273776 |
28-Oct-2014 |
mav |
MFS10 r273767 / MFC r273638: Revert somewhat hackish geom_disk optimization, committed as part of r256880, and the following r273143 commit, supposed to workaround introduced issue by quite innocent-looking change.
While there is no clear understanding why, but r273143 is accused in data corruption in some environments with high I/O load. I personally don't see any problem in that commit, and possibly it is just a trigger to some other bug somewhere, but better safe then sorry for now.
Requested by: scottl@ Approved by: re (kib@) |
272461 |
03-Oct-2014 |
gjb |
Copy stable/10@r272459 to releng/10.1 as part of the 10.1-RELEASE process.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation
|
272006 |
23-Sep-2014 |
cperciva |
MFC r271664: Cache GELI passphrases entered at the console during the boot process, in order to improve user-friendliness when a system has multiple disks encrypted using the same passphrase.
Relnotes: yes Approved by: re (gjb)
|
271636 |
15-Sep-2014 |
emaste |
MFC EFI support for the installer
r264978 (nwhitehorn):
Add EFI support to the installer. This requires that the kernel provide a sysctl to determine what firmware is in use. This sysctl does not exist yet, so the following blocks are in front of the wheels: - I've provisionally called this "hw.platform" after the equivalent thing on PPC - The logic to check the sysctl is short-circuited to always choose BIOS. There's a comment in the top of the file about how to turn this off.
If IA64 acquired a boot1.efifat-like thing (probably with very few modifications), the same code could be adapted there.
r265016 (nwhitehorn):
Finish connecting up installer UEFI support. If the kernel was booted using EFI, set up the disks for an EFI system. If booted from BIOS/CSM, set up for BIOS.
r268256 (nwhitehorn):
After EFI support was added to the installer, it needed to allow boot partitions of types other than "freebsd-boot" (in particular, "efi"). This allows the removal of some nasty hacks for supporting PowerPC systems, in particular aliasing freebsd-boot to apple-boot on APM and an IBM-specific code on MBR.
This changes the installer to use the correct names, which also breaks a degeneracy in the meaning of "freebsd-boot" that allows the addition of support for some newer IBM systems that can boot from GPT in addition to MBR. Since I have no idea how to detect which those systems are, leave the default on IBM PPC systems as MBR for now.
Approved by: re PR: 193658 Relnotes: Yes
|
271238 |
07-Sep-2014 |
smh |
MFC r256956: Improve ZFS N-way mirror read performance by using load and locality information.
MFC r260713: Fix ZFS mirror code for handling multiple DVA's
Also make the addition of the d_rotation_rate binary compatible. This allows storage drivers compiled for 10.0 to work by preserving the ABI for disks.
Approved by: re (gjb) Sponsored by: Multiplay
|
270552 |
25-Aug-2014 |
ae |
MFC r268407 (by gjb): Fix non-version text after .Fx macro usage.
MFC r269487 (by issyl0): Add generic list, status, load and unload docs to gpart(8)
- In the style of gmirror(8). PR: docs/191534
MFC r269852: Add sysctl and loader tunable kern.geom.part.mbr.enforce_chs that is set by default. It can be used to disable automatic alignment to CHS geometry, that GEOM_PART_MBR does.
|
269456 |
03-Aug-2014 |
marcel |
MFC 268986; fix file system corruption by creating as many BIOs as needed to satisfy the original request -- in other words: no short reads.
Obtained from: Juniper Networks, Inc.
|
268091 |
01-Jul-2014 |
ae |
MFC r267355: Add UUIDs for DragonFlyBSD's partition types.
MFC r267356: Add DragonFlyBSD's Hammer FS types and type names.
MFC r267357: Add aliases for DragonFlyBSD's partition types.
MFC r267358: Allow dumping to DragonFlyBSD's swap partition.
MFC r267359: Add disklabel64 support to GEOM_PART class.
This partitioning scheme is used in DragonFlyBSD. It is similar to BSD disklabel, but has the following improvements: * metadata has own dedicated place and isn't accessible through partitions; * all offsets are 64-bit; * supports 16 partitions by default (has reserved place for more); * has reserved place for backup label (but not yet implemented); * has UUIDs for partitions and partition types;
MFC r267360: Add disklabel64 support
Relnotes: yes
|
267860 |
25-Jun-2014 |
marius |
MFC: r267145
Fix the keyfile being cleared prematurely after r259428 (MFCed to stable/10 in r266749).
PR: 185084 Submitted by: fk@fabiankeil.de Reviewed by: pjd
|
267156 |
06-Jun-2014 |
ae |
MFC r266880: Use g_conf_printf_escaped() to escape symbols, which can break an XML tree.
|
266970 |
02-Jun-2014 |
ae |
MFC r266444: We have two functions from where a geom orphan method could be called: g_orphan_register and g_resize_provider_event. Both are called from the event queue. Also we have GEOM_DEV class, which does deferred destroy for its consumers via g_dev_destroy (also called from the event queue). So it is possible, that for some consumers an orphan method will be called twice. This triggers panic in g_dev_orphan. Check that consumer isn't already orphaned before call orphan method.
|
266749 |
27-May-2014 |
marius |
MFC: r259428
Clear content of keyfiles loaded by the loader after processing them.
MFC: r259429
Clear some more places with potentially sensitive data.
|
266679 |
26-May-2014 |
ae |
MFC r266445: Add a topology trace to the g_spoil_event.
|
266608 |
24-May-2014 |
mav |
MFC r266319: Make GEOM DISK to account also BIO_FLUSH operations.
|
266220 |
16-May-2014 |
loos |
MFC r260522, r260523, r261439, r261440, r261586, r264504, r264769, r265193, r265194, r265197
r260522: Add the manual page for geom_uncompress(4).
r260523: Build the geom_uncompress(4) module by default.
Fix geom_uncompress(4) module loading. Don't link zlib.c (which is a module itself) directly.
r261439: Remove some unnecessary code. The offsets read from the first block are overwritten a few lines bellow.
r261440: Fix a logic error. Because of this inflateReset() wasn't being called and the output buffer wasn't being cleared between the inflate() calls, producing zeroed output after the first inflate() call.
This fixes the read of mkuzip(8) images with geom_uncompress(4).
r261586: Fix the build with DEBUG enabled. Where possible, fix style(9) issues.
r264504: Make sure not to do I/O for more than MAXPHYS bytes. Doing so can cause problems in our providers, such as a KASSERT in md(4). We can initiate I/O for more than MAXPHYS bytes if we've been given a BIO for MAXPHYS bytes, the blocks from which we're reading couldn't be compressed and we had compression in preceeding blocks resulting in misalignment of the blocks we're trying to read relative to the sector. We're forced to round up the I/O length to make it an multiple of the sector size.
When we detect the condition, we'll reduce the block count and perform a "short" read. In g_uzip_done() we need to consider the original I/O length and stop early if we're about to deflate a block that we didn't read. By using bio_completed in the cloned BIO and not bio_length to check for this, we automatically and gracefully handle short reads that our providers may be doing on top of the short reads we may initiate ourselves.
r264769: Keep geom_uncompress(4) in line with geom_uzip(4), bring in the r264504 fix.
Make sure not to start I/O bigger than MAXPHYS bytes.
r265193: Some style and whitespace fixes. Reduce the difference between geom_uzip(4) and geom_uncompress(4). Now, they produce an almost clean diff(1) output.
Remove a duplicated variable from g_uncompress.c and an unnecessary header from g_uzip.c.
r265194: Actually the FEATURE() macro is defined on sys/sysctl.h.
r265197: Fix a leak in g_uzip_taste(). After retrieve all the block offsets from the uzip image, free the last data read.
|
266036 |
14-May-2014 |
bdrewery |
MFC r265072:
Remove redundant include
|
266031 |
14-May-2014 |
bdrewery |
MFC r264499:
Make g_access() KASSERT() more useful.
|
265912 |
12-May-2014 |
ae |
MFC r256690: Add an automatic resize support to the GEOM_PART class.
When parent provider has been resized, the scheme specific G_PART_RESIZE method does an update of scheme's metadata. But all changes are not saved to disk, until `gpart commit` will be called.
MFC r265336: Add an advice what to do when partition was automatically resized.
|
265910 |
12-May-2014 |
ae |
MFC r265318: For schemes that do an automatic partition aligning move this code to separate function.
MFC r265331: Prevent an unexpected shrinking on resizing due to alignment for MBR, PC98 and VTOC8 schemes.
MFC r265333: Add better error description for case when we are doing resize and scheme-specific method returns EBUSY.
MFC r265539: It is safe to allow shrinking, when aligned size is bigger than current.
|
265669 |
08-May-2014 |
mav |
MFC r265054: Reduce number of opens by REOM RAID during provider taste.
Instead opening/closing provider by each of metadata classes, do it only once in core code. Since for SCSI disks open/close means sending some SCSI commands to the device, this change reduces taste time.
Sponsored by: iXsystems, Inc.
|
265668 |
08-May-2014 |
mav |
MFC r264313: Do not increment bio_data in case of BIO_DELETE.
This fixes KASSERT() panic in g_io_request().
|
264868 |
24-Apr-2014 |
mav |
MFC r264318: Fix wrong sizes used to access PD_Type and PD_State DDF metadata fields.
This caused incorrect behavior of arrays with big-endian DDF metadata. Little-endian (like used by Adaptec controllers) should not be harmed. Add workaround should be enough to manage compatibility.
|
264713 |
21-Apr-2014 |
bdrewery |
MFC r264142:
Show error code when failing to destroy a mirror on delay
|
264712 |
21-Apr-2014 |
bdrewery |
MFC r264320:
Fix spelling error in g_trace() call.
|
262318 |
22-Feb-2014 |
delphij |
MFC r261618:
In g_eli_crypto_hmac_init(), zero out after using the ipad buffer, k_ipad.
Note that the two consumers in geli(4) are not affected by this issue because the way the code is constructed and as such, we believe there is no security impact with or without this change with geli(4)'s usage.
Reported by: Serge van den Boom <serge vdboom.org> Reviewed by: pjd
|
261993 |
16-Feb-2014 |
marcel |
MFC r258448: Have the GPT probe return a lower priority when the MBR is not a PMBR.
|
261455 |
04-Feb-2014 |
eadler |
MFC r258779,r258780,r258787,r258822:
Fix undefined behavior: (1 << 31) is not defined as 1 is an int and this shifts into the sign bit. Instead use (1U << 31) which gets the expected result.
Similar to the (1 << 31) case it is not defined to do (2 << 30).
This fix is not ideal as it assumes a 32 bit int, but does fix the issue for most cases.
A similar change was made in OpenBSD.
|
261391 |
02-Feb-2014 |
mav |
MFC r260883: Remove unneeded and dangerous assignment. It would probably cause NULL refererence panic if compiler not optimize it out.
|
261285 |
30-Jan-2014 |
ae |
MFC r261084: malloc() with M_WAITOK doesn't return NULL.
MFC r261085: Fix typo in r261084. Add to the gctl_error() an ability to specify error description even if numeric error code is already specified. Also by default set error code to EINVAL.
PR: 185852
MFC r261086: In gctl_copyin() remove unused error variable. geom_alloc_copyin() can't return ENOMEM, so describe its fail as bad control request. Add check for NULL pointer in gctl_dump(), since it can be NULL when geom_alloc_copyin() failed.
MFC r261089: Remove another unneeded NULL check from geom_alloc_copyin(). Do copyout in case of gctl version mismatch and fix sbuf leak in g_ctl_ioctl_ctl().
MFC r261091: Always free sbuf in gctl_free().
|
260980 |
21-Jan-2014 |
marck |
MFC r259925-259926:
Add GPT UUID for VMware vSAN meta-data partition.
Approved by: ae
|
260503 |
10-Jan-2014 |
ae |
MFC r259634: Prevent users from deactivating the last component of a mirror.
MFC r259929: Add an ability to stop gmirror and clear its metadata in one command. This fixes the problem, when gmirror starts again just after stop.
The problem occurs when gmirror's component has geom label with equal size. E.g. gpt and gptid have the same size as partition, diskid has the same size as entire disk. When gmirror's geom has been destroyed, glabel creates its providers and this initiate retaste.
Now "gmirror destroy" command is available. It destroys geom and also erases gmirror's metadata.
PR: 184985
|
260502 |
10-Jan-2014 |
ae |
MFC r258357: Add "resize" verb to gmirror(8) and such functionality to geom_mirror(4). Now it is easy to expand the size of the mirror when all its components are replaced. Also add g_resize method to geom_mirror class. It will write updated metadata to new last sector, when parent provider is resized.
|
260479 |
09-Jan-2014 |
mav |
MFC r258683: Escape special XML chars, returned by some devices, confusing XML parsers.
|
260478 |
09-Jan-2014 |
mav |
MFC r258220, r258251: Implement automatic live resize support for GEOM MULTIPATH class.
In "manual" mode just automatically resize provider in any direction. In "automatic" mode allow growth (with new metadata write); in case of shrinking check if there is already valid metadata found at the new location. This should allow easy transparent recovery if first resize was done by mistake.
While there, unify metadata write code and fix minor memory leak.
|
260385 |
07-Jan-2014 |
scottl |
MFC Alexander Motin's GEOM direct dispatch work:
r256603: Introduce new function devstat_end_transaction_bio_bt(), adding new argument to specify present time. Use this function to move binuptime() out of lock, substantially reducing lock congestion when slow timecounter is used.
r256606: Move g_io_deliver() out of the lock, as required for direct dispatch. Move g_destroy_bio() out too to reduce lock scope even more.
r256607: Fix passing uninitialized bio_resid argument to g_trace().
r256610: Add unmapped I/O support to GEOM RAID.
r256830: Restore BIO_UNMAPPED and BIO_TRANSIENT_MAPPING in biodonne() when unmapping temporary mapped buffer. That fixes double unmap if biodone() called twice for the same BIO (but with different done methods).
r256880: Merge GEOM direct dispatch changes from the projects/camlock branch.
When safety requirements are met, it allows to avoid passing I/O requests to GEOM g_up/g_down thread, executing them directly in the caller context. That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid several context switches per I/O.
r259247: Fix bug introduced at r256607. We have to recalculate bp_resid here since sizes of original and completed requests may differ due to end of media.
Testing of the stable/10 merge was done by Netflix, but all of the credit goes to Alexander and iX Systems.
Submitted by: mav Sponsored by: iX Systems
|
259383 |
14-Dec-2013 |
ae |
MFC r257965: Add missing line breaks.
PR: 181900
|
259328 |
13-Dec-2013 |
trasz |
MFC r256724:
Make geom_label(4) resize-aware. This fixes a situation when "gpart resize" would resize a partition, but label providers - e.g. /dev/gptid/XXX - would stay the same size.
MFC r256766:
Fix build with gcc by spelling unused format string as "unused" instead of NULL.
Sponsored by: The FreeBSD Foundation
|
258505 |
24-Nov-2013 |
mjg |
MFC r256951: gnop: make sure that newly allocated memory for softc is zeroed
This prevents mtx_init from encountering non-zeros and panicking the kernel as a result.
Approved by: re
|
257718 |
05-Nov-2013 |
delphij |
MFC r257539:
When zero'ing out a buffer, make sure we are using right size.
Without this change, in the worst but unlikely case scenario, certain administrative operations, including change of configuration, set or delete key from a GEOM ELI provider, may leave potentially sensitive information in buffer allocated from kernel memory.
We believe that it is not possible to actively exploit these issues, nor does it impact the security of normal usage of GEOM ELI providers when these operations are not performed after system boot.
Security: possible sensitive information disclosure Submitted by: Clement Lecigne <clecigne google com> Approved by: re (glebius)
|
256281 |
10-Oct-2013 |
gjb |
Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation
|
255860 |
24-Sep-2013 |
des |
Introduce a kern.geom.notaste sysctl that can be used to temporarily disable GEOM tasting to avoid the "bouncing GEOM" problem where, when you shut down the consumer of a provider which can be viewed in multiple ways (typically a mirror whose members are labeled partitions), GEOM will immediately taste that provider's alter ego and reattach the consumer.
Approved by: re (glebius)
|
255237 |
05-Sep-2013 |
ae |
Remove stub implementation.
MFC after: 1 week
|
255144 |
02-Sep-2013 |
mav |
Make ELI destruction (including orphanization) less aggressive, making it always wait for provider close. Old algorithm was reported to cause NULL dereference panic on attempt to close provider after softc destruction. If not global workaroung in GEOM, that could even cause destruction with requests still in flight.
|
254936 |
26-Aug-2013 |
mav |
MFprojects/camlock r254895: Add unmapped BIO support to GEOM ZERO if kern.geom.zero.clear is cleared.
|
254766 |
24-Aug-2013 |
mav |
Add new attribute lunname to report only textual LUN-specific device IDs. While lunid attribute prefers to report numeric ones, having both may be useful in some situations.
|
254389 |
15-Aug-2013 |
ken |
Change the way that unmapped I/O capability is advertised.
The previous method was to set the D_UNMAPPED_IO flag in the cdevsw for the driver. The problem with this is that in many cases (e.g. sa(4)) there may be some instances of the driver that can handle unmapped I/O and some that can't. The isp(4) driver can handle unmapped I/O, but the esp(4) driver currently cannot. The cdevsw is shared among all driver instances.
So instead of setting a flag on the cdevsw, set a flag on the cdev. This allows drivers to indicate support for unmapped I/O on a per-instance basis.
sys/conf.h: Remove the D_UNMAPPED_IO cdevsw flag and replace it with an SI_UNMAPPED cdev flag.
kern_physio.c: Look at the cdev SI_UNMAPPED flag to determine whether or not a particular driver can handle unmapped I/O.
geom_dev.c: Set the SI_UNMAPPED flag for all GEOM cdevs. Since GEOM will create a temporary mapping when needed, setting SI_UNMAPPED unconditionally will work.
Remove the D_UNMAPPED_IO flag.
nvme_ns.c: Set the SI_UNMAPPED flag on cdevs created here if NVME_UNMAPPED_BIO_SUPPORT is enabled.
vfs_aio.c: In aio_qphysio(), check the SI_UNMAPPED flag on a cdev instead of the D_UNMAPPED_IO flag on the cdevsw.
sys/param.h: Bump __FreeBSD_version to 1000045 for the switch from setting the D_UNMAPPED_IO flag in the cdevsw to setting SI_UNMAPPED in the cdev.
Reviewed by: kib, jimharris MFC after: 1 week Sponsored by: Spectra Logic
|
254275 |
13-Aug-2013 |
mav |
Return error when opening read-only volumes (like RAID4/5/...) for writing. Previously opens succeeded, but actual write operations returned errors.
Requested by: peter MFC after: 2 weeks
|
254271 |
13-Aug-2013 |
mav |
Oops, wrong constant at r254269.
|
254269 |
13-Aug-2013 |
mav |
Fix reasonable but safe Clang warnings.
|
254252 |
12-Aug-2013 |
ed |
Fix the formatting of the error message.
The G_MIRROR_DEBUG() macro already appends a newline. Also, most of the log messages emitted by gmirror start with an uppercase letter.
|
254095 |
08-Aug-2013 |
ae |
gpt_entries is used as limit for the number of partition entries in the GEOM_PART. Instead of just using number of entries from the GPT header, calculate this limit based on the reserved space between GPT header and first available LBA.
MFC after: 2 weeks
|
254015 |
07-Aug-2013 |
marcel |
Change <sys/diskpc98.h> to not redefine the same symbols that are being defined in <sys/diskmbr.h>. Instead give the symbols here a "PC98_" prefix. This way, both <sys/diskmbr.h> and <sys/diskpc98.h> can be included in the same C source file.
The renaming is trivial. The only gotcha is that DOSBBSECTOR is also redefined from 0 to 1. This because DOSBBSECTOR was always used in conjunction with an addition of 1. The PC98_BBSECTOR symbol is defined as 1 and the expression is simplified.
Note: it is not believed that ports are seriously impacted; or at all for that matter.
Approved by: nyan@
|
253938 |
04-Aug-2013 |
marcel |
Remove inclusion of <sys/diskmbr.h>. We have no business knowing anything related to MBR in this file.
|
253706 |
27-Jul-2013 |
mav |
Introduce 3 seconds timeout on `graid stop` command (mostly with -f flag). Since completion waiting goes in g_event thread, it may cause GEOM deadlock if consumer on top (for example, ZFS) uses g_event thread for closing.
|
253141 |
10-Jul-2013 |
kib |
When panicing due to the gjournal overflow, print the geom metadata journal id.
Requested by: Andreas Longwitz <longwitz@incore.de> MFC after: 1 week
|
253106 |
09-Jul-2013 |
kib |
There are several code sequences like vfs_busy(mp); vfs_write_suspend(mp); which are problematic if other thread starts unmount between two calls. The unmount starts a write, while vfs_write_suspend() drain writers. On the other hand, unmount drains busy references, causing the deadlock.
Add a flag argument to vfs_write_suspend and require the callers of it to specify VS_SKIP_UNMOUNT flag, when the call is performed not in the mount path, i.e. the covered vnode is not locked. The suspension is not attempted if VS_SKIP_UNMOUNT is specified and unmount is in progress.
Reported and tested by: Andreas Longwitz <longwitz@incore.de> Sponsored by: The FreeBSD Foundation MFC after: 3 weeks
|
252657 |
03-Jul-2013 |
smh |
Bump disk(9) ABI version to signify the addition of d_delmaxsize by r249940.
Ensure that d_delmaxsize is always set, removing init to 0 which could cause future issues if use cases change.
Allow kern.cam.da.X.delete_max (which maps to d_delmaxsize) to be increased up to the calculated max after being reduced.
MFC after: 1 day X-MFC-With: r249940
|
252330 |
28-Jun-2013 |
jeff |
- Add a general purpose resource allocator, vmem, from NetBSD. It was originally inspired by the Solaris vmem detailed in the proceedings of usenix 2001. The NetBSD version was heavily refactored for bugs and simplicity. - Use this resource allocator to allocate the buffer and transient maps. Buffer cache defrags are reduced by 25% when used by filesystems with mixed block sizes. Ultimately this may permit dynamic buffer cache sizing on low KVA machines.
Discussed with: alc, kib, attilio Tested by: pho Sponsored by: EMC / Isilon Storage Division
|
252011 |
19-Jun-2013 |
scottl |
Fix a mystery cut-n-paste corruption from the previous commit.
Submitted by: Brenden Fabeny
|
252010 |
19-Jun-2013 |
scottl |
Mark geom_mirror as capable of unmapped i/o
Obtained from: Netflix MFC after: 3 days
|
251654 |
12-Jun-2013 |
mav |
Make CAM return and GEOM DISK pass through new GEOM::lunid attribute.
SPC-4 specification states that serial number may be property of device, but not a specific logical unit. People reported about FC storages using serial number in that way, making it unusable for purposes of LUN multipath detection. SPC-4 states that designators associated with logical unit from the VPD page 83h "Device Identification" should be used for that purpose. Report first of them in the new attribute in such preference order: NAA, EUI-64, T10 and SCSI name string.
While there, make GEOM DISK properly report GEOM::ident in XML output also using d_getattr() method, if available. This fixes serial numbers reporting for SCSI disks in `geom disk list` output and confxml.
Discussed with: gibbs, ken Sponsored by: iXsystems, Inc. MFC after: 2 weeks
|
251616 |
11-Jun-2013 |
mav |
Don't update provider properties and don't set DISKFLAG_OPEN if d_open() disk method call returned error. GEOM considers devices in such case as still closed, and won't call symmetric d_close() for them.
|
251588 |
09-Jun-2013 |
marcel |
Change the set and unset ctlreqs by making the index argument optional. This allows setting attributes on tables. One simply does not provide an index in that case. Otherwise the entry corresponding the index has the attribute set or unset.
Use this change to fix a relatively longstanding bug in our GPT scheme that's the result of rev 198097 (relatively harmless) followed by rev 237057 (damaging). The damaging part being that our GPT scheme always has the active flag set on the PMBR slice. This is in violation with EFI. Existing EFI implementions for both x86 and ia64 reject the GPT. As such, GPT disks created by us aren't usable under EFI because of that.
After this change, GPT disks never have the active flag set on the PMBR slice. In order to make the GPT disk bootable under some x86 BIOSes, the reason of rev 198097, one must now set the active attribute on the gpt table. The kernel will apply this to the PMBR slice For (S)ATA: gpart set -a active ada0
To fix an existing GPT disk that has the active flag set in the PMBR, and that does not need the flag, use (again for (S)ATA): gpart unset -a active ada0
The EBR, MBR & PC98 schemes, which also impement at least 1 attribute, now check to make sure the entry passed is valid. They do not have attributes that apply to the table.
|
251587 |
09-Jun-2013 |
marcel |
Remove stub implementation.
|
251117 |
30-May-2013 |
brooks |
MFP4 @222836
Add support for partitioning CFI disks from FDT using geom_flashmap.
Sponsored by: DARPA, AFRL
|
250868 |
21-May-2013 |
jh |
Remove an extra semicolon from the DOT language output.
PR: kern/178540 Submitted by: Trond Endrestol MFC after: 1 week
|
250819 |
20-May-2013 |
mav |
Fix vdc->Secondary_Element_Count metadata field access from 16 to 8 bit. In some cases it could cause kernel panic during failed drive replacement.
Reported by: trasz MFC after: 1 week
|
250264 |
05-May-2013 |
stas |
- Use int8_t type for the mftrecsz field in g_label_ntfs. char type used previously caused probe failure on platforms where char is unsigned (e.g. ARM), as mftrecsz can be negative.
Submitted by: Ilya Bakulin <ilya@bakulin.de> MFC after: 2 weeks
|
249974 |
27-Apr-2013 |
mav |
Return "descr" field alike to "Intel RAID1 volume" for GEOM RAID to make it look better in bsdinstall.
|
249940 |
26-Apr-2013 |
smh |
Teach GEOM and CAM about the difference between the max "size" of r/w and delete requests.
sys/geom/geom_disk.h: - Added d_delmaxsize which represents the maximum size of individual device delete requests in bytes. This can be used by devices to inform geom of their size limitations regarding delete operations which are generally different from the read / write limits as data is not usually transferred from the host to physical device.
sys/geom/geom_disk.c: - Use new d_delmaxsize to calculate the size of chunks passed through to the underlying strategy during deletes instead of using read / write optimised values. This defaults to d_maxsize if unset (0).
- Moved d_maxsize default up so it can be used to default d_delmaxsize
sys/cam/ata/ata_da.c: - Added d_delmaxsize calculations for TRIM and CFA
sys/cam/scsi/scsi_da.c: - Added re-calculation of d_delmaxsize whenever delete_method is set.
- Added kern.cam.da.X.delete_max sysctl which allows the max size for delete requests to be limited. This is useful in preventing timeouts on devices who's delete methods are slow. It should be noted that this limit is reset then the device delete method is changed and that it can only be lowered not increased from the device max.
Reviewed by: mav Approved by: pjd (mentor)
|
249930 |
26-Apr-2013 |
smh |
Added a sysctl (kern.geom.dev.delete_max_sectors) to control the maximum size of a delete request sent to the providing device performed by g_dev_ioctl.
This allows the kernel and apps via ioctl e.g. newfs -E to request large LBA deletes which siginificantly improves performance.
Previously this was hard coded to 65536 sectors, the new default is 262144 which doubles the throughput of deletes on commonly available SSD's.
In tests on a Intel 520 120GB FW: 400i disk it improved the delete throughput from 1.6GB/s to over 2.6GB/s on a full disk delete such as that done via newfs -E
For some SSD's where delete time is pretty much constant, no matter what the request, setting this to 0 will provide significantly better throughput e.g. Samsung 840 240GB FW DXT07B0Q @ 262144 = 79G/s, @ 0 = 2259G/s
Reviewed by: mav Approved by: pjd (mentor) MFC after: 2 weeks
|
249571 |
16-Apr-2013 |
ivoras |
Comment typo fix.
Is aware of the importance of comments: dim
|
249564 |
16-Apr-2013 |
ivoras |
Fix the buffer-overflow-fixing fixes.
Pointy-hat to: me, for not realizing snprintf() is available in kernel. Thanks to: jh, for bringing me the good news of snprintf(), Pawel Worach, for noting that the panic can be provoked in i386 and not in amd64
|
249556 |
16-Apr-2013 |
brooks |
Partial MFP4 of 222836:
Only look for FDT partitions if our potential parent is a DISK device.
Excluding direct recursion on the flashmap geoms was insufficient because it did not prevent the underlying device from being retrieved if flashmap geoms were further partitioned.
Reviewed by: imp Sponsored by: DARPA, AFRL
|
249508 |
15-Apr-2013 |
ivoras |
Introduce glabel labels based on GEOM ident attributes. In this initial implementation, error on the side of conservatism and only create labels for GEOMs of classes DISK and MULTIPATH.
Discussed with: trasz Approved by: silence from freebsd-geom@
|
249507 |
15-Apr-2013 |
ivoras |
Introduce a symbol for the GEOM class name instead of using the ad-hoc string constant.
|
249440 |
13-Apr-2013 |
jmg |
move the error report to a lower log level... Now you can see when it returns an error without getting every single io that went through it..
MFC after: 1 week
|
249193 |
06-Apr-2013 |
trasz |
Make it possible to submit FLUSH bios through geom_dev strategy. This is required for CTL to work with device-backed LUNs.
Reviewed by: mav
|
249161 |
05-Apr-2013 |
mav |
Following r241022, replace iteration over the provider list on media events by taking first one and asserting that there is no others.
MFC after: 1 week
|
248722 |
26-Mar-2013 |
mav |
geom_slice.c and its consumers like GEOM_LABEL are not touching the data unless hotspots are used. Pass G_PF_ACCEPT_UNMAPPED flag through except such rare cases (obsolete GEOM_SUNLABEL and GEOM_BSD).
|
248721 |
26-Mar-2013 |
mav |
GEOM NOP does not touch the data, so pass G_PF_ACCEPT_UNMAPPED flag through.
|
248720 |
26-Mar-2013 |
mav |
Remove extra bio_data and bio_length copying to child request after calling g_clone_bio(), that already copied them.
|
248712 |
26-Mar-2013 |
kan |
Do not pass unmapped buffers to drivers that cannot handle them
In physio, check if device can handle unmapped IO and pass an appropriately mapped buffer to the driver strategy routine. The only driver in the tree that can handle unmapped buffers is one exposed by GEOM, so mark it as such with the new flag in the driver cdevsw structure.
This fixes insta-panics on hosts, running dconschat, as /dev/fwmem is an example of the driver that makes use of physio routine, but bypasses the g_down thread, where the buffer gets mapped normally.
Discussed with: kib (earlier version)
|
248696 |
25-Mar-2013 |
mav |
Make GEOM MULTIPATH to report unmapped bio support if underling path report it. GEOM MULTIPATH itself never touches the data and so transparent.
|
248694 |
25-Mar-2013 |
mav |
In GEOM DISK: - Replace single done mutex with per-disk ones. On system with several disks on several HBAs that removes small, but measurable lock congestion. - Modify disk destruction process to not destroy the mutex prematurely. - Remove some extra pointer derefences.
|
248679 |
24-Mar-2013 |
mav |
Fix long known deadlock between geom dev destruction and d_close() call. Use destroy_dev_sched_cb() to not wait for device destruction while holding GEOM topology lock (that actually caused deadlock). Use request counting protected by mutex to properly wait for outstanding requests completion in cases of device closing and geom destruction. Unlike r227009, this code does not block taskqueue thread for indefinite time, waiting for completion.
|
248674 |
24-Mar-2013 |
mav |
Make g_wither_washer() to not loop by itself, but only when there was some more topology change done that may require its attention. Add few missing g_do_wither() calls in respective places to signal it.
This fixes potential infinite loop here when some provider is withered, but still opened or connected for some reason and so can not be destroyed. For example, see r227009 and r227510.
|
248596 |
21-Mar-2013 |
kib |
Correct the page count when excess length is trimmed from the bio.
Reported and tested by: Ivan Klymenko <fidaj@ukr.net
|
248568 |
21-Mar-2013 |
kib |
Assert that transient mapping of the bio is only done when unmapped buffers are allowed.
Sponsored by: The FreeBSD Foundation
|
248517 |
19-Mar-2013 |
kib |
The geom_part provider supports unmapped bio iff the underlying provider does so, since geom_part never inspects the bio_data.
Sponsored by: The FreeBSD Foundation Tested by: pho
|
248516 |
19-Mar-2013 |
kib |
A flag for the geom disk driver to indicate that it accepts the unmapped i/o requests.
Sponsored by: The FreeBSD Foundation Tested by: pho
|
248508 |
19-Mar-2013 |
kib |
Implement the concept of the unmapped VMIO buffers, i.e. buffers which do not map the b_pages pages into buffer_map KVA. The use of the unmapped buffers eliminate the need to perform TLB shootdown for mapping on the buffer creation and reuse, greatly reducing the amount of IPIs for shootdown on big-SMP machines and eliminating up to 25-30% of the system time on i/o intensive workloads.
The unmapped buffer should be explicitely requested by the GB_UNMAPPED flag by the consumer. For unmapped buffer, no KVA reservation is performed at all. The consumer might request unmapped buffer which does have a KVA reserve, to manually map it without recursing into buffer cache and blocking, with the GB_KVAALLOC flag.
When the mapped buffer is requested and unmapped buffer already exists, the cache performs an upgrade, possibly reusing the KVA reservation.
Unmapped buffer is translated into unmapped bio in g_vfs_strategy(). Unmapped bio carry a pointer to the vm_page_t array, offset and length instead of the data pointer. The provider which processes the bio should explicitely specify a readiness to accept unmapped bio, otherwise g_down geom thread performs the transient upgrade of the bio request by mapping the pages into the new bio_transient_map KVA submap.
The bio_transient_map submap claims up to 10% of the buffer map, and the total buffer_map + bio_transient_map KVA usage stays the same. Still, it could be manually tuned by kern.bio_transient_maxcnt tunable, in the units of the transient mappings. Eventually, the bio_transient_map could be removed after all geom classes and drivers can accept unmapped i/o requests.
Unmapped support can be turned off by the vfs.unmapped_buf_allowed tunable, disabling which makes the buffer (or cluster) creation requests to ignore GB_UNMAPPED and GB_KVAALLOC flags. Unmapped buffers are only enabled by default on the architectures where pmap_copy_page() was implemented and tested.
In the rework, filesystem metadata is not the subject to maxbufspace limit anymore. Since the metadata buffers are always mapped, the buffers still have to fit into the buffer map, which provides a reasonable (but practically unreachable) upper bound on it. The non-metadata buffer allocations, both mapped and unmapped, is accounted against maxbufspace, as before. Effectively, this means that the maxbufspace is forced on mapped and unmapped buffers separately. The pre-patch bufspace limiting code did not worked, because buffer_map fragmentation does not allow the limit to be reached.
By Jeff Roberson request, the getnewbuf() function was split into smaller single-purpose functions.
Sponsored by: The FreeBSD Foundation Discussed with: jeff (previous version) Tested by: pho, scottl (previous version), jhb, bf MFC after: 2 weeks
|
248295 |
14-Mar-2013 |
pjd |
We don't need buffer to handle BIO_DELETE, so don't check buffer size for it. This fixes handling BIO_DELETE larger than MAXPHYS.
|
248068 |
08-Mar-2013 |
sbruno |
Add legacy support to geom raid to create a /dev/arX device for support of upgrading older machines using ataraid(4) to newer releases.
This optional parameter is controlled via kern.geom.raid.legacy_aliases and will create a /dev/ar0 device that will point at /dev/raid/r0 for example.
Tested on Dell SC 1425 DDF-1 format software raid controllers installing from stable/7 and upgrading to stable/9 without having to adjust /etc/fstab
Reviewed by: mav Obtained from: Yahoo! MFC after: 2 Weeks
|
248058 |
08-Mar-2013 |
dumbbell |
g_label_ntfs_taste: Abort taste is recsize == 0
This will avoid a 0-byte read (in g_read_data()) leading to a panic, if previously read data are erroneous.
Suggested by: John-Mark Gurney <jmg@funkthat.com>
|
247961 |
07-Mar-2013 |
gavin |
Support the FAT16 partition type in gpart(8)
PR: kern/174714 Submitted by: 4721 at hushmail dot com MFC after: 1 week
|
247918 |
07-Mar-2013 |
mav |
Fix panic when Secondary_Element_Count == 1 and Secondary_Element_Seq is not set (255).
Reported by: sbruno MFC after: 1 week
|
247837 |
05-Mar-2013 |
dumbbell |
g_label_ntfs.c: Mark structures as __packed
Without this, read data is mis-interpreted. This could trigger a panic, as was the case on one computer where computed "recsize" was zero, leading to a call to g_read_page() asking for 0 bytes.
|
247662 |
02-Mar-2013 |
attilio |
Remove ntfs headers dependency for g_label_ntfs.c by redefining the used structs and values.
This patch is not targeted for MFC.
|
246876 |
16-Feb-2013 |
mckusick |
Add barrier write capability to the VFS buffer interface. A barrier write is a disk write request that tells the disk that the buffer being written must be committed to the media along with any writes that preceeded it before any future blocks may be written to the drive.
Barrier writes are provided by adding the functions bbarrierwrite (bwrite with barrier) and babarrierwrite (bawrite with barrier).
Following a bbarrierwrite the client knows that the requested buffer is on the media. It does not ensure that buffers written before that buffer are on the media. It only ensure that buffers written before that buffer will get to the media before any buffers written after that buffer. A flush command must be sent to the disk to ensure that all earlier written buffers are on the media.
Reviewed by: kib Tested by: Peter Holm
|
245946 |
26-Jan-2013 |
avg |
g_mirror: g_getattr() failure should not be fatal
This allows to use gmirror e.g. on top of ZVOLs.
PR: kern/175323 Submitted by: Alexei.Volkov@softlynx.ru, mav Reported by: Alexei.Volkov@softlynx.ru Tested by: Alexei.Volkov@softlynx.ru Reviewed by: ae, mav, pjd MFC after: 1 week
|
245533 |
17-Jan-2013 |
mav |
- Fix rebuild position broken at r245522. - Identify one more metadata field.
|
245522 |
17-Jan-2013 |
mav |
For Promise/AMD metadata add support for disks with capacity above 2TiB and for volumes with sector size above 512 bytes.
|
245519 |
17-Jan-2013 |
mav |
Recalculate volume size only for real CONCATs. For SINGLE trust volume size given by metadata, as it should be correct and in some cases can be smaller then subdisk size.
|
245456 |
15-Jan-2013 |
mav |
Allow to insert new component to geom_raid3 without specifying number.
PR: kern/160562 MFC after: 2 weeks
|
245444 |
15-Jan-2013 |
mav |
Alike to r242314 for GRAID make GRAID3 more aggressive in marking volumes as clean on shutdown and move that action from shutdown_pre_sync stage to shutdown_post_sync to avoid extra flapping.
ZFS tends to not close devices on shutdown, that doesn't allow GEOM RAID to shutdown gracefully. To handle that, mark volume as clean just when shutdown time comes and there are no active writes.
MFC after: 2 weeks
|
245443 |
15-Jan-2013 |
mav |
Alike to r242314 for GRAID make GMIRROR more aggressive in marking volumes as clean on shutdown and move that action from shutdown_pre_sync stage to shutdown_post_sync to avoid extra flapping.
ZFS tends to not close devices on shutdown, that doesn't allow GEOM RAID to shutdown gracefully. To handle that, mark volume as clean just when shutdown time comes and there are no active writes.
PR: kern/113957 MFC after: 2 weeks
|
245433 |
14-Jan-2013 |
mav |
Keep value of orig_config_id metadata field. Windows driver writes there previous value of config_id when it is changed in some cases. I guess it may be used do avoid some split-brain conditions.
|
245425 |
14-Jan-2013 |
mav |
Small cosmetic tuning of the IRRT status constants.
|
245423 |
14-Jan-2013 |
mav |
Print some more metadata fields.
|
245400 |
14-Jan-2013 |
mav |
Windows driver writes relative volume IDs to metadata field. Use that value as a hint for raid/rX device number to make it persistent across reboots.
|
245398 |
13-Jan-2013 |
mav |
- Add checks for Intel metadata version and attributes. Ignore disks with unsupported metadata types like Intel Smart Response to not corrupt them. - Improve setting of these things during metadata writing to protect from incapable BIOS'es and other implementations.
|
245363 |
13-Jan-2013 |
mav |
Improve support for disabled disks. If disabled disk disconnected and then reconnected back, leave it as disconnected. If new disk inserted instead of disabled, rebuild it and leave as enabled.
|
245341 |
12-Jan-2013 |
mav |
Windows handles INIT and VERIFY as array-wide and it doesn't specify which disks should be rebuilt. Our rebuild code is same time disk-centric. To handle this situation properly check all disks for RBLD flags, and if no disk specified try rebuild/resync all of them except newly inserted.
|
245338 |
12-Jan-2013 |
mav |
Implement migration from single disk to RAID1/IRRT for Intel metadata. Windows driver uses such migration when it creates new arrays. While GEOM RAID has no mechanism to implement migration in general case, this specifc case still can be handled easily via degraded RAID1 creation followed by regular rebuild.
|
245326 |
12-Jan-2013 |
mav |
Add basic support for Intel Rapid Recover Technology (Intel RRT). It is alike to RAID1, but with dedicating master and recovery disks and providing manual control over synchronization. It allows to use recovery disk as snapshot of the master disk from the time of the last sync.
This implementation is not functionaly complete comparing to Windows, but it is better then silent conversion to RAID1 on first boot.
|
245286 |
11-Jan-2013 |
kib |
Add flags argument to vfs_write_resume() and remove vfs_write_resume_flags().
Sponsored by: The FreeBSD Foundation
|
244716 |
26-Dec-2012 |
pjd |
Reset provider-specific fields when resending I/O request in low memory conditions. This fixes assertion which checks those fields when kernel is compiled with DIAGNOSTIC.
Reported by: kib, pho MFC after: 1 week
|
244585 |
22-Dec-2012 |
jh |
Mangle label names containing spaces, non-printable characters '%' or '"'. Mangling is only done for label names read from file system metadata. Encoding resembles URL encoding. For example, the space character becomes %20.
Help by: kib Discussed with: imp, kib, pjd
|
243333 |
20-Nov-2012 |
jh |
- Don't pass geom and provider names as format strings. - Add __printflike() attributes. - Remove an extra argument for the g_new_geomf() call in swapongeom_ev().
Reviewed by: pjd
|
242439 |
01-Nov-2012 |
alfred |
Provide a device name in the sysctl tree for programs to query the state of crashdump target devices.
This will be used to add a "-l" (ell) flag to dumpon(8) to list the currently configured dumpdev.
Reviewed by: phk
|
242379 |
30-Oct-2012 |
trasz |
Fix problem with geom_label(4) not recognizing UFS labels on filesystems extended using growfs(8). The problem here is that geom_label checks if the filesystem size recorded in UFS superblock is equal to the provider (i.e. device) size. This check cannot be removed due to backward compatibility. On the other hand, in most cases growfs(8) cannot set fs_size in the superblock to match the provider size, because, differently from newfs(8), it cannot recompute cylinder group sizes.
To fix this problem, add another superblock field, fs_providersize, used only for this purpose. The geom_label(4) will attach if either fs_size (filesystem created with newfs(8)) or fs_providersize (filesystem expanded using growfs(8)) matches the device size.
PR: kern/165962 Reviewed by: mckusick Sponsored by: FreeBSD Foundation
|
242328 |
29-Oct-2012 |
mav |
Minor addition to r242323: Alike to BIO_WRITE, report success if at least one subdisk succeeded with BIO_DELETE. But unlike BIO_WRITE don't fail disk on BIO_DELETE error.
Sponsored by: iXsystems, Inc. MFC after: 1 month
|
242323 |
29-Oct-2012 |
mav |
Add basic BIO_DELETE support to GEOM RAID class for all RAID levels.
If at least one subdisk in the volume supports it, BIO_DELETE requests will be propagated down. Unfortunatelly, for RAID levels with redundancy unmapped blocks will be mapped back during first rebuild/resync process.
Sponsored by: iXsystems, Inc. MFC after: 1 month
|
242322 |
29-Oct-2012 |
trasz |
Fix locking problem in disk_resize(); previously it would run without topology lock, resulting in assertion when running with DIAGNOSTIC.
Reviewed by: mav (earlier version)
|
242314 |
29-Oct-2012 |
mav |
Make GEOM RAID more aggressive in marking volumes as clean on shutdown and move that action from shutdown_pre_sync to shutdown_post_sync stage to avoid extra flapping.
ZFS tends to not close devices on shutdown, that doesn't allow GEOM RAID to shutdown gracefully. To handle that, mark volume as clean just when shutdown time comes and there are no active writes.
MFC after: 2 weeks
|
241896 |
22-Oct-2012 |
kib |
Remove the support for using non-mpsafe filesystem modules.
In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems.
The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes.
Conducted and reviewed by: attilio Tested by: pho
|
241706 |
18-Oct-2012 |
attilio |
It seems that it is preferable to keep support for glabel also for filesystems that we don't support natively. Revert part of r241636 to do so.
This patch is not targeted for MFC.
Requested by: gleb, jhb
|
241636 |
17-Oct-2012 |
attilio |
Disconnect non-MPSAFE NTFS from the build in preparation for dropping GIANT from VFS. This code is particulary broken and fragile and other in-kernel implementations around, found in other operating systems, don't really seem clean and solid enough to be imported at all. If someone wants to reconsider in-kernel NTFS implementation for inclusion again, a fair effort for completely fixing and cleaning it up is expected.
In the while NTFS regular users can use FUSE interface and ntfs-3g port to work with their NTFS partitions.
This is not targeted for MFC.
|
241418 |
10-Oct-2012 |
mav |
NULL-ify last previously used pointer instead of last possible pointer. This should be only a cosmetic change.
Found by: Clang Static Analyzer
|
241329 |
07-Oct-2012 |
mav |
Make graid command line a bit more friendly by allowing volume name or provider name to be specified instead of geom name (first argument in all subcommands except label). In most cases there is only one array used any way, so it is not really useful to make user type ugly geom names like Intel-f0bdf223 or SiI-732c2b9448cf. Though they can be used in some cases.
Sponsored by: iXsystems, Inc. MFC after: 1 month
|
241296 |
06-Oct-2012 |
avg |
g_part_taste: directly destroy consumer and geom here, no need for withering
Besides withered but still alive consumers may interfere with re-tatsing.
MFC after: 16 days
|
241022 |
28-Sep-2012 |
pjd |
Remove the topology lock from disk_gone(), it might be called with regular mutexes held and the topology lock is an sx lock.
The topology lock was there to protect traversing through the list of providers of disk's geom, but it seems that disk's geom has always exactly one provider.
Change the code to call g_wither_provider() for this one provider, which is safe to do without holding the topology lock and assert that there is indeed only one provider.
Discussed with: ken MFC after: 1 week
|
240822 |
22-Sep-2012 |
pjd |
Use the topology lock to protect list of providers while withering them. It is possible that provider is destroyed while we are iterating over the list.
Reported by: Brian Parkison <parkison@panzura.com> Discussed with: phk MFC after: 1 week
|
240629 |
18-Sep-2012 |
avg |
g_disk_flushcache definitely should not be traced under G_T_TOPOLOGY
... use G_T_BIO instead
MFC after: 1 week
|
240465 |
13-Sep-2012 |
mav |
Add global and per-module sysctls/tunables to enable/disable metadata taste. That should help to handle some cases when disk has some RAID metadata that should be ignored, especially during boot.
MFC after: 3 days
|
240371 |
11-Sep-2012 |
glebius |
When synchronizing, include in the config dump amount of bytes syncronized. The rationale behind this is the following: for large disks the percent synchronisation counter ticks too seldom, and monitoring software (as well as human operator) can't tell whether synchronisation goes on or one of disks got stuck. On an idle server one can look into gstat and see whether synchronisation goes on or not, but on a busy server that won't work. Also, new value monitored can be differentiated obtaining the synchronisation speed quite precisely.
Submitted by: Konstantin Kukushkin <dark ramtel.ru> Reviewed by: pjd
|
239987 |
01-Sep-2012 |
pjd |
Allow to pass providers with /dev/ prefix to g_provider_by_name().
MFC after: 3 days
|
239790 |
28-Aug-2012 |
ed |
Remove unneeded G_PF_CANDELETE flag.
This flag is only used by GEOM so it can be propagated to the character device's SI_CANDELETE. Unfortunately, SI_CANDELETE seems to do nothing.
|
239673 |
25-Aug-2012 |
thomas |
(g_multipath_rotate): Fix algorithm so that it does rotate over all good providers, not just the last two.
PR: kern/170379 Reviewed by: mav MFC after: 2 weeks
|
239184 |
10-Aug-2012 |
pjd |
Always initialize sc_ekey, because as of r238116 it is always used.
If GELI provider was created on FreeBSD HEAD r238116 or later (but before this change), it is using very weak keys and the data is not protected. The bug was introduced on 4th July 2012.
One can verify if its provider was created with weak keys by running:
# geli dump <provider> | grep version
If the version is 7 and the system didn't include this fix when provider was initialized, then the data has to be backed up, underlying provider overwritten with random data, system upgraded and provider recreated.
Reported by: Fabian Keil <fk@fabiankeil.de> Tested by: Fabian Keil <fk@fabiankeil.de> Discussed with: so MFC after: 3 days
|
239175 |
10-Aug-2012 |
mav |
Add missing FAILED event to g_raid_subdisk_event2str() to print it properly in debug messages.
Submitted by: Dmitry Luhtionov <dmitryluhtionov@gmail.com>
|
239132 |
07-Aug-2012 |
jimharris |
Clone BIO_ORDERED flag, for disk drivers (namely CAM) that try to consume it.
Sponsored by: Intel Discussed with: gibbs, scottl
|
239131 |
07-Aug-2012 |
trociny |
In g_gate_dumpconf() always check the result of g_gate_hold().
This fixes "Negative sc_ref" panic possible when sysctl_kern_geom_confxml() is run simultaneously with destroying GATE device.
Reviewed by: pjd MFC after: 3 days
|
239021 |
03-Aug-2012 |
jimharris |
In virstor_ctl_stop(), check for a valid softc before trying to update metadata.
Sponsored by: Intel Reported and tested by: Marcelo Gondim <gondim at bsdinfo dot com dot br> PR: kern/170199 MFC after: 3 days
|
239012 |
03-Aug-2012 |
thomas |
New command "gmultipath prefer" to force selection of a specified provider in an Active/Passive configuration.
Reviewed by: mav MFC after: 4 weeks
|
238892 |
29-Jul-2012 |
mav |
Partially revert r238886 in part of GEOM_VFS spoiling.
This change triggered interesting foot shooting condition in GEOM when RW access to root partition by fsck spoils VFS geom there, which has it opened RO at the same time. Seems spoiling concept needs some rework.
|
238886 |
29-Jul-2012 |
mav |
Implement media change notification for DA and CD removable media devices. It includes three parts: 1) Modifications to CAM to detect media media changes and report them to disk(9) layer. For modern SATA (and potentially UAS) devices it utilizes Asynchronous Notification mechanism to receive events from hardware. Active polling with TEST UNIT READY commands with 3 seconds period is used for incapable hardware. After that both CD and DA drivers work the same way, detecting two conditions: "NOT READY: Medium not present" after medium was detected previously, and "UNIT ATTENTION: Not ready to ready change, medium may have changed". First one reported to disk(9) as media removal, second as media insert/change. To reliably receive second event new AC_UNIT_ATTENTION async added to make UAs broadcasted to all periphs by generic error handling code in cam_periph_error(). 2) Modifications to GEOM core to handle media remove and change events. Media removal handled by spoiling all consumers attached to the provider. Media change event also schedules provider retaste after spoiling to probe new media. New flag G_CF_ORPHAN was added to consumers to reflect that consumer is in process of destruction. It allows retaste to create new geom instance of the same class, while previous one is still dying. 3) Modifications to some GEOM classes: DEV -- to report media change events to devd; VFS -- to handle spoiling same as orphan to prevent accessing replaced media. PART class already handles spoiling alike to orphan.
Reviewed by: silence on geom@ and scsi@ Tested by: avg Sponsored by: iXsystems, Inc. / PC-BSD MFC after: 2 months
|
238868 |
28-Jul-2012 |
trociny |
Reorder things in g_gate_create() so at the moment when g_new_geomf() is called name is properly initialized.
Discussed with: pjd MFC after: 2 weeks
|
238657 |
20-Jul-2012 |
trasz |
Make it possible to resize opened partitions.
Sponsored by: FreeBSD Foundation
|
238565 |
18-Jul-2012 |
trasz |
Add missing free.
|
238559 |
17-Jul-2012 |
ken |
Add back spare fields consumed in r237545. It seems that these should only be consumed to maintain backward compatibility in stable, but should not be consumed in head.
Submitted by: trasz, attilio (indirectly)
|
238534 |
16-Jul-2012 |
trasz |
The resize GEOM event has no references, thus cannot be canceled.
|
238533 |
16-Jul-2012 |
trasz |
Add back spare fields reused in r238213. According to Attilio, the rule is to use reuse spares only when MFC-ing, not in CURRENT.
|
238219 |
07-Jul-2012 |
trasz |
Add trivial resize handling to gnop(8).
Reviewed by: mav Sponsored by: FreeBSD Foundation
|
238218 |
07-Jul-2012 |
trasz |
Add trivial resize handling to gmountver(8).
Reviewed by: mav Sponsored by: FreeBSD Foundation
|
238216 |
07-Jul-2012 |
trasz |
Add disk_resize(), to make it possible for the disk drivers such as da(4) to notify GEOM about LUN size change.
Reviewed by: mav (earlier version) Sponsored by: FreeBSD Foundation
|
238213 |
07-Jul-2012 |
trasz |
Add a new GEOM method, resize(), which is called after provider size changes. Add a new routine, g_resize_provider(), to use to notify GEOM about provider change.
Reviewed by: mav Sponsored by: FreeBSD Foundation
|
238198 |
07-Jul-2012 |
trasz |
Fix orphan() methods of several GEOM classes to not assume that there is an error set on the provider. With GEOM resizing, class can become orphaned when it doesn't implement resize() method and the provider size decreases.
Reviewed by: mav Sponsored by: FreeBSD Foundation
|
238171 |
06-Jul-2012 |
trasz |
Fix typo in the comment.
|
238119 |
04-Jul-2012 |
pjd |
Extend GEOM Gate class to handle read I/O requests directly within the kernel. This will allow HAST to read directly from the local component without even communicating userland daemon.
Sponsored by: Panzura, http://www.panzura.com MFC after: 1 month
|
238116 |
04-Jul-2012 |
pjd |
Use correct part of the Master-Key for generating encryption keys. Before this change the IV-Key was used to generate encryption keys, which was incorrect, but safe - for the XTS mode this key was unused anyway and for CBC mode it was used differently to generate IV vectors, so there is no risk that IV vector collides with encryption key somehow.
Bump version number and keep compatibility for older versions.
MFC after: 2 weeks
|
238115 |
04-Jul-2012 |
pjd |
Correct comment.
MFC after: 3 days
|
238114 |
04-Jul-2012 |
pjd |
Correct a comment and correct style of a flag check.
MFC after: 3 days
|
237930 |
01-Jul-2012 |
glebius |
Make geom_mirror more friendly to SSDs. To properly support TRIM, we need to pass BIO_DELETE requests down to providers that support it. Also, we need to announce our support for BIO_DELETE to upper consumer. This requires:
- In g_mirror_start() return true for "GEOM::candelete" request. - In g_mirror_init_disk() probe below provider for "GEOM::candelete" attribute, and mark disk with a flag if it does support BIO_DELETE. - In g_mirror_register_request() distribute BIO_DELETE requests only to those disks, that do support it.
Note that we announce "GEOM::candelete" as true unconditionally of whether we have TRIM-capable media down below or not. This is made intentionally, because upper consumer (usually UFS) requests the attribite only once at mount time. And if user ever migrates his mirror from HDDs to SSDs, then he/she would get TRIM working without remounting filesystem.
Reviewed by: pjd
|
237929 |
01-Jul-2012 |
glebius |
In g_mirror_regular_request() upon successful delivery treat BIO_DELETE requests same way as BIO_WRITE removing them from queue. This fixes panic with BIO_DELETE operations on geom_mirror.
Reviewed by: pjd
|
237875 |
01-Jul-2012 |
imp |
Use %j to match intmax_t.
|
237820 |
29-Jun-2012 |
brooks |
MFP4 #212266
Fix compile on MIPS64.
Sponsored by: DARPA, AFRL
|
237648 |
27-Jun-2012 |
ken |
In g_disk_providergone(), don't continue if the softc is NULL. This may be the case if we've already gone through g_disk_destroy().
Reported by: Michael Butler <imb@protected-networks.net> MFC after: 3 days
|
237545 |
25-Jun-2012 |
ken |
Consume spare fields for the providergone pointers added to the g_class and g_geom structures in change 237518. The original change would have broken the ABI.
Suggested by: ae MFC after: 4 days
|
237518 |
24-Jun-2012 |
ken |
Fix a bug which causes a panic in daopen(). The panic is caused by a da(4) instance going away while GEOM is still probing it.
In this case, the GEOM disk class instance has been created by disk_create(), and the taste of the disk is queued in the GEOM event queue.
While that event is queued, the da(4) instance goes away. When the open call comes into the da(4) driver, it dereferences the freed (but non-NULL) peripheral pointer provided by GEOM, which results in a panic.
The solution is to add a callback to the GEOM disk code that is called when all of its resources are cleaned up. This is implemented inside GEOM by adding an optional callback that is called when all consumers have detached from a provider, and the provider is about to be deleted.
scsi_cd.c, scsi_da.c: In the register routine for the cd(4) and da(4) routines, acquire a reference to the CAM peripheral instance just before we call disk_create().
Use the new GEOM disk d_gone() callback to register a callback (dadiskgonecb()/cddiskgonecb()) that decrements the peripheral reference count once GEOM has finished cleaning up its resources.
In the cd(4) driver, clean up open and close behavior slightly. GEOM makes sure we only get one open() and one close call, so there is no need to set an open flag and decrement the reference count if we are not the first open.
In the cd(4) driver, use cam_periph_release_locked() in a couple of error scenarios to avoid extra mutex calls.
geom.h: Add a new, optional, providergone callback that is called when a provider is about to be deleted.
geom_disk.h: Add a new d_gone() callback to the GEOM disk interface.
Bump the DISK_VERSION to version 2. This probably should have been done after a couple of previous changes, especially the addition of the d_getattr() callback.
geom_disk.c: Add a providergone callback for the disk class, g_disk_providergone(), that calls the user's d_gone() callback if it exists.
Bump the DISK_VERSION to 2.
geom_subr.c: In g_destroy_provider(), call the providergone callback if it has been provided.
In g_new_geomf(), propagate the class's providergone callback to the new geom instance.
blkfront.c: Callers of disk_create() are supposed to pass in DISK_VERSION, not an explicit disk API version number. Update the blkfront driver to do that.
disk.9: Update the disk(9) man page to include information on the new d_gone() callback, as well as the previously added d_getattr() callback, d_descr field, and HBA PCI ID fields.
MFC after: 5 days
|
237057 |
14-Jun-2012 |
ae |
Always reconstruct partition entries in the PMBR when Boot Camp is disabled. This helps to easily recover from situations when PMBR is damaged and contains no entries.
MFC after: 1 week
|
236619 |
05-Jun-2012 |
mav |
Add missing newlines into XML output.
MFC after: 3 days Sponsored by: iXsystems, Inc.
|
236023 |
25-May-2012 |
marcel |
Add a partition type for nandfs to the apm, bsd, gpt and vtoc8 schemes. The gpart alias for these partition types is "freebsd-nandfs".
|
235989 |
25-May-2012 |
trasz |
Revert r235918 for now and add comment explaining the reason for the size check.
|
235918 |
24-May-2012 |
trasz |
Make g_label(4) ignore provider size when looking for UFS labels. Without it, it fails to create labels for filesystems resized by growfs(8).
PR: kern/165962 Submitted by: Olivier Cochard-Labbe <olivier at cochard dot me>
|
235858 |
23-May-2012 |
delphij |
- Correct signedness for casts; - Wrap long line while I'm there.
Noticed by: pjd, avg
|
235852 |
23-May-2012 |
delphij |
Use %ju to match uintmax_t usage
|
235849 |
23-May-2012 |
delphij |
Use %j and cast off_t to intmax_t for now to fix build.
Noticed by: bz
|
235778 |
22-May-2012 |
gber |
Add a new geom class which allows to divide NAND Flash chip into partitions.
Partitions are created based on data in dts file which are extracted and interpreted by slicer.
Obtained from: Semihalf Supported by: FreeBSD Foundation, Juniper Networks
|
235600 |
18-May-2012 |
ae |
Prevent removing of the last active component from a mirror.
PR: kern/154860 Reviewed by: pjd MFC after: 1 week
|
235599 |
18-May-2012 |
ae |
Introduce new device flag G_MIRROR_DEVICE_FLAG_TASTING. It should protect geom from destroying while it is tasting.
PR: kern/154860 Reviewed by: pjd MFC after: 1 week
|
235419 |
13-May-2012 |
eadler |
Add missing period at the end of the error message
Submitted by: pjd Approved by: cperciva (implicit) MFC after: 3 days X-MFC-With: r235201
|
235270 |
11-May-2012 |
mav |
- Prevent error status leak if write to some of the RAID1/1E volume disks failed while write to some other succeeded. Instead mark disk as failed. - Make RAID1E less aggressive in failing disks to avoid volume breakage.
MFC after: 2 weeks
|
235201 |
09-May-2012 |
eadler |
Clarify error that geli generates when it finds corrupt data.
PR: kern/165695 Submitted by: Robert Simmons <rsimmons0@gmail.com> Reviewed by: pjd Approved by: cperciva MFC after: 1 week
|
235096 |
06-May-2012 |
mav |
Remove some hardcoded constants from code.
|
235080 |
06-May-2012 |
mav |
Plug small memory leaks.
|
235076 |
06-May-2012 |
mav |
Add support for RAID5R. Slightly improve support for RAIDMDF.
|
235069 |
06-May-2012 |
mav |
Fix `gmultipath configure` for big-endian machines.
MFC after: 1 week
|
234994 |
04-May-2012 |
mav |
Fix bug causing memory corruption and panics with big-endian metadata.
|
234993 |
04-May-2012 |
mav |
Implement read-only support for volumes in optimal state (without using redundancy) for the following RAID levels: RAID4/5E/5EE/6/MDF.
|
234940 |
03-May-2012 |
mav |
Add optional -o argument to the `graid label ` to specify some metadata format options. Use it for specifying byte order for the DDF metadata: big-endian defined by specification and little-endian used by Adaptec.
|
234899 |
01-May-2012 |
mav |
Improve spare disks support. Unluckily, for some reason Adaptec 1430SA RAID BIOS doesn't want to understand spare disks created by graid. But at least spares created by BIOS are working fine now.
|
234869 |
01-May-2012 |
mav |
Implement volume deletion if disk has more then one partition.
|
234868 |
01-May-2012 |
mav |
Improve DDF metadata writing.
|
234848 |
30-Apr-2012 |
mav |
Add to GEOM RAID class module, supporting the DDF metadata format, as defined by the SNIA Common RAID Disk Data Format Specification v2.0.
Supports multiple volumes per array and multiple partitions per disk. Supports standard big-endian and Adaptec's little-endian byte ordering. Supports all single-layer RAID levels. Dual-layer RAID levels except RAID10 are not supported now because of GEOM RAID design limitations.
Some work is still to be done, but the present code already manages basic interoperation with RAID BIOS of the Adaptec 1430SA SATA RAID controller.
MFC after: 1 month Sponsored by: iXsystems, Inc.
|
234816 |
29-Apr-2012 |
mav |
s/gmirror/graid/
|
234727 |
27-Apr-2012 |
mav |
Fix RAID5 level names changed at r234603.
|
234610 |
23-Apr-2012 |
mav |
Fix copy-paste typo in r234603.
Submitted by: kan
|
234603 |
23-Apr-2012 |
mav |
Add names for all primary RAID levels defined by DDF 2.0 specification.
|
234601 |
23-Apr-2012 |
mav |
Add sos@ copyrights to RAID metadata modules, respecting his efforts in decoding metadata formats in ataraid(4) code.
|
234458 |
19-Apr-2012 |
mav |
Add to GEOM RAID class module for reading non-degraded RAID5 volumes and some environment to differentiate 4 possible RAID5 on-disk layouts.
Tested with Intel and AMD RAID BIOSes.
MFC after: 2 weeks
|
234417 |
18-Apr-2012 |
marck |
VMware environments are not unusual now. Add VMware partitions recognition (both MBR for ESXi <= 4.1 and GPT for ESXi 5) to g_part.
Reviewed by: ae Approved by: ae MFC after: 2 weeks
|
234415 |
18-Apr-2012 |
mav |
Some improvements to GEOM MULTIPATH: - Implement "configure" command to allow switching operation mode of running device on-fly without destroying and recreation. - Implement Active/Read mode as hybrid of Active/Active and Active/Passive. In this mode all paths not marked FAIL may handle reads same time, but unlike Active/Active only one path handles write requests at any point in time. It allows to closer follow original write request order if above layers need it for data consistency (not waiting for requisite write completion before sending dependent write). - Hide duplicate messages about device status change. - Remove periodic thread wake up with 10Hz rate.
MFC after: 2 weeks Sponsored by: iXsystems, Inc.
|
234026 |
08-Apr-2012 |
mckusick |
Expand locking around identification of filesystem mount point when accounting for I/O counts at completion of I/O operation. Also switch from using global devmtx to vnode mutex to reduce contention.
Suggested and reviewed by: kib
|
233652 |
29-Mar-2012 |
ae |
VMDB offset should be greater than logical volume size only for MBR.
|
233651 |
29-Mar-2012 |
ae |
Do proper cleanup for the GPT case when an error occurs.
|
233627 |
28-Mar-2012 |
mckusick |
Keep track of the mount point associated with a special device to enable the collection of counts of synchronous and asynchronous reads and writes for its associated filesystem. The counts are displayed using `mount -v'.
Ensure that buffers used for paging indicate the vnode from which they are operating so that counts of paging I/O operations from the filesystem are collected.
This checkin only adds the setting of the mount point for the UFS/FFS filesystem, but it would be trivial to add the setting and clearing of the mount point at filesystem mount/unmount time for other filesystems too.
Reviewed by: kib
|
233342 |
23-Mar-2012 |
ae |
Check that scheme is not already registered. This may happens when a KLD is preloaded with loader(8) and leads to infinity loops.
Also do not return EEXIST error code from MOD_LOAD handler, because we have undocumented(?) ability replace kernel's module with preloaded one. And if we have so, then preloaded module will be initialized first. Thus error in MOD_LOAD handler will be triggered for the kernel.
PR: kern/165573 MFC after: 3 weeks
|
233181 |
19-Mar-2012 |
ae |
Add CTLFLAG_TUN to sysctls.
MFC after: 1 month
|
233176 |
19-Mar-2012 |
ae |
Add new GEOM_PART_LDM module that implements the Logical Disk Manager scheme. The LDM is a logical volume manager for MS Windows NT and it is also known as dynamic volumes. It supports about 2000 partitions and also provides the capability for software RAID implementations.
This version implements only partitioning scheme capability and based on the linux-ntfs project documentation and several publications across the Web. NOTE: JBOD, RAID0 and RAID5 volumes aren't supported.
An access to the LDM metadata is read-only. When LDM is on the disk partitioned with MBR we can also destroy metadata. For the GPT partitioned disks destroy action is not supported.
Reviewed by: ivoras (previous version) MFC after: 1 month
|
233175 |
19-Mar-2012 |
ae |
Make kern.geom.part node not static. Also add CTLFLAG_TUN to the check_integrity sysctl.
MFC after: 1 month
|
233000 |
15-Mar-2012 |
ae |
Add MODULE_DEPEND() to geom_part modules.
MFC after: 2 weeks
|
232680 |
08-Mar-2012 |
emaste |
Remove unactionable message about label geometry
It's not clear to a user what they should do after seeing the "geometry does not match label" kernel message, and it does not appear to present a problem in practice. Thus, just remove the messages.
Approved by: marcel
|
231929 |
20-Feb-2012 |
ae |
If nested scheme allows dump kernel to its partition, we may allow dump for the parent partition too.
MFC after: 2 weeks
|
231928 |
20-Feb-2012 |
ae |
Add alias for the partition type 0x0f. Now "ebr" name is used for both types 0x05 and 0x0f, but 0x05 is preferred and used when partition is created with "gpart add -t ebr ...". This should keep EBR partitions accessible after r231754 for those, who have EBR on the partition with type 0x0f.
|
231754 |
15-Feb-2012 |
ae |
Add additional check to EBR probe and create methods: don't try probe and create EBR scheme when parent partition type is not "ebr". This fixes error messages about corrupted EBR for some partitions where is actually another partition scheme.
NOTE: if you have EBR on the partition with different than "ebr" (0x05) type, then you will lost access to partitions until it will be changed.
MFC after: 2 weeks
|
231751 |
15-Feb-2012 |
ae |
Add PART::type attribute handler. It returns partition type as string.
MFC after: 2 weeks
|
231367 |
10-Feb-2012 |
ae |
Add alias for the partition with type 0x42 to the MBR scheme.
MFC after: 1 week
|
231349 |
10-Feb-2012 |
ae |
Let's be more realistic and limit maximum number of partition to 4k.
MFC after: 1 week
|
231075 |
06-Feb-2012 |
kib |
Current implementations of sync(2) and syncer vnode fsync() VOP uses mnt_noasync counter to temporary remove MNTK_ASYNC mount option, which is needed to guarantee a synchronous completion of the initiated i/o before syscall or VOP return. Global removal of MNTK_ASYNC option is harmful because not only i/o started from corresponding thread becomes synchronous, but all i/o is synchronous on the filesystem which is initiated during sync(2) or syncer activity.
Instead of removing MNTK_ASYNC from mnt_kern_flag, provide a local thread flag to disable async i/o for current thread only. Use the opportunity to move DOINGASYNC() macro into sys/vnode.h and consistently use it through places which tested for MNTK_ASYNC.
Some testing demonstrated 60-70% improvements in run time for the metadata-intensive operations on async-mounted UFS volumes, but still with great deviation due to other reasons.
Reviewed by: mckusick Tested by: scottl MFC after: 2 weeks
|
230990 |
04-Feb-2012 |
emaste |
Correct typo in comment (numbver)
|
230861 |
01-Feb-2012 |
ae |
The scheme code may not know about some inconsistency in the metadata. So, add an integrity check after recovery attempt.
MFC after: 1 week
|
230643 |
28-Jan-2012 |
attilio |
Avoid to check the same cache line/variable from all the locking primitives by breaking stop_scheduler into a per-thread variable. Also, store the new td_stopsched very close to td_*locks members as they will be accessed mostly in the same codepaths as td_stopsched and this results in avoiding a further cache-line pollution, possibly.
STOP_SCHEDULER() was pondered to use a new 'thread' argument, in order to take advantage of already cached curthread, but in the end there should not really be a performance benefit, while introducing a KPI breakage.
In collabouration with: flo Reviewed by: avg MFC after: 3 months (or never) X-MFC: r228424
|
230522 |
25-Jan-2012 |
nwhitehorn |
Experimental support for booting CHRP-type PowerPC systems from hard disks.
|
230064 |
13-Jan-2012 |
truckman |
Allow an MBR primary or extended Linux swap partition to be specified as the system dump device. This was already allowed for GPT. The Linux swap metadata at the beginning of the partition should not be disturbed because the crash dump is written at the end.
Reviewed by: alfred, pjd, marcel MFC after: 2 weeks
|
229886 |
09-Jan-2012 |
jimharris |
Add support for >2TB disks in GEOM RAID for Intel metadata format.
Reviewed by: mav Approved by: scottl MFC after: 1 week
|
229537 |
04-Jan-2012 |
ray |
GEOM_UNCOMPRESS module, can be used with uzip images and with new ulzma images.
Approved by: adrian (mentor)
|
228634 |
17-Dec-2011 |
avg |
replace uses of libkern gets with cngets
MFC after: 2 months
|
228204 |
02-Dec-2011 |
mav |
Close race between geom destruction on g_vfs_close() when softc destroyed and g_vfs_orphan() call that tries to access softc, intruced at r227015.
PR: kern/162997
|
228076 |
28-Nov-2011 |
ae |
Add an ability to increase number of allocated APM entries when we have reserved free space in the APM area. Also instead of one write request per each APM entry, use MAXPHY sized writes when we are updating APM.
MFC after: 1 month
|
228061 |
28-Nov-2011 |
ae |
The size of APM could be bigger than number of already allocated entries. And the first usable sector should not start from the inside of APM area.
MFC after: 1 month
|
227510 |
14-Nov-2011 |
mav |
Temporary revert r227009 to fix freeze on UP systems without PREEMPTION.
Before r215687, if some withered geom or provider could not be destroyed, g_event thread went to sleep for 0.1s before retrying. After that change it is just restarting immediately. r227009 made orphaned (withered) provider to not detach immediately, but only after context switch. That made loop inside g_event thread infinite on UP systems without PREEMPTION.
To address original problem with possible dead lock addressed by r227009 we have to fix r215687 change first, that needs some time to think and test.
|
227464 |
12-Nov-2011 |
mav |
Major GEOM MULTIPATH class rewrite: - Improved locking and destruction process to fix crashes. - Improved "automatic" configuration method to make it consistent and safe by reading metadata back from all specified paths after writing to one. - Added provider size check to reduce chance of ordering conflict with other GEOM classes. - Added "manual" configuration method without using on-disk metadata. - Added "add" and "remove" commands to allow manage paths manually. - Failed paths are no longer dropped from geom, but only marked as FAIL and excluded from I/O operations. - Automatically restore failed paths when all others paths are marked as failed, for example, because of device-caused (not transport) errors. - Added "fail" and "restore" commands to manually control FAIL flag. - geom is now destroyed on last path disconnection. - Added optional Active/Active mode support. Unlike Active/Passive mode, load evenly distributed between all working paths. If supported by the device, it allows to significantly improve performance, utilizing bandwidth of all paths. It is controlled by -A option during creation. Disabled by default now. - Improved `status` and `list` commands output.
Sponsored by: iXsystems, inc. MFC after: 1 month
|
227309 |
07-Nov-2011 |
ed |
Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.
The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.
|
227293 |
07-Nov-2011 |
ed |
Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs.
This means that their use is restricted to a single C file.
|
227015 |
02-Nov-2011 |
mav |
Add mutex and two flags to make orphan() call properly asynchronous: - delay consumer closing and detaching on orphan() until all I/Os complete; - prevent new I/Os submission after orphan() called. Previous implementation could destroy consumers still having active requests and worked only because of global workaround made on GEOM level.
|
227009 |
01-Nov-2011 |
mav |
Make orphan() method in geom_dev asynchronous using destroy_dev_sched_cb() instead of destroy_dev(). It moves device destruction waiting out of the topology lock and so fixes dead lock between orphanization and closing. Real provider and geom destruction called from swi context after device destroyed as callback of the destroy_dev_sched_cb().
|
227004 |
01-Nov-2011 |
mav |
Refactor disk disconnection and geom destruction handling sequences. Do not close/destroy opened consumer directly in case of disconnect. Instead keep it existing until it will be closed in regular way in response to upstream provider destruction. Delay geom destruction in the same way. Previous implementation could destroy consumers still having active requests and worked only because of global workaround made on GEOM level.
|
226998 |
01-Nov-2011 |
mav |
Refactor disk disconnection and geom destruction handling sequences. Do not close/destroy opened consumer directly in case of disconnect. Instead keep it existing until it will be closed in regular way in response to upstream provider destruction. Delay geom destruction in the same way. Previous implementation could destroy consumers still having active requests and worked only because of global workaround made on GEOM level.
|
226985 |
01-Nov-2011 |
mav |
Workaround the problem introduced by combination of r162200 and r215687. r162200 delays provider orphanization until all running requests complete, to workaround broken orphan() method implementation in some classes. r215687 removes persistent periodic (10Hz) event thread wake ups. Together these changes can indefinitely delay orphanization until some other event wake up the event thread. One consequence of this is inability of CAM to destroy device disconnected when busy and, as consequence, create new one after reconnection.
While the best solution would be to revert r162200, it is not easy, as some classes still look broken in that way. Instead conditionally wake up event thread if there are some providers waiting for orphanization.
MFC after: 1 week
|
226880 |
28-Oct-2011 |
ae |
Our geom withering function could take some time before geom with its providers and consumers will be destroyed. Before take some actions with a geom, check that it is not destroyed at the moment.
Tested by: nwhitehorn MFC after: 1 week
|
226840 |
27-Oct-2011 |
pjd |
Before this change when GELI detected hardware crypto acceleration it will start only one worker thread. For software crypto it will start by default N worker threads where N is the number of available CPUs.
This is not optimal if hardware crypto is AES-NI, which uses CPU for AES calculations.
Change that to always start one worker thread for every available CPU. Number of worker threads per GELI provider can be easly reduced with kern.geom.eli.threads sysctl/tunable and even for software crypto it should be reduced when using more providers.
While here, when number of threads exceeds number of CPUs avilable don't reduce this number, assume the user knows what he is doing.
Reported by: Yuri Karaban <dev@dev97.com> MFC after: 3 days
|
226816 |
26-Oct-2011 |
mav |
Clarify disks/volumes above 2TiB support in geom_raid: - add support for volumes above 2TiB with Promise metadata format; - enforse and document other limitations: - Intel and Promise metadata formats do not support disks above 2TiB; - NVIDIA metadata format does not support volumes above 2TiB.
Sponsored by: iXsystems, Inc. MFC after: 2 weeks
|
226737 |
25-Oct-2011 |
pjd |
Allow upper layers to discover than BIO_DELETE and/or BIO_FLUSH is not supported by returning EOPNOTSUPP instead of 0 or ENODEV.
MFC after: 3 days
|
226736 |
25-Oct-2011 |
pjd |
Improve style a bit.
MFC after: 3 days
|
226735 |
25-Oct-2011 |
pjd |
Simplify disk_alloc().
MFC after: 3 days
|
226733 |
25-Oct-2011 |
pjd |
Add support for creating GELI devices with older metadata version for use with older FreeBSD versions: - Add -V option to 'geli init' to specify version number. If no -V is given the most recent version is used. - If -V is given don't allow to use features not supported by this version. - Print version in 'geli list' output. - Update manual page and add table describing which GELI version is supported by which FreeBSD version, so one can use it when preparing GELI device for older FreeBSD version.
Inspired by: Garrett Cooper <yanegomi@gmail.com> MFC after: 3 days
|
226730 |
25-Oct-2011 |
pjd |
When decoding metadata, check magic string, so we know this is not GELI device before we check its version. We don't want to report that some garbage is unsupported version if this is not even GELI provider.
MFC after: 3 days
|
226728 |
25-Oct-2011 |
pjd |
Prefer G_ELI_VERSION_* defines for version numbers over plain digits.
MFC after: 3 days
|
226727 |
25-Oct-2011 |
pjd |
Fit lines into 80 chars.
MFC after: 3 days
|
226721 |
25-Oct-2011 |
pjd |
When metadata is at newer version than the highest supported, return EOPNOTSUPP when decoding.
MFC after: 3 days
|
226647 |
23-Oct-2011 |
marcel |
Add support for Boot Camp. The support is defined as follows: o Detect when Boot Camp is enabled (i.e. the MBR mirrors the GPT). o When Boot Camp is enabled, update the MBR whenever we write the GPT. o Creation of a Boot Camp enabled GPT is not supported. o Automatically disable Boot Camp when the GPT has been changed so that there's either no EFI partition or no HFS+ partition. o The first 4 partitions (by index) get mirrored in the MBR.
Requested by, discussed with and tested by: kris@pcbsd.org MFC after: 1 week
|
226522 |
18-Oct-2011 |
marius |
Allow to dump on Solaris swap partitions.
PR: 161764 Submitted by: Peter Jeremy
|
224147 |
17-Jul-2011 |
pjd |
Add some spare fields to the g_class and g_geom structures needed to implement direct I/O handling and provider's property changes handling.
|
223930 |
11-Jul-2011 |
ae |
Remove include of sys/sbuf.h from geom/geom.h. sbuf support is not always required for geom/geom.h users, and no need to depend from it.
PR: kern/158398
|
223921 |
11-Jul-2011 |
ae |
Include sys/sbuf.h directly.
Reviewed by: pjd
|
223900 |
10-Jul-2011 |
mckusick |
Allow disk partitions associated with UFS read-only mounted filesystems to be opened for writing. This functionality used to be special-cased for just the root filesystem, but with this change is now available for all UFS filesystems. This change is needed for journaled soft updates recovery.
Discussed with: Jeff Roberson
|
223660 |
29-Jun-2011 |
ae |
Initialize elements of state array when creating the GPT table. This fixes the problem, when the secondary GPT header is not erased when partition table destroyed. Move equal operations from g_part_gpt_create and g_part_gpt_recover to the separate function g_gpt_set_defaults.
Reported by: dwhite MFC after: 1 week
|
223594 |
27-Jun-2011 |
ae |
EBR could contain an early stage of boot code. But we do not support it. Remove message about non empty bootcode, we can not break something while GEOM_PART_EBR_COMPAT is defined.
But without GEOM_PART_EBR_COMPAT any changes in EBR are allowed and we can accidentally wipe the boot code. To do not break anything save the first EBR chunk and keep it untouched each time when we are changing EBR. Note that we are still not support boot code for EBR.
PR: kern/141235 MFC after: 1 month
|
223587 |
27-Jun-2011 |
ae |
MS Windows NT+ uses 4 bytes at offset 0x1b8 in the MBR to identify disk drive. The boot0cfg(8) utility preserves these 4 bytes when is writing bootcode to keep a multiboot ability. Change gpart's bootcode method to keep DSN if it is not zero. Also do not allow writing bootcode with size not equal to MBRSIZE.
PR: kern/157819 Tested by: Eir Nym MFC after: 1 month
|
223332 |
20-Jun-2011 |
ae |
Change the way how we update bootcode for BSD scheme. Since the only parameter that we check is size of bootcode, then allow only two sizes: size of boot1 and size of /boot/boot. This partially protects users from losing ability to boot if incorrect bootcode is specified.
Requested by: ru
|
223089 |
14-Jun-2011 |
gibbs |
Plumb device physical path reporting from CAM devices, through GEOM and DEVFS, and make it accessible via the diskinfo utility.
Extend GEOM's generic attribute query mechanism into generic disk consumers. sys/geom/geom_disk.c: sys/geom/geom_disk.h: sys/cam/scsi/scsi_da.c: sys/cam/ata/ata_da.c: - Allow disk providers to implement a new method which can override the default BIO_GETATTR response, d_getattr(struct bio *). This function returns -1 if not handled, otherwise it returns 0 or an errno to be passed to g_io_deliver().
sys/cam/scsi/scsi_da.c: sys/cam/ata/ata_da.c: - Don't copy the serial number to dp->d_ident anymore, as the CAM XPT is now responsible for returning this information via d_getattr()->(a)dagetattr()->xpt_getatr().
sys/geom/geom_dev.c: - Implement a new ioctl, DIOCGPHYSPATH, which returns the GEOM attribute "GEOM::physpath", if possible. If the attribute request returns a zero-length string, ENOENT is returned.
usr.sbin/diskinfo/diskinfo.c: - If the DIOCGPHYSPATH ioctl is successful, report physical path data when diskinfo is executed with the '-v' option.
Submitted by: will Reviewed by: gibbs Sponsored by: Spectra Logic Corporation
Add generic attribute change notification support to GEOM.
sys/sys/geom/geom.h: Add a new attrchanged method field to both g_class and g_geom.
sys/sys/geom/geom.h: sys/geom/geom_event.c: - Provide the g_attr_changed() function that providers can use to advertise attribute changes. - Perform delivery of attribute change notifications from a thread context via the standard GEOM event mechanism.
sys/geom/geom_subr.c: Inherit the attrchanged method from class to geom (class instance).
sys/geom/geom_disk.c: Provide disk_attr_changed() to provide g_attr_changed() access to consumers of the disk API.
sys/cam/scsi/scsi_pass.c: sys/cam/scsi/scsi_da.c: sys/geom/geom_dev.c: sys/geom/geom_disk.c: Use attribute changed events to track updates to physical path information.
sys/cam/scsi/scsi_da.c: Add AC_ADVINFO_CHANGED to the registered asynchronous CAM events for this driver. When this event occurs, and the updated buffer type references our physical path attribute, emit a GEOM attribute changed event via the disk_attr_changed() API.
sys/cam/scsi/scsi_pass.c: Add AC_ADVINFO_CHANGED to the registered asynchronous CAM events for this driver. When this event occurs, update the physical patch devfs alias for this pass instance.
Submitted by: gibbs Sponsored by: Spectra Logic Corporation
|
222813 |
07-Jun-2011 |
attilio |
etire the cpumask_t type and replace it with cpuset_t usage.
This is intended to fix the bug where cpu mask objects are capped to 32. MAXCPU, then, can now arbitrarely bumped to whatever value. Anyway, as long as several structures in the kernel are statically allocated and sized as MAXCPU, it is suggested to keep it as low as possible for the time being.
Technical notes on this commit itself: - More functions to handle with cpuset_t objects are introduced. The most notable are cpusetobj_ffs() (which calculates a ffs(3) for a cpuset_t object), cpusetobj_strprint() (which prepares a string representing a cpuset_t object) and cpusetobj_strscan() (which creates a valid cpuset_t starting from a string representation). - pc_cpumask and pc_other_cpus are target to be removed soon. With the moving from cpumask_t to cpuset_t they are now inefficient and not really useful. Anyway, for the time being, please note that access to pcpu datas is protected by sched_pin() in order to avoid migrating the CPU while reading more than one (possible) word - Please note that size of cpuset_t objects may differ between kernel and userland. While this is not directly related to the patch itself, it is good to understand that concept and possibly use the patch as a reference on how to deal with cpuset_t objects in userland, when accessing kernland members. - KTR_CPUMASK is changed and now is represented through a string, to be set as the example reported in NOTES.
Please additively note that no MAXCPU is bumped in this patch, but private testing has been done until to MAXCPU=128 on a real 8x8x2(htt) machine (amd64).
Please note that the FreeBSD version is not yet bumped because of the upcoming pcpu changes. However, note that this patch is not targeted for MFC.
People to thank for the time spent on this patch: - sbruno, pluknet and Nicholas Esborn (nick AT desert DOT net) tested several revision of the patches and really helped in improving stability of this work. - marius fixed several bugs in the sparc64 implementation and reviewed patches related to ktr. - jeff and jhb discussed the basic approach followed. - kib and marcel made targeted review on some specific part of the patch. - marius, art, nwhitehorn and andreast reviewed MD specific part of the patch. - marius, andreast, gonzo, nwhitehorn and jceel tested MD specific implementations of the patch. - Other people have made contributions on other patches that have been already committed and have been listed separately.
Companies that should be mentioned for having participated at several degrees: - Yahoo! for having offered the machines used for testing on big count of CPUs. - The FreeBSD Foundation for having sponsored my devsummit attendance, which has been instrumental. - Sandvine for having offered offices and infrastructure during development.
(I really hope I didn't forget anyone, if it happened I apologize in advance).
|
222652 |
03-Jun-2011 |
mav |
Update disk's stripesize and stripeoffset parameters on provider open. They are media-dependent and may change in run-time, same as sectorsize and/or mediasize.
SCSI devices return physical sector size and offset via READ CAPACITY(16) command and so can not report it until media inserted or at least until probe sequence completed. UNMAP support is also reported there.
|
222642 |
03-Jun-2011 |
ae |
Add diagnostic message about not aligned partitions.
Idea from: ivoras
|
222603 |
02-Jun-2011 |
ae |
Do not hide stripeoffset from libgeom(3), it may be useful even when stripesize is zero.
MFC after: 1 week
|
222341 |
27-May-2011 |
ae |
Some partitioning tools may have a different opinion about disk geometry and partitions may start from withing the first track. If we found such partitions, then do not reserve space of the first track, only first sector.
|
222283 |
25-May-2011 |
ae |
Prevent non-aligned reading from provider while tasting. Reject providers with unsupported sectorsize.
Reported by: Joerg Wunsch MFC after: 1 week
|
222281 |
25-May-2011 |
ae |
Do not truncate available disk space to the closest track boundary.
|
222280 |
25-May-2011 |
ae |
Do not truncate available disk space to the closest track boundary.
|
222279 |
25-May-2011 |
ae |
Do not truncate available disk space to the closest track boundary.
|
222244 |
24-May-2011 |
ae |
Remove unused variable.
MFC after: 1 week
|
222243 |
24-May-2011 |
ae |
Remove unused variable.
MFC after: 1 week
|
222225 |
23-May-2011 |
pjd |
Recognize BIO_FLUSH requests and pass them to userland.
MFC after: 1 week
|
221992 |
16-May-2011 |
ae |
Make diagnostic messages more specific. With bootverbose print out all inconsistencies of integrity in the partition table, not first found only.
Requested by: kib
|
221984 |
16-May-2011 |
ae |
Add diagnostic messages for integrity checks.
|
221972 |
15-May-2011 |
ae |
Add a sysctl kern.geom.part.check_integrity for those who has corrupt partition tables and lost an ability to boot after r221788. Also unhide an error message from bootverbose, this would help to easier determine the problem.
|
221953 |
15-May-2011 |
trociny |
Fix a memory leak possible in g_eli_key_allocate() if the key with the same keyno is added while we aren't holding the lock.
Approved by: pjd (mentor) MFC after: 1 week
|
221792 |
11-May-2011 |
thompsa |
Move the three geom kprocs as threads under a single pid.
Reviewed by: julian
|
221788 |
11-May-2011 |
ae |
Add basic metadata integrity check. In case when partition table was probed and read successfull, but it contains invalid values (e.g. overlapped partitions, offset or size is out of bounds), then table will be rejected.
MFC after: 1 month
|
221658 |
08-May-2011 |
ae |
Limit number of sectors that can be addressed.
MFC after: 1 week
|
221656 |
08-May-2011 |
ae |
Limit number of sectors that can be addressed.
MFC after: 1 week
|
221654 |
08-May-2011 |
ae |
Limit number of sectors that can be addressed. Reject table if blkcount from metadata is greater than provider.
|
221652 |
08-May-2011 |
ae |
Limit number of sectors that can be addressed.
MFC after: 1 week
|
221647 |
08-May-2011 |
ae |
Replace UINT_MAX to UINT32_MAX.
Pointed out by: kib MFC after: 1 week
|
221645 |
08-May-2011 |
ae |
Limit number of sectors that can be addressed.
MFC after: 1 week
|
221644 |
08-May-2011 |
ae |
Limit number of sectors that can be addressed.
MFC after: 1 week
|
221631 |
08-May-2011 |
pjd |
Export GELI class version via sysctl kern.geom.eli.version.
MFC after: 1 week
|
221630 |
08-May-2011 |
pjd |
Version 6 is compatible with version 5 when it comes to control commands.
MFC after: 1 week
|
221629 |
08-May-2011 |
pjd |
Detect and handle metadata of version 6.
MFC after: 1 week
|
221628 |
08-May-2011 |
pjd |
When support for multiple encryption keys was committed, GELI integrity mode was not updated to pass CRD_F_KEY_EXPLICIT flag to opencrypto. This resulted in always using first key.
We need to support providers created with this bug, so set special G_ELI_FLAG_FIRST_KEY flag for GELI provider in integrity mode with version smaller than 6 and pass the CRD_F_KEY_EXPLICIT flag to opencrypto only if G_ELI_FLAG_FIRST_KEY doesn't exist.
Reported by: Anton Yuzhaninov <citrin@citrin.ru> MFC after: 1 week
|
221626 |
08-May-2011 |
pjd |
Remove prototype for a function that no longer exist.
MFC after: 1 week
|
221625 |
08-May-2011 |
pjd |
Drop proper key.
MFC after: 1 week
|
221624 |
08-May-2011 |
pjd |
Add magic field to the g_eli_key structure to detect if we are really operating on proper structures.
MFC after: 1 week
|
221500 |
05-May-2011 |
adrian |
Updates to geom_map from the author.
The major update here is to support 64 bit size/offsets. There's also style related changes.
Submitted by: ray@dlink.ua
|
221453 |
04-May-2011 |
ae |
Remove unneeded code.
MFC after: 1 week
|
221452 |
04-May-2011 |
ae |
Remove unneeded code.
MFC after: 1 week
|
221451 |
04-May-2011 |
ae |
Remove unneeded code.
MFC after: 1 week
|
221449 |
04-May-2011 |
ae |
Removed KASSERT, g_new_providerf() can not fail.
MFC after: 1 week
|
221447 |
04-May-2011 |
ae |
Remove "for a moment" assignment. struct g_geom zeroed when allocated.
MFC after: 1 week
|
221446 |
04-May-2011 |
ae |
Remove unneeded checks, g_new_xxx functions can not fail.
MFC after: 1 week
|
221433 |
04-May-2011 |
ae |
When checking existence of providers skip those which are orphaned.
PR: kern/132273 MFC after: 2 week
|
221400 |
03-May-2011 |
mav |
Use make_dev_alias_p() added in r221397 to create alias dev entry. It removes panic in case if alias name is already busy for some reason.
|
221101 |
27-Apr-2011 |
mav |
Implement relaxed comparision for hardcoded provider names to make it ignore adX/adaY difference in both directions to simplify migration to the CAM-based ATA or back.
|
221071 |
26-Apr-2011 |
mav |
- Add shim to simplify migration to the CAM-based ATA. For each new adaX device in /dev/ create symbolic link with adY name, trying to mimic old ATA numbering. Imitation is not complete, but should be enough in most cases to mount file systems without touching /etc/fstab. - To know what behavior to mimic, restore ATA_STATIC_ID option in cases where it was present before. - Add some more details to UPDATING.
|
220984 |
24-Apr-2011 |
pjd |
One key is expected from providers smaller than or equal to (2^20)*sectorsize bytes. Remove bogus assertion and while here remove another too obvious assertion.
Reported by: Fabian Keil <freebsd-listen@fabiankeil.de> MFC after: 2 weeks
|
220923 |
21-Apr-2011 |
pjd |
If number of keys for the given provider doesn't exceed the limit, allocate all of them at attach time. This allows to avoid moving keys around in the most-recently-used queue and needs no mutex synchronization nor refcounting.
MFC after: 2 weeks
|
220922 |
21-Apr-2011 |
pjd |
Instead of allocating memory for all the keys at device attach, create reasonably large cache for the keys that is filled when needed. The previous version was problematic for very large providers (hundreds of terabytes or serval petabytes). Every terabyte of data needs around 256kB for keys. Make the default cache limit big enough to fit all the keys needed for 4TB providers, which will eat at most 1MB of memory.
MFC after: 2 weeks
|
220790 |
18-Apr-2011 |
mav |
Reduce geom_raid log verbosity.
|
220652 |
15-Apr-2011 |
gavin |
Remove an incorrect be16toh() that prevented geom_part_apm from working on little-endian machines.
Reviewed by: marcel MFC after: 2 weeks
|
220559 |
12-Apr-2011 |
adrian |
Introduce geom_map, a GEOM provider designed for use by embedded flash stores.
Some devices - notably those with uboot - don't have an explicit partition table (eg like Redboot's FIS.) geom_map thus provides an easy way to export the hard-coded flash layout as geom providers for use by filesystems and other tools.
It also includes a "search" function which allows for dynamic creation of partition layouts where the device only has a single hard-coded partition. For example, if there is a "kernel+rootfs" partition, a single image can be created which appends the rootfs after the kernel with an appropriate search string. geom_map can be told to search for said search string and create a partition beginning after it.
Submitted by: Aleksandr Rybalko <ray@dlink.ua>
|
220299 |
03-Apr-2011 |
trociny |
In g_eli_read_done() and g_eli_write_done(), for a bio with bio_children > 1, g_destroy_bio() is never called and the bio leaks. Fix this by calling g_destroy_bio() earlier, before the check.
Submitted by: Victor Balada Diaz <victor@bsdes.net> (initial version) Approved by: pjd (mentor) MFC after: 1 week
|
220264 |
02-Apr-2011 |
pjd |
GEOM has an internal mechanism to deal with ENOMEM errors returned via g_io_deliver(). In such case it increases 'pace' counter on each ENOMEM and reschedules the request. The 'pace' counter is decreased for each request going down, but until 'pace' is greater than zero, GEOM will handle at most 10 requests per second. For GEOM GATE users that are proxy to local GEOM providers (like ggatel(8) and HAST) we can end up with almost permanent slow down of GEOM down queue. This is because once we reach GEOM GATE queue limit, we return ENOMEM to the GEOM. This means that we have, eg. 1024 I/O requests in the GEOM GATE queue. To make room in the queue and stop returning ENOMEM we need to proceed the requests of course, but those requests are handled by userland daemons that handle them by reading/writing also from/to local GEOM providers. For example with HAST, a new requests comes to /dev/hast/data, which is GEOM GATE provider. GEOM GATE passes the request to hastd(8) and hastd(8) reads/writes from/to /dev/da0. Once we reach GEOM GATE queue limit, to free up a slot in GEOM GATE queue, hastd(8) has to read/write from/to /dev/da0, but this request will also be very slow, because GEOM now slows down all the requests. We end up with full queue that we can unload at the speed of 10 requests per second. This simply looks like a deadlock.
Fix it by allowing userland daemons that work with both GEOM GATE and local GEOM providers to specify unlimited queue size, so GEOM GATE will never return ENOMEM to the GEOM.
MFC after: 1 week
|
220210 |
31-Mar-2011 |
mav |
Bunch of small bugfixes and cleanups.
Found with: Clang Static Analyzer
|
220209 |
31-Mar-2011 |
mav |
Bunch of small bugfixes and cleanups.
Found with: Coverity Prevent(tm) CID: 9656, 9658, 9693, 9705, 9706, 9707, 9808, 9809, 9810, 9711, 9712, 9713, 9714
|
220184 |
31-Mar-2011 |
ae |
Remove unneeded checks, g_new_xxx functions can not return NULL.
Reviewed by: pjd MFC after: 1 week
|
220173 |
30-Mar-2011 |
trociny |
Increase debug level on g_gate device destruction and add message on device creation.
Suggested by: danger Approved by: pjd (mentor) MFC after: 3 days
|
220062 |
27-Mar-2011 |
trociny |
In g_gate_create() there is a window between when g_gate_softc is registered in g_gate_units array and when its sc_provider field is filled. If during this period g_gate_units is accessed by another thread that is checking for provider name collision the crash is possible.
Fix this by adding sc_name field to struct g_gate_softc. In g_gate_create() when g_gate_softc is created but sc_provider is still not sc_name points to provider name stored in the local array.
Approved by: pjd (mentor) Reported by: Freddie Cash <fjwcash@gmail.com> MFC after: 1 week
|
219974 |
24-Mar-2011 |
mav |
MFgraid/head: Add new RAID GEOM class, that is going to replace ataraid(4) in supporting various BIOS-based software RAIDs. Unlike ataraid(4) this implementation does not depend on legacy ata(4) subsystem and can be used with any disk drivers, including new CAM-based ones (ahci(4), siis(4), mvs(4), ata(4) with `options ATA_CAM`). To make code more readable and extensible, this implementation follows modular design, including core part and two sets of modules, implementing support for different metadata formats and RAID levels.
Support for such popular metadata formats is now implemented: Intel, JMicron, NVIDIA, Promise (also used by AMD/ATI) and SiliconImage.
Such RAID levels are now supported: RAID0, RAID1, RAID1E, RAID10, SINGLE, CONCAT.
For any all of these RAID levels and metadata formats this class supports full cycle of volume operations: reading, writing, creation, deletion, disk removal and insertion, rebuilding, dirty shutdown detection and resynchronization, bad sector recovery, faulty disks tracking, hot-spare disks. For Intel and Promise formats there is support multiple volumes per disk set.
Look graid(8) manual page for additional details.
Co-authored by: imp Sponsored by: Cisco Systems, Inc. and iXsystems, Inc.
|
219970 |
24-Mar-2011 |
mav |
MFgraid/head r218212, r218257: Introduce new type of BIO_GETATTR -- GEOM::setstate, used to inform lower GEOM about state of it's providers from the point of upper layers. Make geom_disk use led(4) subsystem to illuminate states in such fashion: FAILED - "1" (on), REBUILD - "f5" (slow blink), RESYNC - "f1" (fast blink), ACTIVE - "0" (off). LED name should be set for each disk via kern.geom.disk.%s.led sysctl. Later disk API could be extended to allow disk driver to report this info in custom way via it's own facilities.
|
219950 |
24-Mar-2011 |
mav |
MFgraid/head r217827: Change BIO_GETATTR("GEOM::kerneldump") API to make set_dumper() called by consumer (geom_dev) instead of provider (geom_disk). This allows any geom insert it's code into the dump call chain, implementing more sophisticated functionality then just disk partitioning.
|
219400 |
08-Mar-2011 |
sobomax |
Some linux distros put mount point into the ext2fs labels, such as '/', or '/boot', which confuses the devfs code and can cause userland programs to fail reading /dev/ext2fs directory with weird error code, such as any program that uses pwlib.
Strip any leading slashes before feeding the label to the geom_label code.
Sponsored by: Sippy Software, Inc.
MFC after: 1 week
|
219056 |
26-Feb-2011 |
nwhitehorn |
Add the disk ident and a human-meaningful description (here, the disk model string) to the geom_disk config XML so that they are easily accessible from userland.
MFC after: 1 week
|
219029 |
25-Feb-2011 |
netchild |
Add some FEATURE macros for various GEOM classes.
No FreeBSD version bump, the userland application to query the features will be committed last and can serve as an indication of the availablility if needed.
Sponsored by: Google Summer of Code 2010 Submitted by: kibab Reviewed by: silence on geom@ during 2 weeks X-MFC after: to be determined in last commit with code from this project
|
218909 |
21-Feb-2011 |
brucec |
Fix typos - remove duplicate "the".
PR: bin/154928 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days
|
218845 |
19-Feb-2011 |
nyan |
Add support to set a slice name.
|
218675 |
14-Feb-2011 |
luigi |
Correct a subtle bug in the 'gsched_rr' disk scheduler. The algorithm is supposed to work as follows: in order to prevent starvation, when a new client starts being served we record the start time and reset the counter of bytes served. We then switch to a new client after a certain amount of time or bytes, even if the current one still has pending requests. To avoid charging a new client the time of the first seek, we start counting time when the first request is served.
Unfortunately a bug in the previous version of the code failed to set the start time in certain cases, resulting in some processes exceeding their timeslice.
The fix (in this patch) is trivial, though it took a while to find out and replicate the bug. Thanks to Tommaso Caprai for investigating and fixing the problem.
Submitted by: Tommaso Caprai MFC after: 1 week
|
218663 |
13-Feb-2011 |
marcel |
Use the preload_fetch_addr() and preload_fetch_size() convenience functions to obtain the address and size of the preloaded key files.
Sponsored by: Juniper Networks.
|
218558 |
11-Feb-2011 |
nyan |
Add support to write boot menu.
|
218014 |
28-Jan-2011 |
ae |
Add new user-friendly aliases for partition types for the MBR and EBR schemes: fat32, ebr, linux-data, linux-raid, linux-swap and linux-lvm. Add bios-boot GUID and alias for the GPT scheme. It used by GRUB 2 loader. Also do sorting definitions of types in diskmbr.h and in g_part.c.
PR: bin/120990, kern/147664 MFC after: 2 weeks
|
217924 |
27-Jan-2011 |
ae |
While inspecting the disklabel check that start offset of partition is within provider's bounds. If not then reject this disklabel. Mark bbarea as NULL to do not free it again in destroy method.
MFC after: 1 week
|
217915 |
26-Jan-2011 |
mdf |
Remove the CTLFLAG_NOLOCK as it seems to be both unused and unfunctional. Wiring the user buffer has only been done explicitly since r101422.
Mark the kern.disks sysctl as MPSAFE since it is and it seems to have been mis-using the NOLOCK flag.
Partially break the KPI (but not the KBI) for the sysctl_req 'lock' field since this member should be private and the "REQ_LOCKED" state seems meaningless now.
|
217880 |
26-Jan-2011 |
kib |
Treat async buffer writes from the gjournal switcher thread the same as from syncer. We shall not sleep on running buffer space when suspending.
Reproduced and tested by: pho PR: kern/154228 MFC after: 1 week
|
217531 |
18-Jan-2011 |
ae |
Limit maximum number of GPT entries to 4k. It is most realistic value and can prevent kernel memory exhausting when big value is specified from command line.
Split reading and writing operation to several iteration to do not trigger KASSERT when data length is greater than MAXPHYS.
PR: kern/144962, kern/147851 MFC after: 2 weeks
|
217324 |
12-Jan-2011 |
mdf |
sysctl(9) cleanup checkpoint: amd64 GENERIC builds cleanly.
Commit the geom piece.
|
217305 |
12-Jan-2011 |
ae |
Sector size can not be greater than MAXPHYS. Since GRAID3 calculates sector size from user-specified block size, report to user about big blocksize.
PR: kern/147851 MFC after: 1 week
|
217303 |
12-Jan-2011 |
ae |
Sector size can not be greater than MAXPHYS.
MFC after: 1 week
|
217263 |
11-Jan-2011 |
ae |
Remove redundant check.
MFC after: 1 week
|
217262 |
11-Jan-2011 |
ae |
Round GNOP provider's mediasize to its sectorsize. This prevents KASSERT in g_io_request when geom classes doing tasting.
PR: kern/147852 MFC after: 1 week
|
217109 |
07-Jan-2011 |
mdf |
Fix a memory overflow where the input length to g_gpt_utf8_to_utf16() was specified incorrectly, causing the bzero to run past the end of a malloc(9)'d object.
Submitted by: Eric Youngblut < eyoungblut AT isilon DOT com > MFC after: 3 days
|
217040 |
06-Jan-2011 |
nwhitehorn |
Add an entry to the gpart XML to determine if the geom has pending changes that need to be committed (or undone).
MFC after: 2 weeks
|
216952 |
04-Jan-2011 |
kib |
Finish r210923, 210926. Mark some devices as eternal.
MFC after: 2 weeks
|
216794 |
29-Dec-2010 |
kib |
Add reporting of GEOM::candelete BIO_GETATTR for md(4) and geom_disk(4). Non-zero value of attribute means that device supports BIO_DELETE.
Suggested and reviewed by: pjd Tested by: pho MFC after: 1 week
|
216755 |
28-Dec-2010 |
ae |
Allow destroying EBR in COMPAT (default) mode.
MFC after: 2 week
|
216754 |
28-Dec-2010 |
ae |
Make EBR probe method less strictly to be able detect EBRs with small non fatal inconsistency. EBR may contain boot loader and sometimes it just has some garbage data. Now this does not prevent FreeBSD to use extended partitions. But since we do not support bootcode for EBR we mark tables which have non empty boot area as corrupt. This does make them readonly and we can not damage this data.
PR: kern/141235 MFC after: 1 month
|
216269 |
07-Dec-2010 |
brucec |
Don't warn if a partition appears not to be aligned on a track boundary. Modern disks use LBA and create a fake CHS geometry that doesn't have any relation to the on-disk layout of data.
|
216132 |
02-Dec-2010 |
ivoras |
Add a note about the magic number 20. Actually, 22.75 entries fit in a 512 byte sector but when choosing magic numbers, 20 looks nicer.
Discussed with: marcel
|
216098 |
01-Dec-2010 |
jh |
- Report an error when a label with invalid name is attempted to be created with glabel(8). - Fix a typo in an error message. - Fix comment typos.
Approved by: pjd
|
215687 |
22-Nov-2010 |
jh |
Use g_eventlock to protect against losing wakeups in the g_event process and replace tsleep(9) with msleep(9) which doesn't use a timeout. The previously used timeout caused the event process to wake up ten times per second on an idle system.
one_event() is now called with the topology lock held and it returns with both the topology and event locks held when there are no more events in the queue.
Reported by: mav, Marius Nünnerich Reviewed by: freebsd-geom
|
215299 |
14-Nov-2010 |
ed |
Add support for asterisk characters when filling in the GELI password during boot.
Change the last argument of gets() to indicate a visibility flag and add definitions for the numerical constants. Except for the value 2, gets() will behave exactly the same, so existing consumers shouldn't break. We only use it in two places, though.
Submitted by: lme (older version)
|
215118 |
11-Nov-2010 |
ae |
Fix regression introduced in r215088: gpart(8) reports "arg0 'provider': Invalid argument" after creating new partition table. Move code for search of existing geom into g_part_find_geom function and use this function instead of g_part_parm_geom in g_part_ctl_create.
Approved by: kib (mentor)
|
215088 |
10-Nov-2010 |
ae |
In r212554 name of G_PART_PARM_GEOM and G_PART_PARM_PROVIDER ctlreq parameters was changed to "arg0". Fix the last place where it is used.
Approved by: kib (mentor)
|
214748 |
03-Nov-2010 |
jh |
Extend the g_eventlock mutex coverage in one_event() to include setting of the EV_DONE flag and use the mutex to protect against losing wakeups in g_waitfor_event().
Reported by: davidxu Tested by: davidxu Discussed on: freebsd-current
|
214352 |
25-Oct-2010 |
ae |
Reimplemented "gpart destroy -F". Now it does all work in kernel. This was needed for recover implementation.
Implement the recover command for GPT. Now GPT will marked as corrupt when any of three types of corruption will be detected: 1. Damaged primary GPT header or table 2. Damaged secondary GPT header or table 3. Secondary header is not located in the last LBA Marked GPT becomes read-only. Any changes with corrupt table are prohibited. Only "destroy" and "recover" commands are allowed.
Discussed with: geom@ (mostly silence) Tested by: Ilya A. Arhipov Approved by: mav (mentor) MFC after: 2 weeks
|
214229 |
22-Oct-2010 |
pjd |
- Improve error messages, so instead of 'Not fully done', the user will get information that device is already suspended or that device is using one-time key and suspend is not supported. - 'geli suspend -a' silently skips devices that use one-time key, this is fine, but because we log which device were suspended on the console, log also which devices were skipped.
|
214228 |
22-Oct-2010 |
pjd |
Close a race between checking if device is already suspended and suspending it.
|
214227 |
22-Oct-2010 |
pjd |
Add State tag, so 'geli status' will report active/suspended status, eg:
# geli status Name Status Components da0.eli SUSPENDED da0 da1.eli ACTIVE da1
|
214226 |
22-Oct-2010 |
pjd |
Encryption keys array might be NULL if device is suspended. Check for this, so we don't panic when we detach suspended device.
|
214225 |
22-Oct-2010 |
pjd |
Move sc_akeyctx and sc_ivctx initialization to the g_eli_mkey_propagate() function which eliminates code duplication and will ensure proper order of operation.
|
214163 |
21-Oct-2010 |
pjd |
Free opencrypto sessions on suspend, as they also might keep encryption keys.
|
214133 |
21-Oct-2010 |
pjd |
Fix a bug introduced in r213067 where we use authentication key before initializing it.
|
214118 |
20-Oct-2010 |
pjd |
Bring in geli suspend/resume functionality (finally).
Before this change if you wanted to suspend your laptop and be sure that your encryption keys are safe, you had to stop all processes that use file system stored on encrypted device, unmount the file system and detach geli provider.
This isn't very handy. If you are a lucky user of a laptop where suspend/resume actually works with FreeBSD (I'm not!) you most likely want to suspend your laptop, because you don't want to start everything over again when you turn your laptop back on.
And this is where geli suspend/resume steps in. When you execute:
# geli suspend -a
geli will wait for all in-flight I/O requests, suspend new I/O requests, remove all geli sensitive data from the kernel memory (like encryption keys) and will wait for either 'geli resume' or 'geli detach'.
Now with no keys in memory you can suspend your laptop without stopping any processes or unmounting any file systems.
When you resume your laptop you have to resume geli devices using 'geli resume' command. You need to provide your passphrase, etc. again so the keys can be restored and suspended I/O requests released.
Of course you need to remember that 'geli suspend' won't clear file system cache and other places where data from your geli-encrypted file system might be present. But to get rid of those stopping processes and unmounting file system won't help either - you have to turn your laptop off. Be warned.
Also note, that suspending geli device which contains file system with geli utility (or anything used by 'geli resume') is not very good idea, as you won't be able to resume it - when you execute geli(8), the kernel will try to read it and this read I/O request will be suspended.
|
214116 |
20-Oct-2010 |
pjd |
- Add missing comments. - Make a comment consistent with others.
|
214063 |
19-Oct-2010 |
jh |
Use make_dev_p(9) with the MAKEDEV_CHECKNAME flag instead of make_dev(9) and print a diagnostic if the call fails.
This avoids a panic when a device with an invalid name is attempted to be registered. For example the label class gets device names from untrusted input.
Reviewed by: freebsd-geom
|
213769 |
13-Oct-2010 |
rpaulo |
The canonical way to print __func__ when using KASSERT() is to write ("%s", __func__). This avoids clang's -Wformat-string warnings.
|
213662 |
09-Oct-2010 |
ae |
Replace strlen(_PATH_DEV) with sizeof(_PATH_DEV) - 1.
Suggested by: kib Approved by: kib (mentor) MFC after: 5 days
|
213318 |
01-Oct-2010 |
lulf |
- Check flag with the bitwise operator, not the logical operator.
Submitted by: arundel MFC after: 1 week
|
213174 |
25-Sep-2010 |
ae |
Some schemes can allocate memory for internal purposes but when GEOM does withering this memory doesn't freed. Add G_PART_DESTROY call to g_part_wither. Also add missed g_free() call to G_PART_READ method for MBR and PC98 schemes.
Submitted by: jh (previous version) Reviewed by: pjd Approved by: kib (mentor)
|
213165 |
25-Sep-2010 |
pjd |
Change g_eli_debug to int, so one can turn off any GELI output by setting kern.geom.eli.debug sysctl to -1.
MFC after: 2 weeks
|
213164 |
25-Sep-2010 |
pjd |
Ignore errors from BIO_FLUSH. It might confuse users that provider wasn't really killed. What we really care about are write errors only.
MFC after: 2 weeks
|
213135 |
24-Sep-2010 |
pjd |
Allow to configure GPT attributes. It shouldn't be allowed to set bootfailed attribute (it should be allowed only to unset it), but for test purposes it might be useful, so the current code allows it.
Reviewed by: arch@ (Message-ID: <20100917234542.GE1902@garage.freebsd.pl>) MFC after: 2 weeks
|
213072 |
23-Sep-2010 |
pjd |
Update copyright years.
MFC after: 1 week
|
213070 |
23-Sep-2010 |
pjd |
Add support for AES-XTS. This will be the default now.
MFC after: 1 week
|
213067 |
23-Sep-2010 |
pjd |
Implement switching of data encryption key every 2^20 blocks. This ensures the same encryption key won't be used for more than 2^20 blocks (sectors). This will be the default now.
MFC after: 1 week
|
213063 |
23-Sep-2010 |
pjd |
Make the code similar to the code in g_eli_integrity.c.
MFC after: 1 week
|
213062 |
23-Sep-2010 |
pjd |
Define default overwrite count, so that userland can use it.
MFC after: 1 week
|
213055 |
23-Sep-2010 |
pjd |
When trashing metadata, flush after each write.
MFC after: 1 week
|
212845 |
19-Sep-2010 |
brian |
Support attaching version 4 metadata
Reviewed by: pjd
|
212754 |
16-Sep-2010 |
mav |
Add support for dumping kernel to gconcat. Dumping goes to the component, where dump partition begins.
|
212706 |
15-Sep-2010 |
pjd |
Change message when setting or unsetting attribute less confusing. Before:
ada0 has <attrib> set
After:
<attrib> set on ada0
MFC after: 2 weeks
|
212703 |
15-Sep-2010 |
pjd |
Make the message that informs about bootcode being written to disk less confusing.
Note there is still no information about 'partcode' being written to disk (gpart bootcode -p <partcode> <disk>).
Maybe in the future all the messages printed by gpart(8) on success could be hidden under -v?
PR: bin/150239 Reported by: Roddi <roddi@me.com> Submitted by: arundel MFC after: 2 weeks
|
212614 |
14-Sep-2010 |
pjd |
- Change all places where G_TYPE_ASCNUM is used to G_TYPE_NUMBER. It turns out the new type wasn't really needed. - Reorganize code a little bit.
|
212609 |
14-Sep-2010 |
pjd |
Simplify the code a bit.
|
212554 |
13-Sep-2010 |
pjd |
- Remove gc_argname field. It was introduced for gpart(8), but if I understand everything correctly, we don't really need it. - Provide default numeric value as strings. This allows to simplify a lot of code. - Bump version number.
|
212547 |
13-Sep-2010 |
pjd |
- Allow to specify value as const pointers. - Make optional string values always an empty string.
|
212160 |
02-Sep-2010 |
gibbs |
Correct bioq_disksort so that bioq_insert_tail() offers barrier semantic. Add the BIO_ORDERED flag for struct bio and update bio clients to use it.
The barrier semantics of bioq_insert_tail() were broken in two ways:
o In bioq_disksort(), an added bio could be inserted at the head of the queue, even when a barrier was present, if the sort key for the new entry was less than that of the last queued barrier bio.
o The last_offset used to generate the sort key for newly queued bios did not stay at the position of the barrier until either the barrier was de-queued, or a new barrier (which updates last_offset) was queued. When a barrier is in effect, we know that the disk will pass through the barrier position just before the "blocked bios" are released, so using the barrier's offset for last_offset is the optimal choice.
sys/geom/sched/subr_disk.c: sys/kern/subr_disk.c: o Update last_offset in bioq_insert_tail().
o Only update last_offset in bioq_remove() if the removed bio is at the head of the queue (typically due to a call via bioq_takefirst()) and no barrier is active.
o In bioq_disksort(), if we have a barrier (insert_point is non-NULL), set prev to the barrier and cur to it's next element. Now that last_offset is kept at the barrier position, this change isn't strictly necessary, but since we have to take a decision branch anyway, it does avoid one, no-op, loop iteration in the while loop that immediately follows.
o In bioq_disksort(), bypass the normal sort for bios with the BIO_ORDERED attribute and instead insert them into the queue with bioq_insert_tail(). bioq_insert_tail() not only gives the desired command order during insertion, but also provides barrier semantics so that commands disksorted in the future cannot pass the just enqueued transaction.
sys/sys/bio.h: Add BIO_ORDERED as bit 4 of the bio_flags field in struct bio.
sys/cam/ata/ata_da.c: sys/cam/scsi/scsi_da.c Use an ordered command for SCSI/ATA-NCQ commands issued in response to bios with the BIO_ORDERED flag set.
sys/cam/scsi/scsi_da.c Use an ordered tag when issuing a synchronize cache command.
Wrap some lines to 80 columns.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c sys/geom/geom_io.c Mark bios with the BIO_FLUSH command as BIO_ORDERED.
Sponsored by: Spectra Logic Corporation MFC after: 1 month
|
211927 |
28-Aug-2010 |
pjd |
Correct offset conversion to little endian. It was implemented in version 2, but because of a bug it was a no-op, so we were still using offsets in native byte order for the host. Do it properly this time, bump version to 4 and set the G_ELI_FLAG_NATIVE_BYTE_ORDER flag when version is under 4.
MFC after: 2 weeks
|
211455 |
18-Aug-2010 |
mav |
Remove bintime_cmp() function, unused since r200086.
MFC after: 1 week
|
210795 |
03-Aug-2010 |
ae |
Check that gsp is not NULL before access. It can be NULL for some cases.
Approved by: kib (mentor) MFC after: 1 week
|
210792 |
03-Aug-2010 |
ae |
Check that table is not NULL before access, it can be NULL for some cases.
Approved by: mav (mentor) MFC after: 2 weeks
|
210747 |
02-Aug-2010 |
ae |
Forward ioctl requests to original geom.
PR: 148540 Silence from: luigi Reviewed by: pjd Approved by: mav (mentor) MFC after: 2 weeks
|
210746 |
02-Aug-2010 |
ae |
Release access for consumers that are opened, but will be destroyed indirectly by orphan method.
PR: 148688 Silence from: marcel Approved by: mav (mentor) MFC after: 2 weeks
|
210471 |
25-Jul-2010 |
mav |
Export PCI IDs of ATA/SATA controllers through CAM and ata(4) layers to GEOM. This information needed for proper soft-RAID's on-disk metadata reading and writing.
|
210401 |
23-Jul-2010 |
ae |
Prevent access after free to table entry in case when user deletes partition that not yet created (changes doesn't committed to disk).
PR: 148687 Approved by: mav (mentor) MFC after: 7 days
|
210046 |
14-Jul-2010 |
ru |
Fixed cache size decoding read from a label.
PR: kern/144732 Submitted by: Eugene Grosbein MFC after: 3 days
|
209536 |
26-Jun-2010 |
rpaulo |
Add NTFS partition type to GEOM_MBR.
|
209187 |
14-Jun-2010 |
pjd |
'unit' can be negative, so use signed type for it.
Found by: Coverity Prevent CID: 3731 MFC after: 3 days
|
209186 |
14-Jun-2010 |
pjd |
BIO_DELETE contains range we want to delete and doesn't provide any useful data, so there is no need to copy it to userland.
MFC after: 3 days
|
209062 |
11-Jun-2010 |
avg |
fix a few cases where a string is passed via format argument instead of via %s
Most of the cases looked harmless, but this is done for the sake of correctness. In one case it even allowed to drop an intermediate buffer.
Found by: clang MFC after: 2 week
|
208992 |
10-Jun-2010 |
trasz |
Untangle g_print_bio(), silencing Coverity.
Found with: Coverity Prevent CID: 3566, 3567
|
208927 |
08-Jun-2010 |
mjacob |
Try and narrow the gap in which you act on an event that has been canceled. Obtained from: Jaako Heinonen MFC after: 1 month
|
208812 |
05-Jun-2010 |
trasz |
Make sure not to pass NULL to g_orphan_provider().
Found with: Coverity Prevent CID: 3411
|
208746 |
02-Jun-2010 |
marius |
Don't leak memory on destruction.
Reviewed by: marcel MFC after: 3 days
|
208672 |
31-May-2010 |
avg |
g_label: fix possible NULL pointer dereference
in case glabel debug level is >= 1 and gp->provider list is empty for some reason
Found by: clang static analyzer MFC after: 4 days
|
208515 |
24-May-2010 |
marius |
Fix some whitespace nits.
|
208173 |
16-May-2010 |
nwhitehorn |
Teach gpart about bootcode on APM.
|
208101 |
14-May-2010 |
mjacob |
Yet another potential dereference of a dead provider.
Sponsored by: Panasas MFC after: 1 week
|
208082 |
14-May-2010 |
mjacob |
Make sure to check that the active provider pointer points to something before dereferencing the pointer.
Sponsored by: Pansas MFC after: 1 week
|
207878 |
10-May-2010 |
jh |
- Don't return EAGAIN from gv_unload(). It was used to work around the deadlock fixed in r207671. - Wait for worker process to exit at class unload. The worker process was not guaranteed to exit before the linker unloaded the module. - Use 0 as the worker process exit status instead of ENXIO and style the NOTREACHED comment.
Reviewed by: lulf X-MFC after: r207671
|
207877 |
10-May-2010 |
jh |
In g_zero_destroy_geom(), return 0 instead of EBUSY in the success case. EBUSY was probably used as a workaround for the deadlock fixed in r207671.
Approved by: pjd X-MFC after: r207671
|
207789 |
08-May-2010 |
lulf |
- Remove obsolete flags.
MFC after: 1 week
|
207671 |
05-May-2010 |
jh |
Fix deadlock between GEOM class unloading and withering. Withering can't proceed while g_unload_class() blocks the event thread. Fix this by not running g_unload_class() as a GEOM event and dropping the topology lock when withering needs to proceed.
PR: kern/139847 Silence on: freebsd-geom
|
207181 |
25-Apr-2010 |
marcel |
Re-calculate a geometry when reprobing as well.
PR: kern/145452 Reported by: "Andrey V. Elsukov" <bu7cher@yandex.ru>
|
207178 |
25-Apr-2010 |
marcel |
Fix undo for schemes that have internal partitions. Internal partitions do not constitute user-visible or active partitions and as such should not prevent undoing pending operations.
While here, initialize the last usable sector for the placeholder geom based on the null scheme, created to allow undoing the destruction of a scheme. This gives consistent output with "gpart show".
Based on a patch from: "Andrey V. Elsukov" <bu7cher@yandex.ru>
|
207094 |
23-Apr-2010 |
marcel |
Implement the resize verb and add support for resizing partitions for all schemes but EBR. Quality work by Andrey!
Submitted by: "Andrey V. Elsukov" <bu7cher@yandex.ru>
|
206859 |
19-Apr-2010 |
jh |
Fix ddb(4) "show geom addr" command when INVARIANTS is enabled. Don't assert that the topology lock is held when g_valid_obj() is called from debugger.
MFC after: 1 week
|
206665 |
15-Apr-2010 |
pjd |
Use lower priority for GELI worker threads. This improves system responsiveness under heavy GELI load.
MFC after: 3 days
|
206650 |
15-Apr-2010 |
avg |
g_io_check: respond to zero pp->mediasize with ENXIO
Previsouly this condition was reported with EIO by bio_offset > mediasize check. Perhaps that check should be extended to bio_offset+bio_length > mediasize.
MFC after: 1 week
|
206552 |
13-Apr-2010 |
luigi |
fix copyright format, as requested by Joel Dahl
|
206551 |
13-Apr-2010 |
luigi |
make code compile with KTR
|
206497 |
12-Apr-2010 |
luigi |
Bring in geom_sched, support for scheduling disk I/O requests in a device independent manner. Also include an example anticipatory scheduler, gsched_rr, which gives very nice performance improvements in presence of competing random access patterns.
This is joint work with Fabio Checconi, developed last year and presented at BSDCan 2009. You can find details in the README file or at
http://info.iet.unipi.it/~luigi/geom_sched/
|
206130 |
03-Apr-2010 |
avg |
g_vfs_open: allow only one mount per device vnode
In other words, deny multiple read-only mounts of the same device. Shared read-only mounts should theoretically be possible, but, unfortunately, can not be implemented correctly using current buffer cache code/interface and results in an eventual system crash. Also, using nullfs seems to be a more efficient way to achieve the same goal.
This gets us back to where we were before GEOM and where other BSDs are.
Submitted by: pjd (idea for checking for shared mounting) Discussed with: phk, pjd Silence from: fs@, geom@ MFC after: 2 weeks
|
206097 |
02-Apr-2010 |
avg |
bo_bsize: revert r205860 and take an alternative approch in getblk
In r205860 I missed the fact that there is code that strongly assumes that devvp bo_bsize is equal to underlying provider's sectorsize. In those places it is hard to obtain the sectorsize in an alternative way if devvp bo_bsize is set to something else. So, I am reverting bo_bsize assigment in g_vfs_open. Instead, in getblk I use DEV_BSIZE block size for b_offset calculation if vp is a disk vp as reported by vn_isdisk. This should coinside with vp being a devvp.
Reported by: Mykola Dzham <i@levsha.me> Tested by: Mykola Dzham <i@levsha.me> Pointyhat to: avg MFC after: 2 weeks X-ToDo: convert bread(devvp) in all fs to use bo_bsize-d blocks
|
205860 |
29-Mar-2010 |
avg |
g_vfs_open: correctly set devvp.v_bufobj.bo_bsize to DEV_BSIZE
Because of how breadn -> bufstrategy -> g_vfs_strategy are currently implemented, bread on devvp always expects DEV_BSIZE block size. Thus, devvp bo_bsize must always be DEV_BSIZE irrespective of media properties or filesystem implementation details.
Reviewed by: mckusick MFC after: 2 weeks
|
205847 |
29-Mar-2010 |
mjacob |
Change how multipath labels are created and managed. This makes it easier to support various storage boxes which really aren't active-active.
We only write the label on the *first* provider. For all other providers we just "add" the disk. This also allows for an "add" verb.
A usage implication is that you should specificy the currently active storage path as the first provider.
Note that this does not add RDAC-like functionality, but better allows for autovolumefailover configurations (additional checkins elsewhere will support this).
Sponsored by: Panasas MFC after: 1 month
|
205619 |
24-Mar-2010 |
mav |
Do not fetch precise time of request start when stats collection disabled.
Reviewed by: pjd, phk
|
205412 |
21-Mar-2010 |
mjacob |
Add 'rotate' and 'getactive' verbs to provide some control and information about what the currently active path is.
Sponsored by: Panasas MFC after: 1 month
|
205385 |
20-Mar-2010 |
jh |
Escape characters unsafe for XML output in GEOM class, instance and provider names.
- Characters in range 0x01-0x1f except '\t', '\n', and '\r' are replaced with '?'. Those characters are disallowed in XML. - '&', '<', '>', '\'', '"' and characters in range 0x7f-0xff are replaced with XML numeric character reference.
If the kern.geom.confxml sysctl provides invalid XML, libgeom geom_xml2tree() fails and utilities using it do not work. Unsafe characters are common in msdosfs and cd9660 labels.
PR: kern/104389 Submitted by: Doug Steinwand (original version) Reviewed by: pjd Discussed on: freebsd-geom MFC after: 3 weeks
|
205279 |
18-Mar-2010 |
pjd |
Simplify loops.
|
204886 |
08-Mar-2010 |
lulf |
- Set missing flag when initiating a plex rebuild with the rebuildparity command. - Check if plex is already syncing or rebuilding before initiating a parity rebuild or check.
|
204076 |
18-Feb-2010 |
pjd |
Please welcome HAST - Highly Avalable Storage.
HAST allows to transparently store data on two physically separated machines connected over the TCP/IP network. HAST works in Primary-Secondary (Master-Backup, Master-Slave) configuration, which means that only one of the cluster nodes can be active at any given time. Only Primary node is able to handle I/O requests to HAST-managed devices. Currently HAST is limited to two cluster nodes in total.
HAST operates on block level - it provides disk-like devices in /dev/hast/ directory for use by file systems and/or applications. Working on block level makes it transparent for file systems and applications. There in no difference between using HAST-provided device and raw disk, partition, etc. All of them are just regular GEOM providers in FreeBSD.
For more information please consult hastd(8), hastctl(8) and hast.conf(5) manual pages, as well as http://wiki.FreeBSD.org/HAST.
Sponsored by: FreeBSD Foundation Sponsored by: OMCnet Internet Service GmbH Sponsored by: TransIP BV
|
204071 |
18-Feb-2010 |
pjd |
- Style fixes. - Prefer strlcpy() over strncpy().
|
204070 |
18-Feb-2010 |
pjd |
Correct comment.
|
204069 |
18-Feb-2010 |
pjd |
Log attach just like we log detach.
|
203411 |
03-Feb-2010 |
gonzo |
- Give geom_redboot taste of flash/spi. Now there is another provider of redboot partitions. This patch was missed during merge from projects/mips.
|
203408 |
02-Feb-2010 |
delphij |
Prevent NULL deference by checking return value of gctl_get_asciiparam.
MFC after: 2 weeks
|
203261 |
30-Jan-2010 |
marcel |
Export the UUID of the partition in the XML. The partition UUID is used by EFI's device path to identify a partition. In order for FreeBSD to add EFI boot options, proper device paths need to be constructed.
|
202987 |
25-Jan-2010 |
ivoras |
Go through with write_metadata() non-error-handling and make it return "void". This is mostly to avoid dead variable assignment warning by LLVM. No functional change.
Pointed out by: trasz Approved by: gnn (mentor)
|
202977 |
25-Jan-2010 |
trasz |
Remove unneeded variables.
Found with: clang
|
202976 |
25-Jan-2010 |
trasz |
Remove pointless assignment.
Found with: clang
|
202974 |
25-Jan-2010 |
trasz |
Remove some pointless variable assignments.
Found with: clang
|
202972 |
25-Jan-2010 |
trasz |
Remove unused variable.
Found with: clang
|
202454 |
17-Jan-2010 |
delphij |
Expose stripe offset and stripe size through libgeom and geom(8) userland utilities.
Reviewed by: pjd, mav (earlier version)
|
202437 |
16-Jan-2010 |
trasz |
Add gmountver, disk mount verification GEOM class.
Note that due to e.g. write throttling ('wdrain'), it can stall all the disk I/O instead of just the device it's configured for. Using it for removable media is therefore not a good idea.
Reviewed by: pjd (earlier version)
|
201645 |
06-Jan-2010 |
mav |
Change the way in which zero stripesize is handled. Instead of reporting zero stripeoffset in such case (as if device has no stripes), report offset from the beginning of the media (as if device has single infinite stripe).
This gives partitioning tools information, required to guess better partition alignment, in case if hardware doesn't report it's stripe size. For example, it should give disklabel info about odd offset made by fdisk.
|
201567 |
05-Jan-2010 |
mav |
Move wakeup() out of mutex to reduce contention.
|
201566 |
05-Jan-2010 |
mav |
Move wakeup() out of mutex to reduce contention.
|
201545 |
05-Jan-2010 |
mav |
Slightly optimize XOR calculation.
|
201374 |
02-Jan-2010 |
marcel |
Properly return the UUID represented by the alias.
PR: 142174 Submitted by: Przemyslaw Laczynski <torindel@gmail.com> Pointy hat to: rpaulo
|
201264 |
30-Dec-2009 |
mav |
Call wakeup() only for the first request on the queue.
|
201145 |
28-Dec-2009 |
antoine |
(S)LIST_HEAD_INITIALIZER takes a (S)LIST_HEAD as an argument. Fix some wrong usages. Note: this does not affect generated binaries as this argument is not used.
PR: 137213 Submitted by: Eygene Ryabinkin (initial version) MFC after: 1 month
|
201139 |
28-Dec-2009 |
mav |
Add BIO_DELETE support to ada(4): - For SSDs use TRIM feature of DATA SET MANAGEMENT command, as defined by ACS-2 specification working draft. - For CompactFlash use CFA ERASE command, same as ad(4) does.
With this patch, `newfs -E /dev/ada1` was able to restore write speed of my heavily weared OCZ Vertex SSD (firmware 1.4) up to the initial level for the most part of it's capacity. Previous 1.3 firmware, even reportiong TRIM capabilty bit set, was not working, reporting ABORT error for every DSM command.
I have no idea whether it is normal, but for some reason it takes 200ms to handle any TRIM command on this drive, that was making delete extremely slow. But TRIM command is able to accept long list of LBAs and the length of that list seems doesn't affect it's execution time. Implemented request clusting algorithm allowed me to rise delete rate up to reasonable numbers, when many parallel DELETE requests running.
|
200942 |
24-Dec-2009 |
mav |
Make geom_concat to passthrough stripe parameters of the first component, hoping that rest will fit.
|
200940 |
24-Dec-2009 |
mav |
As soon as geom_raid3 reports it's own stripe as sector size, report largest underlying provider's stripe, multiplied by number of data disks in array, due to transformation done, as array stripe.
|
200935 |
24-Dec-2009 |
mav |
As soon as mirror has no own stripes, report largest stripe of unrerlying components, hoping others fit, if they are not equal.
|
200934 |
24-Dec-2009 |
mav |
Add two disk ioctls, giving user-level tools information about disk/array stripe (optimal access block) size and offset.
|
200933 |
24-Dec-2009 |
mav |
Make geom_stripe report it's stripe size to upper layers.
|
200821 |
21-Dec-2009 |
mav |
Make graid3 fallback to malloc() when component request size is bigger then maximal prepared UMA zone size. This fixes crash with MAXPHYS > 128K.
|
200539 |
14-Dec-2009 |
rpaulo |
Add Microsoft and NetBSD partition types handling.
|
200534 |
14-Dec-2009 |
rpaulo |
Simplify partition type parsing by using a data-oriented model. While there add more Apple and Linux partition types.
|
200086 |
03-Dec-2009 |
mav |
Change 'load' balancing mode algorithm: - Instead of measuring last request execution time for each drive and choosing one with smallest time, use averaged number of requests, running on each drive. This information is more accurate and timely. It allows to distribute load between drives in more even and predictable way. - For each drive track offset of the last submitted request. If new request offset matches previous one or close for some drive, prefer that drive. It allows to significantly speedup simultaneous sequential reads.
PR: kern/113885 Reviewed by: sobomax
|
199875 |
28-Nov-2009 |
trasz |
Provide a set of sysctls and tunables to disable device node creation for specific "kinds" of disk labels - for example, GPT UUIDs. Reason for this is that sometimes, other GEOM classes attach to these device nodes instead of the proper ones - e.g. they attach to /dev/gptid/XXX instead of /dev/ada0p2, which is annoying.
Reviewed by: pjd (earlier version) MFC after: 1 month
|
199232 |
12-Nov-2009 |
rpaulo |
Add a missing check for Apple HFS partitions.
MFC after: 1 week
|
199228 |
12-Nov-2009 |
rnoland |
We need to allocate space for the header in the create path also.
This fixes a null pointer dereference with "gpart create -s GPT" after the previous commit.
Reported by: Yuri Pankov Pointyhat to: me MFC after: 1 week
|
199017 |
07-Nov-2009 |
rnoland |
Fix handling of GPT headers when size is > 92 bytes.
It is valid for an on-disk GPT header to report a header size which is greater than 92 bytes. Previously, we would read in the sector and copy only the 92 bytes that we know how to deal with before calculating the checksum for comparison. This meant that when we did the checksum, we overshot the buffer and took in random memory, so the checksum would fail.
We now determine the size of the header and allocate enough space to preserve the entire on-disk contents. This allows us to be correctly calculate the checksum and be able to modify and write the header back to the disk, while preserving data that we might not understand.
Reported by: Kris Weston Approved by: marcel@ MFC after: 2 weeks
|
198097 |
14-Oct-2009 |
rnoland |
Set the active flag in the PMBR when we install bootcode on a GPT partitioned disk. Some BIOS require this to be set before they will boot the device.
Approved by: marcel MFC after: 2 weeks
|
197898 |
09-Oct-2009 |
pjd |
If provider is open for writing when we taste it, skip it for classes that depend on on-disk metadata. This was we won't attach to providers that are used by other classes. For example we don't want to configure partitions on da0 if it is part of gmirror, what we really want is partitions on mirror/foo.
During regular work it works like this: if provider is open for writing a class receives the spoiled event from GEOM and detaches, once provider is closed the taste event is send again and class can rediscover its metadata if it is still there. This doesn't work that way when new class arrives, because GEOM gives all existing providers for it to taste, also those open for writing. Classes have to decided on their own if they want to deal with such providers (eg. geom_dev) or not (classes modified by this commit).
Reported by: des, Oliver Lehmann <lehmann@ans-netz.de> Tested by: des, Oliver Lehmann <lehmann@ans-netz.de> Discussed with: phk, marcel Reviewed by: marcel MFC after: 3 days
|
197767 |
05-Oct-2009 |
lulf |
- Improve error message consistency and wording.
|
197608 |
28-Sep-2009 |
marcel |
The first 96 bytes may not be zeroes. It can contain trivial boot code that merely emits an error and waits for a key press before rebooting. The error being that extended partitions are not bootable. The origin is presumed to be Windows 2000; Windows XP does not do this...
For now, ignore the first 96 bytes when checking that the EBR is (for the most part) all zeroes.
Tested by: Mario Lobo <mlobo@digiart.art.br> MFC after: 1 week
|
197449 |
24-Sep-2009 |
marcel |
Don't create more partitions than can fit in the table by checking that the index is within bounds.
|
196986 |
08-Sep-2009 |
trasz |
Remove unused variable.
|
196964 |
08-Sep-2009 |
mav |
Do not check proper request alignment here in geom_dev in production. It will be checked any way later by g_io_check() in g_io_schedule_down(). It is only needed here to not trigger panic from additional check, when INVARIANTS enabled. So cover it with #ifdef INVARIANTS. It saves two 64bit divisions per request.
|
196904 |
06-Sep-2009 |
mav |
MFp4: Remove msleep() timeout from g_io_schedule_up/down(). It works fine without it, saving few percents of CPU on high request rates without need to rearm callout twice per request.
|
196879 |
06-Sep-2009 |
pjd |
Add support for changing providers priority.
Submitted by: Mel Flynn
|
196837 |
04-Sep-2009 |
mav |
Remove artificial MAX_IO_SIZE constant, equal to DFLTPHYS * 2. Use MAXPHYS instead. It is NULL change for GENERIC kernel, but allows 'fast' mode to work on systems with increased MAXPHYS.
|
196823 |
04-Sep-2009 |
pjd |
Simplify g_disk_ident_adjust() function and allow any printable character in serial number.
Discussed with: trasz Obtained from: Wheel Sp. z o.o. (http://www.wheel.pl)
|
196580 |
27-Aug-2009 |
pjd |
There's no need for checking result of M_WAITOK allocation.
|
196579 |
27-Aug-2009 |
pjd |
Fix an obvious topology lock leak.
MFC after: 3 days
|
196333 |
17-Aug-2009 |
marcel |
The start of the EFI GPT partition in the PMBR can always be represented by CHS addressing. Don't define these fields as 0xff, but rather define them correctly. This prevents boot problems on PCs where GPT is being used.
PR: 115406 Submitted by: Kent Hauser <kent@khauser.net> Approved by: re (kib)
|
195752 |
18-Jul-2009 |
lulf |
- Fix the issue with read access count modification on RAID-5 plexes properly. If the access counts were not increased and decreased in equal numbers by gvinum consumers, the read access count would be inconsistent with the write access count. Instead, modify the read access count with the write access count directly to prevent any inconsistencies.
Approved by: re (kib)
|
195436 |
08-Jul-2009 |
marcel |
Revert revisions 188839 and 188868. Use of the ioctl in geom_dev.c is invalid because the ioctl happens without prior open. The ioctl got introduced to provide backward compatibility for extended partitions, but it ended up not being used because it didn't work as expected. Since there are no consumers of the ioctl and the implementation is broken, the best fix is to remove the code entirely.
Spotted by: phk Approved by: re (kensmith)
|
195257 |
01-Jul-2009 |
trasz |
Fix a panic which (reportedly) can happen when unmounting a filesystem with I/O requests in flight on kernels compiled with "options INVARIANTS". Also, make it obvious it's not right to call g_valid_obj() (and macros using it, e.g. G_VALID_CONSUMER()) without topology lock held.
Approved by: re (kib) Reported by: pho
|
195195 |
30-Jun-2009 |
trasz |
Make gjournal work with kernel compiled with "options DIAGNOSTIC". Previously, it would panic immediately.
Reviewed by: pjd Approved by: re (kib)
|
194924 |
24-Jun-2009 |
lulf |
- Apply the same naming rules of LVM names as done in the LVM code itself.
PR: kern/135874
|
194811 |
24-Jun-2009 |
jhay |
Do not stop the loop when an empty or deleted directory entry is found. Rather just skip over it.
|
194433 |
18-Jun-2009 |
ivoras |
Fix tabs, slightly improve comments.
Approved by: gnn (mentor) (original) Noticed by: stas
|
194092 |
13-Jun-2009 |
ivoras |
Add support for labels derived from GPT metadata.
Approved by: gnn (mentor) Reviewed by: pjd PR: 128398 Submitted by: Marius Nuennerich < marius at nuenneri.ch >
|
193981 |
11-Jun-2009 |
luigi |
As discussed in the devsummit, introduce two fields in the struct bio to store classification information, and a hook for classifier functions that can be called by g_io_request().
This code is from Fabio Checconi as part of his GSOC work.
|
193547 |
05-Jun-2009 |
pjd |
Simplify.
|
193131 |
30-May-2009 |
dougb |
Crank the debug level necessary to display the "Label foo is removed" and "Label for provider ..." messages up from 0 to 1.
|
193066 |
29-May-2009 |
jamie |
Place hostnames and similar information fully under the prison system. The system hostname is now stored in prison0, and the global variable "hostname" has been removed, as has the hostname_mtx mutex. Jails may have their own host information, or they may inherit it from the parent/system. The proper way to read the hostname is via getcredhostname(), which will copy either the hostname associated with the passed cred, or the system hostname if you pass NULL. The system hostname can still be accessed directly (and without locking) at prison0.pr_host, but that should be avoided where possible.
The "similar information" referred to is domainname, hostid, and hostuuid, which have also become prison parameters and had their associated global variables removed.
Approved by: bz (mentor)
|
192808 |
26-May-2009 |
lulf |
- Unbreak 64 bit platforms by casting off_t to intmax.
|
192803 |
26-May-2009 |
lulf |
- Fix wrong print on BIO_DONE. - Use db_printf instead of printf. While here, apply this to other ddb commands as well.
Pointed out by: pjd
|
192797 |
26-May-2009 |
lulf |
- Add 'show bio' DDB command.
MFC after: 3 weeks
|
192021 |
12-May-2009 |
trasz |
Check return value of gctl_get_asciiparam().
Found with: Coverity Prevent(tm) CID: 1118
|
191990 |
11-May-2009 |
attilio |
Remove the thread argument from the FSD (File-System Dependent) parts of the VFS. Now all the VFS_* functions and relating parts don't want the context as long as it always refers to curthread.
In some points, in particular when dealing with VOPs and functions living in the same namespace (eg. vflush) which still need to be converted, pass curthread explicitly in order to retain the old behaviour. Such loose ends will be fixed ASAP.
While here fix a bug: now, UFS_EXTATTR can be compiled alone without the UFS_EXTATTR_AUTOSTART option.
VFS KPI is heavilly changed by this commit so thirdy parts modules needs to be recompiled. Bump __FreeBSD_version in order to signal such situation.
|
191856 |
06-May-2009 |
lulf |
- Split up the BIO queue into a queue for new and one for completed requests. This is necessary for two reasons: 1) In order to avoid collisions with the use of a BIOs flags set by a consumer or a provider 2) Because GV_BIO_DONE was used to mark a BIO as done, not enough flags was available, so the consumer flags of a BIO had to be misused in order to support enough flags. The new queue makes it possible to recycle the GV_BIO_DONE flag into GV_BIO_GROW. As a consequence, gvinum will now work with any other GEOM class under it or on top of it.
- Use bio_pflags for storing internal flags on downgoing BIOs, as the requests appear to come from a consumer of a gvinum volume. Use bio_cflags only for cloned BIOs. - Move gv_post_bio to be used internally for maintenance requests. - Remove some cases where flags where set without need.
PR: kern/133604
|
191855 |
06-May-2009 |
lulf |
- Fix a case where a RAID5 volume would think that it is supposed to grow a new subdisk after a parity rebuild.
|
191854 |
06-May-2009 |
lulf |
- Check if any plexes are doing internal maintenance before removing them.
|
191853 |
06-May-2009 |
lulf |
- Add forgotten KASSERT.
|
191852 |
06-May-2009 |
lulf |
- Fix a bug where the bio_data field of the wrong BIO is freed if an error occurs when doing a RAID5 request.
|
191850 |
06-May-2009 |
lulf |
- GV_BIO_RETRY is not used, and it is actually impossible with more than 8 values for bio_cflags/bio_pflags.
|
191849 |
06-May-2009 |
lulf |
- Split the queue mutex into one for the event queue and one for the BIO queue, as they do not really relate and to prepare for an additional queue to be covered by the BIO queue mutex. - Implement wrappers for fetching the next element from the event queue as well as for putting a new element into the BIO queue.
|
191787 |
04-May-2009 |
lulf |
- Make the gvinum softc invisible to userland, as it is not needed.
|
191248 |
18-Apr-2009 |
lulf |
- Remove assertion of topology lock remaining from 7.x gvinum. It is not needed, as the renaming only changes internal gvinum names and will not alter the geom topology. - The topology lock was not held when calling g_wither_geom after renaming.
|
191134 |
16-Apr-2009 |
marcel |
Precision '*' expects an int and strlen() returns a size_t. Compensate.
|
191130 |
15-Apr-2009 |
marcel |
Add a compat option to the EBR scheme that controls the naming of the partitions (GEOM_PART_EBR_COMPAT). When compatibility is enabled, changes to the partitioning are disallowed.
Remove the device name aliasing added previously to provide backward compatibility, but which in practice doesn't give us anything.
Enable compatibility on amd64 and i386.
|
190881 |
10-Apr-2009 |
lulf |
- Move out allocation part of different gvinum objects into its own routine and make use of it in the gvinum userland code.
|
190878 |
10-Apr-2009 |
thompsa |
Revert r190676,190677
The geom and CAM changes for root_hold are the wrong solution for USB design quirks.
Requested by: scottl
|
190849 |
08-Apr-2009 |
marcel |
Don't use hexadecimal in the EBR partition names, because 'a'..'f' are more commonly known as BSD partition names.
Discussed with: ivoras@
|
190677 |
03-Apr-2009 |
thompsa |
Add interleaving root hold tokens from the CAM probe to disk_create and geom provider tasting. This is needed for disk attachments that happen after threads are running in the boot process.
Tested by: rnoland
|
190676 |
03-Apr-2009 |
thompsa |
Add a how argument to root_mount_hold() so it can be passed NOWAIT and be called in situations where sleeping isnt allowed.
|
190667 |
03-Apr-2009 |
marcel |
The 9 bytes immediately prior to the partition table can contain signatures or disk serial numbers. Don't assume those to be zero in all cases. This fixes a false negative.
Tested by: avatar@mmlab.cse.yzu.edu.tw
|
190537 |
30-Mar-2009 |
marcel |
Sharpen the saw: o PC98 uses 32-bit block numbers. Limit the scheme to 2^32-1 blocks when the media is larger. The 32-bit block numbers are implicit (16-bit cylinder * 8-bit head * 8-bit sector).
|
190536 |
30-Mar-2009 |
marcel |
Sharpen the saw: o MBR uses 32-bit block numbers. Limit the scheme to 2^32-1 blocks when the media is larger.
|
190535 |
30-Mar-2009 |
marcel |
Sharpen the saw: o EBR uses 32-bit block numbers. Limit the scheme to 2^32-1 blocks when the media is larger. o Calculate the number of entries based on the rounded media size, rather than the raw media size.
|
190534 |
30-Mar-2009 |
marcel |
Sharpen the saw: o Don't create a GPT scheme underneath another scheme when the probe doesn't allow it.
|
190513 |
28-Mar-2009 |
lulf |
- Add files that should have been added in r190507.
|
190507 |
28-Mar-2009 |
lulf |
Import the gvinum work that have been done during and after Summer of Code 2007. The work have been under testing and fixing since then, and it is mature enough to be put into HEAD for further testing.
A lot have changed in this time, and here are the most important: - Gvinum now uses one single workerthread instead of one thread for each volume and each plex. The reason for this is that the previous scheme was very complex, and was the cause of many of the bugs discovered in gvinum. Instead, gvinum now uses one worker thread with an event queue, quite similar to what used in gmirror. - The rebuild/grow/initialize/parity check routines no longer runs in separate threads, but are run as regular I/O requests with special flags. This made it easier to support mounted growing and parity rebuild. - Support for growing striped and raid5-plexes, meaning that one can extend the volumes for these plex types in addition to the concat type. Also works while the volume is mounted. - Implementation of many of the missing commands from the old vinum: attach/detach, start (was partially implemented), stop (was partially implemented), concat, mirror, stripe, raid5 (shortcuts for creating volumes with one plex of these organizations). - The parity check and rebuild no longer goes between userland/kernel, meaning that the gvinum command will not stay and wait forever for the rebuild to finish. You can instead watch the status with the list command. - Many problems with gvinum have been reported since 5.x, and some has been hard to fix due to the complicated architecture. Hopefully, it should be more stable and better handle edge cases that previously made gvinum crash. - Failed drives no longer disappears entirely, but now leave behind a dummy drive that makes sure the original state is not forgotten in case the system is rebooted between drive failures/swaps. - Update manpage to reflect new commands and extend it with some examples.
Sponsored by: Google Summer of Code 2007 Mentored by: le Tested by: Rick C. Petty <rick-freebsd2008 -at- kiwi-computer.com>
|
190463 |
27-Mar-2009 |
marcel |
Sharpen the saw: o BSD uses 32-bit block numbers. Limit the scheme to 2^32-1 blocks when the media is larger.
|
190461 |
27-Mar-2009 |
marcel |
Sharpen the saw: o Don't create an APM scheme underneath another scheme when the probe doesn't allow it. o APM uses 32-bit block numbers. Limit the scheme to 2^32-1 blocks when the media is larger.
|
190443 |
26-Mar-2009 |
marcel |
Change the priority from high to normal. This makes sure that the BSD or GPT schemes can take precedence as appropriate.
|
190423 |
25-Mar-2009 |
ivoras |
Create GEOM labels from UFS IDs, e.g. /dev/ufsid/49c97b1faa2adc43. UFS IDs are always present and can be used to identify file systems (useful if hardware devices move often).
Actually-by: pjd Approved by: gnn (mentor)
|
190232 |
22-Mar-2009 |
ivoras |
Be more explicit and complain if kernel dumps are perfomed on unsupported partition types. This is to help users used to the old behaviour.
Reviewed by: marcel Approved by: gnn (mentor)
|
190058 |
19-Mar-2009 |
ivoras |
Make GEOM provider names starting with "/dev/" acceptable as well as their "raw" names. While there, change the formatting of extended MSDOS partitions so that the dot (".") is not used to separate two numbers (which kind of looks like the whole is a decimal number). Use "+" instead, which also hints that the second part of the name is the offset from the start of the partition in the first part of the name. Also change the offset from decimal to hexadecimal notation, simply for aesthetic reasons and future compatibility.
GEOM_PART is the default in 8-CURRENT but not yet in 7-STABLE so this changeset can be MFC-ed without causing major problems from the second part.
Reviewed by: marcel Approved by: gnn (mentor) MFC after: 2 weeks
|
189900 |
16-Mar-2009 |
pjd |
Detach GELI providers on shutdown/reboot, which will allow providers underneath to close properly.
Reported, reviewed and tested by: guido MFC after: 1 week
|
189762 |
13-Mar-2009 |
guido |
Backout this commit whil a better solution is developed
|
189695 |
11-Mar-2009 |
nyan |
Move the PC98_[MS]ID_* defines from g_part_pc98.c to diskpc98.h.
Reviewed by: marcel
|
189660 |
11-Mar-2009 |
sam |
o disallow write to RedBoot and FIS directory partitions; these are painful to resurrect (maybe honor foot shooting bit in kern.geom_debugflags) o fix match macro so we now recognize we want to merge FIS dir with RedBoot config parameters even if we don't actually do it
|
189625 |
10-Mar-2009 |
guido |
When attaching a geli on boot make sure that it is detached upon last close. (needed for a gmirror to properly shutdown upon reboot when a geli is on top the gmirror)
|
189616 |
10-Mar-2009 |
nyan |
Restore the return statement. It was accidentally removed by rev 188429.
|
189608 |
09-Mar-2009 |
sam |
add geom_redboot, a geom module that exports RedBoot FIS partitions as named slices in dev/redboot/*
|
188899 |
21-Feb-2009 |
marcel |
o When creating the EBR scheme, set the number of entries properly. Otherwise the minimum of 1 is used and you can only insert a single partition/slice and only at sector 0 (index 1). o When adding a partition/slice, recalculate the index after the start and size of the partition/slice are adjusted to make them a multiple of the track size. Since the precheck method sets the index based on the start of the partition as provided by the user, we know that we're off by at most 1 and adjusting the index is safe.
|
188893 |
21-Feb-2009 |
marcel |
Add bootcode handling.
|
188839 |
20-Feb-2009 |
marcel |
Provide compatibility symlink for logical partitions: 1. Extend geom_dev by having it create the symlink (i.e. call make_dev_alias) based on the DIOCGPROVIDERALIAS ioctl. In this way the functionaility is generic and thus usable by any geom/provider. 2. Have g_part handle said ioctl through the devalias method, so that it's under control of the scheme itself. By design the alias will not be created for newly added partitions.
|
188838 |
20-Feb-2009 |
marcel |
Fix an infinite loop created when the last logical partition is removed.
|
188723 |
17-Feb-2009 |
marcel |
Add a default implementation for pre-check. It should always succeed if not implemented.
Pointy hat: marcel
|
188705 |
17-Feb-2009 |
marcel |
Remove gpt_offset and related code. It was introduced for use by the BSD scheme, ended up not to be needed. Remove to avoid abuse and to keep the bloat to a minimum.
|
188667 |
16-Feb-2009 |
marcel |
Add support to add, delete and modify logical partitions, as well as to create and destroy the extended partitioning scheme. In other words: full support.
|
188659 |
15-Feb-2009 |
marcel |
Add method precheck to the g_part interface. The precheck method allows schemes to reject the ctl request, pre-check the parameters and/or modify/set parameters. There are 2 use cases that triggered the addition: 1. When implementing a R/O scheme, deletes will still happen to the in-memory representation. The scheme is not involved in that operation. The pre-check method can be used to fail the delete up-front. Without this the write to disk will typically fail, but at that time the delete already happened. 2. The EBR scheme uses a linked list to record slices. There's no index. The EBR scheme defines the index as a function of the start LBA of the partition. The add verb picks an index for the range and then invokes the add method of the scheme to fill in the blanks. It is too late for the add method to change the index. The pre-check is used to set the index up-front. This also (silently) overrides/nullifies any (pointless) user-specified index value.
|
188492 |
11-Feb-2009 |
lulf |
- Use the correct argument when determining the buffer size.
PR: kern/131575 MFC after: 2 days
|
188429 |
10-Feb-2009 |
imp |
Fix g_part_dumpconf and g_part_name prototpyes.
Submitted by: marcel@
|
188354 |
09-Feb-2009 |
marcel |
Add the EBR scheme. The EBR scheme supports the Extended Boot Records found inside extended partitions and used to create logical partitions. At this time write/modify support is not (yet) present. The EBR and MBR schemes both check the parent scheme. The MBR will back-off when nested under another MBR, whereas the EBR only nests under a MBR.
|
188352 |
08-Feb-2009 |
marcel |
Allow gpe_offset to be set by the scheme. When gpe_offset is zero, or invalid, initialize it to the start of the partition. Adjust the mediasize when the offset lies somewhere inside the partition.
|
188329 |
08-Feb-2009 |
marcel |
o Add the "PART::scheme" attribute that returns the name of the underlying partitioning scheme. o Put the start and end of the partition in the XML configuration. The start and end are the LBAs of the first and last sector (resp.) of the partition. They are currently identical to the offset and size attributes, which describe the partition as an offset and size in bytes, but may not in the future. The start and end will be used for the logical partition boundaries and may include metadata. The offset and size will always represent the useful storage space within the partition. Typically these two notions are the same, but for logical partitions in an extended partition, the EBR is more naturally treated as being part of the partition.
|
188303 |
08-Feb-2009 |
imp |
Fix g_part_*dumpconf to return void to match kobj definition. Fix g_part_*name to return a const char * rather than a char *.
|
188054 |
03-Feb-2009 |
marcel |
In g_handleattr(), set bp->bio_completed also for the case where len is 0. Otherwise g_getattr() will never succeed when it is handled by g_handleattr_str().
|
187973 |
01-Feb-2009 |
marcel |
Constify val in g_handleattr() and str in g_handleattr_str(). This allows passing string constants to g_handleattr_str().
|
187672 |
24-Jan-2009 |
ed |
Remove unused unrhdr from GEOM character device module.
Now that make_dev() doesn't require unit numbers to be unique, there is no need to use an unrhdr here to generate the numbers. Remove the entire init-routine, because it is optional.
|
187053 |
11-Jan-2009 |
trasz |
Prevent a panic that happens on SMP machines when removing a disk with many writes queued up.
Reviewed by: phk, scottl Approved by: rwatson (mentor) Sponsored by: FreeBSD Foundation
|
186823 |
06-Jan-2009 |
marius |
- Don't enforce an upper-bound to the number of sectors or heads, allowing the full 16-bit width of the corresponding fields in the VTOC8 label to be used. The removed limits basically only held true for providers labeled using the synthetic geometry provided by cam_calc_geometry(9) but neither SCSI disks labeled with Solaris nor sufficiently large ATA disks. - Given that providers (originally) labeled with Solaris typically use the native geometry as reported by the target while FreeBSD typically uses a synthetic one put the message complaining about mismatching geometries between what the label indicates and what GEOM thinks the provider has, which we generally can't help, under bootverbose in order to not unnecessarily scare users. - For informational purposes add the non-matching values to the message complaining about them, similar to what r186501 did for g_part_bsd_read() except also indicating the origin of the values. - Make it clear that the messages emitted by this code refer to the VTOC8 support rather than to another existing scheme or to VTOC32.
|
186807 |
06-Jan-2009 |
marcel |
Don't enforce an upper-bound to the number of sectors or heads that that the provider has. The limits we imposed were PC BIOS specific and not always applicable.
|
186733 |
04-Jan-2009 |
marcel |
Improve probing. o Don't check the dummy fields. o The entry is unused if either dp_mid is 0 or dp_sid is 0. o The start or end cylinder cannot be 0. o The start CHS cannot be equal to the end CHS.
Submitted by: nyan
|
186517 |
27-Dec-2008 |
lulf |
- Fix an issue with access permissions to underlying disks used by a gvinum plex. If the plex is a raid5 plex, and is being written to, parity data might have to be read from the underlying disks, requiring them to be opened for reading as well as writing.
MFC after: 1 week
|
186501 |
26-Dec-2008 |
obrien |
When the geometry does not match the label, print out the values.
|
186188 |
16-Dec-2008 |
trasz |
Implement g_vfs_orphan(). Without it, the filesystem never closes the device, which means refcount on periph drivers never drops, which means cam_sim_free() never returns, which results in umass sleeping there ad infinitum.
Submitted by: pjd Reviewed by: scottl, pjd Approved by: rwatson (mentor) Sponsored by: FreeBSD Foundation
|
185768 |
08-Dec-2008 |
lulf |
- Add missing word in comment.
|
185693 |
06-Dec-2008 |
trasz |
Make it possible to use gjournal for the root filesystem. Previously, an unclean shutdown would make it impossible to mount rootfs at boot.
PR: kern/128529 Reviewed by: pjd Approved by: rwatson (mentor) Sponsored by: FreeBSD Foundation
|
185518 |
01-Dec-2008 |
ivoras |
Trivial patch to show on which geom has the error been detected.
Submitted by: Rick C. Petty Approved by: gnn (mentor) MFC after: 1 month
|
185497 |
01-Dec-2008 |
marcel |
Allow boot code to be smaller than what the scheme expects. This effectively changes the boot code size to be an upper bound and makes the interface more flexible.
|
185327 |
26-Nov-2008 |
marcel |
Allow dumpon to a partition of type FS_UNUSED as well.
|
185318 |
25-Nov-2008 |
lulf |
- Fix a potential NULL pointer reference. Note that this should not happen in practice, but it is a good programming practice and allows the kernel to not depend on userland correctness. - While there, make sizeof usage match the rest of the code.
Found with: Coverity Prevent(tm) CID: 660, 662
|
185309 |
25-Nov-2008 |
lulf |
- Fix a potential NULL pointer reference. Note that this cannot happen in practice, but it is a good programming practice nontheless and it allows the kernel to not depend on userland correctness.
Found with: Coverity Prevent(tm) CID: 655-659, 664-667
|
185048 |
18-Nov-2008 |
marcel |
Partition type FS_UNUSED does not mean the partition entry is unused. Unused partition entries have a partition size of zero. Therefore, partitions can have type FS_UNUSED.
MFC after: 3 days
|
184734 |
06-Nov-2008 |
marcel |
Fix a panic caused by a corrupted table when the header is still valid. We were checking the state of the header and not the table.
PR: 119868 Based on a patch from: Jaakko Heinonen <jh@saunalahti.fi> MFC after: 1 week
|
184554 |
02-Nov-2008 |
attilio |
Improve VFS locking: - Implement real draining for vfs consumers by not relying on the mnt_lock and using instead a refcount in order to keep track of lock requesters. - Due to the change above, remove the mnt_lock lockmgr because it is now useless. - Due to the change above, vfs_busy() is no more linked to a lockmgr. Change so its KPI by removing the interlock argument and defining 2 new flags for it: MBF_NOWAIT which basically replaces the LK_NOWAIT of the old version (which was unlinked from the lockmgr alredy) and MBF_MNTLSTLOCK which provides the ability to drop the mountlist_mtx once the mnt interlock is held (ability still desired by most consumers). - The stub used into vfs_mount_destroy(), that allows to override the mnt_ref if running for more than 3 seconds, make it totally useless. Remove it as it was thought to work into older versions. If a problem of "refcount held never going away" should appear, we will need to fix properly instead than trust on such hackish solution. - Fix a bug where returning (with an error) from dounmount() was still leaving the MNTK_MWAIT flag on even if it the waiters were actually woken up. Just a place in vfs_mount_destroy() is left because it is going to recycle the structure in any case, so it doesn't matter. - Remove the markercnt refcount as it is useless.
This patch modifies VFS ABI and breaks KPI for vfs_busy() so manpages and __FreeBSD_version will be modified accordingly.
Discussed with: kib Tested by: pho
|
184552 |
02-Nov-2008 |
imp |
Add support for reading Tivo Series 1 partitioning. This likely needs a little refinement, but is good enough to commit as is.
# Should look to see if I should move swab(3) into the kernel or just # provide the unoptimized routine here.
Reviewed by: marcel@
|
184499 |
31-Oct-2008 |
kib |
Revert r184136. Instead, push the check for crashdumpmap overflow into the MD i386 and amd64 dump code.
Requested by: jhb Retested by: pho MFC after: 3 days (+ 176304 + 184136)
|
184292 |
26-Oct-2008 |
lulf |
- Import macros used in gmirror for printing gvinum debug messages and making the output more standardized. - Add a sysctl to set the verbosity of the debug messages. - While there, fixup typos and wording in the messages.
|
184264 |
25-Oct-2008 |
marcel |
Invalid BSD disklabels have been created by sysinstall and are possibly still being created. The d_secperunit field contains the number of sectors of the disk and not of the slice/partition to which the disklabel applies. Rather than reject the disklabel, we now silently adjust the field. Existing code, like bslabel(8), does not seem to check the label that extensively and seems to adjust fields as a side-effect as well. In other words, it's not that important apparently, so gpart should not be too strict about it.
Reported by: nyan@ Reported by: Andriy Gapon <avg@icyb.net.ua>
|
184151 |
22-Oct-2008 |
marcel |
Allow dumps to partitions with a tag of 0. The legacy sunlabel implementation in FreeBSD does not use VTOC information and as such as no partition types.
|
184136 |
21-Oct-2008 |
kib |
Do not overflow crashdumpmap.
Reported and tested by: pho Reviewed by: jhb MFC after: 1 week
|
184069 |
20-Oct-2008 |
marcel |
The active and bootable flags are not part of the type. Export the active and bootable flags as attributes in the configuration XML and allow them to be manipulated with the set/unset commands.
Since libdisk treats the flags as part of the partition type, preserve behavior by keeping them included in the configuration text.
|
183754 |
10-Oct-2008 |
attilio |
Remove the struct thread unuseful argument from bufobj interface. In particular following functions KPI results modified: - bufobj_invalbuf() - bufsync()
and BO_SYNC() "virtual method" of the buffer objects set. Main consumers of bufobj functions are affected by this change too and, in particular, functions which changed their KPI are: - vinvalbuf() - g_vfs_close()
Due to the KPI breakage, __FreeBSD_version will be bumped in a later commit.
As a side note, please consider just temporary the 'curthread' argument passing to VOP_SYNC() (in bufsync()) as it will be axed out ASAP
Reviewed by: kib Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
|
183546 |
02-Oct-2008 |
lulf |
- Use the new gv_write_header function to write out the header when removing a drive to make sure that the header is in the correct format.
|
183545 |
02-Oct-2008 |
lulf |
- Remove unneeded macro since the config_length field in the header was changed to 64 bit in the new format.
|
183514 |
01-Oct-2008 |
lulf |
- Make gvinum header on-disk structure consistent on all platforms by storing the gvinum header in fields of fixed size and in a big endian byte order rather than the size and byte order of the actual platform.
Note that the change is backwards compatible with the old gvinum configuration format, but will save the configuration in the new format when the 'saveconfig' command is executed.
Submitted by: Rick C. Petty <rick-freebsd -at- kiwi-computer.com>
|
183455 |
29-Sep-2008 |
marcel |
Return G_PART_PROBE_PRI_HIGH instead of G_PART_PROBE_PRI_NORM if the probe succeeds. This guarantees that the BSD scheme wins over the MBR scheme when MBR gets to probe first. Build- or link-time conditions can cause schemes to end up in the linker set in a different order. Normally BSD is before MBR in the linker set and as such get to probe first. But typically when the kernel gets rebuild or relinked, this can change.
|
183454 |
29-Sep-2008 |
marcel |
Insert the null scheme at the head. This does not change any functionality, but creates an invariant: the first element on the list is always the null scheme.
|
183420 |
27-Sep-2008 |
marcel |
Export the partition name in the conftxt and confxml output. The conftxt output is used by libdisk, and the confxml output is used by gpart itself (gpart show -l).
Submitted by: nyan@
|
183419 |
27-Sep-2008 |
marcel |
Hold the root mount while we're tasting. It is possible that a nested partition (typically the BSD disklabel) is not done tasting while the root file system is being mounted. While this is rare, it's still possible.
|
183410 |
27-Sep-2008 |
marcel |
Allow 255 sectors/track for the BSD disklabel. The previous limit of 63 sectors/track is too PC BIOS specific. On pc98, where the BSD disklabel is used as well, 255 sectors/track is not uncommon.
Submitted by: nyan@
|
183381 |
26-Sep-2008 |
ed |
Remove unit2minor() use from kernel code.
When I changed kern_conf.c three months ago I made device unit numbers equal to (unneeded) device minor numbers. We used to require bitshifting, because there were eight bits in the middle that were reserved for a device major number. Not very long after I turned dev2unit(), minor(), unit2minor() and minor2unit() into macro's. The unit2minor() and minor2unit() macro's were no-ops.
We'd better not remove these four macro's from the kernel, because there is a lot of (external) code that may still depend on them. For now it's harmless to remove all invocations of unit2minor() and minor2unit().
Reviewed by: kib
|
183146 |
18-Sep-2008 |
sbruno |
Just a fixup for a KTRACE message I stumbled upon many moons ago.
Reviewed by: Scott Long MFC after: 2 days
|
182843 |
07-Sep-2008 |
lulf |
- Add a new ioctl for getting the provider name of a geom provider. - Add a routine for looking up a device and checking if it is a valid geom provider given a partial or full path to its device node.
Reviewed by: phk Approved by: pjd (mentor)
|
182798 |
05-Sep-2008 |
rpaulo |
Fix build.
|
182797 |
05-Sep-2008 |
rpaulo |
Keep entries sorted.
|
182793 |
05-Sep-2008 |
rpaulo |
Include the vendor in the partition name.
|
182784 |
05-Sep-2008 |
rpaulo |
Detect Apple HFS GPT slices.
|
182542 |
31-Aug-2008 |
attilio |
Decontextualize vfs_busy(), vfs_unbusy() and vfs_mount_alloc() functions.
Manpages are updated accordingly.
Tested by: Diego Sardina <siarodx at gmail dot com>
|
181803 |
17-Aug-2008 |
bz |
Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course of the next few weeks.
Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch
|
181646 |
12-Aug-2008 |
pjd |
Style(9).
|
181463 |
09-Aug-2008 |
des |
Add sbuf_new_auto as a shortcut for the very common case of creating a completely dynamic sbuf.
Obtained from: Varnish MFC after: 2 weeks
|
180717 |
22-Jul-2008 |
peter |
Trivial commit to attempt to diagnose a svn problem. Add comment that Tivo disks are APM, but do not have a DDR record.
|
180638 |
20-Jul-2008 |
pjd |
Clear passphrase buffer after use.
Submitted by: Fabian Keil <fk@fabiankeil.de> (a bit different version)
|
180612 |
19-Jul-2008 |
lulf |
- When renaming a drive, also set the drive name in the gvinum header.
PR: kern/125632 Approved by: pjd (mentor) MFC after: 3 days
|
180451 |
11-Jul-2008 |
lulf |
- Fix a logic error when updating plex configuration.
Approved by: pjd (mentor)
|
180291 |
05-Jul-2008 |
rwatson |
Introduce a new lock, hostname_mtx, and use it to synchronize access to global hostname and domainname variables. Where necessary, copy to or from a stack-local buffer before performing copyin() or copyout(). A few uses, such as in cd9660 and daemon_saver, remain under-synchronized and will require further updates.
Correct a bug in which a failed copyin() of domainname would leave domainname potentially corrupted.
MFC after: 3 weeks
|
180120 |
30-Jun-2008 |
delphij |
Avoid NULL deference.
Reviewed by: ivoras
|
179897 |
20-Jun-2008 |
lulf |
- Fix spelling errors.
Approved by: kib (mentor) PR: kern/124788 Submitted by: Hywel Mallett <Hywel -at- hmallett.co.uk>
|
179853 |
18-Jun-2008 |
marcel |
Add the set and unset verbs used to set and clear attributes for partition entries. Implement the setunset method for the MBR scheme to control the active flag.
|
179763 |
12-Jun-2008 |
marcel |
Finish the support for partition labels and add it to the XML.
|
179756 |
12-Jun-2008 |
marcel |
Add the raw partition type to the XML.
|
179755 |
12-Jun-2008 |
marcel |
Add the raw partition type to the XML.
|
179752 |
12-Jun-2008 |
marcel |
Add the raw partition type to the XML.
|
179751 |
12-Jun-2008 |
marcel |
Add the raw partiton type to the XML.
|
179750 |
12-Jun-2008 |
marcel |
Add the raw partition type to the XML.
|
179748 |
12-Jun-2008 |
marcel |
Add the partition label and the raw partition type to the XML.
|
179413 |
29-May-2008 |
ed |
Remove the distinction between device minor and unit numbers.
Even though we got rid of device major numbers some time ago, device drivers still need to provide unique device minor numbers to make_dev(). These numbers are only used inside the kernel. They are not related to device major and minor numbers which are visible in devfs. These are actually based on the inode number of the device.
It would eventually be nice to remove minor numbers entirely, but we don't want to be too agressive here.
Because the 8-15 bits of the device number field (si_drv0) are still reserved for the major number, there is no 1:1 mapping of the device minor and unit numbers. Because this is now unused, remove the restrictions on these numbers.
The MAXMAJOR definition was actually used for two purposes. It was used to convert both the userspace and kernelspace device numbers to their major/minor pair, which is why it is now named UMINORMASK.
minor2unit() and unit2minor() have now become useless. Both minor() and dev2unit() now serve the same purpose. We should eventually remove some of them, at least turning them into macro's. If devfs would become completely minor number unaware, we could consider using si_drv0 directly, just like si_drv1 and si_drv2.
Approved by: philip (mentor)
|
179206 |
22-May-2008 |
lulf |
- Recognize the 'volume' parameter when creating a plex.
PR: kern/75632 Approved by: pjd (mentor) MFC after: 1 day
|
179097 |
18-May-2008 |
pjd |
- Assert that we don't send new provider event for a provider which has G_PF_WITHER flag set. - Fix typo in assertion condition (sorry, but I forgot who report that).
|
179094 |
18-May-2008 |
pjd |
Play nice with DDB pager.
Educated by: jhb's BSDCan presentation
|
178444 |
23-Apr-2008 |
marcel |
Implement the G_PART_DUMPCONF method for all 6 schemes. Also call the method for the (indent == NULL) case (i.e. the kern.geom.conftxt sysctl). The purpose is to extend the conftxt output with scheme- specific fields which can be used by libdisk. In particular, have the schemes dump the xs and xt fields, which contain the backward compatible values for class type and partition type. This allows libdisk to work with the legacy slicers as well as with gpart and helps/promotes migration.
|
178180 |
13-Apr-2008 |
marcel |
Add the bootcode verb for installing boot code. Boot code is supported for the MBR, GPT and PC98 schemes, where GPT installs boot code into the PMBR.
|
177713 |
29-Mar-2008 |
marcel |
Change the order from SI_ORDER_FIRST to SI_ORDER_ANY (within SI_SUB_DRIVERS) to avoid loading schemes before all the GEOM classes have been loaded and initialized. Otherwise we may end up using mutexes that haven't been initialized (due to g_retaste() posting an event).
|
177692 |
28-Mar-2008 |
marcel |
Add support for PC-9800 partition tables.
|
177681 |
28-Mar-2008 |
marcel |
When retasting, wither any existing GEOMs of the same class. This allows the class to create a different GEOM for the same provider as well as avoid that we end up with multiple GEOMs of the same class with the same name.
For example, when a disk contains a PC98 partition table but only MBR is supported, then the partition table can be treated as a MBR. If support for PC98 is later loaded as a module, the MBR scheme is pre-empted for the PC98 scheme as expected.
|
177510 |
23-Mar-2008 |
marcel |
Redefine G_PART_SCHEME_DECLARE() from populating a private linker set to declaring a proper module. The module event handler is part of the gpart core and will add the scheme to an internal list on module load and will remove the scheme from the internal list on module unload. This makes it possible to dynamically load and unload partitioning schemes.
|
177509 |
23-Mar-2008 |
marcel |
Add g_retaste(), which given a class will present all non-open providers to it for tasting. This is useful when the class, through means outside the scope of GEOM, can claim providers previously unclaimed.
The g_retaste() function posts an event which is handled by the g_retaste_event().
Event suggested by: phk
|
177345 |
18-Mar-2008 |
lulf |
- Fix a memory leak when re-discovering a gvinum configuration.
Approved by: pjd (mentor) MFC after: 1 week
|
176718 |
02-Mar-2008 |
marcel |
Add support for VTOC8 labels (aka sun disk labels). When a label does not have VTOC information about the partitions, it will be created. This is because the VTOC information is used for the partition type and FreeBSD's sunlabel(8) does not create nor use VTOC information. For this purpose, new tags have been added to support FreeBSD's partition types.
|
176672 |
29-Feb-2008 |
marcel |
Follow-up improvements to the handling of false positives: If the partition table is empty, check to see if we have something that looks sufficiently like a BPB. On non-i386 machines, the boot sector typically doesn't contain boot code; the end of the boot sector is all zeroes. This is also where the partition table is for MBRs. We only check the sector size and cluster size, as that seems to be the most reliable across implementations, BPB versions and platforms.
|
176650 |
28-Feb-2008 |
marcel |
Better handle false positives. The MBR differs from the boot sector only because there's a partition table where the boot sector has boot code. Boot sectors without boot code look like a MBR for all practical purposes. This change adds a check for the partition table and fails the probe when it's obvously invalid. The assumption being that the sector contains a boot sector and not a MBR. More checks are needed to distinguish a boot secto without boot code from a (empty) MBR.
|
176419 |
20-Feb-2008 |
thompsa |
geom_lvm(4) is now known as geom_linux_lvm(4).
|
176417 |
20-Feb-2008 |
thompsa |
Add a geom class to map Linux LVM logical volumes.
The logical disks will appear as /dev/lvm/<vol group>-<logical vol>, for instance /dev/lvm/vg0-home. G_LINUX_LVM currently supports linear stripes with segments on multiple physical disks. The metadata is read only, logical volumes can not be allocated or resized.
Reviewed by: Ivan Voras
Previously known as geom_lvm(4), rename requested by des, phk.
|
176304 |
15-Feb-2008 |
scottl |
Teach the dump and minidump code to respect the maxioszie attribute of the disk; the hard-coded assumption of 64K doesn't work in all cases.
|
176183 |
11-Feb-2008 |
thompsa |
Unbreak build, size_t is larger on 64bit platforms.
|
176166 |
11-Feb-2008 |
thompsa |
Add a geom class to map Linux LVM logical volumes.
The logical disks will appear as /dev/lvm/<vol group>-<logical vol>, for instance /dev/lvm/vg0-home. GLVM currently supports linear stripes with segments on multiple physical disks. The metadata is read only, logical volumes can not be allocated or resized.
Reviewed by: Ivan Voras
|
174882 |
24-Dec-2007 |
marcel |
Various fixes: o BSD disklabels have relative offsets. Even for the BSD in MBR slice setup, except when the mbroffset ioctl is supported. Since we don't support that ioctl, bsdlabel(8) expects relative offsets. So, when reading an existing disklabel, correct for disklabels that mistakenly have the mbroffset offsets. o Don't take the geometry seriously, because it's untrustworthy. We do expect the numbers to be within range. This means that the secperunit field will not be computed from secpercyl and ncyls, but simply is the mediasize in sectors. o Don't enforce partitions to be aligned to track boundaries. The default label, constructed by bsdlabel(8), puts partition a at offset BBSIZE bytes, which commonly means sector 16.
|
174674 |
16-Dec-2007 |
phk |
Chop DIOCGDELETE from userland up in 1024 sector chunks to give geom_disk or any other bio chopping geom a reasonable size of work.
Check for delivered signals between chunks, because the request size and service time is unbounded.
|
174669 |
16-Dec-2007 |
phk |
Don't limit BIO_DELETE requests to MAXPHYS, they perform no data transfers, so they are not subject to the VM system limitation.
|
174500 |
09-Dec-2007 |
marcel |
Decode as many or as few partition entries as the label claims there are. We have already checked it against the caller provided maxpart.
|
174499 |
09-Dec-2007 |
marcel |
Fix a bug in the add verb, where we failed to keep the list of partitions in index-order. This is assumed by the APM, MBR and BSD partitioning schemes.
|
174465 |
08-Dec-2007 |
marcel |
Internal partitions can not be deleted or modified.
|
174456 |
08-Dec-2007 |
marcel |
Skip internal partitions in the check for (user) partitions for the destroy command. Previously a freshly created BSD disklabel could not be destroyed because of the internal partition.
|
174437 |
08-Dec-2007 |
marcel |
Add support for FS_ZFS.
|
174347 |
06-Dec-2007 |
jhb |
Only attach to a GPT partition if it has the GPT_ENT_TYPE_FREEBSD type.
XXX: This only works currently with GEOM_GPT which only exists in 6.x. XXX: I didn't add 'mbroffset' support for a GPT partition holding a BSD label as I'm not sure if they use relative or absolute offsets.
MFC after: 3 days
|
174326 |
06-Dec-2007 |
marcel |
Add a BSD disklabel backend to g_part: o Disklabels can have between 8 and 20 partitions (inclusive). o No device special file is created for the raw partition. o Switch ia64 to use this backend. o No support for boot code yet.
|
173746 |
19-Nov-2007 |
jb |
On some arches, openssl is built with OPENSSL_NO_CAMELLIA, so the code here needs to depend on that too.
|
173677 |
16-Nov-2007 |
maxim |
o s/resiserfs_sb/reiserfs_sb/.
Submitted by: Ighighi
|
173001 |
26-Oct-2007 |
pjd |
Save stack only when KTR_GEOM is both compiled into the kernel and enabled in debug.ktr.mask. Because saving stack is very expensive, it's better only to do it when one really wants to.
Reported by: Dan Nelson
|
172940 |
24-Oct-2007 |
jhb |
First cut at support for booting a GPT labeled disk via the BIOS bootstrap on i386 and amd64 machines. The overall process is that /boot/pmbr lives in the PMBR (similar to /boot/mbr for MBR disks) and is responsible for locating and loading /boot/gptboot. /boot/gptboot is similar to /boot/boot except that it groks GPT rather than MBR + bsdlabel. Unlike /boot/boot, /boot/gptboot lives in its own dedicated GPT partition with a new "FreeBSD boot" type. This partition does not have a fixed size in that /boot/pmbr will load the entire partition into the lower 640k. However, it is limited in that it can only be 545k. That's still a lot better than the current 7.5k limit for boot2 on MBR. gptboot mostly acts just like boot2 in that it reads /boot.config and loads up /boot/loader. Some more details: - Include uuid_equal() and uuid_is_nil() in libstand. - Add a new 'boot' command to gpt(8) which makes a GPT disk bootable using /boot/pmbr and /boot/gptboot. Note that the disk must have some free space for the boot partition. - This required exposing the backend of the 'add' function as a gpt_add_part() function to the rest of gpt(8). 'boot' uses this to create a boot partition if needed. - Don't cripple cgbase() in the UFS boot code for /boot/gptboot so that it can handle a filesystem > 1.5 TB. - /boot/gptboot has a simple loader (gptldr) that doesn't do any I/O unlike boot1 since /boot/pmbr loads all of gptboot up front. The C portion of gptboot (gptboot.c) has been repocopied from boot2.c. The primary changes are to parse the GPT to find a root filesystem and to use 64-bit disk addresses. Currently gptboot assumes that the first UFS partition on the disk is the / filesystem, but this algorithm will likely be improved in the future. - Teach the biosdisk driver in /boot/loader to understand GPT tables. GPT partitions are identified as 'disk0pX:' (e.g. disk0p2:) which is similar to the /dev names the kernel uses (e.g. /dev/ad0p2). - Add a new "freebsd-boot" alias to g_part() for the new boot UUID.
MFC after: 1 month Discussed with: marcel (some things might still change, but am committing what I have so far)
|
172857 |
21-Oct-2007 |
marcel |
Add the freebsd-zfs alias. Both APM and GPT have ZFS partition types.
|
172836 |
20-Oct-2007 |
julian |
Rename the kthread_xxx (e.g. kthread_create()) calls to kproc_xxx as they actually make whole processes. Thos makes way for us to add REAL kthread_create() and friends that actually make theads. it turns out that most of these calls actually end up being moved back to the thread version when it's added. but we need to make this cosmetic change first.
I'd LOVE to do this rename in 7.0 so that we can eventually MFC the new kthread_xxx() calls.
|
172354 |
27-Sep-2007 |
pjd |
When orphaning a provider, cancel events related to it. Without this change the following situation was possible:
1. Provider is orphaned from within class' access() method on last write close - orphan provider event is send. 2. GEOM detects last write close on a provider and sends new provider event. 3. g_orphan_register() is called, and calls all orphan methods of attached consumers. 4. New provider event is executed on orphaned provider, all classes can taste already orphaned provider, and some may attach consumers to it. Those consumers will never go away, because the g_orphan_register() was already called.
We end up with a zombie provider.
With this change, at step 3, we will cancel new provider event.
How to repeat this problem:
# mdconfig -a -t malloc -s 10m # geli init -i 0 md0 # geli attach md0 # newfs -L test /dev/md0.eli # mount /dev/ufs/test /mnt/tmp # geli detach -l md0.eli # umount /mnt/tmp # glabel status Name Status Components ufs/test N/A N/A
Reviewed by: phk Approved by: re (kensmith)
|
172304 |
23-Sep-2007 |
pjd |
LINT compiled just fine for me, but it seems it breaks tinerbox way of compiling LINT.
Approved by: re (implicitly)
|
172302 |
23-Sep-2007 |
pjd |
Bring in the GEOM Virtualisation class, which allows to create huge GEOM providers with limited physical storage and add physical storage as needed.
Submitted by: Ivan Voras Sponsored by: Google Summer of Code 2006 Approved by: re (kensmith)
|
172031 |
01-Sep-2007 |
pjd |
Add support for Camellia encryption algorithm.
PR: kern/113790 Submitted by: Yoshisato YANAGISAWA <yanagisawa@csg.is.titech.ac.jp> Approved by: re (bmah)
|
170897 |
17-Jun-2007 |
marcel |
Have gpart synthesize a disk geometry if the underlying provider don't have it. Some partitioning schemes, as well as file systems, operate on the geometry and without it such schemes (e.g. MBR) and file systems (e.g. FAT) can't be created. This is useful for memory disks.
|
170651 |
13-Jun-2007 |
marcel |
Add the MBR partitioning scheme to g_part. This does not yet support the ability to install boot code.
|
170362 |
06-Jun-2007 |
marcel |
Prefix unknown (i.e. un-aliased) partition types with '!'. This is how they had to be given with ctlreq.
|
170361 |
06-Jun-2007 |
marcel |
Call sbuf_finish() before sbuf_data() and sbuf_len().
|
170307 |
05-Jun-2007 |
jeff |
Commit 14/14 of sched_lock decomposition. - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization.
Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
|
170289 |
04-Jun-2007 |
dwmalone |
Despite several examples in the kernel, the third argument of sysctl_handle_int is not sizeof the int type you want to export. The type must always be an int or an unsigned int.
Remove the instances where a sizeof(variable) is passed to stop people accidently cut and pasting these examples.
In a few places this was sysctl_handle_int was being used on 64 bit types, which would truncate the value to be exported. In these cases use sysctl_handle_quad to export them and change the format to Q so that sysctl(1) can still print them.
|
169588 |
15-May-2007 |
marcel |
Fix a dereference in KASSERT.
|
169585 |
15-May-2007 |
marcel |
o Implement automatic commit. It's enabled when the flags parameter exists and contains the 'C' flag. o The partition label can be the empty string. It's how labels are cleared. o When an action fails, lower permissions when they were raised in order to allow the action. A failed action will not result in any uncommitted changes. o Allow the flags paremeter to be present but empty. It's the equivalent of not being present.
|
169404 |
09-May-2007 |
marcel |
Write the output parameter (if present) for the add, create, delete destroy and modify verbs.
|
169398 |
09-May-2007 |
marcel |
When reverting the creation of a partitioning scheme on a provider, the failure to probe an existing partitioning scheme means that no previous partitioning scheme existed. Don't error. Just destroy the geom.
|
169389 |
08-May-2007 |
marcel |
MFp4: 119373: o Remove the query verb, along with the request and response parameters. o Add the version and output parameters. 119390: [APM,GPT] Properly clear deleted entries. 119394: o Make the alias the standard and use the '!' to prefix literal partition types. o Treat schemes and partition types as case insensitive. 119462: [GPT] Fix a page fault caused when modifying a partition entry without a new partition type.
|
169313 |
06-May-2007 |
pjd |
When deleting key, flush write cache after each overwrite, so we don't overwrite data N times in cache and only once on disk.
|
169289 |
05-May-2007 |
pjd |
Allow to use ':' in d_ident, which is quite handy character.
|
169288 |
05-May-2007 |
pjd |
Handle GEOM::ident attribute by attaching 'sX' string at the end of ident received from the underlying provider, where X is pp->index value.
OK'ed by: phk
|
169287 |
05-May-2007 |
pjd |
Because there are many strange hardware out there, allow to use only [a-zA-Z0-9-_@#%.] characters in d_ident field.
|
169285 |
05-May-2007 |
pjd |
- Extend disk structure to allow to store disk's serial number, which can be retrieved via GEOM::ident attribute. - Bump disk(9) ABI version.
OK'ed by: phk
|
169284 |
05-May-2007 |
pjd |
Implement three new ioctls that can be used with GEOM provider:
DIOCGFLUSH - Flush write cache (sends BIO_FLUSH).
DIOCGDELETE - Delete data (mark as unused) (sends BIO_DELETE).
DIOCGIDENT - Get provider's uniqe and fixed identifier (asks for GEOM::ident attribute).
First two are self-explanatory, but the last one might not be. Here are properties of provider's ident:
- ident value is preserved between reboots, - provider can be detached/attached and ident is preserved, - provider's name can change - ident can't, - ident value should not be based on on-disk metadata; in other words copying whole data from one disk to another should not yield the same ident for the other disk, - there could be more than one provider with the same ident, but only if they point at exactly the same physical storage, this is the case for multipathing for example, - GEOM classes that consumes single providers and provide single providers, like geli, gbde, should just attach class name to the ident of the underlying provider, - ident is an ASCII string (is printable), - ident is optional and applications can't relay on its presence.
The main purpose for this is that application and remember provider's ident and once it tries to open provider by its name again, it may compare idents to be sure this is the right provider. If it is not (idents don't match), then it can open provider by its ident.
OK'ed by: phk
|
169283 |
05-May-2007 |
pjd |
Implement g_delete_data() similar to g_read_data() and g_write_data().
OK'ed by: phk
|
169282 |
05-May-2007 |
pjd |
- Implement helper g_handleattr_str() function for string attributes handling. - Extend g_handleattr() to treat attribute as string when len=0.
OK'ed by: phk
|
169065 |
27-Apr-2007 |
marcel |
Put the scheme (APM, GPT, etc) in the XML.
|
168999 |
24-Apr-2007 |
simokawa |
If compressed length is zero, return a zero-filled block.
MFC after: 1 week
|
168670 |
12-Apr-2007 |
le |
-) Correct sdcount for a plex when removing or adding subdisks. -) Set correct sizes for plexes and volumes a subdisk has been removed.
Submitted by: Ulf Lilleengen <lulf_AT_freebsd.org>
|
168669 |
12-Apr-2007 |
le |
Avoid infinite loop if the device string given for a drive only consists of "/".
Submitted by: Ulf Lilleengen <lulf_AT_freebsd.org>
|
168507 |
08-Apr-2007 |
pjd |
Use root_mounted().
|
168445 |
07-Apr-2007 |
simokawa |
Fix a bug for over 4GB media.
MFC after: 3 days
|
168426 |
06-Apr-2007 |
pjd |
Sysctl description is not a format string, so one % is enough.
|
168052 |
30-Mar-2007 |
delphij |
- Be more verbose when saying "foo" not found. - In gctl_get_geom(), don't issue error when we were not provided with an parameter, like gctl_get_provider() did.
Reviewed by: pjd
|
167913 |
26-Mar-2007 |
kris |
make_dev(9) can be (and is) called without Giant, so there is no need to drop the topology lock and acquire Giant around this call.
Reviewed by: phk
|
167800 |
22-Mar-2007 |
pjd |
Add missing \n.
|
167755 |
21-Mar-2007 |
sam |
Overhaul driver/subsystem api's: o make all crypto drivers have a device_t; pseudo drivers like the s/w crypto driver synthesize one o change the api between the crypto subsystem and drivers to use kobj; cryptodev_if.m defines this api o use the fact that all crypto drivers now have a device_t to add support for specifying which of several potential devices to use when doing crypto operations o add new ioctls that allow user apps to select a specific crypto device to use (previous ioctls maintained for compatibility) o overhaul crypto subsystem code to eliminate lots of cruft and hide implementation details from drivers o bring in numerous fixes from Michale Richardson/hifn; mostly for 795x parts o add an optional mechanism for mmap'ing the hifn 795x public key h/w to user space for use by openssl (not enabled by default) o update crypto test tools to use new ioctl's and add cmd line options to specify a device to use for tests
These changes will also enable much future work on improving the core crypto subsystem; including proper load balancing and interposing code between the core and drivers to dispatch small operations to the s/w driver as appropriate.
These changes were instigated by the work of Michael Richardson.
Reviewed by: pjd Approved by: re
|
167229 |
05-Mar-2007 |
pjd |
Warn when user use sectorsize bigger than the page size, which will lead to problems when the geli device is used with file system or as a swap.
Hopefully will prevent problems like kern/98742 in the future.
MFC after: 1 week
|
167164 |
02-Mar-2007 |
pjd |
Fix geli after last commit for UP systems that are running SMP kernel.
Submitted by: Hyo geol, Lee <hyogeollee@gmail.com> MFC after: 1 week
|
167086 |
27-Feb-2007 |
jhb |
Use pause() rather than tsleep() on stack variables and function pointers.
|
167050 |
27-Feb-2007 |
mjacob |
First cut at GEOM based multipath. This is an active/passive{/passive...} arrangement that has no intrinsic internal knowledge of whether devices it is given are truly multipath devices. As such, this is a simplistic approach, but still a useful one.
The basic approach is to (at present- this will change soon) use camcontrol to find likely identical devices and and label the trailing sector of the first one. This label contains both a full UUID and a name. The name is what is presented in /dev/multipath, but the UUID is used as a true distinguishor at g_taste time, thus making sure we don't have chaos on a shared SAN where everyone names their data multipath as "Fred".
The first of N identical devices (and N *may* be 1!) becomes the active path until a BIO request is failed with EIO or ENXIO. When this occurs, the active disk is ripped away and the next in a list is picked to (retry and) continue with.
During g_taste events new disks that meet the match criteria for existing multipath geoms get added to the tail end of the list.
Thus, this active/passive setup actually does work for devices which go away and come back, as do (now) mpt(4) and isp(4) SAN based disks.
There is still a lot to do to improve this- like about 5 of the 12 recommendations I've received about it, but it's been functional enough for a while that it deserves a broader test base.
Reviewed by: pjd Sponsored by: IronPort Systems MFC: 2 months
|
166934 |
23-Feb-2007 |
jhb |
Use tsleep() rather than msleep() with a NULL mtx parameter.
|
166861 |
21-Feb-2007 |
n_hibma |
Reduce the noise when plugging in (USB) mass storage devices, like a 4 port flash card reader. Also remove an 'Opened da0 -> <random number>' which is not needed on a daily basis (available through bootverbose).
Reviewed by: phk, ken MFC after: 1 week
|
166561 |
08-Feb-2007 |
rodrigc |
#include <sys/systm.h> before <sys/geom.h> to get KASSERT(), and fix LINT build.
|
166551 |
07-Feb-2007 |
marcel |
Evolve the ctlreq interface added to geom_gpt into a generic partitioning class that supports multiple schemes. Current schemes supported are APM (Apple Partition Map) and GPT. Change all GEOM_APPLE anf GEOM_GPT options into GEOM_PART_APM and GEOM_PART_GPT (resp).
The ctlreq interface supports verbs to create and destroy partitioning schemes on a disk; to add, delete and modify partitions; and to commit or undo changes made.
|
166325 |
28-Jan-2007 |
pjd |
We expect 'bio_data != NULL' for BIO_{READ,WRITE,GETATTR}, but for BIO_{DELETE,FLUSH} we expect 'bio_data == NULL'.
Reviewed by: phk
|
166321 |
28-Jan-2007 |
pjd |
It is possible that GEOM taste provider before SMP is started. We can't bind to a CPU which is not yet on-line, so add code that wait for CPUs to go on-line before binding to them.
Reported by: Alin-Adrian Anton <aanton@spintech.ro> MFC after: 2 weeks
|
166193 |
23-Jan-2007 |
kib |
Cylinder group bitmaps and blocks containing inode for a snapshot file are after snaplock, while other ffs device buffers are before snaplock in global lock order. By itself, this could cause deadlock when bdwrite() tries to flush dirty buffers on snapshotted ffs. If, during the flush, COW activity for snapshot needs to allocate block and ffs_alloccg() selects the cylinder group that is being written by bdwrite(), then kernel would panic due to recursive buffer lock acquision.
Avoid dealing with buffers in bdwrite() that are from other side of snaplock divisor in the lock order then the buffer being written. Add new BOP, bop_bdwrite(), to do dirty buffer flushing for same vnode in the bdwrite(). Default implementation, bufbdflush(), refactors the code from bdwrite(). For ffs device buffers, specialized implementation is used.
Reviewed by: tegge, jeff, Russell Cattelan (cattelan xfs org, xfs changes) Tested by: Peter Holm X-MFC after: 3 weeks (if ever: it changes ABI)
|
164821 |
02-Dec-2006 |
pjd |
Softc may be NULL in g_journal_orphan(), so don't be surprised.
|
163912 |
02-Nov-2006 |
pjd |
Fix ia64 build breakage.
|
163906 |
02-Nov-2006 |
pjd |
- Use g_duplicate_bio() instead of g_clone_bio(), so there memory is allocated with M_WAITOK flag. - Check 'buf' instead of 'error' so Prevent is not confused.
CID: 1562, 1563 Found by: Coverity Prevent analysis tool
|
163905 |
02-Nov-2006 |
pjd |
I want CPU number here.
Noticed by: ru
|
163894 |
02-Nov-2006 |
pjd |
Grr, fix one more build breakage.
|
163888 |
01-Nov-2006 |
pjd |
Now, that we have gjournal in the tree add possibility to configure gmirror and graid3 in a way that it is not resynchronized after a power failure or system crash. It is safe when gjournal is running on top of gmirror/graid3.
|
163886 |
01-Nov-2006 |
pjd |
Change spaces to tabs where needed.
|
163877 |
01-Nov-2006 |
pjd |
Skip disabled CPU, because after we sched_bind() to a disabled CPU, we won't be able to exit from the thread.
Function g_eli_cpu_is_disabled() stoled from kern_pmc.c.
PR: 104669 Reported by: Nikolay Mirin <nik@optim.com.ru> MFC after: 1 week
|
163875 |
01-Nov-2006 |
pjd |
Forgot to remove this line.
Reported by: maxim
|
163869 |
01-Nov-2006 |
pjd |
Add BIO_FLUSH support to GSHSEC class.
|
163868 |
01-Nov-2006 |
pjd |
Add BIO_FLUSH support to GPT class.
|
163865 |
01-Nov-2006 |
pjd |
Update the code to the current sync(2) version: - Do not modify mnt_flag without mount interlock held. - Do not touch MNT_ASYNC flag, as this can lead to a race with nmount(2).
Pointed out by: tegge Reviewed by: tegge
|
163853 |
01-Nov-2006 |
pjd |
Remove debugging code I accidentally committed.
|
163837 |
31-Oct-2006 |
pjd |
Add gjournal GEOM class (kernel side), which implements block level journaling and can be tought about marking file system as clean before doing journal switch, which easly allows to add journaling to file systems that don't have this feature.
Sponsored by: home.pl
|
163836 |
31-Oct-2006 |
pjd |
Implement BIO_FLUSH handling by simply passing it down to the components.
Sponsored by: home.pl
|
163833 |
31-Oct-2006 |
pjd |
Add a new disk flag - DISKFLAG_CANFLUSHCACHE, which indicates that the disk can handle BIO_FLUSH requests.
Sponsored by: home.pl
|
163832 |
31-Oct-2006 |
pjd |
Add a new I/O request - BIO_FLUSH, which basically tells providers below to flush their caches. For now will mostly be used by disks to flush their write cache.
Sponsored by: home.pl
|
163206 |
10-Oct-2006 |
pjd |
Guard against invalid metadata.
MFC after: 1 week
|
163048 |
06-Oct-2006 |
ru |
A GEOM cache can speed up read performance by sending fixed size read requests to its consumer. It has been developed to address the problem of a horrible read performance of a 64k blocksize FS residing on a RAID3 array with 8 data components, where a single disk component would only get 8k read requests, thus effectively killing disk performance under high load. Documentation will be provided later. I'd like to thank Vsevolod Lobko for his bright ideas, and Pawel Jakub Dawidek for helping me fix the nasty bug.
|
162835 |
30-Sep-2006 |
pjd |
One more white space fix.
|
162834 |
30-Sep-2006 |
pjd |
Remove trailing spaces.
|
162832 |
30-Sep-2006 |
pjd |
Remove trailing spaces.
|
162357 |
16-Sep-2006 |
pjd |
Fix detecting of UFS1 label when mediasize%fragsize != 0.
Submitted by: Stanislav Sedov PR: kern/84637 MFC after: 1 week
|
162353 |
16-Sep-2006 |
pjd |
Add 'configure' subcommand which for now only allows setting and removing of the BOOT flag. It can be performed on both attached and detached providers.
Requested by: Matthias Lederhofer <matled@gmx.net> MFC after: 1 week
|
162352 |
16-Sep-2006 |
pjd |
Add __printflike() to gctl_error().
Approved by: phk MFC after: 1 week
|
162350 |
16-Sep-2006 |
pjd |
Small fixes after adding __printflike() to gctl_error().
Approved by: phk MFC after: 3 days
|
162345 |
16-Sep-2006 |
pjd |
Remove extra arguments.
MFC after: 3 days
|
162326 |
15-Sep-2006 |
pjd |
Add 'show geom [addr]' ddb(4) command, which prints entire GEOM topology if no additional argument is given or details about the given GEOM object (class, geom, provider or consumer).
Approved by: phk
|
162282 |
13-Sep-2006 |
pjd |
Fix synchronization in gmirror and graid3 which I broken. Synchronization request can still have bio_to set to sc_provider (this is READ part of a synchronization request) and in this case g_{mirror,raid3}_sync() wasn't called as it should be.
MFC after: 1 week
|
162200 |
10-Sep-2006 |
pjd |
Delay an orphan event if provider has still in-flight I/O requests. This way GEOM classes can safely detach from provider when an orphan event is received. This fixes 'detach with active requests' panic for gstripe/gconcat under load.
PR: kern/102766 Submitted by: mjacob OK'ed by: phk MFC after: 1 week
|
162188 |
09-Sep-2006 |
jmg |
move created/detected/activated under debug level 1 to quiet the common case..
add count of active and total components to the launched line so you can see at a glance if your mirror/raid3 is complete...
now: GEOM_MIRROR: Device mirror/sam launched (2/2).
Reviewed by: pjd
|
162153 |
08-Sep-2006 |
pjd |
Fix format character.
Reported by: andre
|
162149 |
08-Sep-2006 |
pjd |
Bump copyright year.
|
162148 |
08-Sep-2006 |
pjd |
Use __FBSDID in .c files.
|
162142 |
08-Sep-2006 |
pjd |
- Split failure probability configuration into read failure probability and write failure probability. - Allow to specify an error number to return of failure.
MFC after: 3 days
|
162056 |
05-Sep-2006 |
pjd |
Fix problems with destroy and forcible destroy functionality: - hold/release device in start/done routines, this will probably slow down things a bit, but previous code was racy; - only release device if g_gate_destroy() failed - if it succeeded device is dead and there is nothing to release; - various other changes which makes forcible destruction reliable.
MFC after: 3 days
|
161425 |
17-Aug-2006 |
imp |
while (0); -> while (0) in multi-line macros
|
161246 |
12-Aug-2006 |
pjd |
Handle MSDOS file systems properly. Before the change file systems created on Windows XP (and others maybe) were not detected. We detected only those created with newfs_msdos(8).
Submitted by: Tobias Reifenberger <treif@mayn.de> style(9)ified by: pjd
|
161245 |
12-Aug-2006 |
pjd |
Verify if a label doesn't point to the parent directory.
|
161220 |
11-Aug-2006 |
pjd |
Before using byte offset for IV creation, covert it to little endian. This way one will be able to use provider encrypted on eg. i386 on eg. sparc64. This doesn't really buy us much today, because UFS isn't endian agnostic.
We retain backward compatibility by setting G_ELI_FLAG_NATIVE_BYTE_ORDER flag on devices with version number less than 2 and not converting the offset.
|
161217 |
11-Aug-2006 |
pjd |
Forgot to bump version number after G_ELI_FLAG_READONLY flag addition.
|
161136 |
09-Aug-2006 |
marcel |
Strengthen the check for a PMBR: o PMBR partitions count to the number of partitions on the disk, which means that if a PMBR entry is invalid we will not treat the MBR as a PMBR by virtue of it not describing any partitions. Previously the checks were inconsistent in that an invalid PMBR entry would be harmless when no other partitions exist (we would treat the MBR as a PMBR by virtue of it being empty), but it would be fatal when there is at least one other partition. o The partition size of a PMBR partition is one less than the media size because the GPT starts at the second sector (LBA 1) and extends to the end of the media. For backward bug-compatibility we accept a size that's exactly the media size (FreeBSD bug). Also, when the partition size can not be represented in a 32-bit integral, the partition size in the MBR is to be set to 0xFFFFFFFF. Accept this as a valid size, even if the size can be represented.
|
161127 |
09-Aug-2006 |
pjd |
Allow geli to operate on read-only providers.
Initial patch from: vd MFC after: 2 weeks
|
161116 |
09-Aug-2006 |
pjd |
Not only a request from us can be passed to g_{mirror,raid3}_worker() function, but also a request to us, in which case checking bio_cflags is wrong, because the class above us is controling it, not we.
MFC after: 1 week
|
161107 |
08-Aug-2006 |
marcel |
Fix a phase-ordering bug: check the mediasize and sectorsize after we obtained access. It is possible that GPT gets to taste a disk first, which means the disk has not been opened before and it will not get opened until after we checked the mediasize and sectorsize. However, since the mediasize and sectorsize are determined at open and that happens when access is optained, checking the mediasize and sectorsize before obtaining access may result in GPT rejecting the disk.
|
160964 |
04-Aug-2006 |
yar |
Commit the results of the typo hunt by Darren Pilgrim. This change affects documentation and comments only, no real code involved.
PR: misc/101245 Submitted by: Darren Pilgrim <darren pilgrim bitfreak org> Tested by: md5(1) MFC after: 1 week
|
160895 |
01-Aug-2006 |
pjd |
Don't use f-word in comments. We are gentlemans.
Pointed out by: Maciej Sobczak
|
160741 |
27-Jul-2006 |
yar |
Fix what looks like a typo: MODULE_DEPEND() takes module names, not KLD file names; and GELI module's name is g_eli, not geom_eli.
Approved by: pjd (silence) MFC after: 5 days
|
160569 |
22-Jul-2006 |
pjd |
Don't forget to initialize crp_olen field, which is used to calculate bio_completed value.
|
160330 |
13-Jul-2006 |
pjd |
Always allow to specify components with /dev/ prefix.
MFC after: 3 days
|
160301 |
12-Jul-2006 |
pjd |
Only check if we're freeing a valid object if we hold the topology lock. This prevents panic under heavy load with DIAGNOSTIC compiled in.
|
160248 |
10-Jul-2006 |
pjd |
Use proper defines instead of magic values.
MFC after: 1 week
|
160203 |
09-Jul-2006 |
pjd |
When kern.geom.raid3.use_malloc tunnable is set to 1, malloc(9) instead of uma(9) will be used for memory allocation. In case of problems or tracking bugs, there are more useful tools for malloc(9) debugging than for uma(9) debugging, like memguard(9) and redzone(9).
MFC after: 1 week
|
160155 |
07-Jul-2006 |
pjd |
Remove bogus assertion.
Reported by: Bradley W. Dutton <brad-fbsd-stable@duttonbros.com> MFC after: 3 days
|
160081 |
03-Jul-2006 |
pjd |
Allow to close access even if device is already destroyed.
Reported by: Ulrich Spoerlein <uspoerlein@gmail.com> PR: kern/98093 MFC after: 1 week
|
159936 |
26-Jun-2006 |
sobomax |
Improve check for protective MBR. Instead of assiming that protective MBR should have only one entry of type 0xEE, consider protective MBR to be one, that has at least one entry of type 0xEE covering the whole unit. This makes GEOM_GPT compatible with disks partitioned by the Apple's BootCamp.
Approved in principle by: marcel MFC After: 1 month
|
159756 |
18-Jun-2006 |
simon |
In g_dev_strategy(), when failing an IO request with EINVAL due to offset or request size which is not a multiple of the sector size, make sure that the bio is set to indicate that no data has actually been transferred.
The result of this is that the file offset is no longer incremented for these requests. The fact that the file offset was incremented broke fdisk(8)'s probing of sector size for non-512 byte sector sizes.
Reviewed by: phk, cperciva Submitted by: mdodd MFC after: 2 weeks
|
159361 |
06-Jun-2006 |
pjd |
Allow to use the old -a option to specify an encryption algorithm to use (for backward compatibility), but print a warning to inform about the change.
|
159343 |
06-Jun-2006 |
pjd |
- Unbreak the build when geli is compiled into the kernel (on as module), by silencing unfounded compiler warning.
Reported by:
|
159307 |
05-Jun-2006 |
pjd |
Implement data integrity verification (data authentication) for geli(8).
Supported by: Wheel Sp. z o.o. (http://www.wheel.pl)
|
159306 |
05-Jun-2006 |
pjd |
Make kern.geom.eli.overwrites sysctl a tunable as well.
|
159304 |
05-Jun-2006 |
pjd |
Add g_duplicate_bio() function which does the same thing what g_clone_bio() is doing, but g_duplicate_bio() allocates new bio with M_WAITOK flag.
|
159238 |
04-Jun-2006 |
marcel |
Fix unaligned memory accesses on Alpha and possible other platforms. By using a pointer to struct dos_partition, we implicitly tell the compiler that the pointer is 4-bytes aligned, even though we know that's not the case. The fact that we only dereference the pointer to access a byte-wide field (field dp_ptyp) is not a guarantee that the compiler will in fact use a byte-wide load. On some platforms it's more efficient to use long word or quad word loads and use bit-shifting and bit-masking to get the intended byte. On those platforms an misaligned load will be the result. The fix is to use byte-wide pointer arithmetic based on sizeof() and offsetof() to avoid invalid casts which avoids that the compiler makes invalid assumptions.
Backtrace provided by: wilko@ MFC after: 1 week
|
158875 |
24-May-2006 |
ceri |
Remove the trailing half of a sentence which was clearly superceded by the preceding one some time during editing.
|
158290 |
04-May-2006 |
pjd |
Use G_RAID3_FOREACH_SAFE_BIO() macro instead of G_RAID3_FOREACH_BIO() in two places where g_io_request() is called. g_io_request() can free bio structure so we can't reference it after and G_RAID3_FOREACH_BIO() macro was doing this.
Found by: Coverity Prevent analysis tool (with my new models) MFC after: 1 day
|
158195 |
30-Apr-2006 |
pjd |
We shouldn't lock the topology here - we will panic on assertion inside g_raid3_bump_syncid().
Reported by: Bradley W. Dutton <brad-fbsd-stable@duttonbros.com> MFC after: 1 day
|
158117 |
28-Apr-2006 |
pjd |
- Don't hold the device sx lock when going to sleep. - Prevent possible live-lock in case of memory problems by freeing already completed requests first.
Reported and tested by: markus, Bradley W. Dutton <brad-fbsd-stable@duttonbros.com> MFC after: 1 day
|
158116 |
28-Apr-2006 |
pjd |
- Remove dead code. - Comment possible event miss, which isn't critical, but probably can be fixed by replacing the event lock usage with the queue lock.
MFC after: 2 weeks
|
158114 |
28-Apr-2006 |
pjd |
Be sure to not destroy device twice. This is not possible in theory, but with this change there is even no theoretical race.
MFC after: 2 weeks
|
158112 |
28-Apr-2006 |
pjd |
Be sure to not destroy device twice. This is not possible in theory, but with this change there is even no theoretical race.
MFC after: 2 weeks
|
157900 |
20-Apr-2006 |
pjd |
geli(8) provides keys on newsession time, so remove CRD_F_KEY_EXPLICIT flag as HW crypto drivers don't support it.
|
157838 |
18-Apr-2006 |
pjd |
Fix storing offset of already synchronized data. Offset in entire array was stored in metadata instead of an offset in single disk. After reboot/crash synchronization process started from a wrong offset skipping (not synchronizing) part of the component which can lead to data corrutpion (when synchronization process was interrupted on initial synchronization) or other strange situations like 'graid3 status' showing value more than 100%.
Reported, reviewed and tested by: ru Reported by: Dmitry Morozovsky <marck@rinet.ru> MFC after: 1 day
|
157783 |
15-Apr-2006 |
pjd |
Correct debug: we are sending child bio here, not parent bio.
MFC after: 1 week
|
157740 |
13-Apr-2006 |
cracauer |
Make CCD be able to read and write Linux software raids.
Supported for raid-0 with <n> disks, raid-1 with 2 disks.
Manpages have examples, warnings etc.
Test scripts on http://www.cons.org/cracauer/ccdconfig-linux/ Reviewed by: alfred
|
157686 |
12-Apr-2006 |
pjd |
Pass BIO_GETATTR requests down.
MFC after: 1 week
|
157630 |
10-Apr-2006 |
pjd |
Introduce and use delayed-destruction functionality from a pre-sync hook, which means that devices will be destroyed on last close.
This fixes destruction order problems when, eg. RAID3 array is build on top of RAID1 arrays.
Requested, reviewed and tested by: ru MFC after: 2 weeks
|
157620 |
10-Apr-2006 |
marcel |
MFp4: o Implement the remove verb to remove a partition entry. o Improve error reporting by first checking that the verb is valid. o Add an entry parameter to the add verb. this parameter can be both read-only as welll as read-write and specifies the entry number of the newly added partition. o Make sure that the provider is alive when passed to us. It may be withering away. o When adding a new partition entry, test for overlaps with existing partitions.
|
157619 |
10-Apr-2006 |
marcel |
Add g_wither_provider() to abstract the details of destroying a particular provider. Use this function where g_orphan_provider() is being called so that the flags are updated correctly and g_orphan_provider() is called only when allowed.
|
157581 |
07-Apr-2006 |
marcel |
Change gctl_set_param() to return an error instead of setting an error on the request. Add a wrapper, gctl_set_param_err(), that sets the error on the request from the error returned by gctl_set_param() and update current callers of gctl_set_param() to call gctl_set_param_err() instead. This makes gctl_set_param() much more usable in situations where the caller knows better what to do with certain (apparent) error conditions and setting an error on the request is not one of the things that need to be done.
|
157548 |
05-Apr-2006 |
pjd |
Typos.
|
157305 |
30-Mar-2006 |
pjd |
Revert previous change, as I fixed MD5(9).
|
157293 |
30-Mar-2006 |
pjd |
md_hash field in g_eli_metadata structure is not 4 byte aligned, which case panic on sparc64.
The problem is in MD5(9) implementation. The Encode() function takes 'unsigned char *output' as its first argument, which is then assigned to 'u_int32_t *op'. If the 'output' argument is not 4 byte aligned (and in geli(8) case it is not), sparc64 machine will panic.
I don't know how to fix MD5(9) in a clean way, so I'm implementing a work-around in geli(8).
Reported by: brueffer MFC after: 3 days
|
157292 |
30-Mar-2006 |
le |
Protect from creating striped and RAID5 plexes with unequally sized subdisks.
|
157290 |
30-Mar-2006 |
pjd |
- 'ndisks' variable is not boolean, so compare it with a value. - Keep conditions order consistent with the comment above.
MFC after: 3 days
|
157222 |
28-Mar-2006 |
pjd |
Preserve previous behaviour of kern.geom.raid3.n{64,16,4}k tunables were 0 means unlimited.
Reported by: ru MFC after: 3 days
|
157134 |
25-Mar-2006 |
pjd |
Increase debug level for "Thread exiting." message. It's not that important and is 0 by accident.
MFC after: 3 days
|
157053 |
23-Mar-2006 |
le |
Fix whitespace.
|
157052 |
23-Mar-2006 |
le |
Implement the 'resetconfig' command.
PR: kern/94835 Submitted by: Ulf Lilleengen <lulf@stud.ntnu.no>
|
156878 |
19-Mar-2006 |
pjd |
Update copyright for 2006.
|
156876 |
19-Mar-2006 |
pjd |
kern.geom.raid3.sync_requests=2 seems to be a better default - it still keeps disks very busy, but makes system much more responsive.
While here, kill extra space.
|
156873 |
19-Mar-2006 |
pjd |
kern.geom.mirror.sync_requests=2 seems to be a better default - it still keeps disks very busy, but makes system much more responsive.
While here, kill extra space.
|
156686 |
13-Mar-2006 |
ru |
Fix a typo.
|
156684 |
13-Mar-2006 |
ru |
Fix build on 64-bit platforms.
|
156612 |
13-Mar-2006 |
pjd |
- Reimplement I/O data allocation to prevent deadlocks.
Submitted by: green
- Speed up synchronization process by using configurable number of I/O requests in parallel. + Add kern.geom.raid3.sync_requests tunable which defines how many parallel I/O requests should be used. + Retire kern.geom.raid3.reqs_per_sync and kern.geom.raid3.syncs_per_sec sysctls. - Fix race between regular and synchronization requests. - Reimplement raid3's data synchronization - do not use the topology lock for this purpose, as it may case deadlocks. - Stop synchronization from pre-sync hook. - Fix some other minor issues.
Tested by: Mike Tancsa <mike@sentex.net> MFC after: 3 days
|
156610 |
13-Mar-2006 |
pjd |
- Speed up synchronization process by using configurable number of I/O requests in parallel. + Add kern.geom.mirror.sync_requests tunable which defines how many parallel I/O requests should be used. + Retire kern.geom.mirror.reqs_per_sync and kern.geom.mirror.syncs_per_sec sysctls. - Fix race between regular and synchronization requests. - Reimplement mirror's data synchronization - do not use the topology lock for this purpose, as it may case deadlocks. - Stop synchronization from pre-sync hook. - Fix some other minor issues.
MFC after: 3 days
|
156527 |
10-Mar-2006 |
pjd |
When inserting a new component md_provsize metadata field wasn't set, which means that old problem was triggered (when two providers end at the same offset, eg. ad0 and ad0s1 and the wrong was is picked up by gmirror/graid3).
Reported by: Michal Suszko <dry@dry.pl> MFC after: 3 days
|
156421 |
08-Mar-2006 |
pjd |
Allow to dump kernel to gmirror providers. Some conditions have to be met to make it work properly. This will be described in the manual page.
MFC after: 3 days
|
156299 |
04-Mar-2006 |
pjd |
We need to check if file system size is equal to provider's size, because sysinstall(8) still bogusly puts first partition at offset 0 instead of 16, so glabel/ufs will find file system on slice instead of partition.
Before sysinstall is fixed, we must keep this code, which means that we wont't be able to detect UFS file systems created with 'newfs -s ...'.
PS. bsdlabel(8) creates partitions properly.
MFC after: 3 days
|
156201 |
02-Mar-2006 |
jeff |
- Lock Giant if needed around the call to vnode_create_vobject(). This is only important if devfs is not mpsafe.
Sponsored by: Isilon Systems, Inc. Found by: kris
|
156170 |
01-Mar-2006 |
pjd |
Assert proper use of bio_caller1, bio_caller2, bio_cflags, bio_driver1, bio_driver2 and bio_pflags fields.
Reviewed by: phk
|
155906 |
22-Feb-2006 |
pjd |
Do not use bio structure after g_io_deliver(), it may not longer by valid.
Found and fixed by: Vsevolod Lobko <seva@ip.net.ua> MFC after: 3 days
|
155803 |
18-Feb-2006 |
pjd |
Inform when label disappears.
MFC after: 3 days
|
155802 |
18-Feb-2006 |
pjd |
Allow to use g_slice_orphan() from outside.
MFC after: 3 days
|
155801 |
18-Feb-2006 |
pjd |
- Do not depend on fact that file system covers entire provider. It won't work for file systems created with -s option. Use better file system verfication. - Add myself to the copyright.
MFC after: 3 days
|
155798 |
18-Feb-2006 |
pjd |
This function returns nothing.
|
155797 |
18-Feb-2006 |
pjd |
If provider's sector size prevents reading SBLOCKSIZE bytes return immediatelly.
|
155582 |
12-Feb-2006 |
pjd |
On component state change to ACTIVE don't forget to update metadata.
MFC after: 3 days
|
155581 |
12-Feb-2006 |
pjd |
Use time_uptime instead of time_second, as the latter may go backwards.
Suggested by: ru MFC after: 3 days
|
155560 |
12-Feb-2006 |
pjd |
Allow to set kern.geom.raid3.disconnect_on_failure from loader.conf.
MFC after: 3 days
|
155546 |
11-Feb-2006 |
pjd |
- Add kern.geom.raid3.disconnect_on_failure sysctl/tunnable (default to 1 to preserve currect behaviour). When set to 0, components are not disconnected - graid3 will try to still use them (only first error will be logged). This is helpful when we have two broken components, but in different places, so actually all data is available. Such buggy component will be visible in 'graid3 list' output with flag BROKEN. - Never disconnect the last valid component. If we detect errors there we will just pass them up. This wasn't reasonable to deny access to the whole provider because of one broken sector.
Prodded by: ru MFC after: 3 days
|
155545 |
11-Feb-2006 |
pjd |
- Add kern.geom.mirror.disconnect_on_failure sysctl/tunnable (default to 1 to preserve currect behaviour). When set to 0, components are not disconnected - gmirror will try to still use them (only first error will be logged). This is helpful when we have two broken components, but in different places, so actually all data is available. Such buggy component will be visible in 'gmirror list' output with flag BROKEN. - Never disconnect the last valid component. If we detect errors there we will just pass them up. This wasn't reasonable to deny access to the whole provider because of one broken sector.
Prodded by: ru MFC after: 3 days
|
155544 |
11-Feb-2006 |
pjd |
Correct typo. 'fbp' is NULL here so this will result in a panic.
MFC after: 3 days
|
155540 |
11-Feb-2006 |
pjd |
Mark array as CLEAN when there are no write requests in kern.geom.raid3.idletime seconds. Write, not any requests. Mark array as clean immediatelly on last write close.
Prodded by: ru MFC after: 3 days
|
155539 |
11-Feb-2006 |
pjd |
Mark array as CLEAN when there are no write requests in kern.geom.mirror.idletime seconds. Write, not any requests. Mark array as clean immediatelly on last write close.
Prodded by: ru MFC after: 3 days
|
155537 |
11-Feb-2006 |
pjd |
Teach geli how to load keyfiles before root file system is mounted. An example entries for loader.conf to make it possible:
geli_da0_keyfile0_load="YES" geli_da0_keyfile0_type="da0:geli_keyfile0" geli_da0_keyfile0_name="/boot/keys/da0.key0" geli_da0_keyfile1_load="YES" geli_da0_keyfile1_type="da0:geli_keyfile1" geli_da0_keyfile1_name="/boot/keys/da0.key1" geli_da0_keyfile2_load="YES" geli_da0_keyfile2_type="da0:geli_keyfile2" geli_da0_keyfile2_name="/boot/keys/da0.key2"
geli_da1s3a_keyfile0_load="YES" geli_da1s3a_keyfile0_type="da1s3a:geli_keyfile0" geli_da1s3a_keyfile0_name="/boot/keys/da1s3a.key"
Thanks for jhb and kan who showed me the right direction.
MFC after: 3 days
|
155535 |
11-Feb-2006 |
pjd |
Check rootvnode variable to see if we still want to ask for passphrase on boot. Other methods just don't work properly.
MFC after: 3 days
|
155462 |
08-Feb-2006 |
le |
Catch the case when a subdisk has no provider or no consumer attached to it.
|
155432 |
07-Feb-2006 |
brueffer |
Clean up some sysctl descriptions, debug messages etc.
Approved by: pjd MFC after: 3 days
|
155174 |
01-Feb-2006 |
pjd |
Remove trailing spaces.
|
155071 |
30-Jan-2006 |
pjd |
Allow to specify only one disk. This is helpful when we want to extend our concatenated device later.
MFC after: 1 week
|
155070 |
30-Jan-2006 |
pjd |
Fix typo which cased that 64kB elements limit was not set properly and 16kB elements limit wasn't set at all.
Submitted by: Vsevolod Lobko <seva@ip.net.ua> MFC after: 3 days
|
154686 |
22-Jan-2006 |
fjoe |
Rename geom_uzip class to g_uzip in order to be consistent with the naming of other GEOM modules.
PR: 89998
|
154540 |
18-Jan-2006 |
pjd |
Fix bio leak in case of malloc(9) failure.
Found by: Coverity Prevent(tm) Coverity ID: CID794 MFC after: 3 days
|
154539 |
18-Jan-2006 |
pjd |
Remove dead code.
Found by: Coverity Prevent(tm) Coverity ID: CID105 MFC after: 3 days
|
154538 |
18-Jan-2006 |
pjd |
Remove dead code.
Found by: Coverity Prevent(tm) Coverity ID: CID104 MFC after: 3 days
|
154513 |
18-Jan-2006 |
pjd |
Style cleanups.
X-MFC-after: Already MFCed to RELENG_6 by accident.
|
154473 |
17-Jan-2006 |
pjd |
Move $FreeBSD$ from comment to __FBSDID().
|
154463 |
17-Jan-2006 |
pjd |
- Use better types. - Log problems at level 0 when killing providers.
MFC after: 3 days
|
154462 |
17-Jan-2006 |
pjd |
Check return value.
Found by: Coverity Prevent(tm) MFC after: 3 days
|
154461 |
17-Jan-2006 |
pjd |
Remove dead code.
Found by: Coverity Prevent(tm) MFC after: 3 days
|
154460 |
17-Jan-2006 |
pjd |
Remove unused value.
Found by: Coverity Prevent(tm) MFC after: 3 days
|
154459 |
17-Jan-2006 |
pjd |
Log situation when EIO is returned.
|
154458 |
17-Jan-2006 |
pjd |
Remove bio leak when EIO error is emulated.
Found by: Coverity Prevent(tm) MFC after: 3 days
|
154075 |
06-Jan-2006 |
le |
Get rid of the gv_bioq hack in most parts of the I/O path and use the standard bioq structures.
|
153532 |
19-Dec-2005 |
pjd |
MFp4: Typo fix (without it the XML GEOM tree wasn't consistent).
Reported by: Eric Anderson <anderson@centtech.com>
|
153265 |
09-Dec-2005 |
pjd |
Fix build breakage by fixing typo.
Reported by: glebius
|
153251 |
08-Dec-2005 |
pjd |
- Allow to specify the byte which will be used for filling read buffer. - Improve sysctl description a bit.
Submitted by: Ivan Voras <ivoras@gmail.com>
|
153250 |
08-Dec-2005 |
pjd |
Teach NOP GEOM class how to gather the following statistics: - number of read I/O requests, - number of write I/O requests, - number of read bytes, - number of written bytes. Add 'reset' subcommand for resetting statistics.
|
152972 |
30-Nov-2005 |
sobomax |
It is unclear who is wrong and who is right, but when operating on plain file bsdlabel(8) always writes label at a fixed offset from its beginning (512 bytes), regardless of the sector size. At the same time, bsdlabel geom class expects label to be available at the very beginning of the second sector.
As a result, images prepared in userland for media with sector size different from 512 bytes (i.e. 2k for cdroms) are not recognized by the tasting mechanism.
Solve the problem by always looking for the label at 512-byte offset if we can't find it at the beginning of the second sector and sector size is not 512 bytes.
|
152971 |
30-Nov-2005 |
sobomax |
Don't pass error value pointer to g_read_data(9) at all if we don't have any use of it.
Suggested by: pjd
|
152967 |
30-Nov-2005 |
sobomax |
Check for g_read_data(9) errors properly:
o The only indication of error condition is NULL value returned by the function;
o value pointed to by error argument is undefined in the case when operation completes successfully.
Discussed with: phk
|
152966 |
30-Nov-2005 |
sobomax |
Kill leading whilespace.
|
152922 |
29-Nov-2005 |
pjd |
We do nothing with returned error value, so just remove it.
|
152913 |
29-Nov-2005 |
sobomax |
Check value returned by g_read_data(9), otherwise we can end in panic(9) if read error happens.
MFC after: 1 week
|
152784 |
25-Nov-2005 |
le |
Add sysctl descriptions.
|
152773 |
24-Nov-2005 |
le |
Since we want a vinum geom created anytime the module loads, move the geom creation to a seperate init function and ignore the tasting.
The config is now parsed only in the vinumdrive geom, which hopefully fixes the problem, that the drive class tasted before the vinum class had a chance, for good.
Also restore the behaviour that the module can be loaded at boot time and on a running system.
|
152634 |
20-Nov-2005 |
le |
Whitespace.
|
152633 |
20-Nov-2005 |
le |
Always declare variables at the start of the function. Don't allocate potentially large variables on the stack. Check strsep() return values when the string comes from userland. Shorten variable names for lucidity's sake.
most of the stuff: Pointed out by: njl@
|
152632 |
20-Nov-2005 |
le |
Fix whitespace issue.
Pointed out by: joel@
|
152615 |
19-Nov-2005 |
le |
Finally bring in what was produced during Google SoC 2005:
Add functions to rename objects and to move a subdisk from one drive to another.
Obtained from: Chris Jones <chris.jones@ualberta.ca> Sponsored by: Google Summer of Code 2005 MFC in: 1 week
|
152565 |
18-Nov-2005 |
jdp |
Fix a bug that caused some /dev entries to continue to exist after the underlying drive had been hot-unplugged from the system. Here is a specific example. Filesystem code had opened /dev/da1s1e. Subsequently, the drive was hot-unplugged. This (correctly) caused all of the associated /dev/da1* entries to be deleted. When the filesystem later realized that the drive was gone it closed the device, reducing the write-access counts to 0 on the geom providers for da1s1e, da1s1, and da1. This caused geom to re-taste the providers, resulting in the devices being created again. When the drive was hot-plugged back in, it resulted in duplicate /dev entries for da1s1e, da1s1, and da1.
This fix adds a new disk_gone() function which is called by CAM when a drive goes away. It orphans all of the providers associated with the drive, setting an error condition of ENXIO in each one. In addition, we prevent a re-taste on last close for writing if an error condition has been set in the provider.
Sponsored by: Isilon Systems Reviewed by: phk MFC after: 1 week
|
152401 |
13-Nov-2005 |
marcel |
o Slightly refactor the ctlreq code to maximize code sharing between verbs. Only the create verb operates on a provider. All other verbs operate on a GPT geom. Also, the GPT entry oriented verbs require a non-downgraded GPT. o Have all verbs take an optional flags parameter. The flags parameter is a string of single-letter flags. The typical use of these flags is to enable certain behaviour in support fo the gpt(8) tool. o Add dummy implementations for the destroy and recover verbs.
This change causes test 2 of the GPT regression test suite to fail. The presence of a geom parameter is now required even for unknown verbs.
|
152342 |
12-Nov-2005 |
marcel |
Make the kern.geom.conftxt sysctl more usable by also dumping the MD class. Previously only the DISK class was dumped. The only consumer of this sysctl is libdisk (i.e. sysinstall) and it tests explicitly for instances of the DISK class. Dumping other classes is therefore harmless. By also dumping the MD class regression tests can be written that use the MD class for operations that would normally be done on the DISK class. The sysctl can now be used to test if those operations took an effect. An example is partitioning.
|
151897 |
31-Oct-2005 |
rwatson |
Normalize a significant number of kernel malloc type names:
- Prefer '_' to ' ', as it results in more easily parsed results in memory monitoring tools such as vmstat.
- Remove punctuation that is incompatible with using memory type names as file names, such as '/' characters.
- Disambiguate some collisions by adding subsystem prefixes to some memory types.
- Generally prefer lower case to upper case.
- If the same type is defined in multiple architecture directories, attempt to use the same name in additional cases.
Not all instances were caught in this change, so more work is required to finish this conversion. Similar changes are required for UMA zone names.
|
151822 |
28-Oct-2005 |
pjd |
Fix possible live-lock under heavy load where we can't allocate more memory for request. I was sure graid3 should handle such situations well, but green@ reported it is not and we want to fix it before 6.0.
Submitted by: green
|
151684 |
26-Oct-2005 |
takawata |
Add checking for File record magic.
|
151172 |
09-Oct-2005 |
marcel |
Rough implementation of the create and add verbs. The verbs cause in-memory changes only and as such are only useful for prototyping and regression testing purposes.
|
150759 |
30-Sep-2005 |
tegge |
Move some devstat collection to below where large IO operations are chopped up. This make iostat report operations passed down to the device driver instead of operations passed down to GEOM disk. The transfer size limit imposed by the device driver is no longer hidden, improving the correlation between iostat output and device driver workload.
|
150735 |
29-Sep-2005 |
fjoe |
- Fix "end_blk out of range" panic when INVARIANTS. - Do not allow rw access.
Submitted by: Dario Freni <saturnero at freesbie dot org> MFC after: 3 days
|
150304 |
18-Sep-2005 |
marcel |
o Don't cause a panic when the control request lacks a verb. o Don't set the error twice when the named class does not exist. It causes ioctl(2) to return with error EEXIST.
|
150240 |
17-Sep-2005 |
marcel |
Complete rewrite in preparation of adding support for control requests. The following features have been added: 1. Extensive checking and validation of both the primary and secondary headers to protect against corrupted data and to take advantage of the redundancy to allow the GPT to be used in the face of recoverable corruption. 2. Dynamic data-structures to avoid hardcoding gratuitous table limits so as to support the creation of GPT tables of (as of yet) unspecified size. 3. Only allow kernel dumps to swap partitions to provide the necessary anti-footshooting measures. Linux swap partitions are allowed. 4. Complete dump of the GPT configuration, including labels. 5. Supports Byte Order Mark (U+FEFF) handling for big-endian, little-endian and mixed-endian partition names.
|
150177 |
15-Sep-2005 |
jhb |
- Add a new simple facility for marking the current thread as being in a state where sleeping on a sleep queue is not allowed. The facility doesn't support recursion but uses a simple private per-thread flag (TDP_NOSLEEPING). The sleepq_add() function will panic if the flag is set and INVARIANTS is enabled. - Use this new facility to replace the g_xup and g_xdown mutexes that were (ab)used to achieve similar behavior. - Disallow sleeping in interrupt threads when invoking interrupt handlers.
MFC after: 1 week Reviewed by: phk
|
150143 |
14-Sep-2005 |
rodrigc |
Fix so that when a slice or a partition is removed through g_slice_config(), it is destroyed in GEOM, in addition to being removed from /dev. Before this patch, if you applied a new MBR which deleted a slice, the deleted slice would not be in /dev, but it would still appear in kern.geom.conftxt and kern.geom.confxml, which would confused the diskPartitionEditor in sysinstall.
Submitted by: pjd Tested by: pjd, rodrigc MFC after: 1 week
|
149931 |
10-Sep-2005 |
pjd |
Fix copy&paste typo.
MFC after: 3 days
|
149930 |
10-Sep-2005 |
pjd |
Don't forget to initialize crp_etype field.
Reported by: Nick Evans <nevans@syphen.net> MFC after: 3 days
|
149895 |
08-Sep-2005 |
le |
Set the G_PF_WITHER flag on the subdisk provider that is about to be destroyed. That way the GEOM system handles all deallocations and we don't have to do it ourselves.
|
149787 |
04-Sep-2005 |
phk |
Remove a race condition that could result in processes being stuck waiting for geom events to happen:
Instead of maintaining a count of outstanding events, simply look if the queue is empty. Make sure to not remove events from the queue until they are executed in order to not open a new race.
Much work by: pjd Tested by: kris MT6: yes, should be.
|
149757 |
03-Sep-2005 |
phk |
Typo.
|
149576 |
29-Aug-2005 |
pjd |
Use KTR to log allocations and destructions of bios. This should hopefully allow to track down "duplicate free of g_bio" panics.
|
149555 |
28-Aug-2005 |
le |
Prevent that sync operations can be started when they are already in progress, and be a bit more user friendly in terms of error messages returned from the kernel.
|
149538 |
28-Aug-2005 |
pjd |
Verify length of the data to read as well.
|
149501 |
26-Aug-2005 |
le |
Shuffle around the order in which the components are compiled.
This way, the VINUMDRIVE class is loaded before the VINUM class, but since geom does the tasting for newly arrived classes last-in-first-out, the VINUM class tastes first.
This removes the need to call gv_parse_config() in the drive taste path.
|
149495 |
26-Aug-2005 |
pjd |
Verify offset before reading.
MFC after: 2 days
|
149492 |
26-Aug-2005 |
takawata |
Add NTFS labeling function.
Reviewed by:pjd
|
149395 |
23-Aug-2005 |
pjd |
Verify if we can actually read the data at given offset.
Reported by: Martin <nakal@nurfuerspam.de>
|
149379 |
22-Aug-2005 |
le |
Correct the check if a plex is accessible in case it is not up. This makes degraded RAID5 plexes actually work.
|
149353 |
21-Aug-2005 |
pjd |
By default, when doing crypto work in software, start as many threads as we have active CPUs and bind each thread to its own CPU.
MFC after: 3 days
|
149352 |
21-Aug-2005 |
pjd |
Remove stale comment (we now always start worker thread).
MFC after: 3 days
|
149339 |
20-Aug-2005 |
pjd |
Back-out the change from revision 1.14 and allow for '/' in labels again.
Convinced by: green, Gavin Atkinson, dougb, gordon MFC after: 1 day
|
149323 |
20-Aug-2005 |
pjd |
Add a __packed keyword to g_eli_metadata struct definition, so sizeof(struct g_eli_metadata) will return the exact number of bytes needed for storing it on the disk. Without this change GELI was unusable on amd64 (and probably other 64-bit archs), because sizeof(struct g_eli_metadata) was greater than 512 bytes and geli(8) was failing on assertion.
Reported by: Michael Reifenberger <mike@Reifenberger.com> MFC after: 3 days
|
149304 |
19-Aug-2005 |
pjd |
Allow to change number of iterations for PKCS#5v2. It can only be used when there is only one key set.
MFC after: 3 days
|
149303 |
19-Aug-2005 |
pjd |
- Add a missing period. - Fix number of spaces.
MFC after: 3 days
|
149300 |
19-Aug-2005 |
pjd |
Avoid code duplication and implement bitcount32() function in systm.h only.
Reviewed by: cperciva MFC after: 3 days
|
149193 |
17-Aug-2005 |
pjd |
Always run dedicated kernel thread (even when we have hardware support). There is no performance impact, but allows to allocate memory with M_WAITOK flag. As a side effect this simplify code a bit.
MFC after: 3 days
|
149192 |
17-Aug-2005 |
pjd |
We should now return 0.
|
149187 |
17-Aug-2005 |
pjd |
Even if crypto_dispatch() return an error, request is not canceled and our callback will still be called, just to tell us that requested failed...
Reported by: Mike Tancsa <mike@sentex.net> MFC after: 3 days
|
149185 |
17-Aug-2005 |
pjd |
We don't need to clear allocated memory. This will speed-up things a bit.
MFC after: 3 days
|
149150 |
16-Aug-2005 |
phk |
remove stale comments
|
149140 |
16-Aug-2005 |
le |
Make it possible to remove stale, left-over subdisks.
|
149094 |
15-Aug-2005 |
le |
Fix a stupid logic bug introduced in geom_vinum_drive.c rev 1.18:
When a drive is newly created, it's state is initially set to 'down', so it won't allow saving the config to it (thus it will never know of itself being created). Work around this by adding a new flag, that's also checked when saving the config to a drive.
|
149030 |
13-Aug-2005 |
pjd |
Because code paths for I/O requests are quite complex, add comments above the functions which participate in I/O paths.
MFC after: 1 day
|
148979 |
12-Aug-2005 |
pjd |
Provide more complete "How to add a new file system to glabel." list.
MFC after: 1 week
|
148978 |
12-Aug-2005 |
pjd |
Add code for Ext2FS and ReiserFS labels recognition.
Submitted by: Stanislav Sedov <stas@310.ru> PR: kern/84638 MFC after: 1 week
|
148977 |
12-Aug-2005 |
pjd |
Avoid creating directories in devfs by changing all '/' in labels to '_'.
Idea from: Stanislav Sedov <stas@310.ru> MFC after: 3 days
|
148961 |
11-Aug-2005 |
pjd |
GELI doesn't need cryptodev.
MFC after: 3 days
|
148867 |
08-Aug-2005 |
pjd |
Be case-insensitive when dealing with algorithm names.
PR: kern/84659 Submitted by: Benjamin Lutz <benlutz@datacomm.ch>
|
148460 |
27-Jul-2005 |
pjd |
MFp4: Export more informations about encrypted providers.
MFC after: 1 week
|
148458 |
27-Jul-2005 |
pjd |
Reduce default debug level to 0.
MFC after: 1 week
|
148456 |
27-Jul-2005 |
pjd |
Add GEOM_ELI class which provides GEOM providers encryption. For features list and usage see manual page: geli(8).
Sponsored by: Wheel Sp. z o.o. http://www.wheel.pl MFC after: 1 week
|
148440 |
27-Jul-2005 |
pjd |
Use root_mount KPI for RAID3 to delay root file system mount. Actually, one cannot setup root file system on RAID3 device, but when other file system exist in /etc/fstab which are placed on RAID3 device, boot process will be interrupted when these devices are missing.
MFC after: 3 days X-MFC-note: MFC only to RELENG_6, as RELENG_5 doesn't have root_mount KPI.
|
148410 |
25-Jul-2005 |
phk |
By design I left a tiny race in updating the I/O statistics based on the assumption that performance was more important that beancounter quality statistics.
As it transpires the microoptimization is not measurable in the real world and the inconsistent statistics confuse users, so revert the decision.
MT6 candidate: possibly MT5 candidate: possibly
|
148382 |
25-Jul-2005 |
pjd |
Add a very simple and small GEOM class - ZERO. It creates very huge provider (41PB) /dev/gzero. On BIO_READ request it zero-fills bio_data and on BIO_WRITE it does nothing. You can also set kern.geom.zero.clear sysctl to 0 to do nothing even for BIO_READ.
I'm using it for performance testing where it is very helpful.
MFC after: 3 days
|
148192 |
20-Jul-2005 |
phk |
Comment typo
|
148092 |
17-Jul-2005 |
pjd |
Before calling g_orphan_provider(), add G_PF_WITHER flag, so GEOM will know to destroy it.
PR: kern/81758 Submitted by: trasz <trasz@buziaczek.pl> MFC after: 3 days
|
148061 |
15-Jul-2005 |
nyan |
Merged from geom_mbr.c revisions 1.62 and 1.66. - Implement a gctl handler and the verb "write MBR".
|
148048 |
15-Jul-2005 |
le |
*) Implement round-robin reads for multiplex volumes.
*) Plug a possible memory leak. [1]
[1] obtained from: pjd@.
|
148034 |
15-Jul-2005 |
phk |
Implement a gctl handler and the verb "write MBR" which can be used to update metadata and bootcode while the MBR is in use.
MFC candidate
|
147843 |
08-Jul-2005 |
pjd |
Add CANCEL command which allows to remove one request from the queue or all requests from the queue if request number is not given.
Bump version number.
Approved by: re (scottl)
|
146624 |
25-May-2005 |
pjd |
After provider creation!!
|
146616 |
25-May-2005 |
pjd |
- Call root_mount_rel() when provider IS created, not earlier. This should close the race observed by Daniel Eriksson. - Remove redundant wakeup().
|
146538 |
23-May-2005 |
pjd |
Add some debug code to diagnose root-on-mirror problems with recent -current.
Reported by: Daniel Eriksson
|
146353 |
18-May-2005 |
pjd |
Correct typo.
|
146325 |
17-May-2005 |
le |
When a drive dies, don't call g_wither_geom() directly, but instead post an event to the geom event queue that will take care of it, letting outstanding bios finish, and closing the consumers.
Plus some cosmetic clean ups.
|
146118 |
11-May-2005 |
pjd |
cp can't be NULL.
Noticed by: Coverity Prevent analysis tool
|
146117 |
11-May-2005 |
pjd |
gp can't be NULL.
Noticed by: Coverity Prevent analysis tool
|
146110 |
11-May-2005 |
pjd |
Add KASSERT() to be sure there is an active component.
Suggested by: Coverity Prevent analysis tool
|
146109 |
11-May-2005 |
pjd |
Check return value.
Found by: Coverity Prevent analysis tool
|
145761 |
01-May-2005 |
nyan |
Fix signed vs unsigned warning.
|
145619 |
28-Apr-2005 |
le |
Only allow RAID5 plexes to be parity checked.
PR: kern/80427 Submitty by: Stijn Hoop <stijn@win.tue.nl>
|
145502 |
25-Apr-2005 |
pjd |
Fix provider's size check for 'insert' command. Before this fix one was able to insert one sector too small provider.
MFC after: 3 days
|
145306 |
19-Apr-2005 |
wollman |
The size of a filesystem may be less than the size of the provider it resides on. Fix the special case of the filesystem fragment size not evenly dividing the size of the provider. Fixing the general case probably requires better superblock validation (left as an exercise to the reader).
|
145305 |
19-Apr-2005 |
pjd |
Remove the hack which allowed to use gmirror for root file system, use root_mount KPI instead.
|
145259 |
19-Apr-2005 |
phk |
Call g_waitidle() instead of GEOM using the root_mount_hold() KPI. GEOM could (and will) get events as a result of drivers coming in late so a one-shot method is not good enough for GEOM.
|
145250 |
18-Apr-2005 |
phk |
Add a named reference-count KPI to hold off mounting of the root filesystem.
While we wait for holds to be released, print a list of who holds us back once per second.
Use the new KPI from GEOM instead of vfs_mount.c calling g_waitidle().
Use the new KPI also from ata.
With ATAmkIII's newbusification, ata could narrowly miss the window and ad0 would not exist when we tried to mount root.
|
144934 |
12-Apr-2005 |
pjd |
Protect against recursive labels creation in simlar way as it is done in BSD and MBR classes, ie. if provider below us uses the same metadata, don't create labels based on the metadata. This allows to create labels on geoms with rank != 1 without hacks.
Tested by: Chris Elsworth <chris@shagged.org> on sparc64 OK'ed by: phk MFC after: 2 weeks
|
144789 |
08-Apr-2005 |
pjd |
Fix a long-standing bug. Error string has to be copyied from the user process context.
Approved by: phk MFC after: 3 days
|
144592 |
03-Apr-2005 |
pjd |
- Add a missing g_io_deliver() in case of allocation failure - we didn't completed I/O requests here. - First allocate all needed bios, so if any of allocations fail, we can free memory before sending any I/O requests down.
Reported by: Pawel Malachowski MFC after: 3 days
|
144333 |
30-Mar-2005 |
nyan |
Remove geometry translations here.
|
144328 |
30-Mar-2005 |
joerg |
Support VTOC volume names. This can be useful to distinguish multiple disks in a system. Solaris' format(1m) displays the volume names in the disk overview.
MFC after: 1 month
|
144157 |
26-Mar-2005 |
phk |
fix a "modify after free" bug which is practically impossible to experience.
Found by: Coverity (id #540 #541)
|
144144 |
26-Mar-2005 |
pjd |
If an error occurs, clean up before returning from g_raid3_connect_disk().
|
144143 |
26-Mar-2005 |
pjd |
Make the code more obvious - when an error occurs in g_mirror_connect_disk(), detach and destroy consumer before returning.
|
144142 |
26-Mar-2005 |
pjd |
Check for return values.
Submitted by: sam Found by: Coverity Prevent analysis tool
|
143792 |
18-Mar-2005 |
phk |
g_read_data() can return NULL, check for it.
Found by: Coverity (ID#258)
|
143791 |
18-Mar-2005 |
phk |
After rejecting the bio request early, return instead of panicing.
Found by: Coverity (ID#450)
|
143790 |
18-Mar-2005 |
phk |
Avoid null pointer dereference.
|
143719 |
16-Mar-2005 |
pjd |
Plug memory leak.
Submitted by: Ted Unangst Found by: Coverity Prevent analysis tool Approved by: phk MFC after: 3 days
|
143627 |
15-Mar-2005 |
phk |
forward declare struct disk.
|
143590 |
14-Mar-2005 |
phk |
Do not attach MBR on top of an MBR. This removes some confusing slice names on disks with extended partitions.
Spotted on: Mother-in-laws computer.
|
143418 |
11-Mar-2005 |
ume |
stop including rijndael-api-fst.h from rijndael.h. this is required to integrate opencrypto into crypto.
|
143259 |
07-Mar-2005 |
le |
Remove test for zero sectorsize when tasting. This check doesn't seem to be necessary anymore, and it prevents tasting a valid drive when booting with geom_vinum already loaded, since SCSI disks set their sectorsize not until first opening them.
|
143238 |
07-Mar-2005 |
phk |
Add placeholder mutex argument to new_unrhdr().
|
143130 |
04-Mar-2005 |
le |
Don't allow to synchronize a plex that is already sychronizing.
Reset the 'syncing' flag in case of errors, too.
Some cosmetics.
|
142727 |
27-Feb-2005 |
pjd |
- Add md_provsize field to metadata, which will help with shared-last-sector problem. After this change, even if there is more than one provider with the same last sector, the proper one will be chosen based on its size. It still doesn't fix the 'c' partition problem (when da0s1 can be confused with da0s1c) and situation when 'a' partition starts at offset 0 (then da0s1a can be confused with da0s1 and da0s1c). One can use '-h' option there, when creating device or avoid sharing last sector. Actually, when providers share the same last sector and their size is equal, they provide exactly the same data, so the name (da0s1, da0s1a, da0s1c) isn't important at all. - Provide backward compatibility. - Update copyright's year.
MFC after: 1 week
|
142301 |
23-Feb-2005 |
le |
Correctly calculate what to do and how to retry a request to a plex when the previous one failed and there are more than one plex in the volume.
This could have led to a flood of error messages on the console and probably a deadlock in certain situations.
|
142079 |
19-Feb-2005 |
phk |
Try to unbreak the vnode locking around vop_reclaim() (based mostly on patch from kan@).
Pull bufobj_invalbuf() out of vinvalbuf() and make g_vfs call it on close. This is not yet a generally safe function, but for this very specific use it is safe. This solves the problem with buffers not being flushed by unmount or after failed mount attempts.
|
142020 |
17-Feb-2005 |
le |
In case of drive errors, don't close the associated consumer and detach it, but instead let the geom wither away.
Bump copyright year.
|
141998 |
16-Feb-2005 |
pjd |
Fix year in copyrights.
|
141994 |
16-Feb-2005 |
pjd |
Update copyright in files changed this year.
|
141993 |
16-Feb-2005 |
pjd |
Fix year in copyrights.
|
141973 |
16-Feb-2005 |
pjd |
Remove mutex asserion from g_gate_find(). We don't want g_gate_list_mtx mutex to be held here, because we want speed here.
|
141972 |
16-Feb-2005 |
pjd |
Remove TDP_GEOM flag from thread after ggate device creation. This flag means "wait for all pending requests before returning to userland". There are pending events for sure, because we just created new provider and other classes want to taste it, but we cannot answer on I/O requests until we're here.
|
141742 |
12-Feb-2005 |
pjd |
Fix typo. We want to unlock mutex here.
Submitted by: Andreas Kohn <andreas.kohn@gmail.com> MFC after: 1 week
|
141624 |
10-Feb-2005 |
phk |
Make various random things static
|
141561 |
09-Feb-2005 |
pjd |
- Remove g_gate_hold()/g_gate_release() from start/done paths. It saves 4 mutex operations per I/O requests. - Use only one mutex to protect both (incoming and outgoing) queue. As MUTEX_PROFILING(9) shows, there is no big contention for this lock. - Protect sc_queue_count with queue mutex, instead of doing atomic operations on it. - Remove DROP_GIANT()/PICKUP_GIANT() - ggate is marked as MPSAFE and no Giant there.
|
141513 |
08-Feb-2005 |
des |
merge from geom_vol_ffs.c rev 1.14 (avoid unaligned I/O requests)
|
141498 |
08-Feb-2005 |
des |
Take care not to issue unaligned I/O requests while tasting a provider.
|
141312 |
05-Feb-2005 |
pjd |
- Use bioq_insert_tail()/bioq_insert_head() instead of bioq_disksort(). - Improve mediasize checking.
MFC after: 1 week
|
140968 |
29-Jan-2005 |
phk |
When dumping to a unpartitioned disk, make sure to chop the length of the dump area accordingly.
Run into by: scottl
|
140940 |
28-Jan-2005 |
jeff |
- If mpsafevfs is off, acquire giant around all calls to bufdone().
Sponsored by: Isilon Systems, Inc.
|
140822 |
25-Jan-2005 |
phk |
Introduce and use g_vfs_close().
|
140773 |
24-Jan-2005 |
phk |
Create a correctly sized vnode objects for disk devices.
|
140722 |
24-Jan-2005 |
jeff |
- Don't acquire giant around calls to bufdone().
Sponsored By: Isilon Systems, Inc.
|
140591 |
21-Jan-2005 |
le |
Only report state changes of subdisks and plexes when there's really a state change.
Reword the info a bit.
|
140590 |
21-Jan-2005 |
le |
Don't initialize error with ENXIO as we might end up here when the plex has no more consumers (e.g. orphaning).
|
140532 |
20-Jan-2005 |
pjd |
Protect against recursive slices creation in simlar way as it is done in BSD class, ie. if provider below us uses the same metadata, don't create slices based on the metadata. This allows to create slices on geoms with rank != 1 without hacks.
Discussed with: phk Approved by: phk MFC after: 2 weeks
|
140476 |
19-Jan-2005 |
le |
Rename synchronization and initialization threads and prefix them with 'gv_' for consistency.
|
140475 |
19-Jan-2005 |
le |
Although an object may already be known in the configuration, it's worker thread may have been destroyed (e.g. during orphaning).
Make sure that objects get back their worker threads when they get a new geom.
|
140474 |
19-Jan-2005 |
le |
Reset object flags after killing off an object's worker thread.
|
140367 |
17-Jan-2005 |
phk |
Discontinue zero-length g_ctl arguments as "just give him this pointer" transfers. The necessary context for calling copyin() isn't available anyway and automatic code-validation chokes on this.
|
140261 |
14-Jan-2005 |
phk |
CAM will sometimes remove a disk again even before it finished being initialized. We already cancel the pending events but we need to not dereference the geom pointer which never got set different from NULL.
|
140074 |
11-Jan-2005 |
pjd |
Introduce a new GEOM class - SHSEC. It provides sharing secret between the given providers. Without even one of the configured components there should be no way to get the secret.
Supported by: WHEEL Sp. z o.o. http://www.wheel.pl
|
140056 |
11-Jan-2005 |
phk |
Add BO_SYNC() and add a default which uses the secret vnode pointer and VOP_FSYNC() for now.
|
139940 |
09-Jan-2005 |
pjd |
Increase default synchronization speed.
MFC after: 3 days
|
139778 |
06-Jan-2005 |
imp |
/* -> /*- for copyright notices, minor format tweaks as necessary
|
139671 |
04-Jan-2005 |
pjd |
- Fix 'rebuild' command - it can no longer relay on retaste event (we ignore it). - Remove code used for handling spoil events, as spoiling is not possible anymore, because we keep consumers open for writing all the time.
MFC after: 4 days
|
139670 |
04-Jan-2005 |
pjd |
Spoiling is now not possible, because we keep consumers open for writing all the time. Remove unused code then.
MFC after: 4 days
|
139650 |
03-Jan-2005 |
pjd |
Fix 'rebuild' command (we ignore retaste event now, so don't relay on it).
|
139622 |
03-Jan-2005 |
pjd |
Remove unused #include.
|
139451 |
30-Dec-2004 |
jhb |
Stop explicitly touching td_base_pri outside of the scheduler and simply set a thread's priority via sched_prio() when that is the desired action. The schedulers will start managing td_base_pri internally shortly.
|
139379 |
28-Dec-2004 |
pjd |
Remove debug code.
|
139295 |
25-Dec-2004 |
pjd |
- Add genid field to the metadata which will allow to improve reliability a bit. After this change, when component is disconnected because of an I/O error, it will not be connected and synchronized automatically, it will be logged as broken and skipped. Autosynchronization can occur, when component is disconnected (on orphan event) and connected again - there were no I/O error, so there is no need to not connected the component, but when there were writes while it wasn't connected, it will be synchronized. This fix cases, when component is disconnected because of I/O error and can be connected again and again. - Bump version number. - Implement backward compatibility mechanism. After this change when metadata in old version is detected, it is automatically upgraded to the new (current) version.
|
139246 |
23-Dec-2004 |
pjd |
Update disk->d_genid field when increasing sc->sc_genid.
|
139213 |
22-Dec-2004 |
pjd |
- Add genid field to the metadata which will allow to improve reliability a bit. After this change, when component is disconnected because of an I/O error, it will not be connected and synchronized automatically, it will be logged as broken and skipped. Autosynchronization can occur, when component is disconnected (on orphan event) and connected again - there were no I/O error, so there is no need to not connected the component, but when there were writes while it wasn't connected, it will be synchronized. This fix cases, when component is disconnected because of I/O error and can be connected again and again. - Bump version number. - Add version change history. - Implement backward compatibility mechanism. After this change when metadata in old version is detected, it is automatically upgraded to the new (current) version.
|
139146 |
21-Dec-2004 |
pjd |
Now, when force device destruction is done on shutdown, hide warning, that device cannot be destroyed immediately, under debug=1.
Suggested by: simon
|
139144 |
21-Dec-2004 |
pjd |
Improve reliability and clean up code a bit. For more details check src/sys/geom/mirror/g_mirror.c rev.1.47,1.48,1.49,1.50.
|
139140 |
21-Dec-2004 |
pjd |
This should not be permitted, but some GEOM classes held the topology lock while doing g_(read|write)_data() (e.g. BSD). This can cause a deadlock in MIRROR class. Not sure if this is safe to drop the topology lock in BSD class, so change the code in MIRROR class to avoid this deadlock.
|
139139 |
21-Dec-2004 |
pjd |
Implement g_topology_try_lock().
No objection from: phk
|
139054 |
19-Dec-2004 |
pjd |
Remove unused variables.
|
139053 |
19-Dec-2004 |
pjd |
- Argument 'flags' in g_mirror_destroy_consumer() function is unsed - mark it as such. - Before closing consumer check if it is open. It can be closed here when g_mirror_connect_disk() fails on g_access().
|
139051 |
19-Dec-2004 |
pjd |
Some major cleanups.
Keeping consumers open when device is closed is very hard. We need to open consumers sometimes to update metadata, etc. Many hacks was introduced in the past to made it possible. You cannot be sure that you can open consumer for writing always, even if you think it should be allowed. If one of the mirror components is for example da0 and you try to open it, you can get EPERM when da0s1 is opened for reading (because BSD class opens consumers (da0) with an extra 'e' bit set). Waiting for the events queue to be empty may do the trick, but it makes code much uglier (as you cannot always call g_waitidle()), it doesn't solve all edge cases and it can introduce deadlocks if there are events in the queue that wait for gmirror.
I removed those hacks. Now all consumers are open r1w1e1 always, even if device is closed. Maybe it is less clean from GEOM perspective, but simpify code a lot and make it much more reliable. The only issue was retaste event which is sent when we close consumers opened for writing. I ignore retaste event by not detaching consumer immediately (so retaste event is not send to my class) and sending event right after it to detach and destroy consumer.
|
139050 |
19-Dec-2004 |
pjd |
Don't quit on first failure, just skip failures.
|
138888 |
15-Dec-2004 |
brueffer |
Fix typo in a comment.
MFC after: 3 days
|
138801 |
13-Dec-2004 |
pjd |
bioq_insert_head() function is already in subr_disk.c.
|
138732 |
12-Dec-2004 |
phk |
Pass the file->flags down to geom ioctl handlers.
Reject certain ioctls if write permission is not indicated.
Bump geom API version.
Reported by: Ruben de Groot <mail25@bzerk.org>
|
138623 |
09-Dec-2004 |
pjd |
- Turn off 'fast' mode by default and increase maximum memory to consume when this mode is used. - Manual page update.
|
138382 |
05-Dec-2004 |
marcel |
o Don't limit GPT as a rank 2 provider. Allow it to be connected anywhere in the DAG. This includes configurations that are not allowed by the EFI specification. o Reject a GPT partition table if it's not preceeded by a PMBR. There's no need to preserve the MBR partitioning anymore as GPT is mature and with the first bullet extending the applicability of GPT, it's better to be a bit more strict.
|
138374 |
04-Dec-2004 |
pjd |
When initializing device, set d_softc and d_no fields for all components, because we know it then and we need it when inserting a component which wasn't destroyed while device was running.
Reported by: Michael Handler <handler@grendel.net> MFC after: 1 week
|
138221 |
30-Nov-2004 |
imp |
Add observations of the Linux98 and Grub/98 boot loaders. These observations lead me to believe that the convetion for pc98 boot loaders is to have a jump unstruction, followed by a string, followed by code. The jump usually doesn't have a nop after it and usually the string is NUL terminated, but Grub/98 breaks both of these rules.
# I looked for, but failed to find the Minux boot blocks for PC-9801 port.
|
138219 |
30-Nov-2004 |
imp |
Reject tasting of this provider if the sector size isn't a multiple of 512. If I had an audio cdrom in my cd player when I booted my system, I'd get a panic from geom because you can't read 8192 bytes from an audio cdrom.
Remove XXX comment about IPL1 and replace it with some information from my soon to be published web page on the pc98 disk layout. The IPL1 test was the result of an observation of a disk with FreeBSD's boot0 program. It was testing part of an area what appears to be reserved for a boot loader name, which comes after a jump over this area. I don't yet know if it is required to be any specific jump instruction, or if the destination has to be location 11. [1]
[1] FreeBSD Press No. 13, page 115, poorly translated by myself. The picture there shows offset 8 as the destination of the jump, but FreeBSD's boot0 program has three padding NULs after the IPL1 name and uses a 16-bit 'jmp' instruction.
|
138171 |
28-Nov-2004 |
phk |
Fix a long standing bug in geom_mbr which is only now exposed by the correct open/close behaviour of filesystems:
When an ioctl to modify the MBR arrives, we cannot take for granted that we have the consumer open.
The symptom is that one cannot run 'boot0cfg -s2 /dev/ad0' in single-user mode because / is the only open partition in only open r1w0e1.
If it is not, we attempt to increase the write count by one and decrease it again afterwards.
Presumably most if not all other slices suffer from the same problem.
|
138112 |
26-Nov-2004 |
le |
Implement 'setstate' to allow setting the state of drives and subdisks for debugging and emergency purposes.
|
138110 |
26-Nov-2004 |
le |
Implement checkparity/rebuildparity.
|
138014 |
23-Nov-2004 |
pjd |
- Add missing Giant drop before acquiring the topology lock. - Move DROP_GIANT()/PICKUP_GIANT() to g_gate_ioctl().
|
137936 |
20-Nov-2004 |
fjoe |
Use M_ZERO to not panic in mtx_init when INVARIANTS enabled.
Submitted by: simokawa MFC after: 1 week
|
137730 |
15-Nov-2004 |
le |
Move RAID5 offset calculation into a separate function to avoid code duplication.
|
137727 |
15-Nov-2004 |
le |
Share gv_roughlength() between kernel and userland, as we will need it there later.
|
137490 |
09-Nov-2004 |
pjd |
Before trying to update metadata (so open consumer for writing), be sure that the events queue is empty. In other case we're able to hit the race where for example da0s1 is tasted by some other class, which means that da0 is open with exclusive bit set, which means that we can't open da0 for writing if it is our component.
Reported by: Attila Nagy <bra@fsn.hu> (and somebody else sometime ago, but I cannot find who it was)
|
137489 |
09-Nov-2004 |
pjd |
Introduce g_waitidlelock() function which is simlar to g_waitidle(), but should be called with the topology lock held and returns with the topology lock held and empty event queue.
Approved by: phk (sometime ago)
|
137487 |
09-Nov-2004 |
pjd |
Don't rely on DIRTY flag to be sure that consumer if open, because DIRTY flag can be removed in idle process. Use consumer's acw field instead to avoid opening consumer twice.
|
137485 |
09-Nov-2004 |
pjd |
For BIO_READ check if provider is open for reading and for BIO_WRITE, check if provider is open for writing. This fixes panic when device is open only for writing and we send write request.
|
137421 |
09-Nov-2004 |
pjd |
Drop Giant lock before grabbing the topology lock.
|
137412 |
08-Nov-2004 |
pjd |
If device is marked as beeing destroyed, deny all access requests.
|
137259 |
05-Nov-2004 |
pjd |
Don't forget to make sure that there are no not-finished requests before marking components as clean.
Pointed out by: scottl
|
137258 |
05-Nov-2004 |
pjd |
- Mark all raid3 components as clean after kern.geom.raid3.idletime seconds. - Make kern.geom.raid3.timeout variable tunable.
|
137257 |
05-Nov-2004 |
pjd |
Mark raid3 devices as clean on shutdown (after all file systems are unmounted).
Suggested by: scottl
|
137256 |
05-Nov-2004 |
pjd |
- Use ->index consumer's field to track number of in-flight requests. - Remove unused #include.
|
137254 |
05-Nov-2004 |
pjd |
Use shutdown hooks to mark mirrors as clean after all file systems are unmounted.
Suggested by: scottl
|
137253 |
05-Nov-2004 |
pjd |
Remove unused #include.
|
137251 |
05-Nov-2004 |
pjd |
- Add a sysctl kern.geom.mirror.idletime, so one can specify after how many seconds of idling, DRITY flags are removed. - If mirror is in idle state or is not open for writing, sleep without timeout when waiting for I/O requests. - Don't use atomic operations, for now sysctls are protected by Giant. - Update debugs.
|
137248 |
05-Nov-2004 |
pjd |
MFp4: - Fix for good (I hope) force-stopping mirrors and some filure cases (e.g. the last good component dies when synchronization is in progress). Don't use ->nstart/->nend consumer's fields, as this could be racy, because those fields are used in g_down/g_up, use ->index consumer's field instead for tracking number of not finished requests.
Reported by: marcel
- After 5 seconds of idle time (this should be configurable) mark all dirty providers as clean, so when mirror is not used in 5 seconds and there will be power failure, no synchronization on boot is needed.
Idea from: sorry, I can't find who suggested this
- When there are no ACTIVE components and no NEW components destroy whole mirror, not only provider.
- Fix one debug to show information about I/O request, before we change its command.
|
137184 |
04-Nov-2004 |
phk |
Finish cut&paste adjustments.
Spotted by: tegge
|
137150 |
03-Nov-2004 |
phk |
Stop dumping the MBR entries under bootverbose
|
137149 |
03-Nov-2004 |
phk |
Stop wasting a bootverbose line on all geom slices.
|
137048 |
29-Oct-2004 |
phk |
Don't set si_bsize_phys, nobody cares.
|
137034 |
29-Oct-2004 |
phk |
Add GEOM class "VFS" for filesystems and other buffer cache users of GEOM devices.
There is nothing magic about this, it just gives a bufobj interface to GEOM.
|
137032 |
29-Oct-2004 |
phk |
Add g_wither_geom_close() function.
|
137029 |
29-Oct-2004 |
phk |
Give dev_strategy() an explict cdev argument in preparation for removing buf->b-dev.
Put a bio between the buf passed to dev_strategy() and the device driver strategy routine in order to not clobber fields in the buf.
Assert copyright on vfs_bio.c and update copyright message to canonical text. There is no legal difference between John Dysons two-clause abbreviated BSD license and the canonical text.
|
136983 |
26-Oct-2004 |
le |
Give each plex a separate queue where held back bios are put on. This lowers the CPU usage of the worker thread and prevents a possible live lock on non-SMP machines.
MFC candidate.
|
136946 |
25-Oct-2004 |
phk |
Use unit number allocation functions for GEOM minor numbers.
|
136940 |
25-Oct-2004 |
phk |
Retire si_stripesize and si_stripeoffset they will not be needed in cdev in the future.
|
136839 |
23-Oct-2004 |
phk |
Don't call g_waitidle(), it happens automagically now.
|
136837 |
23-Oct-2004 |
phk |
Add a new per-thread private flag: TDP_GEOM.
This flag gets set whenever the thread posts an event on the GEOM event queue, and if the flag is set when the thread is prepared to return to userland from the kernel, g_waitidle() will be called to make sure that the posted events have completed.
This can replace an insufficient number of g_waitidle() calls in various other places, and has the advantage of being failsafe: Any system call which does a VOP_OPEN()/VOP_CLOSE will now correctly wait for any geom events it posted as part of spoils or tastes.
Assert that topology and Giant is not held in g_waitidle().
|
136836 |
23-Oct-2004 |
phk |
Move the prototype for g_waitidle() to a more visible place.
|
136797 |
22-Oct-2004 |
arr |
- Turn KASSERT()s into warning printf()'s in the g_class_load() routine. This removes a panic that will occur if you build with GENERIC and attempt to kldload a GEOM module that is already in the kernel.
Reviewed by: phk
|
136755 |
21-Oct-2004 |
rwatson |
Add KTR_GEOM, which allows tracing of basic GEOM I/O events occuring in the g_up and g_down threads. Each time a bio is propelled up and down the stack, an event is generating showing the provider, offset, and length, as well as thread wakeup and work status information.
|
136504 |
14-Oct-2004 |
pjd |
Ehh. Introduce a hack: Wait for 3 seconds, so GEOM is able to give us providers for tasting. Before this hack, race below is possible: SI_SUB_RAID (no not-fully-configured geoms, so don't block) GEOM tasting (now geoms are created) SI_SUB_MOUNT_ROOT (if root file system is placed on a mirror, it is possible that this mirror is not fully configured yet) There is a lot of work to do to avoid such hacks and I need a working solution before 5.3, sorry.
Reported by: John Hay <jhay@icomtek.csir.co.za>
|
136503 |
14-Oct-2004 |
pjd |
Only allow for unloading when there are no geoms in LABEL GEOM class. We have to use our own destroy_geom method, because default one, which is a part of geom_slice is broken. MT5 candidate.
PR: kern/72467 Submitted by: Vladimir Novoseltsev
|
136414 |
12-Oct-2004 |
green |
When loading GEOM modules, we expect the actual load process to be done by the time that kldload(8) returns. Satisfy that by making the GEOM module load event -- only when the kernel is !cold -- wait until the GEOM module init function has finished instead of returning immediately.
This is the other half of fixing md(8) (actually, "mfs" in fstab(5)) that is similar to r1.128 of src/sys/dev/md/md.c. This bug would be why RAM disks would often fail on boot and the first call to mdconfig(8) would probably fail.
pjd has ideas for not requiring kldload(8) to work synchronously for control devices that could make this obsolete.
Silence on: -arch
|
136399 |
11-Oct-2004 |
ups |
Trace information about a buffer while we still control it.
Reviewed by: phk Approved by: sam (mentor)
|
136284 |
08-Oct-2004 |
sos |
Only do the geometry translations on ad* devices, other devices seems to have their own way of life. Those other devices translations should be moved here as well.
|
136236 |
07-Oct-2004 |
pjd |
Be sure to always return 0 for negative access requests.
Reported by: Maciej Kucharz <qk@comp.waw.pl>
|
136233 |
07-Oct-2004 |
sos |
Move the PC98 specific geometry "gunk" to geom_pc98.c where it belongs. This also adds support for bigger disks on the controller I have access to, and maybe others if I understood the adhoc methods used on those.
Those with more PC98 bigdrive controllers it is hereby invited to add/fix support for those in geom_pc98.c and not using #ifdef PC98 all over the place.
|
136201 |
06-Oct-2004 |
phk |
Don't set the BIO_ONQUEUE debugging flag until we actually put the bio onto a queue. This made the ENOMEM handling an instant panic.
|
136197 |
06-Oct-2004 |
pjd |
Geoms without softc are geoms which are initialized, so wait for them.
|
136191 |
06-Oct-2004 |
pjd |
Look out for geoms without softc.
Reported by: tegge
|
136143 |
05-Oct-2004 |
pjd |
Before root file system is mounted, wait for mirrors in degraded state.
|
136065 |
02-Oct-2004 |
le |
Don't allow to create a drive that already exists.
|
136064 |
02-Oct-2004 |
le |
Correctly skip the '/dev/' part when creating new drives and prefix a drive's provider with '/dev/' when printing the config.
Reported by: will@
|
136056 |
02-Oct-2004 |
pjd |
Unlock g_gate_list_mtx mutex when we cannot allocate unit number. MT5 candidate.
PR: kern/72253 Submitted by: Ivan Voras <ivoras@fer.hr>
|
135966 |
30-Sep-2004 |
le |
Make it possible to rebuild degraded RAID5 plexes. Note that it is currently not possible to do this while the volume is mounted.
MFC in: 1 week
|
135876 |
28-Sep-2004 |
phk |
Protect the start/end counts on consumers and providers with the up/down mutexes.
Make it possible to also protect the disk statistics (at a minor cost in performance) by setting bit 2 of kern.geom.collectstats.
|
135873 |
28-Sep-2004 |
pjd |
- Set maximum request size to MAXPHYS (128kB), instead of DFLPHYS (64kB). - Set minimum request size to sectorsize, instead of 512 bytes.
Approved by: phk (some time ago)
|
135872 |
28-Sep-2004 |
pjd |
Just use MAXPHYS as maximum I/O request size, instead of using my own #define for this purpose. No functional change.
|
135866 |
27-Sep-2004 |
pjd |
Decrease kern.geom.raid3.timeout to 4, so it is smaller than vfs.root.mountdelay by default.
|
135865 |
27-Sep-2004 |
pjd |
Deny invalid I/O requests which comes from userland here, because later we'll get a panic. MT5 candidate.
Reviewed by: phk
|
135863 |
27-Sep-2004 |
pjd |
Avoid race while synchronizing components. It is very hard to bump into, but it is possible: 1. Read data from good component for synchronization. 2. Write data to the same area. 3. Write synchronization data, which are now stale.
Found by: tegge (for gmirror)
|
135859 |
27-Sep-2004 |
pjd |
Minor, but very important condition fix. The current one can never be true.
|
135854 |
27-Sep-2004 |
pjd |
Decrease kern.geom.mirror.timeout to 4, so it is smaller than vfs.root.mountdelay by default.
|
135834 |
26-Sep-2004 |
pjd |
Forgot to commit addition of ds_resync field.
|
135833 |
26-Sep-2004 |
pjd |
Avoid race while synchronizing components. It is very hard to bump into, but it is possible: 1. Read data from good component for synchronization. 2. Write data to the same area. 3. Write synchronization data, which are now stale.
Found by: tegge
|
135831 |
26-Sep-2004 |
pjd |
Simplify code a bit.
|
135716 |
24-Sep-2004 |
phk |
Assert topology is held in g_dev_getprovider().
Don't call devsw(). It is not necessary, and we do not need to hold dev_lock to compare the devsw pointer to our own since we do not dereference it.
|
135522 |
20-Sep-2004 |
pjd |
This is not needed anymore, it is forced in GEOM now. Actually, it can even cause some problems, because GEOM requires sectorsize to be more than 0 on first access, not on provider creation, so we can skip valid providers by doing this check here.
Reported by: Divacky Roman <xdivac02@stud.fit.vutbr.cz> Sven Willenberger <sven@dmv.com>
|
135461 |
19-Sep-2004 |
fjoe |
Use correct malloc type when freeing memory allocated by g_read_data.
PR: 71431 Submitted by: daichi
|
135434 |
18-Sep-2004 |
le |
Single concat or striped plexes don't need no special initialization if their subdisks are all available, so let them be brought up.
|
135426 |
18-Sep-2004 |
le |
Re-vamp how I/O is handled in volumes and plexes.
Analogous to the drive level, give each volume and plex a worker thread that picks up and processes incoming and completed BIOs.
This should fix the data corruption issues that have come up a few weeks ago and improve performance, especially of RAID5 plexes.
The volume level needs a little work, though.
|
135302 |
16-Sep-2004 |
fjoe |
g_nop_create: destroy newly created provider in case of errors.
|
135173 |
13-Sep-2004 |
le |
Give the DRIVE geom a worker thread that picks up incoming bios, sends them down, and takes care of the finished bios. This makes it easier to handle I/O errors at drive level.
|
135164 |
13-Sep-2004 |
le |
Rename gv_kill_thread() to gv_kill_plex_thread(), since there are more threads to come.
|
135162 |
13-Sep-2004 |
le |
Save the config back to disk when a drive goes down.
|
135161 |
13-Sep-2004 |
le |
Read a whole sector instead of GV_HDR_LEN, since a sector might be bigger (i.e. on CD-ROMs).
|
135151 |
13-Sep-2004 |
pjd |
Make kern.geom.debugflags sysctl tunable from /boot/loader.conf. It will help to debug problems when booting.
Approved by: phk
|
135085 |
11-Sep-2004 |
phk |
Fix a problem that shows up if less than the full complement of lock sectors are defined ("number_of_keys" argument to gbde init being less than 4 in the default compile).
|
135084 |
11-Sep-2004 |
phk |
Respect that G_BDE_MAXKEYS is a compile time variable.
|
134958 |
08-Sep-2004 |
fjoe |
Do not compile in zlib.c. Add a dependency on module instead.
|
134957 |
08-Sep-2004 |
pjd |
Show current status of mirror device directly.
Suggested by: Krzysztof Ciep³ucha <kris@home.pl>
|
134824 |
05-Sep-2004 |
phk |
For removable devices without media we set a zero mediasize but a non-zero sectorsize in order to avoid a lot of checks around various divisions etc.
Enforce the sectorsize being > 0 with a KASSERT on successful open.
Fix scsi_cd.c to return 2k sectors when no media inserted.
|
134528 |
30-Aug-2004 |
pjd |
Allow to configure debug level from /boot/loader.conf.
|
134519 |
30-Aug-2004 |
phk |
Add more KASSERTS and checks.
|
134486 |
29-Aug-2004 |
pjd |
GCC, ehh.
|
134421 |
28-Aug-2004 |
pjd |
Use sc->sc_mediasize instead of sc->sc_provider->mediasize which contains exactly the same value, but is shorter.
|
134420 |
28-Aug-2004 |
pjd |
Warn the user if we are not going to use whole provider space.
Requested by: Michael Handler <handler@grendel.net>
|
134418 |
28-Aug-2004 |
pjd |
Don't allow to insert providers, which are too small.
Reported by: Michael Handler <handler@grendel.net>
|
134407 |
27-Aug-2004 |
le |
Move config_new_drive() to the correct place and rename it to gv_config_new_drive().
|
134379 |
27-Aug-2004 |
phk |
Introduce g_alloc_bio() as a waiting variant of g_new_bio().
Use in places where we can sleep and where we previously failed to check for a NULL pointer.
MT5 candidate.
|
134356 |
26-Aug-2004 |
le |
When attaching a consumer from a volume to a plex, check if the volume already has a plex attached and adjust the access counts of the new consumer accordingly.
|
134344 |
26-Aug-2004 |
pjd |
Skip providers with not defined sector size.
Reported by: kuriyama
|
134303 |
25-Aug-2004 |
pjd |
Log verification errors at level 1.
|
134292 |
25-Aug-2004 |
pjd |
Dump disk number.
|
134226 |
23-Aug-2004 |
pjd |
Allow to set kern.geom.mirror.timeout from /boot/loader.conf.
|
134221 |
23-Aug-2004 |
le |
Compare the addresses of two RAID5 work packets directly instead of the addresses of their related bios when locking one out, since they could share a bio and this could lead to parity corruption.
|
134176 |
22-Aug-2004 |
le |
Implement the possibility to remove drives.
|
134168 |
22-Aug-2004 |
pjd |
Implementation of 'verify reading' algorithm, which uses parity data for verification of regular data when device is in complete state. On verification error, EIO error is returned for the bio and sysctl kern.geom.raid3.stat.parity_mismatch is increased.
Suggested by: phk
|
134155 |
22-Aug-2004 |
le |
Add forgotten format specifier in a KASSERT and shut up the compiler.
Submitted by: Gavin Atkinson <gavin.atkinson@ury.york.ac.uk>
|
134136 |
21-Aug-2004 |
pjd |
Add version history.
|
134124 |
21-Aug-2004 |
pjd |
Implement new reading algorithm, which will use parity component for reading as well, even if device is in complete state. I observe 40% of speed-up with this option for random read operations, but slowdown for sequential reads. Basically, without this option reading from a RAID3 device built from 5 components (c0-c4) looks like this:
Request no. Used components 1 c0+c1+c2+c3 2 c0+c1+c2+c3 3 c0+c1+c2+c3
With the new feature:
Request no. Used components 1 c0+c1+c2+c3 2 (c1^c2^c3^c4)+c1+c2+c3 3 c0+(c0^c2^c3^c4)+c2+c3 4 c0+c1+(c0^c1^c3^c4)+c3 5 c0+c1+c2+(c0^c1^c2^c4) 6 c0+c1+c2+c3 [...]
|
134014 |
19-Aug-2004 |
le |
A volume can be up if it has a degraded RAID5 plex.
|
133991 |
18-Aug-2004 |
pjd |
We really don't want to receive spoil event for synchroniztion consumers.
|
133986 |
18-Aug-2004 |
phk |
Do not override the class provided dumpconf function.
|
133984 |
18-Aug-2004 |
le |
Pretty print some informational messages.
|
133983 |
18-Aug-2004 |
le |
Fix a stupid bug in the drive taste function: when checking if a drive is known to the configuration check also if it already has a geom. Without this check several needless geoms are created and valid configuration data was overwritten.
This change obsoletes the need for a separate geom to taste an offered provider and the consumer doesn't need to be opened with the exclusive bit set.
|
133981 |
18-Aug-2004 |
pjd |
NOP class doesn't operate on metadata, so the spoil event can be safely ignored.
|
133979 |
18-Aug-2004 |
pjd |
Dump device status on 'list' command.
|
133946 |
18-Aug-2004 |
pjd |
Bump synchronization ID if we are sure, that we have ACTIVE components.
|
133839 |
16-Aug-2004 |
obrien |
Minor style.9 cleanup.
|
133825 |
16-Aug-2004 |
pjd |
Decrease debug level to 0.
|
133823 |
16-Aug-2004 |
pjd |
Fix warning.
|
133808 |
16-Aug-2004 |
pjd |
Introduce GEOM RAID3 class, i.e. kernel module, which implements RAID3 transformation and graid3(8) userland utility, which can be used for configuration. No manual page yet, sorry.
Hardware provided by: Daniel Seuffert
|
133752 |
15-Aug-2004 |
pjd |
Avoid code duplication by introducing g_mirror_write_metadata() function, which is used now by g_mirror_clear_metadata() function and g_mirror_update_metadata() function.
|
133717 |
14-Aug-2004 |
le |
Make informational output look less like an accident.
|
133640 |
13-Aug-2004 |
fjoe |
Add geom_uzip -- geom class that implements read-only compressed disks. Currently supports cloop V2.0 disk compression format. May support more formats in future.
|
133530 |
11-Aug-2004 |
pjd |
MFp4: Simplify code a bit: - Remove kern.geom.mirror.sync_block_size sysctl. It is quite obvious that we want to use the biggest size possible. - Do not use UMA zone for sync data allocations. There could be only one synchronization request per synchronized disk at a time, so allocate memory for one request on whole synchronization process related to one disk.
Tested by synchronizing one component (out of three) and by synchronizing two components (out of three) in parallel.
|
133528 |
11-Aug-2004 |
pjd |
Actually, HARDCODED flag isn't stored in metadata, so don't bother dumping it.
|
133527 |
11-Aug-2004 |
pjd |
- Fix typo. - Dump HARDCODED flag.
|
133498 |
11-Aug-2004 |
pjd |
Increase default kern.geom.stripe.maxmem to 50 elements.
|
133487 |
11-Aug-2004 |
pjd |
When sending request once again because of ENOMEM, reset bio_children and bio_inbed fields to 0. Without this change we can end up with I/O leakage in some rare situations. I tested this change by putting failure probability mechanism simlar to this used in NOP class into g_clone_bio(9) function, so it was able to return NULL with the given probability.
Discussed with: phk
|
133484 |
11-Aug-2004 |
pjd |
Try harder to not panic on 'stop -f'. After the commit, this command should be really safe to use.
|
133450 |
10-Aug-2004 |
le |
If we kill the worklist thread of a RAID5 plex we can destroy the worklist mutex at the same time, so move the mtx_destroy() call to gv_kill_thread().
|
133449 |
10-Aug-2004 |
le |
Lock the topology before calling gv_parse_config, not afterwards.
|
133448 |
10-Aug-2004 |
pjd |
- Recognize HARDCODED flag when dumping consumer configuration. - Improve code readabilty a bit.
|
133447 |
10-Aug-2004 |
pjd |
Forgot to commit those: introduce hardcoded provider functionality, which allow to store provider's name in the metadata and avoid problems when few providers share the same last sector.
|
133444 |
10-Aug-2004 |
pjd |
Fix one of the lastest commit. This bio_caller1 should also be changed to bio_driver1 (as all the rest). This introduced a small memory leak, but it wasn't really critical, because maximum memory for g_stripe_zone is always set, so after few requests gstripe was working in "economic" mode.
|
133373 |
09-Aug-2004 |
pjd |
- Introduce option for hardcoding providers' names into metadata. It allows to fix problems when last provider's sector is shared between few providers. - Bump version number for CONCAT and STRIPE and add code for backward compatibility. - Do not bump version number of MIRROR, as it wasn't officially introduced yet. Even if someone started to play with it, there is no big deal, because wrong MD5 sum of metadata will deny those providers. - Update manual pages. - Add version history to g_(stripe|concat).h files.
|
133371 |
09-Aug-2004 |
pjd |
Do not use g_wither_geom(9). I doesn't work in the way which is expected here anymore (after g_wither_washer() was introduced), i.e. geom and consumer will not be immediately destroyed if possible.
|
133356 |
09-Aug-2004 |
phk |
Too many versions.
Spotted by: pjd
|
133319 |
08-Aug-2004 |
phk |
OK, now check geom class version numbers.
|
133318 |
08-Aug-2004 |
phk |
Tag all geom classes in the tree with a version number.
|
133316 |
08-Aug-2004 |
phk |
OOps, that check was a bit premature. Allow zero versions as well.
|
133314 |
08-Aug-2004 |
phk |
Use default method initialization on geoms.
|
133312 |
08-Aug-2004 |
phk |
Give classes a version number and refuse to touch classes which are not understood. This makes room for additional binary compatibility in the future.
Put fields in the class for the geom's methods and initialize the methods of a new geom from these fields. This saves some code in all classes.
|
133205 |
06-Aug-2004 |
pjd |
Add and document kern.geom.stripe.fast_failed sysctl, which shows how many times "fast" mode failed.
|
133204 |
06-Aug-2004 |
pjd |
Fields bio_caller[12] should be used by the consumer and fields bio_driver[12] should be used by the provider!
|
133201 |
06-Aug-2004 |
pjd |
Fix I/O leakage. We're cloning bios in g_stripe_start_fast(), but when something goes wrong while running in "fast" mode, we free all bios and falling back to "economic" mode. Freeing bios, doesn't mean decrease bio_children, so bio_inbed couldn't be equal to bio_children and request was never finished. Decrease bio_children manually when destroying bios.
Reported by: Sam Lawrance <boris@brooknet.com.au>, simon
|
133173 |
05-Aug-2004 |
pjd |
Don't use 'bp' after its destruction!
|
133170 |
05-Aug-2004 |
pjd |
Simplify a bit - we could use 'sc' here as it was initialized properly.
|
133142 |
04-Aug-2004 |
pjd |
- Add two fields to bio structure: 'bio_cflags' which can be used by consumer and 'bio_pflags' which can be used by provider. - Remove BIO_FLAG1 and BIO_FLAG2 flags. From now on new fields should be used for internal flags. - Update g_bio(9) manual page. - Update some comments. - Update GEOM_MIRROR, which was the only one using BIO_FLAGs.
Idea from: phk Reviewed by: phk
|
133115 |
04-Aug-2004 |
pjd |
- Add "prefer" balance algorithm. When used, only disk with the biggest priority will be used for reading. - Bump version number.
|
133114 |
04-Aug-2004 |
pjd |
MFp4: We don't really need g_mirror_free_disk() function.
|
133079 |
03-Aug-2004 |
pjd |
Fix comment.
|
132988 |
02-Aug-2004 |
pjd |
- Fix unloading by the same way it is done in my other classes: set gp->softc to NULL and return ENXIO when it is NULL, so GEOM will not panic or hang, but unload one device on every 'unload'. This make 'unload' command usable, but it have to be executed <number of devices> + 1 times. - Made use of 'pp' variable.
|
132976 |
01-Aug-2004 |
pjd |
Typo.
|
132954 |
01-Aug-2004 |
pjd |
- Launch main provider when there are no more disks in NEW state. - Log syncid bump at debug level 1.
|
132941 |
31-Jul-2004 |
pjd |
If there are no valid components after the timeout, just destroy device. There is probably nothing to wait for.
|
132940 |
31-Jul-2004 |
le |
Propagate size changes upwards.
|
132938 |
31-Jul-2004 |
pjd |
Handle spoil event in dedicated function: g_mirror_spoiled(). The different between the new function and g_mirror_orphan() (which was used previously) is that syncid is bumped immediately, instead of on first write, because when consumer was spoiled, it means, that its provider was opened for writing, so we can't trust that its data will be valid when it will be connected again.
|
132923 |
31-Jul-2004 |
pjd |
Remove unused field.
|
132922 |
31-Jul-2004 |
pjd |
Destroy synchronization geom immediately. This should fix unloading without stopping all mirrors.
|
132911 |
31-Jul-2004 |
pjd |
Allow slice creation on providers from MIRROR class. This should allow mounting root file system from a mirror.
|
132909 |
31-Jul-2004 |
pjd |
Add '-p' option for 'insert' command which allows to specify priority of the new component. Version number wasn't bumped (it should be), because I think there are no geom_mirror users yet.
|
132908 |
31-Jul-2004 |
pjd |
- Check if 'slice' argument was given. - Check if disk isn't already the mirror component.
|
132907 |
31-Jul-2004 |
pjd |
Dump correct field.
|
132906 |
30-Jul-2004 |
le |
Set the access counts of a subdisk correctly when attaching it to a plex that already has subdisks.
|
132904 |
30-Jul-2004 |
pjd |
Add GEOM_MIRROR class which provide RAID1 functionality and has many useful features. The gmirror(8) utility should be used for control of this class. There is no manual page yet, but I'm working on it with keramida@.
Many useful tests provided by: simon (thank you!) Some ideas from: scottl, simon, phk
|
132896 |
30-Jul-2004 |
pjd |
Nuke geom_mirror class. New geom_mirror class is in the way.
Approved by: phk
|
132895 |
30-Jul-2004 |
pjd |
Allow to create slices on providers from class LABEL and class NOP. This is really ugly way to do this, but there is no other way for now. It allows to mount root file system from providers which belong to those classes.
Approved by: phk
|
132877 |
30-Jul-2004 |
pjd |
- Add '-S' option, which allow to specify sector size for transparent provider. - Bump version number.
This allows for a quite interesting trick. One can setup a stripe with stripe size of 512 bytes and create transparent provider on top of it with sector size equal to <ndisks> * 512. The result will be something like RAID3 without parity disk (every access will touch all disks).
|
132833 |
29-Jul-2004 |
le |
Shut up the compiler and temporarily '#if 0' gv_destroy_geom(), until we need it again.
|
132665 |
26-Jul-2004 |
pjd |
Improve geom(8)'s 'list' command to show geoms and their providers and consumers. Teach STRIPE, CONCAT and NOP classes about this improvement.
|
132664 |
26-Jul-2004 |
pjd |
Change naming scheme from /dev/<name>.stripe to /dev/stripe/<name>.
|
132663 |
26-Jul-2004 |
pjd |
Change naming scheme from /dev/<name>.concat to /dev/concat/<name>.
|
132662 |
26-Jul-2004 |
pjd |
M_WAITOK is ok here, while I'm using M_WAITOK later in this function.
|
132661 |
26-Jul-2004 |
pjd |
M_WAITOK is ok here, while I'm using M_WAITOK later in this function.
|
132654 |
26-Jul-2004 |
le |
Save the vinum config back to disk after syncing two plexes.
|
132642 |
25-Jul-2004 |
le |
There's a chance that the VINUMDRIVE class tastes before the VINUM class, so let the VINUMDRIVE class parse the on-disk configuration, too.
|
132631 |
25-Jul-2004 |
le |
Check for a NULL pointer before dereferencing it.
|
132617 |
24-Jul-2004 |
le |
Use a temporary geom when tasting vinumdrives and lock the 'real' vinumdrive geom with an exclusive bit. This should fix the problem when underlying partitions overlap (i.e. the 'a' partition is at the same offset as the 'c' partition).
Ideas borrowed from pjd@, quite a bit of testing by Matthias Schuendehuette <msch@snafu.de>.
|
132607 |
24-Jul-2004 |
le |
Disable kldunloading of geom_vinum temporarily until I figured out how to do it correctly.
|
132381 |
19-Jul-2004 |
pjd |
MFp4: Add two options for gnop(8)'s 'create' command: -o offset - specifies where to start on the original provider -s size - specifies size of the transparent provider
|
132355 |
18-Jul-2004 |
pjd |
Fix copy&paste bug.
|
132342 |
18-Jul-2004 |
pjd |
Fix exclusive-bit leakage.
|
132199 |
15-Jul-2004 |
phk |
Do a pass over all modules in the kernel and make them return EOPNOTSUPP for unknown events.
A number of modules return EINVAL in this instance, and I have left those alone for now and instead taught MOD_QUIESCE to accept this as "didn't do anything".
|
132098 |
13-Jul-2004 |
pjd |
Remove unused macro.
|
132097 |
13-Jul-2004 |
pjd |
Decrease log level of one debug message, so there is no hole (level 2 wasn't used at all).
|
132095 |
13-Jul-2004 |
pjd |
Minor sysctl description fixes.
Submitted by: simon
|
131878 |
09-Jul-2004 |
pjd |
Implement "FAST" mode for GEOM_STRIPE class and turn it on by default.
In this mode you can setup even very small stripe size and you can be sure that only one I/O request will be send to every disks in stripe. It consumes some more memory, but if allocation fails, it will fall back to "ECONOMIC" mode.
It is about 10 times faster for small stripe size than "ECONOMIC" mode and other RAID0 implementations. It is even recommended to use this mode and small stripe size, so our requests are always splitted.
One can still use "ECONOMIC" mode by setting kern.geom.stripe.fast to 0. It is also possible to setup maximum memory which "FAST" mode can consume, by setting kern.geom.stripe.maxmem from /boot/loader.conf.
|
131877 |
09-Jul-2004 |
phk |
Only detach consumers which are attached when we wither stuff away.
Pointed out by: pjd
|
131820 |
08-Jul-2004 |
phk |
Make withering water tight.
When we orphan/wither a provider, an attached geom+consumer could end up being withered as a result and it may be in front of us in the normal object scanning order so we need to do multi-pass. On the other hand, there may be withering stuff we can't get rid off (yet), so we need to keep track of both the existence of withering stuff and if there is more we can do at this time.
|
131798 |
08-Jul-2004 |
phk |
Fail normally rather than KASSERT if attempt to open a spoiled consumer.
|
131718 |
06-Jul-2004 |
pjd |
Add missing argument.
|
131716 |
06-Jul-2004 |
pjd |
Properly free resources if g_access() fails.
|
131649 |
05-Jul-2004 |
pjd |
- Add 'stop' command, which works just like 'destroy' command, but sounds less dangerous. - Update manual pages and extend examples. - Bump versions.
|
131625 |
05-Jul-2004 |
pjd |
g_clone_bio() can fail, be ready for this.
Approved by: le
|
131568 |
04-Jul-2004 |
phk |
We only need to check for overlaps if we increasing access counts.
|
131476 |
02-Jul-2004 |
pjd |
Introduce GEOM_LABEL class. This class is used for detecting volume labels on file systems: UFS, MSDOSFS (FAT12, FAT16, FAT32) and ISO9660. It also provide native labelization (there is no need for file system).
g_label_ufs.c is based on geom_vol_ffs from Gordon Tetlow. g_label_msdos.c and g_label_iso9660.c are probably hacks, I just found where volume labels are stored and I use those offsets here, but with this class it should be easy to do it as it should be done by someone who know how. Implementing volume labels detection for other file systems also should be trivial.
New providers are created in those directories: /dev/ufs/ (UFS1, UFS2) /dev/msdosfs/ (FAT12, FAT16, FAT32) /dev/iso9660/ (ISO9660) /dev/label/ (native labels, configured with glabel(8))
Manual page cleanups and some comments inside were submitted by Simon L. Nielsen, who was, as always, very helpful. Thanks!
|
131411 |
01-Jul-2004 |
pjd |
Remove unused argument for good.
|
131408 |
01-Jul-2004 |
pjd |
Free only if pointer isn't NULL.
|
131267 |
29-Jun-2004 |
phk |
Fix regression in last commit.
|
131207 |
27-Jun-2004 |
phk |
Make sure to kill the devstat entry for disappearing disks.
PR: 68074 Submitted by: Hendrik Scholz <hscholz@raisdorf.net>
|
131188 |
27-Jun-2004 |
pjd |
Introduce a hack that will make geom_gate to work with read-only mounts. Now, when trying to mount file system in read-only mode it tries to opened a device for writting to be able to update to read-write mode latter. Ehh.
Discussed with: phk
|
131160 |
26-Jun-2004 |
rwatson |
The g_up and g_down threads use a local 'mymutex' mutex to allow WITNESS to warn about attempts to sleep in the I/O path. This change pushes the definition and use of 'mymutex' behind #ifdef WITNESS to avoid the cost in non-debugging cases. This results in a clear .22% performance win for 512 byte and 1k I/O tests on my SMP test box. Not much, but every bit counts.
|
131107 |
25-Jun-2004 |
le |
Mark a plex as 'newborn' when it is created. This is used to indicate that new RAID5 plexes need to be initialized first.
|
131046 |
24-Jun-2004 |
pjd |
Don't force class to give a valid softc to g_slice_new(), it is not always needed.
Approved by: phk
|
131015 |
24-Jun-2004 |
csjp |
Currently, if the drives specified for volume creation are not active GEOM providers, it will result in a kernel panic.
If the GEOM provider or disk goes away before the volume configuration data gets written to the disk, it will result in another kernel panic.
o Make sure that the drives specified for volume creation are active GEOM providers.
o When writing out volume configuration data to associated drives, make sure that the GEOM provider is active, otherwise continue to the next drive in the volume.
Approved by: le, bmilekic (mentor)
|
131000 |
23-Jun-2004 |
le |
Add a function to clean up RAID5 packets and use it when I/O has finished or when building the complete packet fails.
|
130997 |
23-Jun-2004 |
le |
Remove two debugging printfs that are currently rather disturbing than helpful.
|
130990 |
23-Jun-2004 |
le |
Accept "sd len 0" and auto-size the subdisk correctly.
Spotted by: csjp
|
130930 |
22-Jun-2004 |
le |
No need to free the softc, because it wasn't allocated.
|
130925 |
22-Jun-2004 |
le |
Don't sleep in the g_down path. More error checks to come.
|
130875 |
21-Jun-2004 |
phk |
Kill g_access_rel() already now before we send it down 5-stable
|
130836 |
21-Jun-2004 |
pjd |
Don't hold topology lock while calling g_gate_release().
Found by: KASSERT()
|
130712 |
19-Jun-2004 |
phk |
Duplicate the securelevel check from spec_vnops.c here.
|
130697 |
18-Jun-2004 |
le |
Clean up allocated ressources when destroying the main vinum geom.
|
130651 |
17-Jun-2004 |
phk |
Reduce the thaumaturgical level of root filesystem mounts: Instead of using an otherwise redundant clone routine in geom_disk.c, mount a temporary DEVFS and do a proper lookup.
Submitted by: thomas
|
130640 |
17-Jun-2004 |
phk |
Second half of the dev_t cleanup.
The big lines are: NODEV -> NULL NOUDEV -> NODEV udev_t -> dev_t udev2dev() -> findcdev()
Various minor adjustments including handling of userland access to kernel space struct cdev etc.
|
130597 |
16-Jun-2004 |
le |
Handle dead disks in a somewhat sane way.
|
130585 |
16-Jun-2004 |
phk |
Do the dreaded s/dev_t/struct cdev */ Bump __FreeBSD_version accordingly.
|
130542 |
15-Jun-2004 |
le |
Fix several bugs related to subdisk drive_offset calculation.
|
130478 |
14-Jun-2004 |
le |
Don't free a VINUMDRIVE softc when it's orphaned or spoiled. All allocated ressouces should be ultimately freed in gv_destroy_geom() (when unloading the module and not earlier), but I need to look at this more closely.
|
130477 |
14-Jun-2004 |
le |
Correctly calculate subdisk offset in RAID5 plexes.
|
130389 |
12-Jun-2004 |
le |
Add a first version of a GEOMified vinum.
|
130280 |
09-Jun-2004 |
phk |
Make the sysctl kern.geom.collectstats more granular.
Bit 0 controls statistics collection on GEOM providers. Bit 1 controls statistics collection on GEOM consumers.
Default value is 1.
Prodded by: scottl
|
130193 |
07-Jun-2004 |
pjd |
Fix format string.
|
130191 |
07-Jun-2004 |
pjd |
Don't allow for duplicated entries creation.
|
129963 |
01-Jun-2004 |
joerg |
Add SVR4-compatible VTOC-style elements to the Sun label. The FreeBSD kernel doesn't use them but sunlabel(8) shortly will, and both these files are used by sunlabel(8).
|
129877 |
30-May-2004 |
phk |
Zap a redundant NULL
|
129747 |
26-May-2004 |
pjd |
Dump some more informations: - device state - list of used providers - total number of disks - number of disks online
Prodded by: Alex Deiter <tiamat@komi.mts.ru>
|
129548 |
21-May-2004 |
pjd |
- Change command name from 'config' to 'configure'. - Bump version number.
|
129478 |
20-May-2004 |
pjd |
- Teach CONCAT class how to talk with geom(8). - Remove provider if any disk was lost. - Dump CONCAT version.
Supported by: Wheel - Open Technologies - http://www.wheel.pl
|
129473 |
20-May-2004 |
pjd |
Introduce STRIPE GEOM class. It implements RAID0 transformation and it is intend to be fast. Just like CONCAT class it provides manual and auto configuration methods.
Supported by: Wheel - Open Technologies - http://www.wheel.pl
|
129471 |
20-May-2004 |
pjd |
Introduce NOP GEOM class. This is totally transparent GEOM class, but it is very useful for tests. One is able to destroy its provider forcibly if wants to test how other class handle such events. One is also able to specify failure probability to check how other classes handle I/O errors.
Supported by: Wheel - Open Technologies - http://www.wheel.pl
|
129116 |
11-May-2004 |
sos |
Dont try to finish devstat's if the disk pointer is NULL, this can happen when a disk has been destroyed but still has outstanding bio's.
Reviewed by: phk
|
128957 |
05-May-2004 |
pjd |
Close some small wakeup<->msleep races.
|
128913 |
04-May-2004 |
pjd |
Fix compilation on 64-bit architectures.
Noticed by: Tinderbox
|
128889 |
03-May-2004 |
pjd |
Turn off debugging by default.
|
128887 |
03-May-2004 |
pjd |
Prefer signed type over unsigned to be able to assert negative reference count.
|
128881 |
03-May-2004 |
pjd |
- Hold g_gate_list_mtx lock while generating/checking unit number. Found by: mtx_assert() g_gate.c:273 - Set command before returning to userland with ENOMEM error value. Found by: assert() ggatel.c:108
|
128835 |
02-May-2004 |
pjd |
Make it compile on 64-bit architectures. The biggest issue was that 16-bit atomic operations aren't supported on all architectures.
|
128760 |
30-Apr-2004 |
pjd |
Kernel bits of GEOM Gate.
|
128747 |
30-Apr-2004 |
marcel |
Allow disks with a GPT to be used on big-endian machines. The GPT is little-endian by definition and needs byte-swap operations for any multi-byte field. While here fix indentation.
|
128486 |
20-Apr-2004 |
pjd |
- Don't check if 'gp' is non-NULL, it always is and GEOM wants to dump geom configuration when 'pp' and 'cp' are NULL. - Use tabs instead of spaces.
|
127863 |
04-Apr-2004 |
pjd |
Calculate bio_completed properly or die!
Approved by: phk
|
127699 |
01-Apr-2004 |
grehan |
Move the name attribute to the end of the conftxt line to simplify libdisk parsing (the name may be empty, or contain spaces).
Submitted by: Suleiman Souhlal <refugee@segfaulted.com>
|
127162 |
18-Mar-2004 |
pjd |
Move "is consumer attached?" check before G_VALID_PROVIDER() check, because if consumer is not attached, its provider never will be valid, so we never reach this check.
Approved by: phk
|
126832 |
11-Mar-2004 |
phk |
Be more insistent on destroying geoms at unload time. Still not perfect, but it will do (better) for now.
KASSERT that to have providers a class must have an access method.
Tag the new_provider event with the geom as well.
|
126798 |
10-Mar-2004 |
phk |
Rearrange some of the GEOM debugging tools to be more structured.
Retire g_sanity() and corresponding debugflag (0x8)
Retire g_{stall,release}_events().
Under #ifdef DIAGNOSTIC:
Make g_valid_obj() an official function and have it return an an non-zero integer which indicates the kind of object when found.
Implement G_VALID_{CLASS,GEOM,CONSUMER,PROVIDER}() macros based on g_valid_obj().
Sprinkle calls to these macros liberally over the infrastructure.
Always check that we do not free a live object.
|
126773 |
09-Mar-2004 |
pjd |
- Don't take sectorsize from first disk. Calculate it by finding least common multiple of all disks sector sizes. This will allow to safely concatenate disks with different sector sizes. - Mark unused function arguments. - Other minor cleanups.
|
126772 |
09-Mar-2004 |
pjd |
Print a space character between string given as a macro argument and bio description.
|
126726 |
07-Mar-2004 |
phk |
Don't panic on providers already withered when we wither a geom.
|
126674 |
05-Mar-2004 |
jhb |
kthread_exit() no longer requires Giant, so don't force callers to acquire Giant just to call kthread_exit().
Requested by: many
|
126589 |
04-Mar-2004 |
pjd |
Correct year in copyrights.
|
126565 |
03-Mar-2004 |
pjd |
- Remove d_valid field, we can use d_consumer field to check if disk is valid. - Use SYSCTL_DECL() instead of using own, ugly extern.
|
126450 |
01-Mar-2004 |
pjd |
Removed unused fields.
|
126449 |
01-Mar-2004 |
pjd |
We don't need d_length field.
|
126315 |
27-Feb-2004 |
pjd |
Even if we're sure that we can't be orphaned here, we have to define orphan field - we're enforcing it in GEOM. This will reach KASSERT in INVARIANTS case.
Add missing space.
Approved by: scottl (mentor)
|
126314 |
27-Feb-2004 |
pjd |
Remove unused field.
Approved by: scottl (mentor)
|
126080 |
21-Feb-2004 |
phk |
Device megapatch 4/6:
Introduce d_version field in struct cdevsw, this must always be initialized to D_VERSION.
Flip sense of D_NOGIANT flag to D_NEEDGIANT, this involves removing four D_NOGIANT flags and adding 145 D_NEEDGIANT flags.
|
126007 |
19-Feb-2004 |
pjd |
Introduce CONCAT GEOM class for disk concatenation. It allows manual and automatic (based on on-disk metadata) concatenation.
Reviewed by: phk, scottl Approved by: scottl (mentor)
|
125975 |
18-Feb-2004 |
phk |
Change the disk(9) API in order to make device removal more robust.
Previously the "struct disk" were owned by the device driver and this gave us problems when the device disappared and the users of that device were not immediately disappearing.
Now the struct disk is allocate with a new call, disk_alloc() and owned by geom_disk and just abandonned by the device driver when disk_create() is called.
Unfortunately, this results in a ton of "s/\./->/" changes to device drivers.
Since I'm doing the sweep anyway, a couple of other API improvements have been carried out at the same time:
The Giant awareness flag has been flipped from DISKFLAG_NOGIANT to DISKFLAG_NEEDSGIANT
A version number have been added to disk_create() so that we can detect, report and ignore binary drivers with old ABI in the future.
Manual page update to follow shortly.
|
125803 |
14-Feb-2004 |
phk |
Do not check error code from closing ->access() calls, we know they succeed.
|
125802 |
14-Feb-2004 |
phk |
Add a KASSERT which checks that a class never fails a closing ->access() call.
|
125755 |
12-Feb-2004 |
phk |
Remove the absolute count g_access_abs() function since experience has shown that it is not useful.
Rename the relative count g_access_rel() function to g_access(), only the name has changed.
Change all g_access_rel() calls in our CVS tree to call g_access() instead.
Add an #ifndef BURN_BRIDGES #define of g_access_rel() for source code compatibility.
|
125743 |
12-Feb-2004 |
phk |
Give both consumers and providers a {void *private, u_int index} which the implementing class can use to hang internal info from.
|
125713 |
11-Feb-2004 |
pjd |
Added g_print_bio() function to print informations about given bio.
Approved by: phk, scottl (mentor)
|
125657 |
10-Feb-2004 |
pjd |
Now we have g_topology_assert_not(), so use it to detect deadlocks.
Approved by: phk, scottl (mentor)
|
125656 |
10-Feb-2004 |
pjd |
Added macro which will be used to assert, that the topology lock is not held.
Approved by: phk, scottl (mentor)
|
125651 |
10-Feb-2004 |
phk |
don't call sbuf_clear() right after sbuf_new(), it is not necessary.
|
125591 |
08-Feb-2004 |
phk |
Polish the work/state engine in preparation for HW-crypto support.
|
125590 |
08-Feb-2004 |
phk |
Add a missing error case return.
Problem reported by: Flemming Jacobsen <fj@batmule.dk>
|
125579 |
07-Feb-2004 |
phk |
We don't need to hold Giant to create the worker kthread.
|
125539 |
06-Feb-2004 |
pjd |
Allow decreasing access count even if there is no disk anymore. This will allow closing disks that were removed while opened.
Approved by: phk, scottl (mentor)
|
125538 |
06-Feb-2004 |
le |
Fix memory leak.
PR: kern/58634 Submitted by: le Approved by: phk
|
125342 |
02-Feb-2004 |
phk |
Allow a GEOM class to unload if it has no geoms or a method function to get rid of them.
Prodded by: pjd
|
125332 |
02-Feb-2004 |
pjd |
- Use proper names in KASSERTs. - Typos.
Approved by: phk, scottl (mentor)
|
125325 |
02-Feb-2004 |
phk |
Check error return from g_clone_bio(). (netchild@)
Rearrange code to avoid duplication (phk@)
Submitted by: netchild@
|
125318 |
02-Feb-2004 |
phk |
Don't mingle malloc/g_event flags.
Spotted by: pjd@
|
125137 |
28-Jan-2004 |
phk |
Bring back the geom_bioqueues, they _are_ a good idea.
ATA will uses these RSN.
|
124885 |
23-Jan-2004 |
phk |
Make sure to keep track of canceled events.
Submitted by: Pawel Jakub Dawidek <nick@garage.freebsd.pl>
|
124883 |
23-Jan-2004 |
phk |
Add KASSERTS.
Submitted by: Pawel Jakub Dawidek <nick@garage.freebsd.pl>
|
124881 |
23-Jan-2004 |
phk |
Plug an insignificant memoryleak.
Submitted by: Pawel Jakub Dawidek <nick@garage.freebsd.pl>
|
124880 |
23-Jan-2004 |
phk |
Add missing newline in printf.
Submitted by: Pawel Jakub Dawidek <nick@garage.freebsd.pl>
|
124869 |
23-Jan-2004 |
phk |
Remove the MD5_KEY debugging tool
|
124864 |
23-Jan-2004 |
phk |
Remove no longer necessary debug printfs
|
124371 |
11-Jan-2004 |
phk |
Print the correct pointer in a KASSERT.
Submitted by: Pawel Jakub Dawidek <nick@garage.freebsd.pl>
|
124294 |
09-Jan-2004 |
phk |
KASSERT against no-op access requests.
Submitted by: Pawel Jakub Dawidek <nick@garage.freebsd.pl>
|
123761 |
23-Dec-2003 |
phk |
Prevent withering of the provider we're orphaning from happening until we do it ourselves.
Nailed by: Simon Heath <heath@cng.fr>
|
123271 |
07-Dec-2003 |
truckman |
Correct usage of mtx_init() API. This is not a functional change since the code happened to work because MTX_DEF and NULL are both defined as 0.
Reviewed by: phk
|
123233 |
07-Dec-2003 |
phk |
KASSERT against multiple orphanings of providers.
|
123215 |
07-Dec-2003 |
scottl |
Re-arrange and consolidate some random debugging stuff
|
122888 |
18-Nov-2003 |
phk |
Call class->init() an class->fini() while the class is hooked up, rather than right before and right after. This allows these routines to manipulate the mesh.
KASSERT that nobody creates a geom on an alien class.
Assert topology in g_valid_obj().
Approved by: re@
|
122880 |
18-Nov-2003 |
phk |
Fix a harmless bug and add a ')' in a debugging printf.
Submitted by: "Bjoern A. Zeeb" <bzeeb-lists@lists.zabbadoz.net>
|
122762 |
15-Nov-2003 |
phk |
This is a crude bandaid for 5.2 to protect against providers which disappear while being tasted. I can moderately easy trigger this with atapi-cd, but I do not fully understand the circumstances.
|
122550 |
12-Nov-2003 |
phk |
Make sure to return errors if we have any.
Submitted by: Pawel Jakub Dawidek <nick@garage.freebsd.pl>
|
121476 |
24-Oct-2003 |
phk |
Close the right consumers if we run into trouble opening them all.
Submitted by: Pawel Jakub Dawidek <nick@garage.freebsd.pl>
|
121475 |
24-Oct-2003 |
phk |
Fix two old/new consumer confusions.
Submitted by: Pawel Jakub Dawidek <nick@garage.freebsd.pl>
|
121366 |
22-Oct-2003 |
phk |
Fix a braino memory leak.
Found by: Pawel Jakub Dawidek <nick@garage.freebsd.pl>
|
121323 |
22-Oct-2003 |
phk |
Forgotten commit: If a provider has zero sectorsize, it is an indication of lack of media.
Tripped up: peter
|
121253 |
19-Oct-2003 |
phk |
Remove KASSERT check for negative bio_offsets, add "normal" EIO error return for same.
|
121216 |
18-Oct-2003 |
phk |
Retire bio_blkno entirely.
bio_offset is the field drivers should use. bio_pblkno remains as a convenient place to store the number of the device drivers.
|
121030 |
12-Oct-2003 |
phk |
Assume that bp->bio_offset is correctly initialized.
This fixes non-power-of-2 blocksize GEOM I/O.
|
121029 |
12-Oct-2003 |
phk |
Destroy providers maked with G_PF_WITHER when the last consumer has detached.
|
120876 |
07-Oct-2003 |
phk |
Interior decoration changes.
|
120852 |
06-Oct-2003 |
phk |
Allow our bio tools to be used for local bio-chopping by not mandating a bio_from value. bio_to is still mandated (mostly for debuggign) and shall be copied from the parent bio.
|
120851 |
06-Oct-2003 |
phk |
Introduce a per provider wither flag
|
120572 |
29-Sep-2003 |
phk |
Return ENODEV in case the driver has no dump routine.
|
120506 |
27-Sep-2003 |
phk |
The present defaults for the open and close for device drivers which provide no methods does not make any sense, and is not used by any driver.
It is a pretty hard to come up with even a theoretical concept of a device driver which would always fail open and close with ENODEV.
Change the defaults to be nullopen() and nullclose() which simply does nothing.
Remove explicit initializations to these from the drivers which already used them.
|
120493 |
26-Sep-2003 |
phk |
Add more KASSERTS().
|
120374 |
23-Sep-2003 |
phk |
Be more careful in dumpconf: softc may be NULL for departing devices.
Allow drivers to initialize the d_devstat if they want magic params.
|
119973 |
11-Sep-2003 |
phk |
Reorder a couple of KASSERTS to give more sensible messages.
Found by: GEOM 101 class of '03
|
119891 |
08-Sep-2003 |
phk |
Correct bzero length so we clear the entire key structure.
|
119809 |
06-Sep-2003 |
phk |
Bzero the right number of bytes.
Found by: Juergen Buchmueller <pullmoll@stop1984.com>
|
119749 |
04-Sep-2003 |
phk |
Make sure to return ENOIOCTL if the ioctl is not handled.
|
119660 |
01-Sep-2003 |
phk |
Simplify the ioctl handling in GEOM.
This replaces the current ioctl processing with a direct call path from geom_dev() where the ioctl arrives (from SPECFS) to any directly connected GEOM class.
The inverse of the above is no longer supported. This is the situation were you have one or more intervening GEOM classes, for instance a BSDlabel on top of a MBR or PC98. If you want to issue MBR or PC98 specific ioctls, you will need to issue them on a MBR or PC98 providers.
This paves the way for inviting CD's, FD's and other special cases inside GEOM.
|
119652 |
01-Sep-2003 |
phk |
Try to close the race between disk_destroy() and a subsequent disk_create().
|
119593 |
30-Aug-2003 |
phk |
Add the new g_dev_getprovider() function, the swap_pager needs it now.
Spotted by: mr
|
119300 |
22-Aug-2003 |
ps |
Change the the size fields to daddr_t to support greater than 2TB ccd volumes.
Reviewed by: phk
|
119299 |
22-Aug-2003 |
phk |
Make CCD unloadable.
|
119298 |
22-Aug-2003 |
phk |
Don't panic over the fact that unloading failed if we already knew that.
|
119296 |
22-Aug-2003 |
phk |
Block all GETATTR calls hitting the CCD, we wouldn't know which child device should handle them.
This prevents for instance GEOM::ioctl requests from reaching a lower BSDlabel node, which ps@ found would confuse newfs(8).
|
119295 |
22-Aug-2003 |
phk |
Check for null softc pointers, these happens when a ccd is withering.
Found by: David Schultz <dschultz@OCF.Berkeley.EDU>
|
118869 |
13-Aug-2003 |
phk |
Replace a panic with a .1Hz retry loop. Not a perfect solution, but far cheaper than one.
|
118855 |
13-Aug-2003 |
phk |
In case we encounter a zero sectorsize provider in g_io_check(), fail the request with a printf rather than a divide by zero error.
|
118355 |
02-Aug-2003 |
phk |
Kick Giant compatibility one layer up.
|
118182 |
29-Jul-2003 |
phk |
Fix a memory leak in CCD's mirror code.
|
118150 |
29-Jul-2003 |
phk |
Implement DOSPTYP_EXTLBA more completely: loop until we find no more partitions.
Submitted by: Rudolf Cejka <cejkar@fit.vutbr.cz> PR: 53719
|
117342 |
08-Jul-2003 |
phk |
Handle geoms which are withering away specially in the dump functions.
|
117150 |
02-Jul-2003 |
phk |
Only dump 512 bytes of debugging.
Always wait for things to settle before returning.
|
116522 |
18-Jun-2003 |
phk |
Sleep on "-" in our normal state to simplify debugging.
|
116518 |
18-Jun-2003 |
phk |
Add "GEOM_FOX", a class which detects and selects between multiple redundant paths to the same device.
This class reacts to a label in the first sector of the device, which is created the following way:
# "0123456789abcdef012345..." # "<----magic-----><-id-...> echo "GEOM::FOX someid" | dd of=/dev/da0 conv=sync
NB: Since the fact that multiple disk devices are in fact the same device is not known to GEOM, the geom taste/spoil process cannot fully catch all corner cases and this module can therefore be confused if you do the right wrong things.
NB: The disk level drivers need to do the right thing for this to be useful, and that is not by definition currently the case.
|
116196 |
11-Jun-2003 |
obrien |
Use __FBSDID().
Approved by: phk
|
116107 |
09-Jun-2003 |
phk |
Fix error handling for ENOMEM style issues.
|
115960 |
07-Jun-2003 |
phk |
Improve the root-dev prompt facility for printing devices which could possibly be a root filesystem.
|
115959 |
07-Jun-2003 |
phk |
Wait for everything to settle before we try to print the list of geom devices.
|
115958 |
07-Jun-2003 |
phk |
Make sure we return an error message if the geom parameter is not located.
|
115953 |
07-Jun-2003 |
phk |
Polishing and nitpicking.
|
115951 |
07-Jun-2003 |
phk |
Drop a memory-corruption debugging test-tool.
|
115949 |
07-Jun-2003 |
phk |
Add missing va_end() calls.
Noticed by: tmm
|
115850 |
04-Jun-2003 |
phk |
Introduce g_provider_by_name() function, and use it.
|
115849 |
04-Jun-2003 |
phk |
Make this a true GEOM class: Attach to the component devices using GEOM semantics. Create a GEOM provider instead of using disk_create() Use the GEOM OAM api for configuration.
I saw approx ~1% speedup in througput and ~7% in latency in a simple minded test of a two-disk striped device.
This file was repo-copied from src/sys/dev/ccd/ccd.c.
This is not yet linked into the build.
|
115845 |
04-Jun-2003 |
phk |
Add a KASSERT to prevent the same GEOM class from being processed loaded twice.
Enforce that classes should have different names while we are here.
|
115731 |
02-Jun-2003 |
phk |
Further devilification of CCD:
Change the list interface to simplify things. Remove old list ioctls which bogusly exported the softc to userland. Move the softc and associated structures from the public header to the source file.
|
115729 |
02-Jun-2003 |
phk |
Begin deevilification of CCD:
Make CCD a GEOM class.
For now only use this for implementing a OAM config method which can return a list of configured CCD devices in the format which "ccdconfig -g[v]" would normally output.
|
115726 |
02-Jun-2003 |
phk |
Return an indicative error message.
|
115624 |
01-Jun-2003 |
phk |
Simplify the GEOM OAM api: Drop the request type, and let everything hinge on the "verb" parameter which the class gets to interpret as it sees fit.
Move the entire request into the kernel and move changed parameters back when done.
|
115623 |
01-Jun-2003 |
phk |
constify g_sanity()
|
115611 |
01-Jun-2003 |
phk |
Use bcmp() to compare hash strings.
|
115517 |
31-May-2003 |
phk |
Remove unused variable. Remove unneeded return;
Found by: FlexeLint
|
115515 |
31-May-2003 |
phk |
Remove unused variables.
Found by: FlexeLint
|
115512 |
31-May-2003 |
phk |
Remove unused variables. Rename struct h0h0 to g_hh01 in order to make it unique over files.
Found by: FlexeLint
|
115509 |
31-May-2003 |
phk |
Remove unused variables. Remove #ifdef notyet which will never become.
Found by: FlexeLint
|
115508 |
31-May-2003 |
phk |
Remove unused variable. Remove unneeded return.
Found by: FlexeLint
|
115507 |
31-May-2003 |
phk |
Remove unused variable.
Found by: FlexeLint
|
115506 |
31-May-2003 |
phk |
Add a destroy_geom method to the slice "library". If a slice class has no destroy_geom method, use this one.
This should allow all slicers to kldload.
|
115505 |
31-May-2003 |
phk |
Don't use & in front of arrays.
Found by: FlexeLint
|
115504 |
31-May-2003 |
phk |
Remove unused variable.
Found by: FlexeLint
|
115492 |
31-May-2003 |
phk |
Remove unused variable.
Found by: FlexeLint
|
115473 |
31-May-2003 |
phk |
Introduce a init and fini member functions on a class.
Use ->init() and ->fini() to handle the mutex in geom_disk.c
Remove the g_add_class() function and replace it with a standardized g_modevent() function.
This adds the basic infrastructure for loading/unloading GEOM classes
|
115468 |
31-May-2003 |
phk |
Remove the G_CLASS_INITIALIZER, we do not need it anymore.
|
115460 |
31-May-2003 |
phk |
Use le_uuid_dec() since GPT UUID's are always in LE format.
Tested by: Marcel
|
115309 |
25-May-2003 |
phk |
Don't do silly thing if the disk_create() event gets canceled.
Approved by: re/scottl
|
115214 |
21-May-2003 |
phk |
Return ENXIO if the softc pointer is NULL, in all likelyhood the disk is in the process of disappearing.
Approved by: re/rwats*
|
114958 |
12-May-2003 |
phk |
When a disk disappears, destroy the class from the event thread to avoid race condtion.
Approved by: re/rwatson
|
114864 |
09-May-2003 |
phk |
When a GEOM (/dev-)device is closed and we find that I/O requests are still outstanding, give them a chance to complete.
If after 10 seconds we still find outstanding I/O requests, complete the close with a console warning that the system is likely to panic later on.
This is a workaround for umount -f not quite doing the right thing.
Approved by: re/scottl
|
114795 |
07-May-2003 |
phk |
Hide the "ENOMEM" notice messages behind bootverbose. They are still a valuable debugging tool for certain kinds of problems.
Approved by: re/scottl
|
114785 |
06-May-2003 |
phk |
Fix the WARNING for wrong rawoffset, I tested incompatible units.
Approved by: re/jhb
|
114736 |
05-May-2003 |
phk |
Avoid double-free panic.
Tripped up: DougB
|
114720 |
05-May-2003 |
phk |
Re-order the the initialization slightly to improve structure.
|
114715 |
05-May-2003 |
phk |
Use a dedicated malloc(9) bucket for sector storage.
|
114712 |
05-May-2003 |
phk |
Don't warn if the rawoffset is zero, that is actually the best value it could have.
|
114705 |
05-May-2003 |
phk |
Turn the check that rawoffset == mbroffset into a warning instead.
|
114672 |
04-May-2003 |
phk |
Only accept a rawoffset if it is identical to the mbroffset.
|
114671 |
04-May-2003 |
phk |
Add a way to read the current mbroffset from a BSD label class.
|
114670 |
04-May-2003 |
phk |
Add gctl_set_param() function.
|
114668 |
04-May-2003 |
phk |
Remove debugging printfs which should not have been committed.
|
114568 |
03-May-2003 |
phk |
Add a OAM interface for changing the label and writing the boot code.
|
114566 |
03-May-2003 |
phk |
remove unused variables.
Spotted by: dougb
|
114556 |
02-May-2003 |
phk |
Make bsd_disklabel_le_enc calculate the checksum and fill it in. (If there is a legitimate need to correctly encode and pack a disklabel with an invalid checksum custom tools can be built for that.)
Make bsd_disklabel_le_dec() validate the magics, number of partitions (against a new parameter) and the checksum.
Vastly simplify the logic of the GEOM::BSD class implementation:
Let g_bsd_modify() always take a byte-stream label.
This simplifies all users, except the ioctl's which now have to convert to a byte-stream first. Their loss.
g_bsd_modify() is called with topology held now, and it returns with it held.
Always update the md5sum in g_bsd_modify(), otherwise the check is no use after the first modification of the label. Make the MD5 over the bytestream version of the label.
Move the rawoffset hack to g_bsd_modify() and remove all the inram/ondisk conversions.
Don't configure hotspots in g_bsd_modify(), do it in taste instead, we do not support moving the label to a different location on the fly anyway.
This passes all current regression tests.
|
114548 |
02-May-2003 |
phk |
Pull in bcopy() prototype from <string.h> when compiled in userland.
|
114543 |
02-May-2003 |
phk |
Considering that I did cast the arguments to (intmax_t) I must have been sleepy since I used %qd instead of %jd.
|
114533 |
02-May-2003 |
phk |
Style improvement.
|
114532 |
02-May-2003 |
phk |
Use g_wither_geom() and plug memory leaks.
|
114531 |
02-May-2003 |
phk |
Plug memory leaks.
|
114526 |
02-May-2003 |
phk |
Use an uma-zone for allocation bio requests.
|
114519 |
02-May-2003 |
phk |
Use g_slice_spoiled() instead of g_std_spoiled().
Add XXX comment about minor memory leak until I can fix it.
|
114518 |
02-May-2003 |
phk |
Use g_slice_spoiled() instead of g_std_spoiled().
|
114517 |
02-May-2003 |
phk |
Use g_slice_spoiled(). Free buffer from g_read_data().
|
114511 |
02-May-2003 |
phk |
Back out all the stuff that didn't belong in the last commit.
|
114508 |
02-May-2003 |
phk |
Use g_slice_spoiled() rather than g_std_spoiled().
Remember to free the buffer we got from g_read_data().
|
114507 |
02-May-2003 |
phk |
Use g_slice_spoiled() not g_std_spoiled()
|
114506 |
02-May-2003 |
phk |
Use g_slice_spoiled() rather than g_std_spoiled()
|
114505 |
02-May-2003 |
phk |
Use g_slice_spoiled() rather than g_std_spoiled().
|
114504 |
02-May-2003 |
phk |
Use a more tailored spoil routine for slices, and take advantage of g_wither_geom() to do most of the work for us.
|
114499 |
02-May-2003 |
phk |
Style improvement.
|
114498 |
02-May-2003 |
phk |
Use g_wither_geom() for cleanup.
|
114495 |
02-May-2003 |
phk |
Rework the "withering" mechanism:
Introduce g_wither_geom() to do the work in one single place.
|
114493 |
02-May-2003 |
phk |
Rename g_slice_init() to the more appropriate g_slice_alloc() and give it a g_slice_free() partner function.
|
114491 |
02-May-2003 |
phk |
style improvement.
|
114490 |
02-May-2003 |
phk |
Get rid of trivial function g_destroy_event().
|
114459 |
01-May-2003 |
phk |
Plug some memory-leaks.
|
114455 |
01-May-2003 |
phk |
Remove the now obsolete geomidorname hack.
|
114450 |
01-May-2003 |
phk |
Add a new flag, EV_CANCELED, and use it to make g_waitfor_event() return EAGAIN if an event got canceled.
|
114447 |
01-May-2003 |
phk |
When events on a reference is cancelled, check our doorstep first, it might be an orphan.
|
114440 |
01-May-2003 |
phk |
Remove now unneeded special case for "geom.ctl".
|
114421 |
01-May-2003 |
nyan |
Remove DIOCGPC98 ioctl.
|
114414 |
01-May-2003 |
nyan |
- Move decoding pc98_partition function into geom_pc98_enc.c. - Add encoding pc98_partition function.
|
114367 |
01-May-2003 |
marcel |
Don't emulate a MBR by handling the MBR::type attribute. It is not needed at all. The BSD class will attach to a GPT class without it.
|
114293 |
30-Apr-2003 |
markm |
Fix some easy, global, lint warnings. In most cases, this means making some local variables static. In a couple of cases, this means removing an unused variable.
|
114251 |
29-Apr-2003 |
phk |
Fix an obscure fencepost error in GBDE's sector mapping code:
For certain combinations of sectorsize, mediasize and random numbers (used to define the mapping), a multisector read or write would ignore some subset of the sectors past the first sector in the request because those sectors would be mapped past the end of the parent device, and normal "end of media" truncation would zap that part of the request.
Rev 1.19+1.20 of g_bde_work.c added the check which should have alerted me to this happening. This commit maps the request correctly and adds KASSERTS to make sure things stay inside the parent device.
This does not change the on-disk layout of GBDE, there is no need to backup/restore.
|
114250 |
29-Apr-2003 |
phk |
Typo in last commit: Do not press xZZ to leave vi(1).
|
114249 |
29-Apr-2003 |
phk |
When a bio comes back from below with a zero error code, check that it wrote the full length. The only case where this should be able to happen is if we try to read/write past the end and the request is truncated. We obviously should never try to do that, so this code should never activate.
|
114216 |
29-Apr-2003 |
kan |
Deprecate machine/limits.h in favor of new sys/limits.h. Change all in-tree consumers to include <sys/limits.h>
Discussed on: standards@ Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>
|
114167 |
28-Apr-2003 |
phk |
I accidentally leaked this debugging tool in with my last commit.
Disable it with a direct warning.
|
114153 |
28-Apr-2003 |
phk |
Rename g_bde_get_sector() to g_bde_get_keysector() and pick up the offset from the work packet.
|
114152 |
28-Apr-2003 |
phk |
Only attempt total cache-purge once in case of failure.
|
114150 |
28-Apr-2003 |
phk |
Better criteria for skipping disk reading BIO_READ work packets.
|
114148 |
28-Apr-2003 |
phk |
Explicitly set the sector state to JUNK if we encounter a read-error.
|
114088 |
26-Apr-2003 |
phk |
Bail as soon as the first write request has failed, there is no point in trying the second write if the first one went nowhere.
|
114087 |
26-Apr-2003 |
phk |
Appearantly UFS no longer issues BIO_DELETE requests correctly, and consequently trashes data. Disable BIO_DELETE handling in gbde for now.
|
114041 |
25-Apr-2003 |
phk |
Do an explicit retry after we have dumped the cache, rather than a (potential) tail recursion.
|
114040 |
25-Apr-2003 |
phk |
If on a BIO_READ request, we failed to allocate the bio for reading our key-sector, we would end up returning the read without an error, despite the fact that the data was not correctly decrypted.
This would result in data corruption on read, but intact data still on the media.
|
114038 |
25-Apr-2003 |
phk |
Fix a problem and slightly improve the ENOMEM handling:
Give up the entire bio as soon as we detect a problem.
When we detect a problem, give up the bio by contributing the remainder with ENOMEM, rather than kicking the bio back right away.
If we failed on a non-first iteration we previously could end up modifying fields in the bio after we delivered it. This could account for memory corruption (none directly reported) on machines with GBDE.
|
114035 |
25-Apr-2003 |
phk |
Don't count a sector in the cache unless we manage to create it.
|
114034 |
25-Apr-2003 |
phk |
Rename g_bde_release_sector() to g_bde_release_keysector() and pick up the sector from the work item.
|
114033 |
25-Apr-2003 |
phk |
Rename g_bde_read_sector() to g_bde_read_keysector() pick up the offset in the work structure.
|
113940 |
23-Apr-2003 |
phk |
Introduce a g_waitfor_event() function which posts an event and waits for it to be run (or cancelled) and use this instead of home-rolled versions.
|
113938 |
23-Apr-2003 |
phk |
More of the event stuff can now be private to geom_event.c
|
113937 |
23-Apr-2003 |
phk |
Rename g_call_me() to g_post_event(), and give it a flag argument to determine if we can M_WAITOK in malloc.
|
113934 |
23-Apr-2003 |
phk |
Remove the now unused hardcoded g_post_event() event support.
|
113930 |
23-Apr-2003 |
phk |
Turn EV_NEW_PROVIDER into a g_call_me() event.
|
113929 |
23-Apr-2003 |
phk |
Convert EV_SPOILED event to use g_call_me().
|
113927 |
23-Apr-2003 |
phk |
Turn the hardwired NEW_CLASS event into a g_call_me() event.
|
113926 |
23-Apr-2003 |
phk |
Move the shutdown eventhandler stuff to a more logical place.
|
113895 |
23-Apr-2003 |
phk |
Implement CONFIG_GEOM verbs "write label" and "write bootcode".
|
113893 |
23-Apr-2003 |
phk |
Introduce gctl_get_paraml() which gets a parameter only if it has the right length.
|
113892 |
23-Apr-2003 |
phk |
Make gctl_error() take printfline varargs.
|
113889 |
23-Apr-2003 |
phk |
Remove unused event pointers in object structures. Remove KASSERTS which checked that they were unused.
|
113880 |
22-Apr-2003 |
phk |
Change the locking so that the _modify function is called with topology held.
The only place where we want to not hold topology is when we read (or write) the label to disk: in the case of a disk error with a long recovery time, holding topology would prevent open/close of any disk device.
|
113879 |
22-Apr-2003 |
phk |
We don't need to have a slice->start() function.
|
113878 |
22-Apr-2003 |
phk |
Do not mandate that slicers have a private ->start(), they may not need one. KASSERT() that they have one if G_SLICE_HOT_START is used.
|
113876 |
22-Apr-2003 |
phk |
Implement handling of CONFIG_GEOM OAM request.
|
113875 |
22-Apr-2003 |
phk |
Add "CONFIG_GEOM" operation to the OAM API.
|
113862 |
22-Apr-2003 |
phk |
Collapse meta arguments into regular arguments, the distinction is more trouble than it is worth.
|
113821 |
21-Apr-2003 |
phk |
Implement a hotspot for the sunlabel.
This means that you can no longer trash your opened partitions by writing to the sunlabel through another partition. This is similar to the semantics implemented for BSD labels.
|
113819 |
21-Apr-2003 |
phk |
Update GEOM::SUN to use the decoding functions in geom_sunlabel_enc.c and #defines from sys/sun_disklabel.h.
|
113818 |
21-Apr-2003 |
phk |
Use #defines from <sys/sun_disklabel.h> instead of private ones.
|
113813 |
21-Apr-2003 |
phk |
Functions to encode and decode Sun Microsystems disk partitioning data structures.
Mostly by: jake
|
113713 |
19-Apr-2003 |
phk |
Make more of the "hotspot" stuff generic:
Give the class a way to specify the necessary action for read/delete/write: ALLOW, DENY, START or CALL.
Update geom_bsd to use this.
|
113712 |
19-Apr-2003 |
phk |
Create a dedicated structure for holding hotspot information rather than using slice structures for it.
|
113593 |
17-Apr-2003 |
phk |
These two files fell off during my previous commit: put the encoding decoding functions for struct disklabel in a separate .c file.
|
113464 |
14-Apr-2003 |
phk |
More correct patch: Only call biofinish if we have not already sent any children down the mesh.
|
113462 |
14-Apr-2003 |
phk |
Call biofinish() also when we get a malloc() failure.
|
113432 |
13-Apr-2003 |
phk |
Time has run from the "run GEOM in userland" harness, and the new regression test is built to test GEOM as running in the kernel.
This commit is basically "unifdef -D_KERNEL" to remove the mainly #include related code to support the userland-harness.
|
113411 |
12-Apr-2003 |
phk |
If we hit access ahead of a spoil event, we should have negative delta access-counts and proceed.
|
113408 |
12-Apr-2003 |
phk |
Fix a bug which resulted in orphanization getting confused every now and then.
|
113392 |
12-Apr-2003 |
phk |
Retire the experimental bio_taskqueue(), it was not quite as usable as hoped. It can be revived from here, should other drivers be able to use it.
|
113390 |
12-Apr-2003 |
phk |
Retire the "frontstuff" record keeping, it was no match for the in-band meta-data of BSD labels and a more complex solution will be needed.
|
113389 |
12-Apr-2003 |
phk |
Move the functions for encoding decoding struct dos_partition into a separate .c file so they can be used from userland as well.
|
113294 |
09-Apr-2003 |
phk |
Only be verbose if (bootverbose)
|
113292 |
09-Apr-2003 |
phk |
With the magic sequence checks removed this class is downright dangerous to have in your kernel since it indiscriminately attaches to anything it is offered with a range of bogus partitions.
Stop this from happening by rejecting any label with negative numbers in it.
|
113286 |
09-Apr-2003 |
phk |
Correctly split cyl/sects bytes when we print them.
|
113285 |
09-Apr-2003 |
phk |
Style issue: use do {...} while(0); for multi-exit section.
|
113034 |
03-Apr-2003 |
phk |
Retire the DIOCGMBR ioctl before anybody starts to use it.
|
113032 |
03-Apr-2003 |
phk |
Remove all references to BIO_SETATTR. We will not be using it.
|
113031 |
03-Apr-2003 |
phk |
Update the initializer for GEOM_MBREXT, I overlooked it previously.
|
113030 |
03-Apr-2003 |
phk |
Add #define for DOSPTYP_PMBR, and use it.
|
113013 |
03-Apr-2003 |
phk |
#include <sys/endian.h> as needed.
|
113012 |
03-Apr-2003 |
phk |
Remove geom_enc.c, a superset of these functions are now available in <sys/endian.h>
|
113011 |
03-Apr-2003 |
phk |
Use <sys/endian.h> instead of geom_enc.c for endianess-agnostification.
|
113010 |
03-Apr-2003 |
phk |
Use sys/endian.h instead of geom_enc.c for endian-agnostfication.
|
113008 |
03-Apr-2003 |
phk |
Make sure we don't ignore error codes.
|
112989 |
02-Apr-2003 |
phk |
Add handling for cancelled events in the g_call_me() methods.
|
112988 |
02-Apr-2003 |
phk |
Change events to have an array of "void *" references, and give the event posting functions varargs to fill these.
Attribute g_call_me() to appropriate g_geom's where necessary.
Add a flag argument to g_call_me() methods which will be used to signal cancellation of events in the future.
This commit should be a no-op.
|
112979 |
02-Apr-2003 |
phk |
Only orphan things if the open/close actually succeeded.
|
112978 |
02-Apr-2003 |
phk |
Properly handle races between open/close and orphan.
KASSERT the race between close and strategy, it is an error in the upper echelons if this happens,
Add XXX: comment explaining why the ioctl/orphan race is not closed.
|
112952 |
01-Apr-2003 |
phk |
Include <geom/geom_disk.h> not <sys/disk.h>
|
112946 |
01-Apr-2003 |
phk |
Use bioq_flush() to drain a bio queue with a specific error code. Retain the mistake of not updating the devstat API for now.
Spell bioq_disksort() consistently with the remaining bioq_*().
#include <geom/geom_disk.h> where this is more appropriate.
|
112943 |
01-Apr-2003 |
phk |
Start to split the GEOM/diskdriver specific bits into geom/geom_disk.h
|
112927 |
01-Apr-2003 |
phk |
Remove the old config interface, the new OAM is sufficiently functional now.
|
112926 |
01-Apr-2003 |
phk |
Remove the old config interface now that the new OAM is functional.
|
112876 |
31-Mar-2003 |
phk |
Remove some debugging in the new OAM[*] and add a debug flag for other parts of it.
[*] I've been asked what "OAM" means: It's an acronym used in the telecom industry, "Operations And Maintenance", and there it covers anything from a single unlabeled led on the frontpanel the the full nightmare of CMIP for SS7.
|
112830 |
29-Mar-2003 |
phk |
Fix a bug in the ENOMEM pacing code which probably made it panic systems after a lot of ENOMEM errors.
|
112828 |
29-Mar-2003 |
phk |
Add create_geom and destroy_geom methods.
|
112709 |
27-Mar-2003 |
phk |
Run a revision on the OAM api.
Use prefix gctl_ systematically. Add flag with access perms for each argument. Add ro/rw versions of argument building functions. General cleanup.
|
112708 |
27-Mar-2003 |
phk |
Check return value of g_call_me()
|
112596 |
25-Mar-2003 |
phk |
g_class_by_name() was unused too.
|
112595 |
25-Mar-2003 |
phk |
Remove unuse g_insert_geom().
|
112594 |
25-Mar-2003 |
phk |
Forward compatibility: NULL check the passed in meta argument.
|
112552 |
24-Mar-2003 |
phk |
Premptively change initializations of struct g_class to use C99 sparse struct initializations before we extend the struct with new OAM related member functions.
|
112534 |
24-Mar-2003 |
phk |
Turn /dev/geom.ctl from a GEOM class into a plain character device driver instead, it will never see a disk-I/O transaction, so this is a lot simpler.
|
112533 |
24-Mar-2003 |
phk |
Save a lock: Grab the stall_events SX lock exclusively so it also serialize OAM reqests.
|
112518 |
23-Mar-2003 |
phk |
Introduce g_cancel_events() and use it a couple of places where it makes sense.
|
112517 |
23-Mar-2003 |
phk |
Introduce an SX lock which allows us to stall event processing during OAM operations.
|
112512 |
23-Mar-2003 |
phk |
I forgot the evil ioctl census scripts: #include <geom/geom_ctl.h>
|
112511 |
23-Mar-2003 |
phk |
Marshalling stuff for OAM API.
|
112509 |
23-Mar-2003 |
phk |
A note about which #include files may be used where.
|
112508 |
23-Mar-2003 |
phk |
Start leaking the AOM api into the tree.
|
112476 |
21-Mar-2003 |
phk |
Mitigate deadlock situation pending a more complete solution.
|
112370 |
18-Mar-2003 |
phk |
Retire the GEOM private statistics code and use devstat instead.
|
112367 |
18-Mar-2003 |
phk |
Including <sys/stdint.h> is (almost?) universally only to be able to use %j in printfs, so put a newsted include in <sys/systm.h> where the printf prototype lives and save everybody else the trouble.
|
112322 |
16-Mar-2003 |
phk |
#ifdef notyet a bit of code which needs not yet committed refcounting to work correctly.
|
112259 |
15-Mar-2003 |
phk |
Use devstat_{start,end}_transaction_bio(). Remember to set bio_resid correctly first.
|
112070 |
10-Mar-2003 |
phk |
If we run out of consumers while orphaning them, and the provider's geom is withering, destroy the provider when done.
This was exposed by the recent change to geom_dev's orphaning logic.
|
112069 |
10-Mar-2003 |
phk |
Fix yet another fallout of our M_* song and dance.
|
112030 |
09-Mar-2003 |
phk |
Remove unneeded #include of geom_stats.h
|
112029 |
09-Mar-2003 |
phk |
Stamp out Danglish.
|
112028 |
09-Mar-2003 |
phk |
Don't use statistics counters to detect outstanding I/O.
|
112027 |
09-Mar-2003 |
phk |
Don't abuse the statistics counters for detecting if we have outstanding I/O requests, instead use the new dedicated fields in the consumer and provider to track this.
|
112026 |
09-Mar-2003 |
phk |
Add u_int nstart, nend counters to consumer and providers so we will not have to examine the stats structure to tell if we have outstanding I/O requests.
Making them u_int improves the chance of atomic updates to them, but risks roll-over. Since the only interesting property is if they are equal or not, this is not an issue.
|
112024 |
09-Mar-2003 |
phk |
When a DEV class consumer is orphan'ed we need to wait for all the outstanding requests to return before we unravel the mesh.
It is very important that the stuff below us plays nice and don't overlook a couple of outstanding bio's, because until they remember the geom event thread is blocked. At an expense in code here this could be made more robust, but I actually _want_ a robust failure in this case so any offending drivers can be fixed.
|
112002 |
08-Mar-2003 |
phk |
Allocate devstat structure with devstat_new_entry().
|
111979 |
08-Mar-2003 |
phk |
Centralize the devstat handling for all GEOM disk device drivers in geom_disk.c.
As a side effect this makes a lot of #include <sys/devicestat.h> lines not needed and some biofinish() calls can be reduced to biodone() again.
|
111964 |
07-Mar-2003 |
phk |
Limit our requests to DFLTPHYS, this is generally a good idea for memory-allocation purposes. Right now it is also a very good idea because we hit a Giant assertion in the free(9) processing if we free something larger than 64k.
|
111863 |
04-Mar-2003 |
phk |
Initialize the second buffer for mirroring to point to itself and not its partner.
|
111815 |
03-Mar-2003 |
phk |
Gigacommit to improve device-driver source compatibility between branches:
Initialize struct cdevsw using C99 sparse initializtion and remove all initializations to default values.
This patch is automatically generated and has been tested by compiling LINT with all the fields in struct cdevsw in reverse order on alpha, sparc64 and i386.
Approved by: re(scottl)
|
111733 |
02-Mar-2003 |
phk |
NO_GEOM cleanup:
Remove cdevsw->d_psize() implementation, we don't need it any more.
|
111668 |
28-Feb-2003 |
phk |
NO_GEOM cleanup:
Retire the "dev_t" centric version of the disk mini-layer. Remove now unneeded linkage field in dev_t and struct disk.
|
111462 |
25-Feb-2003 |
mux |
Cleanup of the d_mmap_t interface.
- Get rid of the useless atop() / pmap_phys_address() detour. The device mmap handlers must now give back the physical address without atop()'ing it. - Don't borrow the physical address of the mapping in the returned int. Now we properly pass a vm_offset_t * and expect it to be filled by the mmap handler when the mapping was successful. The mmap handler must now return 0 when successful, any other value is considered as an error. Previously, returning -1 was the only way to fail. This change thus accidentally fixes some devices which were bogusly returning errno constants which would have been considered as addresses by the device pager. - Garbage collect the poorly named pmap_phys_address() now that it's no longer used. - Convert all the d_mmap_t consumers to the new API.
I'm still not sure wheter we need a __FreeBSD_version bump for this, since and we didn't guarantee API/ABI stability until 5.1-RELEASE.
Discussed with: alc, phk, jake Reviewed by: peter Compile-tested on: LINT (i386), GENERIC (alpha and sparc64) Runtime-tested on: i386
|
111277 |
23-Feb-2003 |
grehan |
Drop down Apple Partition Map code that has been in use by some ppc developers for a while.
OK'd by: phk
|
111232 |
21-Feb-2003 |
phk |
NO_GEOM cleanup: Convert CCD(4) to be use "struct disk*" instead of "dev_t" as "this" handle.
|
111220 |
21-Feb-2003 |
phk |
NO_GEOM cleanup:
Retire the "d_dump_t" and use the "dumper_t" type instead.
Dumper_t takes a void * as first arg which is more general than the dev_t taken by d_dump_t. (Remember: we could have net-dumpers if somebody wrote us one!)
Define the convention for GEOM controlled disk devices to be that the first argument to the dumper function is the struct disk pointer.
Change device drivers accordingly.
|
111216 |
21-Feb-2003 |
phk |
NO_GEOM cleanup:
Change the argument to disk_destroy() to be the same struct disk * as disk_create() takes.
This enables drivers to ignore the (now) bogus dev_t which disk_create() returns.
|
111146 |
19-Feb-2003 |
phk |
Add M_WAITOK
|
111119 |
19-Feb-2003 |
imp |
Back out M_* changes, per decision of the TRB.
Approved by: trb
|
110766 |
12-Feb-2003 |
tegge |
Correctly set bio_data in cloned children when cutting up large requests.
|
110759 |
12-Feb-2003 |
phk |
Implement a handle for efficient implementation of perforations in lower extremities.
Setting bit 4 in debugflags (sysctl kern.geom.debugflags=16) will allow any open to succeed on rank#1 providers. This will generally correspond to the physical disk devices: ad0, da0, md0 etc.
This fundamentally violates the mechanics of GEOMs autoconfiguration, and is only provided as a debugging facility, so obviously error reports on GEOM where this bit is or has been set will not be accepted.
|
110736 |
11-Feb-2003 |
phk |
Implement a bio-taskqueue to reduce number of context switches in disk I/O processing.
The intent is that the disk driver in its hardware interrupt routine will simply schedule the bio on the task queue with a routine to finish off whatever needs done.
The g_up thread will then schedule this routine, the likely outcome of which is a biodone() which queues the bio on g_up's regular queue where it will be picked up and processed.
Compared to the using the regular taskqueue, this saves one contextswitch.
Change our scheduling of the g_up and g_down queues to be water-tight, at the cost of breaking the userland regression test-shims.
Input and ideas from: scottl
|
110729 |
11-Feb-2003 |
phk |
Announce our ability to do MAXPHYS transfers.
|
110728 |
11-Feb-2003 |
phk |
Advertise MAXPHYS upwards, we will split as necessary before we get to the bottom of things.
|
110727 |
11-Feb-2003 |
phk |
Check disk->d_maxsize/dev->si_iosize_max at open time rather than in strategy.
Printf a warning and use DFLTPHYS if the drive has not set a size.
|
110720 |
11-Feb-2003 |
phk |
Make a mutex to stop the race coming into geom_disk's done routine.
Cut up requests into smaller bits if they are longer than the drivers disk->d_maxsize or dev->si_iosize_max.
Properly handle the race condition when using g_clone_bio() is used without having the single-threadedness of g_down/g_up secure locking.
|
110713 |
11-Feb-2003 |
phk |
Don't divide by zero if there is no stripewidth specified.
|
110712 |
11-Feb-2003 |
phk |
Typo in last commit.
|
110710 |
11-Feb-2003 |
phk |
Better names for struct disk elements: d_maxsize, d_stripeoffset and d_stripesisze;
Introduce si_stripesize and si_stripeoffset in struct cdev so we can make the visible to clustering code.
Add stripesize and stripeoffset to providers.
DTRT with stripesize and stripeoffset in various places in GEOM.
|
110708 |
11-Feb-2003 |
phk |
Propagate DISKFLAG_CANDELETE from struct disk to G_PF_CANDELETE on the provider.
|
110706 |
11-Feb-2003 |
phk |
Wrap a long line.
|
110703 |
11-Feb-2003 |
phk |
Don't short-circuit zero-length requests of they are BIO_[SG]ETATTR.
|
110700 |
11-Feb-2003 |
phk |
Use the SI_CANDELETE flag on the dev_t rather than the D_CANFREE flag on the cdevsw to determine ability to handle the BIO_DELETE request.
|
110697 |
11-Feb-2003 |
phk |
Unconditionally make our provider with G_PF_CANDELETE.
|
110696 |
11-Feb-2003 |
phk |
Propagate G_PF_CANDELETE to our own providers from the provider we attach to.
|
110690 |
11-Feb-2003 |
phk |
Introduce flag field and G_PF_CANDELETE field on providers.
|
110686 |
11-Feb-2003 |
phk |
Remove another printf which does not say anything we didn't already know.
|
110685 |
11-Feb-2003 |
phk |
Turn the "updating" flag (back) into two sequence number fields at either ends of the structure so we have a way to determine if a snapshot is consistent.
|
110684 |
11-Feb-2003 |
phk |
Remove a debugging printf.
|
110592 |
09-Feb-2003 |
phk |
Update the statistics collection code to track busy time instead of idle time.
Statistics now default to "on" and can be turned off with sysctl kern.geom.collectstats=0
Performance impact of statistics collection is on the order of 800 nsec per consumer/provider set on a 700MHz Athlon.
|
110543 |
08-Feb-2003 |
phk |
Put the name of the /dev entry in the .h file, userland will need it.
|
110541 |
08-Feb-2003 |
phk |
Move the g_stat struct to its own .h file, we will export it to other code.
Insted of embedding a struct g_stat in consumers and providers, merely include a pointer.
Remove a couple of <sys/time.h> includes now unneeded.
Add a special allocator for struct g_stat. This allocator will allocate entire pages and hand out g_stat functions from there. The "id" field indicates free/used status.
Add "/dev/geom.stats" device driver whic exports the pages from the allocator to userland with mmap(2) in read-only mode.
This mmap(2) interface should be considered a non-public interface and the functions in libgeom (not yet committed) should be used to access the statistics data.
|
110540 |
08-Feb-2003 |
phk |
Move #defines of major/minor to internal header file so other bits can share and coordinate with geom_dev.
|
110523 |
07-Feb-2003 |
phk |
Commit the correct copy of the g_stat structure.
Add debug.sizeof.g_stat sysctl.
Set the id field of the g_stat when we create consumers and providers.
Remove biocount from consumer, we will use the counters in the g_stat structure instead. Replace one field which will need to be atomically manipulated with two fields which will not (stat.nop and stat.nend).
Change add companion field to bio_children: bio_inbed for the exact same reason.
Don't output the biocount in the confdot output.
Fix KASSERT in g_io_request().
Add sysctl kern.geom.collectstats defaulting to off.
Collect the following raw statistics conditioned on this sysctl:
for each consumer and provider { total number of operations started. total number of operations completed. time last operation completed. sum of idle-time. for each of BIO_READ, BIO_WRITE and BIO_DELETE { number of operations completed. number of bytes completed. number of ENOMEM errors. number of other errors. sum of transaction time. } }
API for getting hold of these statistics data not included yet.
|
110520 |
07-Feb-2003 |
phk |
Fix some sleep strings to make more sense.
|
110518 |
07-Feb-2003 |
phk |
Add the new statistics structure, put one in consumers and providers. include <sys/time.h> as necessary.
|
110517 |
07-Feb-2003 |
phk |
Rename bio_linkage to the more obvious bio_parent. Add bio_t0 timestamp, and include <sys/time.h> where needed
|
110513 |
07-Feb-2003 |
gordon |
Add some comments about the deficiencies of this module. I had hoped to get around to addressing them some more, but Real Life (tm) has gotten in the way.
|
110477 |
06-Feb-2003 |
phk |
Check return value of g_clone_bio().
|
110475 |
06-Feb-2003 |
phk |
Experimentally don't let go of Giant in geom_disk's done. We may actually be increasing Giant contention doing so because the actual stuff we do is very cheap.
Also I am not convinced there is not a tiny window for a race here.
|
110471 |
06-Feb-2003 |
phk |
Put the checks we perform on a bio before calling ::start in their own function, handle all validation and truncation at the time we process the bio instead of when it gets scheduled.
|
110419 |
05-Feb-2003 |
phk |
Implement the new "struct disk" centered API for device drivers.
This commit should not change anything as no device drivers use the new API yet.
|
110317 |
04-Feb-2003 |
phk |
Pave the road to removing the fixed size limit on device nodes:
Change the si_name of dev_t's to be a char * and put a private buffer for holding the name at then end of the struct.
Initialize si_name to point to the private buffer.
Put a KASSERT in geom_disk to prevent overrun on the fake dev_t we still have to generate for the disk_drivers.
|
110291 |
03-Feb-2003 |
gordon |
Correct a comment. GEOM modules do not create /dev entries. They create providers.
Pointed out by: phk
|
110290 |
03-Feb-2003 |
gordon |
Add the GEOM module that makes volume labels useful. A kernel compiled with this will cause volume labels to be exposed in /dev/vol/<volname>. Currently, there is no conflict resolution if more than one FS has the same volume name.
Reviewed by: phk
|
110230 |
02-Feb-2003 |
phk |
Add a bio_disk pointer for use between geom_disk and the device drivers.
|
110188 |
01-Feb-2003 |
phk |
Eliminate the sc_openmask, ccdopen() and ccdclose() functions, we can use the flag maintained by geom_disk.c
Having only a strategy method to intialize, don't waste space using a cdevsw structure to do so.
|
110183 |
01-Feb-2003 |
phk |
Move configuration of geom/providers into its own function in preparation for adding on-the-fly config interface.
|
110157 |
31-Jan-2003 |
phk |
Remove commented out g_enc_dos_partition(). We won't be needing it.
|
110150 |
31-Jan-2003 |
phk |
Add a rudimentary class for slicing Apple partitioned disks.
More work is needed on this, stakeholders please contact me.
Not quite asked for by: rwatson
|
110119 |
30-Jan-2003 |
phk |
Add some agility to the disk_create() API:
Make passing the methods in a cdevsw structure optional.
Move "CANFREE" and "NOGIANT" flags into struct disk instead of the cdevsw which may or may not be there.
Rename CANFREE to CANDELETE to match BIO_DELETE operation.
Add "OPEN" flag so drivers don't have to provide open/close methods just to maintain such a flag.
Add temporary stopgap include of <sys/conf.h> to <sys/disk.h> until the files which have them in the other order are fixed.
Add KASSERTS to make sure we don't get fed too many NULL pointers.
Clear our geom's softc pointer before we wither.
|
110118 |
30-Jan-2003 |
phk |
NO_GEOM cleanup: Remove sys/disklabel.h include.
|
110116 |
30-Jan-2003 |
phk |
NO_GEOM cleanup: retire disk_invalidate()
|
110081 |
30-Jan-2003 |
phk |
NO_GEOM cleanup: Mark the last arg to disk_create() as unused.
|
110052 |
29-Jan-2003 |
phk |
Add code to repsect the D_NOGIANT flag, should the disk device driver set it. NO_GEOM cleanup: remove ifdefs.
Still untested.
|
110050 |
29-Jan-2003 |
phk |
Sort these functions as the author instructed.
|
109973 |
28-Jan-2003 |
phk |
Mark some args unused so this compiles in userland.
|
109972 |
28-Jan-2003 |
phk |
Use a void * to carry the private data for return-call'ed ioctl requests. Amongst other things this avoids a complex workaround in the userland regression bits.
|
109900 |
26-Jan-2003 |
phk |
Implement DIOCBSDBB ioctl which overwrites first BBSIZE bytes of BSD labeled disk.
This is complicated by the fact that BBSIZE is greater than the PAGE_SIZE limit ioctl inflicts on arguments which are automatically copied in.
As long as we don't need access to userland memory (copyin/out) we can deal with the ioctl using g_callme() which executes it from the GEOM event thread.
Once we need copyin/out, we need to return the bio with EDIRIOCTL in order to make geom_dev call us back in the original process context where copyin will work.
Unfortunately, that results in us getting called with Giant, so we have to DROP_GIANT/PICKUP_GIANT around the code where we diddle GEOMs internals.
Sometimes you just can't win...
... But it does make geom_bsd.c an almost complete example of the GEOM beastiarium.
|
109623 |
21-Jan-2003 |
alfred |
Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
|
109563 |
20-Jan-2003 |
phk |
disk_dev_synth() is a NO_GEOM hack.
|
109560 |
20-Jan-2003 |
phk |
Remove need for <sys/diskslice.h> but retain numerical compatibilty just in case.
|
109535 |
19-Jan-2003 |
phk |
Finally give CCD the disk mini-layer treatment:
CAUTION:
Previously CCD would be different from all other disks in the system in that there were no "ccd0" device, only a "ccd0c" device.
This is no longer so after this commit. If you access a ccd device through the "/dev/ccd0c" device _and_ have not actually put a BSD disklabel on the device, you will have to use the name "/dev/ccd0". If your CCD device contains a BSD disklabel there should be no difference.
You need to recompile ccdconfig(8) using the changed src/sys/sys/ccdvar.h for the -g "show me" option to work.
I have run the regression test I created before I started overhauling CCD and it flags no problems, but this code is mildly evil, so take care. If you would cry if you lost what's on CCD, make a back before you upgrade.
Create separate cdevsw for the /dev/ccd.ctl device.
Remove the cloning function, the disk-minilayer will do all naming for us.
Remove the ccdunit and ccdpart functions and carry the softc pointer in the relevant dev_t's and structures.
Release all memory when a CCD device is unconfigured, previously the softc would linger behind.
Remove all traces of BSD disklabel fiddling code.
Remove ccdpsize, the disk mini-layer does this for us.
Don't allocate memory with M_WAITOK in ccdstrategy().
Remove boundary checks which the disk mini-layer does for us.
Don't allocate space for more than 2 ccdbuf, RAID was never implemented.
NB: I have not tried to address any of the preexisting ailments of CCD.
|
109534 |
19-Jan-2003 |
phk |
Unifdef -UDEBUG on the CCD driver. The debugging is mostly useless and can be added back selectively, should anybody start to interest themselves for the internal workings of ccd.
This commit will make the diffs for the following commits much more readable.
|
109486 |
18-Jan-2003 |
phk |
Inline now trivial functions getccdbuf() and putccdbuf(). Fix another trivial memory-leak.
|
109482 |
18-Jan-2003 |
phk |
Fix minor memory-leak.
|
109474 |
18-Jan-2003 |
phk |
Use the M_CCD malloc bucket instead of M_DEVBUF. Don't keep a private freelist of a low number of trivially small structures.
|
109473 |
18-Jan-2003 |
phk |
Inline trivial function ccdintr() into its one caller ccdiodone(). Only call ccdfind() once in ccdiodone() and cache the result.
|
109471 |
18-Jan-2003 |
phk |
Sanitize the copyright section a bit: We do not need two copies of the four-clause BSD license in the file, one will do.
|
109421 |
17-Jan-2003 |
phk |
Find places to store the previously implicityly passed unit number in the three configuration ioctls which need a unit number.
Add a "ccd.ctl" device for config operations.
Implement ioctls on ccd.ctl which rely on the explicityly passed unit numbers.
Update ccdconfig to use the new ccd.ctl interface.
Add code to the kernel to detect old ccdconfig binaries, and whine about it.
Add code to ccdconfig to detect old kernels, and whine about it.
These two compatibility measures will be retained only for a limited period since they are in the way of GEOM'ification of ccd.
|
109256 |
14-Jan-2003 |
phk |
Add a very simple but functional GEOM mirror class.
This is committed more as an instructive tool than as a production facility, but this will change over time.
|
109253 |
14-Jan-2003 |
phk |
Now that we have non-geom_disk based drivers, we need to cover for those, in case they return EOPNOTSUPP on an ioctl.
Found by: jhb
|
109176 |
13-Jan-2003 |
phk |
Always issue ioctls as BIO_GEATTR requests. The direction of data copies on ioctls are no reliable indication of the ioctls "set" or "get" nature or if such simplistic categories can even be applied.
MFC candidate: boot0cfg issue.
|
109170 |
13-Jan-2003 |
phk |
Remove g_silence(). It does not do anything anymore.
|
109169 |
13-Jan-2003 |
phk |
Fix typo.
|
109101 |
11-Jan-2003 |
phk |
Don't restrict MBR sectorsize to 512 bytes.
Test data provided by: Andrey Koklin <aka@veco.ru>
|
109081 |
10-Jan-2003 |
jhb |
Output the fstype of each partition in a disklabel in the configuration text similar to the way that the MBR module dumps its slice types.
|
108819 |
06-Jan-2003 |
phk |
BSD disklabels expose the controling label though the 'c' partition, and some trick is necessary to prevent further BSD geoms from attaching to that. Our old trick was to make sure we don't attach to a geom from the "BSD" class, but this doesn't work if an intermediary geom obscures this fact. Instead, calculate the MD5 checksum of the label we target and ask if anybody below us loves that label. If they do we don't.
Coded by: gordon.
|
108817 |
06-Jan-2003 |
phk |
In userland case include <errno.h>, not <err.h>. This is needed to make the src/tools/regression/geom stuff compile.
|
108650 |
04-Jan-2003 |
nyan |
Rename the dos_partition structure for pc98 to pc98_partition.
|
108593 |
03-Jan-2003 |
phk |
Remove CCDF_SWAP and CCDF_PARITY, they have never been implemented.
|
108591 |
03-Jan-2003 |
nyan |
MFMBR: Add ioctls for writing an IPL and a boot menu.
|
108584 |
03-Jan-2003 |
phk |
Remove unused second argument from BIO_STRATEGY()
|
108558 |
02-Jan-2003 |
phk |
Optimize the size of the work-items by letting the mapping function decide the largest size which stays inside the zone and does not collide with a lock sector.
|
108552 |
02-Jan-2003 |
phk |
Update si_bsize_phys on open.
MFC candidate.
|
108470 |
30-Dec-2002 |
schweikh |
Fix typos, mostly s/ an / a / where appropriate and a few s/an/and/ Add FreeBSD Id tag where missing.
|
108393 |
29-Dec-2002 |
phk |
Implement ioctls for tampering with sector0.
|
108308 |
27-Dec-2002 |
phk |
Remove the "ascii" attribute from the sysctls so that "sysctl -a" will skip them.
|
108297 |
26-Dec-2002 |
phk |
white-space changes
|
108296 |
26-Dec-2002 |
phk |
Use a mutex assert to document our locking circumstances.
|
108295 |
26-Dec-2002 |
phk |
We should not need to hold Giant for sbuf operations any more.
|
108294 |
26-Dec-2002 |
phk |
Add an XXX comment to explain the predicament.
|
108093 |
19-Dec-2002 |
phk |
Don't forget our topology lock in the MBREXT case.
|
108060 |
18-Dec-2002 |
phk |
Solve another bug in the mapping code: correctly skip lock sectors. Make sure sector zero is protected if it contains metadata.
Lower WARNS for gbde to 3 on non-i386 archs. rijndael-fst is evil but appearntly does the right thing and passes the test-vectors.
MFC Candidate.
|
108052 |
18-Dec-2002 |
phk |
Fix two blunders in the mapping functions which can lead to corrupt data, for request sizes larger than the sectorsize or for multi-key setups.
See warning mailed to current@ for details of recovery.
Found by: Marcus Reid <marcus@blazingdot.com>
|
108051 |
18-Dec-2002 |
phk |
Balk at unaligned requests.
MFC candidate.
|
108003 |
17-Dec-2002 |
phk |
Add a check for negative offset locations and return EINVAL for them.
|
107970 |
17-Dec-2002 |
phk |
Don't mangle geometry for pc98, this will happen in the ata driver.
|
107968 |
17-Dec-2002 |
phk |
Remember to hold topology lock when we change things.
Spotted by: kuriyama
|
107967 |
17-Dec-2002 |
phk |
Constify the dumpconf() function.
|
107956 |
16-Dec-2002 |
phk |
Get rid of g_slice_addslice() and use g_slice_config() instead.
Tested with: i386 + src/tools/regression/geom
|
107953 |
16-Dec-2002 |
phk |
Constification and some s/int/u_int/ changes.
|
107834 |
13-Dec-2002 |
phk |
Add a couple of KASSERTS, just in case.
|
107832 |
13-Dec-2002 |
phk |
Don't interpret the hotspots relative to all slices on a slicer, but relative to the parent device.
|
107831 |
13-Dec-2002 |
phk |
Fix spelling in comment.
|
107562 |
03-Dec-2002 |
sos |
Add support for the PC98 platform to the ATA driver. This mostly consists of functionality to serialize accesses to the two ATA channels (which can also be used to "fix" certain PCI based controllers). Add support for Acard controllers. Enable the ATA driver in PC98 GENERIC, and add device hints. Update man page with latest support.
The PC98 core team has kindly provided me with a PC98 machine that made this all possible, thanks to all that contributed to that effort, without that this would probably newer have been possible..
Approved by: re@
|
107526 |
02-Dec-2002 |
phk |
Use the hotspot code to prevent people from overwriting their disklabel with stuff which would ruin the day for any open parititons.
Approved by: re
|
107522 |
02-Dec-2002 |
phk |
Add a simplified version of the hot-spot code to enable us to protect in-band disklabels from in-band vandalism.
Approve by: re
|
107453 |
01-Dec-2002 |
phk |
Use more mnemonic argument names in the access functions.
Sponsored by: DARPA & NAI Labs Approved by: re (blanket)
|
107452 |
01-Dec-2002 |
phk |
Fix a cut&past-o.
Spotted by: yar Approved by: re (blanket)
|
107451 |
01-Dec-2002 |
phk |
Conceiveably, there may exist an algorithm which can tell if a sequence of bytes are the output of AES/128/CBC or ARC4RANDOM. Encrypt the random data with which we wipe when we get a BIO_DELETE to make such an algorithm useful.
Sponsored by: DARPA & NAI Labs Approved by: re (blanket)
|
107450 |
01-Dec-2002 |
phk |
Use unsigned for an index.
Sponsored by: DARPA & NAI Labs. Approved by: re (blanket).
|
107116 |
20-Nov-2002 |
phk |
Remember to update the providers idea of its size when we reconfigure a slice child.
Approved by: re
|
107111 |
20-Nov-2002 |
phk |
Do not call the dumpconf method unless there is one. Compare pointers with NULL.
Partially submitted by: Christian Carstensen <cc@gate5.de> Approved by: re
|
107012 |
17-Nov-2002 |
nyan |
Save a slice name on the disk and print it at g_pc98_dumpconf().
|
106635 |
08-Nov-2002 |
phk |
Remove harmless but irritating printf.
|
106634 |
08-Nov-2002 |
phk |
Always recalculate the SRM checksum if the label is at 64 bytes offset.
Tested by: jhb
|
106559 |
07-Nov-2002 |
nyan |
Fix to support pc98. It is mostly merged from MBR specific part.
Reviewed by: phk
|
106518 |
06-Nov-2002 |
phk |
Straighten up the geom.ctl config interface definitions.
Sponsored by: DARPA & NAI Labs
|
106408 |
04-Nov-2002 |
phk |
Polish a bit here and there. Reenable the geom.ctl device so people can play with gbde.
Sponsored by: DARPA & NAI Labs
|
106407 |
04-Nov-2002 |
phk |
Run a revision on the GBDE encryption facility.
Replace ARC4 with SHA2-512. Change lock-structure encoding to use random ordering rather for obscurity. Encrypt lock-structure with AES/256 instead of AES/128. Change kkey derivation to be MD5 hash based. Watch for malloc(M_NOWAIT) failures and ditch our cache when they happen. Remove clause 3 of the license with NAI Labs consent.
Many thanks to "Lucky Green" <shamrock@cypherpunks.to> and "David Wagner" <daw@cs.berkeley.edu>, for code reading, inputs and suggestions.
This code has still not been stared at for 10 years by a gang of hard-core cryptographers. Discretion advised.
NB: These changes result in the on-disk format changing: dump/restore needed.
Sponsored by: DARPA & NAI Labs.
|
106398 |
04-Nov-2002 |
phk |
Reject slices where begin == end. Remove clause 3 from the license with NAI Labs consent.
Sponsored by: DARPA & NAI Labs
|
106397 |
04-Nov-2002 |
phk |
Remove clause 3 in the license with NAI's consent. Reject slices with type==0. Diddle the bootverbose printfs.
Sponsored by: DARPA & NAI Labs
|
106341 |
02-Nov-2002 |
marcel |
Remove the GEOM_GPT hack. We now check for partition type 0xEE and skip those. This handles the Protective MBR (PMBR) which consists of a single partition of type 0xEE that covers the whole disk and as such protects the GPT partitioning. We allow other partitions to be present besides partitions of type 0xEE and as such interpret partition type 0xEE as a "hands-off" partition only.
While here, fix g_mbrext_dumpconf to test if indent is NULL and dump the data in a form that libdisk can grok. Change the logic in g_mbr_dumpconf to match that of g_mbrext_dumpconf. This does not change the output, but prevents a NULL-pointer dereference when indent == NULL && pp == NULL.
|
106340 |
02-Nov-2002 |
marcel |
Fix dumpconf so libdisk can grok its output. We weren't checking if indent was NULL. Consequently we always emitted the XML format.
|
106338 |
02-Nov-2002 |
phk |
malloc(9) with M_NOWAIT seems to return NULL a lot more than I would have expected under -current. This is a problem for GEOM because the up/down threads cannot sleep waiting for memory to become free. The reason they cannot sleep is that paging things out to disk may be the only way we can clear up some RAM. Nice catch-22 there.
Implement a rudimentary ENOMEM recovery strategy: If an I/O request fails with an error code of ENOMEM, schedule it for a retry, and tell the down-thread to sleep hz/10 to get other parts of the system a chance to free up some memory, in particular the up-path in GEOM.
All caches should probably start to monitor malloc(9) failures using the new malloc_last_fail() function, and release when it indicates congestion.
Sponsored by: DARPA & NAI Labs.
|
106301 |
01-Nov-2002 |
phk |
Make this compile in the userland shims again.
Sponsored by: DARPA & NAI Labs
|
106300 |
01-Nov-2002 |
phk |
Add KASSERT for bio_cmd validity here as well. Various hacks still bypass specfs.
|
106263 |
31-Oct-2002 |
phk |
Spruce up bootverbose output a bit.
Allow extended partitions to have flag=0x80
|
106226 |
30-Oct-2002 |
phk |
Change the kkey generation cherry-picker to use MD5.
Sponsored by: DARPA & NAI Labs
|
106101 |
28-Oct-2002 |
phk |
Add the remaning part of the new libdisk interaction.
WARNING: This is not a published interface, it is a stopgap measure for WARNING: libdisk so we can get 5.0-R out of the door.
Sponsored by: DARPA & NAI Labs
|
106100 |
28-Oct-2002 |
phk |
Add support for the new libdisk interaction.
Sponsored by: DARPA & NAI Labs.
|
106085 |
28-Oct-2002 |
phk |
Fix a bug in the cherry-picker kkey generator routine.
WARNING: You need to backup and restore the _unencrypted_ contents WARNING: of your GBDE disks when you take this update!
Sponsored by: DARPA & NAI Labs.
|
106076 |
28-Oct-2002 |
phk |
Add more compatibility junk.
|
106030 |
27-Oct-2002 |
phk |
Don't truncate on large disks.
|
106001 |
26-Oct-2002 |
phk |
Make geom_mbr.c optional on PC98, use GEOM_MBR option to include it.
Disable check for supposedly magic "IPL1" string for PC98 labels, its thaumaturgical power is in doubt.
|
105957 |
25-Oct-2002 |
phk |
Reduce the GEOM verbosity under bootverbose to something more sufferable. This is not quite the set of information I would want, but the tree where I have the "correct" version is messed up with conflicts.
Sponsored by: DARPA & NAI Labs.
|
105947 |
25-Oct-2002 |
phk |
Add a g_dev_print() function which prints all the /dev entries GEOM know about.
|
105941 |
25-Oct-2002 |
phk |
Loose the g_dev_clone() noise.
|
105897 |
24-Oct-2002 |
phk |
Use a better test to prevent tasting geom.ctl so we don't screw the regression tests.
|
105892 |
24-Oct-2002 |
phk |
Don't taste the first provider, it's /dev/geom.ctl and it's not going to taste like anything we like anyway.
|
105581 |
20-Oct-2002 |
phk |
No need to specify CTLTYPE_INT when we use SYSCTL_INT.
|
105551 |
20-Oct-2002 |
phk |
Now that the sectorsize and mediasize are properties of the provider, don't take the detour over the I/O path to discover them using getattr(), we can just pick them out directly.
Do note though, that for now they are only valid after the first open of the underlying disk device due compatibility with the old disk_create() API. This will change in the future so they will always be valid.
Sponsored by: DARPA & NAI Labs.
|
105550 |
20-Oct-2002 |
phk |
The g_id*() functions are not needed in the userland test-suite so #ifdef _KERNEL them rather than deal with a copyin simulation.
Sponsored by: DARPA & NAI Labs
|
105542 |
20-Oct-2002 |
phk |
Make the sectorsize a property of providers so we can include it in the XML output.
Sponsored by: DARPA & NAI Labs
|
105540 |
20-Oct-2002 |
phk |
Use %jd instead of %lld now that we have it.
|
105539 |
20-Oct-2002 |
phk |
It makes more sense for the fwheads and fwsectors properties to be in the provider stanza rather than the geom stanza.
|
105537 |
20-Oct-2002 |
phk |
Include fwsectors and gfwheads in the XML output for the disks we know.
Sponsored by: DARPA & NAI Labs.
|
105520 |
20-Oct-2002 |
phk |
Be consistent about functions being static.
Spotted by: FlexeLint
|
105512 |
20-Oct-2002 |
phk |
Constify input to the arc4 seed function. Implement the lockfile hunting in sector zero.
Sponsored by: DARPA & NAI Labs.
|
105506 |
20-Oct-2002 |
phk |
Don't track bio allocation in debug output.
Sponsored by: DARPA & NAI Labs.
|
105505 |
20-Oct-2002 |
phk |
Style(9) and english(9) fixes.
Submitted by: schweikh
|
105504 |
20-Oct-2002 |
phk |
Make it possible to specify also via geom_t ID in the geom.ctl config ioctl.
Sponsored by: DARPA & NAI Labs.
|
105465 |
19-Oct-2002 |
phk |
Fix a missing initialization.
|
105464 |
19-Oct-2002 |
phk |
Add Geom Based Disk Encryption to the tree.
This is an encryption module designed for to secure denial of access to the contents of "cold disks" with or without destruction activation.
Major features:
* Based on AES, MD5 and ARC4 algorithms. * Four cryptographic barriers: 1) Pass-phrase encrypts the master key. 2) Pass-phrase + Lock data locates master key. 3) 128 bit key derived from 2048 bit master key protects sector key. 3) 128 bit random single-use sector keys protect data payload. * Up to four different changeable pass-phrases. * Blackening feature for provable destruction of master key material. * Isotropic disk contents offers no information about sector contents. * Configurable destination sector range allows steganographic deployment.
This commit adds the kernel part, separate commits will follow for the userland utility and documentation.
This software was developed for the FreeBSD Project by Poul-Henning Kamp and NAI Labs, the Security Research Division of Network Associates, Inc. under DARPA/SPAWAR contract N66001-01-C-8035 ("CBOSS"), as part of the DARPA CHATS research program.
Many thanks to Robert Watson, CBOSS Principal Investigator for making this possible.
Sponsored by: DARPA & NAI Labs.
|
105452 |
19-Oct-2002 |
tmm |
The argument to the DIOCGMEDIASIZE ioctl() is an off_t, not an u_int.
Reviewed by: phk
|
105358 |
17-Oct-2002 |
phk |
Be consistent and return the NUL at the end of kern.geom.conf{xml,dot}.
Spotted by: sam
|
105350 |
17-Oct-2002 |
phk |
NUL terminate sysctl kern.disks
|
105180 |
15-Oct-2002 |
njl |
Return an error if the drive reports heads/sectors that do not make sense. This fixes a divide by zero in fdisk(8)
Reviewed by: phk
|
105163 |
15-Oct-2002 |
phk |
Constification ? Yes, out that door, row on the left, one patch each.
Sponsored by: DARPA & NAI Labs
|
105133 |
14-Oct-2002 |
phk |
Remove a bogus local variable.
Sponsored by: DARPA & NAI Labs.
|
105124 |
14-Oct-2002 |
jake |
Moved geom class initialization to SI_SUB_DRIVERS from SI_SUB_PSEUDO. This fixes mounting root from md(4) which calls disk_create() early.
|
105092 |
14-Oct-2002 |
phk |
Implement the GEOMCONFIGGEOM ioctl which can be used to manually create and configure an instance of a class on a give provider.
Sponsored by: DARPA & NAI Labs
|
105091 |
14-Oct-2002 |
phk |
Add more KASSERTS.
Sponsored by: DARPA & NAI Labs.
|
105068 |
13-Oct-2002 |
phk |
Add the outline of the "/dev/geom.ctl" handling code.
Sponsored by: DARPA & NAI Labs.
|
105061 |
13-Oct-2002 |
phk |
Give GEOM modules a chance to specify their own init routine, in case they have special requirements.
Sponsored by: DARPA & NAI Labs.
|
104936 |
11-Oct-2002 |
phk |
The CAM system has it's own ideas of what locks are to be held by whom. So do GEOM. Not a pretty sight.
Take all the interesting stuff out of GEOM::disk_create(), and leave just the creation of the fake dev_t. Schedule the topology munging to happen in the g_event thread with g_call_me().
This makes disk_create() pretty lock-agnostic, almost lock-atheist.
Tripped over by: peter Sponsored by: DARPA & NAI Labs
|
104701 |
09-Oct-2002 |
phk |
Add support g_clone_bio() and g_std_done() to spawn multiple children of a bio and correctly gather status when done.
Sponsored by: DARPA & NAI Labs.
|
104665 |
08-Oct-2002 |
phk |
For now, don't wait for drives to stop returning EBUSY. There is too much broken harware around it seems.
Sponsored by: DARPA & NAI Labs.
|
104609 |
07-Oct-2002 |
phk |
Correctly deal with non-DEVBSIZE drives. Allow BIO_DELETE through too.
This fixes swap-backed md(4) devices.
Sponsored by: DARPA & NAI Labs.
|
104606 |
07-Oct-2002 |
phk |
Put a printf under #ifdef DIAGNOSTIC.
Sponsored by: DARPA & NAI Labs.
|
104602 |
07-Oct-2002 |
phk |
Copyin and copyout are only possible from a process-native thread, and therefore we need a way for ioctl handlers to run in that thread in GEOM. Rather than invent a complicated registration system to recognize which ioctl handler to use for a given ioctl, we still schedule all ioctls down the tree as bio transactions but add a special return code that means "call me directly" and have the geom_dev layer do that.
Use this for all ioctls that make it as far as a diskdriver to avoid any backwards compatibility problems.
Requested by: scottl Sponsored by: DARPA & NAI Labs
|
104542 |
05-Oct-2002 |
phk |
This patch got lost in my trees: Pass setattr down to device drivers as well.
Detected by: scottl Sponsored by: DARPA & NAI Labs.
|
104534 |
05-Oct-2002 |
phk |
Fix argument order mistake when decoding disklabels from on-disk format.
Detected by: jhay Sponsored by: DARPA & NAI Labs.
|
104519 |
05-Oct-2002 |
phk |
NB: This commit does *NOT* make GEOM the default in FreeBSD NB: But it will enable it in all kernels not having options "NO_GEOM"
Put the GEOM related options into the intended order.
Add "options NO_GEOM" to all kernel configs apart from NOTES.
In some order of controlled fashion, the NO_GEOM options will be removed, architecture by architecture in the coming days.
There are currently three known issues which may force people to need the NO_GEOM option:
boot0cfg/fdisk: Tries to update the MBR while it is being used to control slices. GEOM does not allow this as a direct operation.
SCSI floppy drives: Appearantly the scsi-da driver return "EBUSY" if no media is inserted. This is wrong, it should return ENXIO.
PC98: It is unclear if GEOM correctly recognizes all variants of PC98 disklabels. (Help Wanted! I have neither docs nor HW)
These issues are all being worked.
Sponsored by: DARPA & NAI Labs.
|
104452 |
04-Oct-2002 |
phk |
Properly isolate the locking domains of sysctl from the topology lock for the sysctls which report the configuration.
Sponsored by: DARPA & NAI Labs.
|
104451 |
04-Oct-2002 |
phk |
Implement the "kern.disks" sysctl in GEOM.
This makes "mdconfig -l" work again.
Sponsored by: DARPA & NAI Labs.
|
104450 |
04-Oct-2002 |
phk |
Properly conditionalize a debugging printf.
Sponsored by: DARPA & NAI Labs.
|
104359 |
02-Oct-2002 |
phk |
Move GEOM's sysctls under kern.geom.
Sponsored by: DARPA & NAI Labs.
|
104357 |
02-Oct-2002 |
phk |
Put some failing ioctl related printfs under a suitable debug flag.
Sponsored by: DARPA & NAI Labs.
|
104316 |
01-Oct-2002 |
phk |
Use the canonical root:operator 0640 for GEOM disk devices.
Spotted by: brooks Sponsored by: DARPA & NAI Labs.
|
104312 |
01-Oct-2002 |
phk |
Don't restrict device drivers ability to sleep in the ioctl method, this is actually entirely legal.
Do bio's with ioctls in them in a g_call_me() function.
Sponsored by: DARPA & NAI Labs
|
104292 |
01-Oct-2002 |
phk |
Include <sys/diskmbr.h> instead of <sys/disklabel.h>
Sponsored by: DARPA & NAI Labs.
|
104197 |
30-Sep-2002 |
phk |
Don the asbestos underwear and add the code which lets DIOCWDINFO write modified disklabels back to disk.
Sponsored by: DARPA & NAI Labs.
|
104195 |
30-Sep-2002 |
phk |
Retire g_io_fail() and let g_io_deliver() take an error argument instead.
Sponsored by: DARPA & NAI Labs.
|
104194 |
30-Sep-2002 |
phk |
Introduce g_write_data() function.
Sponsored by: DARPA & NAI Labs
|
104193 |
30-Sep-2002 |
phk |
Add missing g_enc_le2().
Sponsored by: DARPA & NAI Labs.
|
104191 |
30-Sep-2002 |
phk |
Disable the g_sanity() check unless people ask for it in the debugflags.
Sponsored by: DARPA & NAI Labs.
|
104184 |
30-Sep-2002 |
phk |
Make sure we don't loose our topology lock in a call_me() handler.
Sponsored by: DARPA & NAI Labs.
|
104107 |
28-Sep-2002 |
phk |
Zero the local-variable mutexes before we call mtx_init() on them, failing to do this may lead mtx_init() to belive they have already been initialized.
Detected by: Marc Recht <marc@informatik.uni-bremen.de>
|
104087 |
28-Sep-2002 |
phk |
Style, whitespace and lint fixes.
Sponsored by: DARPA & NAI Labs.
|
104086 |
28-Sep-2002 |
phk |
Void functions cannot use return(foo) even if foo is also returning void.
Sponsored by: DARPA & NAI Labs.
|
104081 |
28-Sep-2002 |
phk |
First confirmed kill from my Flexelint license: Check return value of g_clone_bio().
Detected by: http://www.gimpel.com/html/flex.htm Sponsored by: DARPA & NAI Labs.
|
104065 |
27-Sep-2002 |
phk |
Extensively rework the geom_bsd method, put a lot of comments in, betting that this will make people use this for their future copy&paste operations.
Rework the detection of raw-disk offsets in disklabels. This actually unearthed a number of bugs in the (now) previous version.
Also accept labels which don't have a magic RAW_PART, provided they don't confuse us too much.
Change the order of our sanity-checks on labels found on disks to be more robust.
Check against MAXPARTITIONS in our sanity-check and reject disklabels we cannot cope with.
Create new g_bsd_modify() function to implment disklabel modifying ioctls.
Implement DIOCSDINFO and DIOCWDINFO with the provision that the latter still not writes your change back to disk. I didn't have the nerves for that yet.
In the start routine, use g_call_me() for complex ioctls to prevent sleeping.
Sponsored by: DARPA & NAI Labs.
|
104064 |
27-Sep-2002 |
phk |
Add the new g_slice_config() call, which can add/delete/change a slice, with support for trying, doing and forcing.
This will eventually replace g_slice_addslice() which gets changed from grabbing topology to requing it in this commit as well.
Sponsored by: DARPA & NAI Labs.
|
104063 |
27-Sep-2002 |
phk |
Make the UP/DOWN threads hold on to their own private mutex while doing work.
This prevents people from sleeping in the UP/DOWN I/O path by mistake or design (doing so almost invariably result in deadlocks since it stalls all I/O processing in the given direction.
Sponsored by: DARPA & NAI Labs.
|
104062 |
27-Sep-2002 |
phk |
Correctly en/decode MAXPARTITIONS partitions.
Sponsored by: DARPA & NAI Labs.
|
104061 |
27-Sep-2002 |
phk |
Setattr should not retry on EBUSY, we could get EBUSY back because a disklabel modification tries to change an open device, and no counter-examples exists.
Be less facist about when we can do Setattr, the openmodes of devices are so loosely managed that the "exclusive" count is almost useless.
Sponsored by: DARPA & NAI Labs.
|
104060 |
27-Sep-2002 |
phk |
Various no-ops:
Add a __unused.
Make the 2byte decoder functions return 16 bits for the benefits of picky lints.
No need to grab giant around a tsleep() when we have a timeout.
Sponsored by: DARPA & NAI Labs.
|
104059 |
27-Sep-2002 |
phk |
Correctly calculate size of PC98 slices.
Sponsored by: DARPA & NAI Labs.
|
104058 |
27-Sep-2002 |
phk |
Allocate bio's with M_NOWAIT and let the caller deal with the problems.
Sponsored by: DARPA & NAI Labs.
|
104057 |
27-Sep-2002 |
phk |
Add checks for g_clone_bio() returning NULL, it will be possible RSN.
Sponsored by: DARPA & NAI Labs.
|
104056 |
27-Sep-2002 |
phk |
Implement g_call_me() as a way for geom methods to schedule operations to be performed in the event-thread.
To do this, we need to lock the eventlist with g_eventlock (nee g_doorlock), since g_call_me() being called from the UP/DOWN paths will not be able to aquire g_topology_lock.
This also means that for now these events are not referenced on any particular consumer/provider/geom.
For UP/DOWN path use, this will not become a problem since the access() function will make sure we drain any bio's before we dismantle.
Sponsored by: DARPA & NAI Labs.
|
104055 |
27-Sep-2002 |
phk |
Ok, include also the two tests which actually does effect the claims of the last commit message.
Sponsored by: DARPA & NAI Labs.
|
104054 |
27-Sep-2002 |
phk |
Hook into the shutdown EVENTHANDLER and stop tasting things after we get notified to make things settle a bit faster.
Sponsored by: DARPA & NAI Labs.
|
104053 |
27-Sep-2002 |
phk |
Rename the doorlock to eventlock, it gets to protect a bit more in the future.
Sponsored by: DARPA & NAI Labs.
|
103942 |
25-Sep-2002 |
jeff |
- Use vrefcnt() instead of v_usecount.
|
103714 |
20-Sep-2002 |
phk |
(This commit touches about 15 disk device drivers in a very consistent and predictable way, and I apologize if I have gotten it wrong anywhere, getting prior review on a patch like this is not feasible, considering the number of people involved and hardware availability etc.)
If struct disklabel is the messenger: kill the messenger.
Inside struct disk we had a struct disklabel which disk drivers used to communicate certain metrics to the disklayer above (GEOM or the disk mini-layer). This commit changes this communication to use four explicit fields instead.
Amongst the benefits is that the fields do not get overwritten by wrong or bogus on-disk disklabels.
Once that is clear, <sys/disk.h> which is included in the drivers no longer need to pull <sys/disklabel.h> and <sys/diskslice.h> in, the few places that needs them, have gotten explicit #includes for them.
The disklabel inside struct disk is now only for internal use in the disk mini-layer, so instead of embedding it, we malloc it as we need it.
This concludes (modulus any mistakes) the series of disklabel related commits.
I belive it all amounts to a NOP for all the rest of you :-)
Sponsored by: DARPA & NAI Labs.
|
103695 |
20-Sep-2002 |
phk |
Remove unneeded #include <sys/disklabel.h>
Sponsored by: DARPA & NAI Labs.
|
103670 |
20-Sep-2002 |
phk |
Retire now unused DIOCGDVIRGIN kludge.
Sponsored by: DARPA & NAI Labs.
|
103284 |
13-Sep-2002 |
phk |
"Fix" printf format issues by using %j
Sponsored by: DARPA & NAI Labs.
|
103283 |
13-Sep-2002 |
phk |
Use biowait() rather than DIY.
Sponsored by: DARPA & NAI Labs
|
103279 |
13-Sep-2002 |
phk |
Add a couple more of the big/little-endian conversion routines and make them visible from userland, if need be.
I wish that the C language contained this as part of struct definintions, but failing that, I would settle for an agreed upon set of functions for packing/unpacking integers in various sizes from byte-streams which may have unfriendly alignment.
This really belongs in <sys/endian.h> I guess.
|
103278 |
13-Sep-2002 |
mux |
Fix another two printf() format errors which weren't warned about because the bio_blknos were bogusly casted to long long.
|
103276 |
13-Sep-2002 |
mux |
Fix another printf() format error which wasn't warned about because the bio_blkno was bogusly casted to an int.
|
103275 |
13-Sep-2002 |
mux |
Fix a printf() format error on 64 bits architectures. Also fix some style bugs on the same line.
|
103100 |
08-Sep-2002 |
phk |
Deal with a new exteded MBR paritition type
Submitted by: Michal Mertl <mime@traveller.cz>
|
103009 |
06-Sep-2002 |
phk |
Remove "magicspace". It looks good on paper, it doesn't work in practice.
Sponsored by: DARPA & NAI Labs.
|
103004 |
06-Sep-2002 |
phk |
Don't respect the O_EXCL flag, we don't get it back on close so we cannot correctly track it.
Spotted by: peter Sponsored by: DARPA & NAI Labs.
|
102380 |
24-Aug-2002 |
marcel |
Use 'p' as the partition specifier instead of 's'. We continue to use 's' for compatibility partitions (ie partitions with a BSD disklabel). Partition numbers continue to start with 1. Example /etc/fstab: # Device Mountpoint FStype Options ... /dev/da0p1 /efi msdos rw ... /dev/da0p2 / ufs rw ... /dev/da0p3 none swap sw ...
|
99028 |
29-Jun-2002 |
julian |
Don't use the static thread.. it is going away.
|
98987 |
28-Jun-2002 |
phk |
Add two new submodes to the AES encryption method.
This method is now suitable for encrypting swap spaces.
Sponsored by: DARPA & NAI Labs.
|
98099 |
10-Jun-2002 |
phk |
Put geom_gpt.c under the GEOM option instead of having a special GEOM_GPT option for it.
|
98066 |
09-Jun-2002 |
phk |
Improve some on the naming.
Submitted by: iedowse
|
97887 |
05-Jun-2002 |
phk |
Change the registration of magic spaces so it does its own memory management.
Sponsored by: DARPA & NAI Labs.
|
97547 |
30-May-2002 |
marcel |
Add compile time asserts for the size of struct gpt_hdr and struct gpt_ent. Use offsetof() for struct gpt_hdr to exclude padding.
|
97512 |
29-May-2002 |
phk |
Add one copy of crc32() and crc32_tab[] in libkern, and remove it two other places.
Comment out crc32 related definitions in zlib.h, we don't seem to have the corresponding code in our kernel.
|
97392 |
28-May-2002 |
marcel |
Add support to GEOM for GUID Partition Tables (GPTs). The support is currently conditional on both the GEOM and GEOM_GPT options to avoid getting GPT by default and having the MBR and GPT classes clash. The correct behaviour of the MBR class would be to back-off (reject) a MBR if it's a Protective MBR (a MBR with a single partition of type 0xEE that spans the whole disk (as far as the MBR is concerned). The correct behaviour if the GPT class would be to back-off (reject) a GPT if there's a MBR that's not a Protective MBR.
At this stage it's inconvenient to destroy a good MBR when working with GPTs that it's more convenient to have the MBR class back-off when it detects the GPT signature on disk and have the GPT class ignore the MBR.
In sys/gpt.h UUIDs (GUIDs) for the following FreeBSD partitions have been defined:
GPT_ENT_TYPE_FREEBSD FreeBSD slice with disklabel. This is the equivalent of the well-known FreeBSD MBR partition type. GPT_ENT_TYPE_FREEBSD_{SWAP|UFS|UFS2|VINUM} FreeBSD partitions in the context of disklabel. This is speculating on the idea to use the GPT to hold partitions instead if slices and removing the fixed (and low) limits we have on the number of partitions.
This commit lacks a GPT image for the regression suite.
|
97318 |
26-May-2002 |
phk |
Add a proof-of-concept encryption class.
"The only hard problem in cryptography is key-management."
All sectors are encrypted with AES in CBC mode using a constant key, currently compiled in and all zero.
To activate this module, write the magic header on the partition:
echo "<<FreeBSD-GEOM-AES>>" | dd conv=sync of=/dev/md98
The encrypted device will be one sector shorter and have ".aes" appended to its name.
Sponsored by: DARPA & NAI Labs.
|
97317 |
26-May-2002 |
phk |
Give the closet-dev_t we hand to the diskdrivers a name.
|
97316 |
26-May-2002 |
phk |
Only clear the spoiled flag if the class had no spoiled method, the spoiled method may have deallocated the consumer already and modifying free()'ed memory is bad style.
Sponsored by: DARPA & NAI Labs.
|
97272 |
25-May-2002 |
bde |
Fixed printf format errors. Most of them are 64-bit daddr_t casualties. Printing daddr_t's using %d format was always an error, but gcc's warning about it was ignored for supported 64-bit arches and not printed for supported 32-bit arches. Hundreds if not thousands thousands of previously "fixed" daddr_t printings are now broken on 32-bit machines by casting daddr_t's to longs. daddr_t's should be printed using %jd format, but this fix uses %lld since %j is not implemented in the kernel yet.
Fixed some nearby format printf errors (style bugs).
|
97078 |
21-May-2002 |
phk |
Introduce the concept of "magic spaces", and implement them in most of the relevant classes.
Some methods may implement various "magic spaces", this is reserved or magic areas on the disk, set a side for various and sundry purposes. A good example is the BSD disklabel and boot code on i386 which occupies a total of four magic spaces: boot1, the disklabel, the padding behind the disklabel and boot2. The reason we don't simply tell people to write the appropriate stuff on the underlying device is that (some of) the magic spaces might be real-time modifiable. It is for instance possible to change a disklabel while partitions are open, provided the open partitions do not get trampled in the process.
Sponsored by: DARPA & NAI Labs.
|
97075 |
21-May-2002 |
phk |
Remove the "-class" suffix from classes, they will not be ambiguous.
Sponsored by: DARPA & NAI Labs.
|
96987 |
20-May-2002 |
phk |
Don't grab Giant around malloc(9) and free(9). Don't grab Giant around wakeup(9). Don't print verbose messages about each device found in geom_dev. Various cleanups.
Sponsored by: DARPA & NAI Labs.
|
96953 |
19-May-2002 |
phk |
Generalize a bit: we don't need separate functions to find the i386 and alpha disklabels, just one function which is told where to look.
Sponsored by: DARPA & NAI Labs.
|
96952 |
19-May-2002 |
phk |
Include needed #include for regression tests.
Sponsored by: DARPA & NAI Labs.
|
96475 |
12-May-2002 |
phk |
Retire the bogus uses of the disklabel field d_sbsize and begin to initialize it to zero so we don't have to have everbody and their aunt including FFS specific header files.
Sponsored by: DARPA & NAI Labs.
|
95550 |
27-Apr-2002 |
phk |
Fix a {} bug which doesn't have any effect yet.
Spotted by: jake
|
95405 |
24-Apr-2002 |
phk |
Improve the cross-references in the XML output.
Explained by: des Sponsored by: DARPA & NAI Labs.
|
95362 |
24-Apr-2002 |
phk |
Make specific provisions for the kernel simulator used in the regression tests, other userland programs may need to include <geom/geom.h>.
Sponsored by: DARPA & NAI Labs.
|
95323 |
23-Apr-2002 |
phk |
Implement the GEOMGETCONF ioctl which returns vital stats for the current device in XML in an sbuf.
Sponsored by: DARPA & NAI Labs
|
95321 |
23-Apr-2002 |
phk |
All in a days work: make a function static.
|
95310 |
23-Apr-2002 |
phk |
Introduce some serious paranoia to try to catch a memory overwrite problem as early as possible.
Sponsored by: DARPA & NAI Labs
|
95276 |
22-Apr-2002 |
phk |
Protect against multitple #includes of this file.
|
95038 |
19-Apr-2002 |
phk |
Make kernel dumps work with GEOM.
Notice that if the device on which the dump is set is destroyed for any reason, the dump setting is lost. This in particular will happen in the case of spoilage. For instance if you set dump on ad0s1b and open ad0 for writing, ad0s* will be spoilt and the dump setting lost. See geom(4) for more about spoiling.
Sponsored by: DARPA & NAI Labs.
|
95037 |
19-Apr-2002 |
phk |
Make life easier for reference-vector generatorts in tools/regression/geom by including a FreeBSD friendly CVS identifier in the XML output.
Sponsored by: DARPA & NAI Labs.
|
94287 |
09-Apr-2002 |
phk |
Implement DIOCGFRONTSTUFF ioctl which reports how many bytes from the start of the device magic stuff might occupy.
Sponsored by: DARPA & NAI Labs.
|
94285 |
09-Apr-2002 |
phk |
Various stylistic nit picking.
Sponsored by: DARPA & NAI Labs.
|
94284 |
09-Apr-2002 |
phk |
Introduce the convenience function g_getattr() and make it DWIM.
Sponsored by: DARPA & NAI Labs.
|
94283 |
09-Apr-2002 |
phk |
Constifixation of attribute argument to g_io_[gs]etattr()
Sponsored by: DARPA & NAI Labs
|
94182 |
08-Apr-2002 |
phk |
Move generic disk ioctls from <sys/disklabel.h> to <sys/disk.h>.
Sponsored by: DARPA & NAI Labs
|
94175 |
08-Apr-2002 |
phk |
In reverence of the 3rd X11 development rule:
3.The only thing worse than generalizing from one example is generalizing from no examples at all.
Remove the fwcylinders attribute before anybody gets the idea that we alone have squared the circle.
Sponsored by: DARPA & NAI Labs.
|
93818 |
04-Apr-2002 |
jhb |
Change callers of mtx_init() to pass in an appropriate lock type name. In most cases NULL is passed, but in some cases such as network driver locks (which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used.
Tested on: i386, alpha, sparc64
|
93778 |
04-Apr-2002 |
phk |
Centralize EOF handling and improve access controls for bio scheduling.
Sponsored by: DARPA & NAI Labs
|
93776 |
04-Apr-2002 |
phk |
Move access and orphan member functions from class to geom.
Sponsored by: DARPA & NAI Labs
|
93774 |
04-Apr-2002 |
phk |
s/classs/classes/ to fixup grammer after the previous global renaming.
Sponsored by: DARPA & NAI Labs
|
93657 |
02-Apr-2002 |
phk |
Retire the bogus ioctl DIOCGPART in toto.
Once again we can notice that badly thought out hacks ferment and infect far more code than initially expected.
Sponsored by: DARPA and NAI Labs.
|
93653 |
02-Apr-2002 |
phk |
One less user of the bogus DIOCGPART ioctl.
|
93642 |
02-Apr-2002 |
phk |
Initialize a field to cater for ata-raid
|
93496 |
31-Mar-2002 |
phk |
Here follows the new kernel dumping infrastructure.
Caveats:
The new savecore program is not complete in the sense that it emulates enough of the old savecores features to do the job, but implements none of the options yet.
I would appreciate if a userland hacker could help me out getting savecore to do what we want it to do from a users point of view, compression, email-notification, space reservation etc etc. (send me email if you are interested).
Currently, savecore will scan all devices marked as "swap" or "dump" in /etc/fstab _or_ any devices specified on the command-line.
All architectures but i386 lack an implementation of dumpsys(), but looking at the i386 version it should be trivial for anybody familiar with the platform(s) to provide this function.
Documentation is quite sparse at this time, more to come.
Details:
ATA and SCSI drivers should work as the dump formatting code has been removed. The IDA, TWE and AAC have not yet been converted.
Dumpon now opens the device and uses ioctl(DIOCGKERNELDUMP) to set the device as dumpdev. To implement the "off" argument, /dev/null is used as the device.
Savecore will fail if handed any options since they are not (yet) implemented. All devices marked "dump" or "swap" in /etc/fstab will be scanned and dumps found will be saved to diskfiles named from the MD5 hash of the header record. The header record is dumped in readable format in the .info file. The kernel is not saved. Only complete dumps will be saved.
All maintainer rights for this code are disclaimed: feel free to improve and extend.
Sponsored by: DARPA, NAI Labs
|
93395 |
29-Mar-2002 |
phk |
Remove bogus ccddump() function in favour of the standard nodump.
|
93358 |
28-Mar-2002 |
phk |
Complete an incomplete cut&paste operation.
|
93354 |
28-Mar-2002 |
phk |
Add preliminary PC98 class to GEOM.
I have not been able to find very much information about the PC98 extended partition layout so this is gleaned from the source in our pc98 architecture. Corrections and patched very welcome.
Sponsored by: DARPA and NAI Labs.
|
93326 |
28-Mar-2002 |
phk |
In the absense of any smarter way to do this, cast various printf arguments to silence printf format warnings.
|
93292 |
27-Mar-2002 |
phk |
Calculate the checksum the right place for alpha. The fact that this worked for the beast disklabel only goes to show how weak a simple parity really is.
|
93250 |
26-Mar-2002 |
phk |
Eliminate some thread pointers which do not make sense anymore.
Split private parts of geom.h into geom_int.h. The latter should never be included in class implemtations.
|
93248 |
26-Mar-2002 |
phk |
Cave in to tradition and rename "methods" to "classes".
|
93238 |
26-Mar-2002 |
phk |
Push BIO_FORMAT into a local hack inside the floppy drivers where it belongs.
|
93097 |
24-Mar-2002 |
phk |
Make the BSD method width/endian agnostic and support alpha architecture labels as well.
Sponsored by: DARPA, NAI Labs.
|
93090 |
24-Mar-2002 |
phk |
Be more systematic about conversion of on-disk formats in a endian/width agnostic way.
Collapse the MBR and MBREXT methods into one file and make them endian/width agnostic.
Sponsored by: DARPA & NAI Labs.
|
92718 |
19-Mar-2002 |
alfred |
Fix bio->bio_blkno format warning.
|
92698 |
19-Mar-2002 |
phk |
Add five GEOM oriented ioctls to get basic information about a geom device.
|
92514 |
17-Mar-2002 |
phk |
Need a different #include for the userland regression test.
|
92513 |
17-Mar-2002 |
phk |
Make this compile in the userland-regression testsuite again.
|
92479 |
17-Mar-2002 |
phk |
Change the giant-dropping method a fair bit to keep WITNESS more happy.
|
92474 |
17-Mar-2002 |
phk |
Forgot to remove the old g_malloc() call when I split it.
Spotted by: dima
|
92408 |
16-Mar-2002 |
phk |
Hmm, talk about optimizer-fodder. Make the DIOCGDVIRGIN hack work again.
|
92403 |
16-Mar-2002 |
phk |
Add a generic and general ioctl pass-through mechanism.
It should now be posible to issue ioctls to SCSI CD drives.
|
92372 |
15-Mar-2002 |
phk |
Teach GEOM about Sun disklabel formats.
The detection code in this method is written so that it should work on all architectures which means that you can plug a Sun disk into a i386 now and access the partitions.
We still need an endian-agnostic ufs/ffs before this is really interresting, but the main focus was to get sparc64 onto the GEOM trail.
|
92371 |
15-Mar-2002 |
phk |
Try to get used to architectures which are picky about alignment.
|
92363 |
15-Mar-2002 |
mckusick |
Introduce the new 64-bit size disk block, daddr64_t. Change the bio and buffer structures to have daddr64_t bio_pblkno, b_blkno, and b_lblkno fields which allows access to disks larger than a Terabyte in size. This change also requires that the VOP_BMAP vnode operation accept and return daddr64_t blocks. This delta should not affect system operation in any way. It merely sets up the necessary interfaces to allow the development of disk drivers that work with these larger disk block addresses. It also allows for the development of UFS2 which will use 64-bit block addresses.
|
92108 |
11-Mar-2002 |
phk |
First commit of the GEOM subsystem to make it easier for people to test and play with this.
This is not yet production quality and should be run only on dedicated test boxes.
For people who want to develop transformations for GEOM there exist a set of shims to run geom in userland (ask phk@freebsd.org).
Reports of all kinds to: phk@freebsd.org Please include in report: dmesg sysctl debug.geomdot sysctl debug.geomconf
Known significant limitations: no kernel dump facility. ioctls severely restricted.
Sponsored by: DARPA, NAI Labs
|
91406 |
27-Feb-2002 |
jhb |
Simple p_ucred -> td_ucred changes to start using the per-thread ucred reference.
|
88707 |
30-Dec-2001 |
phk |
Reduce kernel stack usage of ccdinit() by MAXPATHLEN by using MALLOC(9).
Submitted by: Maxim Konovalov <maxim@macomnet.ru> MFC after: 1 week
|
86479 |
17-Nov-2001 |
iedowse |
Return EOPNOTSUPP for unknown module events.
PR: kern/18473 Submitted by: "Jeroen C. van Gelderen" <gelderen@systemics.com>
|
83366 |
12-Sep-2001 |
julian |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
|
83291 |
10-Sep-2001 |
kris |
Fix some signed/unsigned integer confusion, and add bounds checking of arguments to some functions.
Obtained from: NetBSD Reviewed by: peter MFC after: 2 weeks
|
82937 |
04-Sep-2001 |
phk |
Kill the NCCD constant by modernizing the ccd driver.
Submitted by: sobomax Reviewed by: phk
|
76366 |
08-May-2001 |
phk |
Polish error handling with biofinish().
|
76322 |
06-May-2001 |
phk |
Actually biofinish(struct bio *, struct devstat *, int error) is more general than the bioerror().
Most of this patch is generated by scripts.
|
74993 |
29-Mar-2001 |
gallatin |
fix a number of printf format string warnings inside DEBUG ifdefs
|
74810 |
26-Mar-2001 |
phk |
Send the remains (such as I have located) of "block major numbers" to the bit-bucket.
|
71773 |
29-Jan-2001 |
phk |
Fix a braino in ccd's clone routine.
Submitted by: tegge
|
71699 |
27-Jan-2001 |
jhb |
Back out proc locking to protect p_ucred for obtaining additional references along with the actual obtaining of additional references.
|
71463 |
23-Jan-2001 |
jhb |
Proc locking in the form of using the proc lock to protect p_ucred while we obtain another reference to it for vnode operations.
|
69781 |
08-Dec-2000 |
dwmalone |
Convert more malloc+bzero to malloc+M_ZERO.
Submitted by: josh@zipperup.org Submitted by: Robert Drehmel <robd@gmx.net>
|
65374 |
02-Sep-2000 |
phk |
Avoid the modules madness I inadvertently introduced by making the cloning infrastructure standard in kern_conf. Modules are now the same with or without devfs support.
If you need to detect if devfs is present, in modules or elsewhere, check the integer variable "devfs_present".
This happily removes an ugly hack from kern/vfs_conf.c.
This forces a rename of the eventhandler and the standard clone helper function.
Include <sys/eventhandler.h> in <sys/conf.h>: it's a helper #include like <sys/queue.h>
Remove all #includes of opt_devfs.h they no longer matter.
|
65208 |
29-Aug-2000 |
phk |
Give ccd a cloning function.
|
62550 |
04-Jul-2000 |
mckusick |
Move the truncation code out of vn_open and into the open system call after the acquisition of any advisory locks. This fix corrects a case in which a process tries to open a file with a non-blocking exclusive lock. Even if it fails to get the lock it would still truncate the file even though its open failed. With this change, the truncation is done only after the lock is successfully acquired.
Obtained from: BSD/OS
|
60041 |
05-May-2000 |
phk |
Separate the struct bio related stuff out of <sys/buf.h> into <sys/bio.h>.
<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes.
Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data.
Still a few bogus uses of struct buf to track down.
Repocopy by: peter
|
59841 |
01-May-2000 |
phk |
Convert to struct bio instead of struct buf.
|
59794 |
30-Apr-2000 |
phk |
Remove unneeded #include <vm/vm_zone.h>
Generated by: src/tools/tools/kerninclude
|
59249 |
15-Apr-2000 |
phk |
Complete the bio/buf divorce for all code below devfs::strategy
Exceptions: Vinum untouched. This means that it cannot be compiled. Greg Lehey is on the case.
CCD not converted yet, casts to struct buf (still safe)
atapi-cd casts to struct buf to examine B_PHYS
|
58934 |
02-Apr-2000 |
phk |
Move B_ERROR flag to b_ioflags and call it BIO_ERROR.
(Much of this done by script)
Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED.
Move b_pblkno and b_iodone_chain to struct bio while we transition, they will be obsoleted once bio structs chain/stack.
Add bio_queue field for struct bio aware disksort.
Address a lot of stylistic issues brought up by bde.
|
58349 |
20-Mar-2000 |
phk |
Rename the existing BUF_STRATEGY() to DEV_STRATEGY()
substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo)
substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo)
This patch is machine generated except for the ccd.c and buf.h parts.
|
58345 |
20-Mar-2000 |
phk |
Remove B_READ, B_WRITE and B_FREEBUF and replace them with a new field in struct buf: b_iocmd. The b_iocmd is enforced to have exactly one bit set.
B_WRITE was bogusly defined as zero giving rise to obvious coding mistakes.
Also eliminate the redundant struct buf flag B_CALL, it can just as efficiently be done by comparing b_iodone to NULL.
Should you get a panic or drop into the debugger, complaining about "b_iocmd", don't continue. It is likely to write on your disk where it should have been reading.
This change is a step in the direction towards a stackable BIO capability.
A lot of this patch were machine generated (Thanks to style(9) compliance!)
Vinum users: Greg has not had time to test this yet, be careful.
|
56825 |
29-Jan-2000 |
peter |
Remove #if NCCD > 0 - it's guaranteed to be true by config if ccd.c is being compiled. (NCCD is used elsewhere though :-( )
|
56098 |
16-Jan-2000 |
phk |
Cleanup some remaining bdev fluff.
|
55756 |
10-Jan-2000 |
phk |
Give vn_isdisk() a second argument where it can return a suitable errno.
Suggested by: bde
|
54934 |
21-Dec-1999 |
eivind |
Remove unused variable
|
54655 |
15-Dec-1999 |
eivind |
Introduce NDFREE (and remove VOP_ABORTOP)
|
54279 |
08-Dec-1999 |
ken |
Revamp the devstat priority system. All disks now have the same priority. The same goes for CD drivers and tape drivers. In systems with mixed IDE and SCSI, devices in the same priority class will be sorted in attach order.
Also, the 'CCD' priority is now the 'ARRAY' priority, and a number of drivers have been modified to use that priority.
This includes the necessary changes to all drivers, except the ATA drivers. Soren will modify those separately.
This does not include and does not require any change in the devstat version number, since no known userland applications use the priority enumerations.
Reviewed by: msmith, sos, phk, jlemon, mjacob, bde
|
53577 |
22-Nov-1999 |
phk |
Convert various pieces of code to use vn_isdisk() rather than checking for vp->v_type == VBLK.
In ccd: we don't need to call VOP_GETATTR to find the type of a vnode.
Reviewed by: sos
|
52965 |
07-Nov-1999 |
phk |
Remove the devsw magic from DEV_MODULE()
|
51957 |
05-Oct-1999 |
n_hibma |
Removal of sys/device.h
- Move intrhook stuff into kernel.h - Remove all occurrences of #device <device.h> - Add kernel.h were necessary (nowhere) - delete device.h
This file contained the structures for cfdata (old style config) and is no longer used. It was included by most drivers.
It confuses the remote debugger as the definition of 'struct device' in device.h is found before the one in bus_private.h.
|
51714 |
27-Sep-1999 |
grog |
Correct typo in comment. putccdbuf() releases a buffer, it doesn't allocate one.
|
51701 |
27-Sep-1999 |
dillon |
Buffer locking code failed to use BUF_KERNPROC and BUF_UNLOCK and BUF_LOCKFREE a buffer prior to physically freeing it. While these bugs did not cause a crash, they might in the future.
Added eof handling for unlabeled partitions.
Submitted by: Tor.Egge@fast.no
|
51658 |
25-Sep-1999 |
phk |
Remove five now unused fields from struct cdevsw. They should never have been there in the first place. A GENERIC kernel shrinks almost 1k.
Add a slightly different safetybelt under nostop for tty drivers.
Add some missing FreeBSD tags
|
51601 |
23-Sep-1999 |
dillon |
Cleanup CCD quite a bit, including adding clarifying comments.
Enhance MIRROR code. Add a few more sanity checks and implement a zone-based disk selector to make use of both disks when reading.
Also implement a read fail-over. If a read error occurs on one disk, the I/O is retried on the other.
NOTE: CCD's mirroring support cannot deal with write errors properly in regards to recovery, meaning that 'old' data under a write error may be read non-deterministically if you reboot after a write error, and CCD certainly cannot deal with a disk changeout. And it still can't. Use vinum if you are really serious about mirroring. CCD basically just implements a poor-man's mirror.
|
51600 |
23-Sep-1999 |
dillon |
Fix ccdiodone code. The code was using cbp->cb_buf.b_bcount to sum the total amount of I/O issued to determine when all the I/O has completed. This fails when the EOF boundry occurs in the middle of an I/O. Using cbp->cb_buf.b_bufsize works better.
|
51580 |
23-Sep-1999 |
dillon |
Fix bug in pseudo-geometry calculation code that assumed a sector size smaller then 1024 bytes.
|
51376 |
18-Sep-1999 |
phk |
Use devstat_end_transaction_buf() rather than Use devstat_end_transaction()
|
51111 |
09-Sep-1999 |
julian |
Changes to centralise the default blocksize behaviour. More likely to follow.
Submitted by: phk@freebsd.org
|
50830 |
03-Sep-1999 |
julian |
Revert a bunch of contraversial changes by PHK. After a quick think and discussion among various people some form of some of these changes will probably be recommitted.
The reversion requested was requested by dg while discussions proceed. PHK has indicated that he can live with this, and it has been agreed that some form of some of these changes may return shortly after further discussion.
|
50623 |
30-Aug-1999 |
phk |
Make bdev userland access work like cdev userland access unless the highly non-recommended option ALLOW_BDEV_ACCESS is used.
(bdev access is evil because you don't get write errors reported.)
Kill si_bsize_best before it kills Matt :-)
Use the specfs routines rather having cloned copies in devfs.
|
50477 |
28-Aug-1999 |
peter |
$Id$ -> $FreeBSD$
|
50403 |
26-Aug-1999 |
phk |
Initialize the dev->si_bsize fields.
Submitted by: tegge Reviewed by: phk
|
49771 |
14-Aug-1999 |
phk |
Spring cleaning around strategy and disklabels/slices:
Introduce BUF_STRATEGY(struct buf *, int flag) macro, and use it throughout. please see comment in sys/conf.h about the flag argument.
Remove strategy argument from all the diskslice/label/bad144 implementations, it should be found from the dev_t.
Remove bogus and unused strategy1 routines.
Remove open/close arguments from dssize(). Pick them up from dev_t.
Remove unused and unfinished setgeom support from diskslice/label/bad144 code.
|
48885 |
18-Jul-1999 |
phk |
Use the vn_todev() function, rather than VOP_GETATTR
|
48865 |
17-Jul-1999 |
phk |
Fix 2nd arg to udev2dev() call in ccd.c
|
48268 |
27-Jun-1999 |
peter |
Initialize and hold locks for ccd generated bufs..
Obtained from: Matt Dillon <dillon@backplane.com>
|
47625 |
30-May-1999 |
phk |
This commit should be a extensive NO-OP:
Reformat and initialize correctly all "struct cdevsw".
Initialize the d_maj and d_bmaj fields.
The d_reset field was not removed, although it is never used.
I used a program to do most of this, so all the files now use the same consistent format. Please keep it that way.
Vinum and i4b not modified, patches emailed to respective authors.
|
47028 |
11-May-1999 |
phk |
Divorce "dev_t" from the "major|minor" bitmap, which is now called udev_t in the kernel but still called dev_t in userland.
Provide functions to manipulate both types: major() umajor() minor() uminor() makedev() umakedev() dev2udev() udev2dev()
For now they're functions, they will become in-line functions after one of the next two steps in this process.
Return major/minor/makedev to macro-hood for userland.
Register a name in cdevsw[] for the "filedescriptor" driver.
In the kernel the udev_t appears in places where we have the major/minor number combination, (ie: a potential device: we may not have the driver nor the device), like in inodes, vattr, cdevsw registration and so on, whereas the dev_t appears where we carry around a reference to a actual device.
In the future the cdevsw and the aliased-from vnode will be hung directly from the dev_t, along with up to two softc pointers for the device driver and a few houskeeping bits. This will essentially replace the current "alias" check code (same buck, bigger bang).
A little stunt has been provided to try to catch places where the wrong type is being used (dev_t vs udev_t), if you see something not working, #undef DEVT_FASCIST in kern/kern_conf.c and see if it makes a difference. If it does, please try to track it down (many hands make light work) or at least try to reproduce it as simply as possible, and describe how to do that.
Without DEVT_FASCIST I belive this patch is a no-op.
Stylistic/posixoid comments about the userland view of the <sys/*.h> files welcome now, from userland they now contain the end result.
Next planned step: make all dev_t's refer to the same devsw[] which means convert BLK's to CHR's at the perimeter of the vnodes and other places where they enter the game (bootdev, mknod, sysctl).
|
46635 |
07-May-1999 |
phk |
Continue where Julian left off in July 1998:
Virtualize bdevsw[] from cdevsw. bdevsw() is now an (inline) function.
Join CDEV_MODULE and BDEV_MODULE to DEV_MODULE (please pay attention to the order of the cmaj/bmaj arguments!)
Join CDEV_DRIVER_MODULE and BDEV_DRIVER_MODULE to DEV_DRIVER_MODULE (ditto!)
(Next step will be to convert all bdev dev_t's to cdev dev_t's before they get to do any damage^H^H^H^H^H^Hwork in the kernel.)
|
46625 |
07-May-1999 |
phk |
Introduce two functions: physread() and physwrite() and use these directly in *devsw[] rather than the 46 local copies of the same functions.
(grog will do the same for vinum when he has time)
|
46576 |
06-May-1999 |
phk |
Don't use <sys/disk.h>
|
44671 |
11-Mar-1999 |
dg |
Fixed variable overflow problem.
Obtained from: NetBSD via Mark J. Taylor <mtaylor@cybernet.com>
|
44617 |
10-Mar-1999 |
mjacob |
Don't forget to remove devstat entries when taking down the CCD device.
|
44126 |
18-Feb-1999 |
ken |
Set the devstat priority for ccd devices to DEVSTAT_PRIORITY_CCD instead of DEVSTAT_PRIORITY_OTHER.
|
43819 |
10-Feb-1999 |
ken |
Add a prioritization field to the devstat_add_entry() call so that peripheral drivers can determine where in the devstat(9) list they are inserted.
This requires recompilation of libdevstat, systat, vmstat, rpc.rstatd, and any ports that depend on the devstat code, since the size of the devstat structure has changed. The devstat version number has been incremented as well to reflect the change.
This sorts devices in the devstat list in "more interesting" to "less interesting" order. So, for instance, da devices are now more important than floppy drives, and so will appear before floppy drives in the default output from systat, iostat, vmstat, etc.
The order of devices is, for now, kept in a central table in devicestat.h. If individual drivers were able to make a meaningful decision on what priority they should be at attach time, we could consider splitting the priority information out into the various drivers. For now, though, they have no way of knowing that, so it's easier to put them in an easy to find table.
Also, move the checkversion() call in vmstat(8) to a more logical place.
Thanks to Bruce and David O'Brien for suggestions, for reviewing this, and for putting up with the long time it has taken me to commit it. Bruce did object somewhat to the central priority table (he would rather the priorities be distributed in each driver), so his objection is duly noted here.
Reviewed by: bde, obrien
|
43295 |
27-Jan-1999 |
dillon |
Fix warnings preparing for -Wall -Wcast-qual
Also disable one usb module in LINT due to fatal compilation errors, temporary.
|
43076 |
22-Jan-1999 |
peter |
Convert ccd to a proper module vs. something started by PSEUDO_SET().
|
39228 |
15-Sep-1998 |
gibbs |
Update system to new device statistics code.
Submitted by: "Kenneth D. Merry" <ken@plutotech.com> mike@smith.net.au (Mike Smith)
|
38438 |
19-Aug-1998 |
sos |
Make struct buf->b_offset reflect the real byte offset which got in via the uio struct. This enables device drivers to use != DEV_BSIZE blocking on devices with wierd sector/block sizes (ie CDROM's).
|
37389 |
04-Jul-1998 |
julian |
There is no such thing any more as "struct bdevsw".
There is only cdevsw (which should be renamed in a later edit to deventry or something). cdevsw contains the union of what were in both bdevsw an cdevsw entries. The bdevsw[] table stiff exists and is a second pointer to the cdevsw entry of the device. it's major is in d_bmaj rather than d_maj. some cleanup still to happen (e.g. dsopen now gets two pointers to the same cdevsw struct instead of one to a bdevsw and one to a cdevsw).
rawread()/rawwrite() went away as part of this though it's not strictly the same patch, just that it involves all the same lines in the drivers.
cdroms no longer have write() entries (they did have rawwrite (?)). tapes no longer have support for bdev operations.
Reviewed by: Eivind Eklund and Mike Smith Changes suggested by eivind.
|
37384 |
04-Jul-1998 |
julian |
VOP_STRATEGY grows an (struct vnode *) argument as the value in b_vp is often not really what you want. (and needs to be frobbed). more cleanups will follow this. Reviewed by: Bruce Evans <bde@freebsd.org>
|
36735 |
07-Jun-1998 |
dfr |
This commit fixes various 64bit portability problems required for FreeBSD/alpha. The most significant item is to change the command argument to ioctl functions from int to u_long. This change brings us inline with various other BSD versions. Driver writers may like to use (__FreeBSD_version == 300003) to detect this change.
The prototype FreeBSD/alpha machdep will follow in a couple of days time.
|
34437 |
09-Mar-1998 |
julian |
Slightly more correct initialisation of the new buf struct for soft-updates. Submitted by: Chris Csanady <ccsanady@friley585.res.iastate.edu> Suggested by: Kirk McKusick
|
33740 |
22-Feb-1998 |
jkh |
Properly bzero() structures after they're returned from getccdbuf(). Submitted by: Chris Csanady <ccsanady@friley585.res.iastate.edu>
|
33365 |
15-Feb-1998 |
jkh |
Revert part of my previous patch - I don't see the *need* to call splbio() from within an interrupt handler here. :-)
|
33363 |
15-Feb-1998 |
jkh |
missing spl() call and off by one error in the handling of the partitions. Submitted by: Chris Csanady <ccsanady@friley585.res.iastate.edu> Obtained from: OpenBSD
|
32921 |
31-Jan-1998 |
eivind |
Remove unused devfs include. (Julian or Satoshi might want to add proper DEVFS support here; just including the header file doesn't do any good, and would make this depend on opt_devfs.h)
|
31270 |
18-Nov-1997 |
phk |
There is no ccdread() nor ccdwrite().
|
30688 |
24-Oct-1997 |
phk |
Statizice.
|
30294 |
11-Oct-1997 |
phk |
Remove a #ifndef __FreeBSD__ chunk.
|
26640 |
14-Jun-1997 |
bde |
Removed unused #includes.
|
25360 |
01-May-1997 |
sos |
Make ccd use the maxsecsize sector size as denominator, this fixes ccd on != 512byte devices..
|
24203 |
24-Mar-1997 |
bde |
Don't include <sys/ioctl.h> in the kernel. Stage 1: don't include it when it is not used. In most cases, the reasons for including it went away when the special ioctl headers became self-sufficient.
|
22975 |
22-Feb-1997 |
peter |
Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
|
22538 |
10-Feb-1997 |
mpp |
Make ccd compile again after the Lite2 merge.
VOP_UNLOCK was being called with the wrong number of arguments.
|
21673 |
14-Jan-1997 |
jkh |
Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
21470 |
10-Jan-1997 |
dyson |
Fix CCD for bounced devices.
|
18084 |
06-Sep-1996 |
phk |
Remove devconf, it never grew up to be of any use.
|
17275 |
24-Jul-1996 |
asami |
Fail when odd number of disks are specified with mirror flag. Memory leak fixes. Miscellaneous cleanup.
Partially submitted by: Matt White <mwhite+@CMU.EDU>
|
17264 |
23-Jul-1996 |
phk |
Make a "DWIM" function for adding [bc]devsw entries for bdev drivers.
Saves about 280 butes of source per driver, 56 bytes in object size and another 56 bytes moves from data to bss.
No functional change intended nor expected.
GENERIC should be about one k smaller now :-)
|
17237 |
21-Jul-1996 |
phk |
Substitute raw{read|write} for ccd{read|write}
|
16322 |
12-Jun-1996 |
gpalmer |
Clean up -Wunused warnings.
Reviewed by: bde
|
15765 |
13-May-1996 |
asami |
Add #ifndef/#endif around the "#define CCD_OFFSET 16", so you can override it in your kernel config file.
Requested (in essence) by: phk
|
15763 |
13-May-1996 |
asami |
Leave 16 lines in front of each component partition. It's now safe to use sd87a or sd237e even if they start at the beginning of the slice.
You can also use sd85c if you prefer, although you need to change the type field in the disklabel to "4.2BSD".
|
15369 |
24-Apr-1996 |
asami |
Add missing "int" to static var.
|
14821 |
26-Mar-1996 |
asami |
Change how mirror writes are handled, according to the discussion on the mailing list.
When initiating a write, ccdbuffer() returns two "struct ccdbuf *"s linked together by the cb_mirror field. "cb_pflags & CCDPF_MIRROR_DONE" is set to 0 on both of them.
When a component returns to ccdiodone(), it checks if "cb_pflags & CCDPF_MIRROR_DONE" is set or not. If not, it sets the partner's flag and returns. If it is, it means its partner has already returned, so it will go to the regular cleanup (which is in the fallthrough code).
There should be no performance or functionality changes unless the higher-level scsi driver does something with the resid value. The change is purely aesthetical and prepares us for the parity implementation.
|
14730 |
21-Mar-1996 |
asami |
Ported to 2.2-current. Uses [bc]devsw_add(), and is also now a proper pseudo-device.
Doesn't use devfs correctly yet.
|
13784 |
31-Jan-1996 |
asami |
Fix one warning and fix one bug found while looking at another warning (but caused by a different reason):
. #ifndef __FreeBSD__ around check for negative size, FreeBSD size_t is unsigned
. Disable mirror/parity if interleave size is 0 (i.e., serial concatenation).
|
13775 |
31-Jan-1996 |
asami |
Mirror support. When CCDF_MIRROR is set:
(1) The reads are always done from the first n/2 disks.
(2) Each write is done twice, to the "data" disk (in the first half) and the "mirror" disk (in the second half).
ccdbuffer() now takes an extra argument (struct ccdbuf **) and stores the pointer to ccdbuf in there. In case of a mirrored write, it allocates and stores two pointers. The "residual" is also doubled for mirrored writes so that ccdiodone() can correctly tell when all the writes are done.
|
13764 |
30-Jan-1996 |
asami |
Prepare for adding mirroring. Check for flags (mirror forces uniform), reduce the size to half, etc. Right now it only uses the first n/2 disks for both read and write.
|
13173 |
02-Jan-1996 |
asami |
Prepare to add support for parity. Report the post-parity size, allocate space around parity blocks.
|
13070 |
28-Dec-1995 |
asami |
Added $Id$.
|
13046 |
27-Dec-1995 |
asami |
Changes to make it work on FreeBSD-2.1.
|
13041 |
27-Dec-1995 |
asami |
ccd.c and ccd.4 from NetBSD-current circa 12/25/95.
|