267961 |
27-Jun-2014 |
hselasky |
Extend the meaning of the CTLFLAG_TUN flag to automatically check if there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel.
Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change.
MFC after: 2 weeks Sponsored by: Mellanox Technologies
|
255219 |
05-Sep-2013 |
pjd |
Change the cap_rights_t type from uint64_t to a structure that we can extend in the future in a backward compatible (API and ABI) way.
The cap_rights_t represents capability rights. We used to use one bit to represent one right, but we are running out of spare bits. Currently the new structure provides place for 114 rights (so 50 more than the previous cap_rights_t), but it is possible to grow the structure to hold at least 285 rights, although we can make it even larger if 285 rights won't be enough.
The structure definition looks like this:
struct cap_rights { uint64_t cr_rights[CAP_RIGHTS_VERSION + 2]; };
The initial CAP_RIGHTS_VERSION is 0.
The top two bits in the first element of the cr_rights[] array contain total number of elements in the array - 2. This means if those two bits are equal to 0, we have 2 array elements.
The top two bits in all remaining array elements should be 0. The next five bits in all array elements contain array index. Only one bit is used and bit position in this five-bits range defines array index. This means there can be at most five array elements in the future.
To define new right the CAPRIGHT() macro must be used. The macro takes two arguments - an array index and a bit to set, eg.
#define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL)
We still support aliases that combine few rights, but the rights have to belong to the same array element, eg:
#define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL) #define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL)
#define CAP_FCHMODAT (CAP_FCHMOD | CAP_LOOKUP)
There is new API to manage the new cap_rights_t structure:
cap_rights_t *cap_rights_init(cap_rights_t *rights, ...); void cap_rights_set(cap_rights_t *rights, ...); void cap_rights_clear(cap_rights_t *rights, ...); bool cap_rights_is_set(const cap_rights_t *rights, ...);
bool cap_rights_is_valid(const cap_rights_t *rights); void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src); void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src); bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little);
Capability rights to the cap_rights_init(), cap_rights_set(), cap_rights_clear() and cap_rights_is_set() functions are provided by separating them with commas, eg:
cap_rights_t rights;
cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT);
There is no need to terminate the list of rights, as those functions are actually macros that take care of the termination, eg:
#define cap_rights_set(rights, ...) \ __cap_rights_set((rights), __VA_ARGS__, 0ULL) void __cap_rights_set(cap_rights_t *rights, ...);
Thanks to using one bit as an array index we can assert in those functions that there are no two rights belonging to different array elements provided together. For example this is illegal and will be detected, because CAP_LOOKUP belongs to element 0 and CAP_PDKILL to element 1:
cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL);
Providing several rights that belongs to the same array's element this way is correct, but is not advised. It should only be used for aliases definition.
This commit also breaks compatibility with some existing Capsicum system calls, but I see no other way to do that. This should be fine as Capsicum is still experimental and this change is not going to 9.x.
Sponsored by: The FreeBSD Foundation
|
254263 |
12-Aug-2013 |
scottl |
Update PCI drivers to no longer look at the MEMIO-enabled bit in the PCI command register. The lazy BAR allocation code in FreeBSD sometimes disables this bit when it detects a range conflict, and will re-enable it on demand when a driver allocates the BAR. Thus, the bit is no longer a reliable indication of capability, and should not be checked. This results in the elimination of a lot of code from drivers, and also gives the opportunity to simplify a lot of drivers to use a helper API to set the busmaster enable bit.
This changes fixes some recent reports of disk controllers and their associated drives/enclosures disappearing during boot.
Submitted by: jhb Reviewed by: jfv, marius, achadd, achim MFC after: 1 day
|
246713 |
12-Feb-2013 |
kib |
Reform the busdma API so that new types may be added without modifying every architecture's busdma_machdep.c. It is done by unifying the bus_dmamap_load_buffer() routines so that they may be called from MI code. The MD busdma is then given a chance to do any final processing in the complete() callback.
The cam changes unify the bus_dmamap_load* handling in cam drivers.
The arm and mips implementations are updated to track virtual addresses for sync(). Previously this was done in a type specific way. Now it is done in a generic way by recording the list of virtuals in the map.
Submitted by: jeff (sponsored by EMC/Isilon) Reviewed by: kan (previous version), scottl, mjacob (isp(4), no objections for target mode changes) Discussed with: ian (arm changes) Tested by: marius (sparc64), mips (jmallet), isci(4) on x86 (jharris), amd64 (Fabian Keil <freebsd-listen@fabiankeil.de>)
|
234501 |
20-Apr-2012 |
jhb |
The amr(4) firmware contains a rather dubious "feature" where it assumes for small buffers (< 64k) that the OS driver is actually using a buffer rounded up to the next power of 2. It also assumes that the buffer is at least 4k in size. Furthermore, there is at least one known instance of megarc sending a request with a 12k buffer where the firmware writes out a 24k-ish reply.
To workaround the data corruption triggered by this "feature", ensure that buffers for user commands use a minimum size of 32k, and that buffers between 32k and 64k use a 64k buffer.
PR: kern/155658 Submitted by: Andreas Longwitz longwitz incore de Reviewed by: scottl MFC after: 1 week
|
227843 |
22-Nov-2011 |
marius |
- There's no need to overwrite the default device method with the default one. Interestingly, these are actually the default for quite some time (bus_generic_driver_added(9) since r52045 and bus_generic_print_child(9) since r52045) but even recently added device drivers do this unnecessarily. Discussed with: jhb, marcel - While at it, use DEVMETHOD_END. Discussed with: jhb - Also while at it, use __FBSDID.
|
224778 |
11-Aug-2011 |
rwatson |
Second-to-last commit implementing Capsicum capabilities in the FreeBSD kernel for FreeBSD 9.0:
Add a new capability mask argument to fget(9) and friends, allowing system call code to declare what capabilities are required when an integer file descriptor is converted into an in-kernel struct file *. With options CAPABILITIES compiled into the kernel, this enforces capability protection; without, this change is effectively a no-op.
Some cases require special handling, such as mmap(2), which must preserve information about the maximum rights at the time of mapping in the memory map so that they can later be enforced in mprotect(2) -- this is done by narrowing the rights in the existing max_protection field used for similar purposes with file permissions.
In namei(9), we assert that the code is not reached from within capability mode, as we're not yet ready to enforce namespace capabilities there. This will follow in a later commit.
Update two capability names: CAP_EVENT and CAP_KEVENT become CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they represent.
Approved by: re (bz) Submitted by: jonathan Sponsored by: Google Inc
|
196037 |
02-Aug-2009 |
attilio |
Make the newbus subsystem Giant free by adding the new newbus sxlock. The newbus lock is responsible for protecting newbus internIal structures, device states and devclass flags. It is necessary to hold it when all such datas are accessed. For the other operations, softc locking should ensure enough protection to avoid races.
Newbus lock is automatically held when virtual operations on the device and bus are invoked when loading the driver or when the suspend/resume take place. For other 'spourious' operations trying to access/modify the newbus topology, newbus lock needs to be automatically acquired and dropped.
For the moment Giant is also acquired in some key point (modules subsystem) in order to avoid problems before the 8.0 release as module handlers could make assumptions about it. This Giant locking should go just after the release happens.
Please keep in mind that the public interface can be expanded in order to provide more support, if there are really necessities at some point and also some bugs could arise as long as the patch needs a bit of further testing.
Bump __FreeBSD_version in order to reflect the newbus lock introduction.
Reviewed by: ed, hps, jhb, imp, mav, scottl No answer by: ariff, thompsa, yongari Tested by: pho, G. Trematerra <giovanni dot trematerra at gmail dot com>, Brandon Gooch <jamesbrandongooch at gmail dot com> Sponsored by: Yahoo! Incorporated Approved by: re (ksmith)
|
183397 |
27-Sep-2008 |
ed |
Replace all calls to minor() with dev2unit().
After I removed all the unit2minor()/minor2unit() calls from the kernel yesterday, I realised calling minor() everywhere is quite confusing. Character devices now only have the ability to store a unit number, not a minor number. Remove the confusion by using dev2unit() everywhere.
This commit could also be considered as a bug fix. A lot of drivers call minor(), while they should actually be calling dev2unit(). In -CURRENT this isn't a problem, but it turns out we never had any problem reports related to that issue in the past. I suspect not many people connect more than 256 pieces of the same hardware.
Reviewed by: kib
|
168752 |
15-Apr-2007 |
scottl |
Remove Giant from CAM. Drivers (SIMs) now register a mutex that CAM will use to synchornize and protect all data objects that are used for that SIM. Drivers that are not yet MPSAFE register Giant and operate as usual. RIght now, no drivers are MPSAFE, though a few will be changed in the coming week as this work settles down.
The driver API has changed, so all CAM drivers will need to be recompiled. The userland API has not changed, so tools like camcontrol do not need to be recompiled.
|
163816 |
31-Oct-2006 |
mjacob |
The first of 3 major steps to move the CAM layer forward to using the CAM_NEW_TRAN_CODE that has been in the tree for some years now.
This first step consists solely of adding to or correcting CAM_NEW_TRAN_CODE pieces in the kernel source tree such that a both a GENERIC (at least on i386) and a LINT build with CAM_NEW_TRAN_CODE as an option will compile correctly and run (at least with some the h/w I have).
After a short settle time, the other pieces (making CAM_NEW_TRAN_CODE the default and updating libcam and camcontrol) will be brought in.
This will be an incompatible change in that the size of structures related to XPT_PATH_INQ and XPT_{GET,SET}_TRAN_SETTINGS change in both size and content. However, basic system operation and basic system utilities work well enough with this change.
Reviewed by: freebsd-scsi and specific stakeholders
|
153409 |
14-Dec-2005 |
scottl |
Mega update to the LSI MegaRAID driver:
1. Implement a large set of ioctl shims so that the Linux management apps from LSI will work. This includes infrastructure to support adding, deleting and rescanning arrays at runtime. This is based on work from Doug Ambrosko, heavily augmented by LSI and Yahoo.
2. Implement full 64-bit DMA support. Systems with more than 4GB of RAM can now operate without the cost of bounce buffers. Cards that cannot do 64-bit DMA will automatically revert to using bounce buffers. This option can be forced off by setting the 'hw.amr.force_sg32" tunable in the loader. It should only be turned off for debugging purposes. This work was sponsored by Yahoo.
3. Streamline the command delivery and interrupt handler paths after much discussion with Dell and LSI. The logic now closely matches the intended design, making it both more robust and much faster. Certain i/o failures under heavy load should be fixed with this.
4. Optimize the locking. In the interrupt handler, the card can be checked for completed commands without any locks held, due to the handler being implicitely serialized and there being no need to look at any shared data. Only grab the lock to return the command structure to the free pool. A small optimization can still be made to collect all of the completions together and then free them together under a single lock.
Items 3 and 4 significantly increase the performance of the driver. On an LSI 320-2X card, transactions per second went from 13,000 to 31,000 in my testing with these changes. However, these changes are still fairly experimental and shouldn't be merged to 6.x until there is more testing.
Thanks to Doug Ambrosko, LSI, Dell, and Yahoo for contributing towards this.
|
140340 |
16-Jan-2005 |
scottl |
Lock the AMR driver: - Introduce the amr_io_lock to control access to command queues, bio queues, and the hardware. - Eliminate the taskqueue and do all completion processing in the ithread. - Assign a static slot number to each command instead of doing a linear search for free slots each time a command is needed. - Modify the interrupt handler to more closely match what Linux does, for safety.
|
138422 |
05-Dec-2004 |
scottl |
Fix a number of bugs and significantly alter the command execution path to properly support bounce buffers and resource shortages. This allows the driver to work properly and reliably with more than 4GB of RAM. Of the three data paths that exist in the driver, (block, CAM, ioctl), the ioctl path has not been well tested with these changes due to difficulty with finding an application that uses it that actually works.
Sponsored by: The FreeBSD Foundation and FreeBSD Systems, Inc.
|
125975 |
18-Feb-2004 |
phk |
Change the disk(9) API in order to make device removal more robust.
Previously the "struct disk" were owned by the device driver and this gave us problems when the device disappared and the users of that device were not immediately disappearing.
Now the struct disk is allocate with a new call, disk_alloc() and owned by geom_disk and just abandonned by the device driver when disk_create() is called.
Unfortunately, this results in a ton of "s/\./->/" changes to device drivers.
Since I'm doing the sweep anyway, a couple of other API improvements have been carried out at the same time:
The Giant awareness flag has been flipped from DISKFLAG_NOGIANT to DISKFLAG_NEEDSGIANT
A version number have been added to disk_create() so that we can detect, report and ignore binary drivers with old ABI in the future.
Manual page update to follow shortly.
|
117126 |
01-Jul-2003 |
scottl |
Mega busdma API commit.
Add two new arguments to bus_dma_tag_create(): lockfunc and lockfuncarg. Lockfunc allows a driver to provide a function for managing its locking semantics while using busdma. At the moment, this is used for the asynchronous busdma_swi and callback mechanism. Two lockfunc implementations are provided: busdma_lock_mutex() performs standard mutex operations on the mutex that is specified from lockfuncarg. dftl_lock() is a panic implementation and is defaulted to when NULL, NULL are passed to bus_dma_tag_create(). The only time that NULL, NULL should ever be used is when the driver ensures that bus_dmamap_load() will not be deferred. Drivers that do not provide their own locking can pass busdma_lock_mutex,&Giant args in order to preserve the former behaviour.
sparc64 and powerpc do not provide real busdma_swi functions, so this is largely a noop on those platforms. The busdma_swi on is64 is not properly locked yet, so warnings will be emitted on this platform when busdma callback deferrals happen.
If anyone gets panics or warnings from dflt_lock() being called, please let me know right away.
Reviewed by: tmm, gibbs
|
111528 |
26-Feb-2003 |
scottl |
Introduce a new taskqueue that runs completely free of Giant, and in turns runs its tasks free of Giant too. It is intended that as drivers become locked down, they will move out of the old, Giant-bound taskqueue and into this new one. The old taskqueue has been renamed to taskqueue_swi_giant, and the new one keeps the name taskqueue_swi.
|
106225 |
30-Oct-2002 |
emoore |
amr.c, amr_cam.c, amrreg.h, amrvar.h: - added support for 12/16 byte cdb's, effecting CAM branch only ( non-disk support )
amrreg.h: - increased number of scatter gather elements from 16 to 26.
amr_pci.c: - amr_pci_free(), incorrect bus tag meant for 'amr_mailbox_dmat' was being freed
all: - copyright change requested by scottl
Reviewed by: ps,scottl MFC after: 1 week
|
105419 |
18-Oct-2002 |
emoore |
(1) added LSI Logic copyright, and legal line 3 in license, and string changes for "LSILogic" (2) enabled non-disk support through CAM interface (3) HA_INQ (a) enabled tagged queuing (b) disable reset during driver loading (b) renamed BSDi string to LSI (4) disabled detecting disk devices during SCSI INQUIRY (5) changed dcdb single element sglist to send one entire buffer chunk (6) nsgelem not set in sglist (7) ap_data_transfer_length not set for dcdb (8) changed "struct thread" to "d_thread_t" for compatibliity { xxx_open, xxx_close, xxx_ioctl } (9) miscellaneous compatiblity fixes (10) bug fix for 0x0409/0x1000 card (11) added compiling amr_cam.c in sys/conf/files (12) added compiling amr_cam.c in sys/modules/amr/Makefile
Reviewed by:ps MFC after:1 week 1 week
|
103714 |
20-Sep-2002 |
phk |
(This commit touches about 15 disk device drivers in a very consistent and predictable way, and I apologize if I have gotten it wrong anywhere, getting prior review on a patch like this is not feasible, considering the number of people involved and hardware availability etc.)
If struct disklabel is the messenger: kill the messenger.
Inside struct disk we had a struct disklabel which disk drivers used to communicate certain metrics to the disklayer above (GEOM or the disk mini-layer). This commit changes this communication to use four explicit fields instead.
Amongst the benefits is that the fields do not get overwritten by wrong or bogus on-disk disklabels.
Once that is clear, <sys/disk.h> which is included in the drivers no longer need to pull <sys/disklabel.h> and <sys/diskslice.h> in, the few places that needs them, have gotten explicit #includes for them.
The disklabel inside struct disk is now only for internal use in the disk mini-layer, so instead of embedding it, we malloc it as we need it.
This concludes (modulus any mistakes) the series of disklabel related commits.
I belive it all amounts to a NOP for all the rest of you :-)
Sponsored by: DARPA & NAI Labs.
|
103675 |
20-Sep-2002 |
phk |
Make FreeBSD "struct disklabel" agnostic, step 311 of 723:
Rename diskerr() to disk_err() for naming consistency.
Drop the by now entirely useless struct disklabel argument.
Add a flag argument for new-line termination.
Fix a couple of printf-format-casts to %j instead of %l.
Correctly print the name of all bio commands.
Move the function from subr_disklabel.c to subr_disk.c, and from <sys/disklabel.h> to <sys/disk.h>.
Use the new disk_err() throughout, #include <sys/disk.h> as needed.
Bump __FreeBSD_version for the sake of the aac disk drivers #ifdefs.
Remove unused disklabel members of softc for aac, amr and mlx, which seem to originally have been intended for diskerr() use, but which only rotted and got Copy&Pasted at least two times to many.
Sponsored by: DARPA & NAI Labs.
|
58883 |
01-Apr-2000 |
msmith |
Update to latest working version.
- Add periodic status monitoring routine. Currently just detects lost commands, further functionality pending data from AMI. Add some new commands states; WEDGED (never coming back) and LATE (for when a command that wasmarked as WEDGED comes bacj,
- Remove a number of redundant efforts to poll the card for completed commands. This is what interrupt handlers are for.
- Limit the maximum number of outstanding I/O transactions. It seems that some controllers report more than they can really handle, and exceding this limit can cause the controller to lock up.
- Don't use 'wait' mode for anything where the controller might not be able to generate interrupts. (Keep the 'wait' mode though sa it will become useful when we start taking userspace commands.
- Use a similar atomic locking trategy to the Mylex driver to prevent some reentrancy problems.
- Correctly calculate the block count for non-whoile-bloch transfers (actually illegal).
- Use the dsik device's si_drv1 field instead of b_driver1 in the buf struct to pass the driver identifier arond.
- Rewrite amr_start and amr_done() along the lines of the Mylex driver in order to improve robustnes.
- Always force the PCI busmaster bit on.
|
58345 |
20-Mar-2000 |
phk |
Remove B_READ, B_WRITE and B_FREEBUF and replace them with a new field in struct buf: b_iocmd. The b_iocmd is enforced to have exactly one bit set.
B_WRITE was bogusly defined as zero giving rise to obvious coding mistakes.
Also eliminate the redundant struct buf flag B_CALL, it can just as efficiently be done by comparing b_iodone to NULL.
Should you get a panic or drop into the debugger, complaining about "b_iocmd", don't continue. It is likely to write on your disk where it should have been reading.
This change is a step in the direction towards a stackable BIO capability.
A lot of this patch were machine generated (Thanks to style(9) compliance!)
Vinum users: Greg has not had time to test this yet, be careful.
|
54279 |
08-Dec-1999 |
ken |
Revamp the devstat priority system. All disks now have the same priority. The same goes for CD drivers and tape drivers. In systems with mixed IDE and SCSI, devices in the same priority class will be sorted in attach order.
Also, the 'CCD' priority is now the 'ARRAY' priority, and a number of drivers have been modified to use that priority.
This includes the necessary changes to all drivers, except the ATA drivers. Soren will modify those separately.
This does not include and does not require any change in the devstat version number, since no known userland applications use the priority enumerations.
Reviewed by: msmith, sos, phk, jlemon, mjacob, bde
|