#
355341 |
|
03-Dec-2019 |
mav |
MFC r355023: Do not retry long ready waits if previous gave nothing.
I have some disks reporting "Logical unit is in process of becoming ready" for about half an hour before finally reporting failure. During that time CAM waits for the readiness during ~2 minutes for each request, that makes system boot take very long time.
This change reduces wait times for the following requests to ~1 second if previously long wait for that device has timed out.
|
#
346758 |
|
26-Apr-2019 |
mav |
MFC r345656: Do not map small IOCTL buffers to KVA, but copy.
CAM IOCTL interfaces traditionally mapped user-space data buffers to KVA. It was nice originally, but now it takes too much to handle respective TLB shootdowns, while small kernel memory allocations up to 64KB backed by UMA and accompanied by copyin()/copyout() can be much cheaper.
For large buffers mapping still may have sense, and unmapped I/O would be even better, but the last unfortunately is more tricky, since unmapped I/O API is too specific to struct bio now.
Sponsored by: iXsystems, Inc.
|
#
316139 |
|
29-Mar-2017 |
mav |
MFC r314870: Add mechanism to unload CAM periph drivers.
For now it allows to unload CTL kernel module if there are no target-capable SIMs in CAM. As next step full teardown of CAM targets can be implemented.
|
#
308351 |
|
05-Nov-2016 |
markj |
MFC r306529: cam_periph_ccbwait could return while ccb in progress
|
#
302408 |
|
07-Jul-2016 |
gjb |
Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle. Prune svn:mergeinfo from the new branch, as nothing has been merged here.
Additional commits post-branch will follow.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
288420 |
|
30-Sep-2015 |
mav |
Make pass, sg and targ drivers respect HBA's maxio.
Previous limitation of 64K (DFLTPHYS) is quite annoying.
|
#
260541 |
|
11-Jan-2014 |
mav |
Take additional reference on SCSI probe periph to cover its freeze count.
Otherwise periph may be invalidated and freed before single-stepping freeze is dropped, causing use after free panic.
|
#
256843 |
|
21-Oct-2013 |
mav |
Merge CAM locking changes from the projects/camlock branch to radically reduce lock congestion and improve SMP scalability of the SCSI/ATA stack, preparing the ground for the coming next GEOM direct dispatch support.
Replace big per-SIM locks with bunch of smaller ones: - per-LUN locks to protect device and peripheral drivers state; - per-target locks to protect list of LUNs on target; - per-bus locks to protect reference counting; - per-send queue locks to protect queue of CCBs to be sent; - per-done queue locks to protect queue of completed CCBs; - remaining per-SIM locks now protect only HBA driver internals.
While holding LUN lock it is allowed (while not recommended for performance reasons) to take SIM lock. The opposite acquisition order is forbidden. All the other locks are leaf locks, that can be taken anywhere, but should not be cascaded. Many functions, such as: xpt_action(), xpt_done(), xpt_async(), xpt_create_path(), etc. are no longer require (but allow) SIM lock to be held.
To keep compatibility and solve cases where SIM lock can't be dropped, all xpt_async() calls in addition to xpt_done() calls are queued to completion threads for async processing in clean environment without SIM lock held.
Instead of single CAM SWI thread, used for commands completion processing before, use multiple (depending on number of CPUs) threads. Load balanced between them using "hash" of the device B:T:L address.
HBA drivers that can drop SIM lock during completion processing and have sufficient number of completion threads to efficiently scale to multiple CPUs can use new function xpt_done_direct() to avoid extra context switch. Make ahci(4) driver to use this mechanism depending on hardware setup.
Sponsored by: iXsystems, Inc. MFC after: 2 months
|
#
256552 |
|
15-Oct-2013 |
mav |
Unify periph invalidation and destruction reporting. Print message containing device model and serial number on invalidation.
Requested by: glebius MFC after: 1 week
|
#
249585 |
|
17-Apr-2013 |
gabor |
- Corrrect mispellings of word useful
Submitted by: Christoph Mallon <christoph.mallon@gmx.de> (via private mail)
|
#
249466 |
|
14-Apr-2013 |
mav |
MFprojects/camlock r248890, r248897, r248898, r248900, r248903, r248905, r248917, r248918, r248978, r249001, r249014, r249030:
Remove multilevel freezing mechanism, implemented to handle specifics of the ATA/SATA error recovery, when post-reset recovery commands should be allocated when queues are already full of payload requests. Instead of removing frozen CCBs with specified range of priorities from the queue to provide free openings, use simple hack, allowing explicit CCBs over- allocation for requests with priority higher (numerically lower) then CAM_PRIORITY_OOB threshold.
Simplify CCB allocation logic by removing SIM-level allocation queue. After that SIM-level queue manages only CCBs execution, while allocation logic is localized within each single device.
Suggested by: gibbs
|
#
248874 |
|
29-Mar-2013 |
marius |
Unbreak compilation after r248868.
|
#
248868 |
|
29-Mar-2013 |
mav |
Implement CAM_PERIPH_FOREACH() macro, safely iterating over the list of driver's periphs, acquiring and releaseing periph references while doing it.
Use it to iterate over the lists of ada and da periphs when flushing caches and putting devices to sleep on shutdown and suspend. Previous code could panic in theory if some device disappear in the middle of the process.
|
#
236814 |
|
09-Jun-2012 |
mav |
One more major cam_periph_error() rewrite to improve error handling and reporting. It includes: - removing of error messages controlled by bootverbose, replacing them with more universal and informative debugging on CAM_DEBUG_INFO level, that is now built into the kernel by default; - more close following to the arguments submitted by caller, such as SF_PRINT_ALWAYS, SF_QUIET_IR and SF_NO_PRINT; consumer knows better which errors are usual/expected at this point and which are really informative; - adding two new flags SF_NO_RECOVERY and SF_NO_RETRY to allow caller specify how much assistance it needs at this point; previously consumers controlled that by not calling cam_periph_error() at all, but that made behavior inconsistent and debugging complicated; - tuning debug messages and taken actions order to make debugging output more readable and cause-effect relationships visible; - making camperiphdone() (common device recovery completion handler) to also use cam_periph_error() in most cases, instead of own dumb code; - removing manual sense fetching code from cam_periph_error(); I was told by number of people that it is SIM obligation to fetch sense data, so this code is useless and only significantly complicates recovery logic; - making ada, da and pass driver to use cam_periph_error() with new limited recovery options to handle error recovery and debugging in common way; as one of results, CAM_REQUEUE_REQ and other retrying statuses are now working fine with pass driver, that caused many problems before. - reverting r186891 by raj@ to avoid burning few seconds in tight DELAY() loops on device probe, while device simply loads media; I think that problem may already be fixed in other way, and even if it is not, solution must be different.
Sponsored by: iXsystems, Inc. MFC after: 2 weeks
|
#
230000 |
|
11-Jan-2012 |
ken |
Fix a race condition in CAM peripheral free handling, locking in the CAM XPT bus traversal code, and a number of other periph level issues.
cam_periph.h, cam_periph.c: Modify cam_periph_acquire() to test the CAM_PERIPH_INVALID flag prior to allowing a reference count to be gained on a peripheral. Callers of this function will receive CAM_REQ_CMP_ERR status in the situation of attempting to reference an invalidated periph. This guarantees that a peripheral scheduled for a deferred free will not be accessed during its wait for destruction.
Panic during attempts to drop a reference count on a peripheral that already has a zero reference count.
In cam_periph_list(), use a local sbuf with SBUF_FIXEDLEN set so that mallocs do not occur while the xpt topology lock is held, regardless of the allocation policy of the passed in sbuf.
Add a new routine, cam_periph_release_locked_buses(), that can be called when the caller already holds the CAM topology lock.
Add some extra debugging for duplicate peripheral allocations in cam_periph_alloc().
Treat CAM_DEV_NOT_THERE much the same as a selection timeout (AC_LOST_DEVICE is emitted), but forgo retries.
cam_xpt.c: Revamp the way the EDT traversal code does locking and reference counting. This was broken, since it assumed that the EDT would not change during traversal, but that assumption is no longer valid.
So, to prevent devices from going away while we traverse the EDT, make sure we properly lock everything and hold references on devices that we are using.
The two peripheral driver traversal routines should be examined. xptpdperiphtraverse() holds the topology lock for the entire time it runs. xptperiphtraverse() is now locked properly, but only holds the topology lock while it is traversing the list, and not while the traversal function is running.
The bus locking code in xptbustraverse() should also be revisited at a later time, since it is complex and should probably be simplified.
scsi_da.c: Pay attention to the return value from cam_periph_acquire().
Return 0 always from daclose() even if the disk is now gone.
Add some rudimentary error injection support.
scsi_sg.c: Fix reference counting in the sg(4) driver.
The sg driver was calling cam_periph_release() on close, but never called cam_periph_acquire() (which increments the reference count) on open.
The periph code correctly complained that the sg(4) driver was trying to decrement the refcount when it was already 0.
Sponsored by: Spectra Logic MFC after: 2 weeks
|
#
223081 |
|
14-Jun-2011 |
gibbs |
Lay groundwork in CAM for recording and reporting physical path and other device attributes stored in the CAM Existing Device Table (EDT). This includes some infrastructure requried by the enclosure services driver to export physical path information.
Make the CAM device advanced info interface accept store requests.
sys/cam/scsi/scsi_all.c: sys/cam/scsi/scsi_all.h: - Replace scsi_get_sas_addr() with a scsi_get_devid() which takes a callback that decides whether to accept a particular descriptor. Provide callbacks for NAA IEEE Registered addresses and for SAS addresses, replacing the old function. This is needed because the old function doesn't work for an enclosure address for a SAS device, which is not flagged as a SAS address, but is NAA IEEE Registered. It may be worthwhile merging this interface with the devid match interface. - Add a few more defines for some device ID fields.
sbin/camcontrol/camcontrol.c: - Update for the CCB_DEV_ADVINFO interface change.
cam/cam_xpt_internal.h: - Add the new fields for the physical path string to the CAM EDT. cam/cam_ccb.h: - Rename CCB_GDEV_ADVINFO to simply CCB_DEV_ADVINFO, and the ccb structure to ccb_dev_advinfo. - Add a flag that changes this CCB's action to store, rather than the default, retrieve. - Add a new buffer type, CDAI_TYPE_PHYS_PATH, for the new CAM EDT physpath field. - Remove the never-implemented transport & proto flags. cam/cam_xpt.c: cam/cam_xpt.h: - Add xpt_getattr(), which provides a wrapper for fetching a device's attribute using the GEOM strings as key. This method currently supports "GEOM::ident" and "GEOM::physpath".
Submitted by: will Reviewed by : gibbs
Extend the XPT_DEV_MATCH api to allow a device search by device ID. As far as the API is concerned, device ID is a binary blob to be interpreted by the transport layer. The SCSI implementation assumes it is an array of VPD device ID descriptors.
sys/cam/cam_ccb.h: Create a new structure, device_id_match_pattern, and update the XPT_DEV_MATCH datastructures and flags so that this pattern type can be used.
sys/cam/cam_xpt.c: - A single pattern matching on both inquiry data and device ID is invalid. Report any violators. - Pass device ID match requests through to the new routine scsi_devid_match(). The direct call of a SCSI routine is a layering violation, but no worse than the one a few lines up that checks inquiry data. Defer cleaning this up until our future, larger, rototilling of CAM. - Zero out cam_ed and cam_et nodes on allocation. Prior to this change, device_id_len and device_id were not inialized, preventing proper detection of the presence of this information.
sys/cam/scsi/scsi_all.c: sys/cam/scsi/scsi_all.h: Add the scsi_match_devid() routine.
Add a helper function for extracting peripherial driver names
sys/cam/cam_periph.c: sys/cam/cam_periph.h: Add the cam_periph_list() method which fills an sbuf with a comma delimited list of the peripheral instances associated with a given CAM path.
Add a helper functions for SCSI commands used by the SES driver.
sys/cam/scsi/scsi_all.c: sys/cam/scsi/scsi_all.h: Add structure definitions and csio filling functions for the receive diagnostic results and send diagnostic commands.
Misc CAM XPT cleanups.
sys/cam/cam_xpt.c: Broadcast AC_FOUND_DEVICE and AC_PATH_REGISTERED events at the time async event handlers are attached even when registering just for events on a partitular SIM. Previously, you had to register for these events on all SIMs in the system in order to get the initial broadcast even though subsequent device and path arrivals would be delivered.
sys/cam/cam_xpt.c: Remove SIM mutex held asserts from path accessors. CAM paths are reference counted and it is this reference count, not the sim mutex, that garantees they are stable.
Sponsored by: Spectra Logic Corporation
|
#
203108 |
|
28-Jan-2010 |
mav |
MFp4: Large set of CAM inprovements.
- Unify bus reset/probe sequence. Whenever bus attached at boot or later, CAM will automatically reset and scan it. It allows to remove duplicate code from many drivers. - Any bus, attached before CAM completed it's boot-time initialization, will equally join to the process, delaying boot if needed. - New kern.cam.boot_delay loader tunable should help controllers that are still unable to register their buses in time (such as slow USB/ PCCard/ CardBus devices), by adding one more event to wait on boot. - To allow synchronization between different CAM levels, concept of requests priorities was extended. Priorities now split between several "run levels". Device can be freezed at specified level, allowing higher priority requests to pass. For example, no payload requests allowed, until PMP driver enable port. ATA XPT negotiate transfer parameters, periph driver configure caching and so on. - Frozen requests are no more counted by request allocation scheduler. It fixes deadlocks, when frozen low priority payload requests occupying slots, required by higher levels to manage theit execution. - Two last changes were holding proper ATA reinitialization and error recovery implementation. Now it is done: SATA controllers and Port Multipliers now implement automatic hot-plug and should correctly recover from timeouts and bus resets. - Improve SCSI error recovery for devices on buses without automatic sense reporting, such as ATAPI or USB. For example, it allows CAM to wait, while CD drive loads disk, instead of immediately return error status. - Decapitalize diagnostic messages and make them more readable and sensible. - Teach PMP driver to limit maximum speed on fan-out ports. - Make boot wait for PMP scan completes, and make rescan more reliable. - Fix pass driver, to return CCB to user level in case of error. - Increase number of retries in cd driver, as device may return several UAs.
|
#
200180 |
|
06-Dec-2009 |
mav |
MFp4: If we panicked with SIM lock held, do not try to flush caches. Extra lock recursing will not make debugging easier.
|
#
198899 |
|
04-Nov-2009 |
mav |
MFp4: - Remove CAM_PERIPH_POLLED flag. It is broken by design. Polling can't be periph flag. May be SIM, may be CCB, but now it works fine just without it. - Remove check unused for at least five years. If we will ever have non-BIO devices in CAM, this check is smallest of what we will need. - If several controllers complete requests same time, call swi_sched() only once.
|
#
198708 |
|
31-Oct-2009 |
mav |
MFp4: - Reduce code duplication in ATA XPT and PMP driver. - Move PIO size setting from ada driver to ATA XPT. It is XPT business to negotiate transfer details. ada driver is now stateless. - Report PIO size to SIM. It is required for correct PATA SIM operation. - Tune PMP scan timings. It workarounds some problems with SiI. - If reset hapens during PMP initialization - restart it. - Introduce early-initialized periph drivers, which are used during initial scan process. Use it for xpt, probe, aprobe and pmp. It gives pmp chance to finish scan before mountroot and numerate devices in right order.
|
#
194627 |
|
22-Jun-2009 |
scottl |
Change cam_periph_ioctl() to take 'cmd' and a u_long instead of an int. All of its callers pass in cmd as a u_long, so this has always been a dangerous type demotion. It was spooted by clang/llvm trying to do a type promotion and sign extension within cam_periph_ioctl.
Submitted by: rdivacky
|
#
186319 |
|
19-Dec-2008 |
trasz |
Periph driver fixes, second try.
Reviewed by: scottl Approved by: rwatson (mentor) Sponsored by: FreeBSD Foundation
|
#
168876 |
|
19-Apr-2007 |
scottl |
Inline cam_periph_lock|unlock to make debugging easier. Use more CAM_SIM_LOCK() more uniformly.
|
#
168752 |
|
15-Apr-2007 |
scottl |
Remove Giant from CAM. Drivers (SIMs) now register a mutex that CAM will use to synchornize and protect all data objects that are used for that SIM. Drivers that are not yet MPSAFE register Giant and operate as usual. RIght now, no drivers are MPSAFE, though a few will be changed in the coming week as this work settles down.
The driver API has changed, so all CAM drivers will need to be recompiled. The userland API has not changed, so tools like camcontrol do not need to be recompiled.
|
#
139743 |
|
05-Jan-2005 |
imp |
Start each of the license/copyright comments with /*-
|
#
136132 |
|
05-Oct-2004 |
scottl |
Remove the camnet swi and CAM_PERIPH_NET. It has never been used, and given that netowrk-over-scsi never really took off, there is little chance that it will ever be needed.
|
#
132199 |
|
15-Jul-2004 |
phk |
Do a pass over all modules in the kernel and make them return EOPNOTSUPP for unknown events.
A number of modules return EINVAL in this instance, and I have left those alone for now and instead taught MOD_QUIESCE to accept this as "didn't do anything".
|
#
111979 |
|
08-Mar-2003 |
phk |
Centralize the devstat handling for all GEOM disk device drivers in geom_disk.c.
As a side effect this makes a lot of #include <sys/devicestat.h> lines not needed and some biofinish() calls can be reduced to biodone() again.
|
#
72119 |
|
07-Feb-2001 |
peter |
Change the peripheral driver list from a linker set to module driven driver registration. This should allow things like da, sa, cd etc to be in seperate KLD's to the cam core and make them preloadable.
|
#
60938 |
|
26-May-2000 |
jake |
Back out the previous change to the queue(3) interface. It was not discussed and should probably not happen.
Requested by: msmith and others
|
#
60833 |
|
23-May-2000 |
jake |
Change the way that the queue(3) structures are declared; don't assume that the type argument to *_HEAD and *_ENTRY is a struct.
Suggested by: phk Reviewed by: phk Approved by: mdodd
|
#
60168 |
|
07-May-2000 |
n_hibma |
*sigh* I must have been on something that night. Make xpt_periph an extern with the original in cam_xpt.c instead of replicating xpt_periph in all the sources using it (and hence not initialising it)
|
#
58972 |
|
03-Apr-2000 |
n_hibma |
Add a hack to cam that makes the cam_xpt available to the rest of the kernel. Justin agress that there is no other reasonable alternative to do automatic rescans on connect.
The problem is that when a new device attaches to a SIM (SCSI host controller) we need to send a XPT_SCAN_BUS command to the SIM using xpt_action. This requires however that there is a peripheral available to take the command (otherwise xpt_done and later bomb). The RESCAN ioctl uses the same periph.
This enables a USB mass storage drive to do an automatic rescan on connection of the drive.
The automatic dropping of a CAM entry on disconnection was already working (asynchronous event).
The next thing to do is find someone to commit a change to vpo to do the same thing. Just port umass_cam_rescan and friends across to that driver.
Approved by: gibbs
|
#
58111 |
|
15-Mar-2000 |
n_hibma |
Various typo's.
One minor nit. The speed was displayed wrong when below 1Mb/s.
|
#
55206 |
|
29-Dec-1999 |
peter |
Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL" is an application space macro and the applications are supposed to be free to use it as they please (but cannot). This is consistant with the other BSD's who made this change quite some time ago. More commits to come.
|
#
50477 |
|
27-Aug-1999 |
peter |
$Id$ -> $FreeBSD$
|
#
47412 |
|
22-May-1999 |
gibbs |
Add the XPT_PATH_STATS and XPT_GDEV_STATS function codes. These ccb types allow the reporting of error counts and other statistics. Currently we provide information on the last BDR or bus reset as well as active transaction inforamtion, but this will be expanded as more information is added to aid in error recovery.
Use the 'last reset' information to better handle bus settle delays. Peripheral drivers now control whether a bus settle delay occurs and for how long. This allows target mode peripheral drivers to avoid having their device queue frozen by the XPT for what shoudl only be initiator type behavior.
Don't perform a bus reset if the target device is incapable of performing transfer negotiation (e.g. Fiber Channel).
If we don't perform a bus reset but the controller is capable of transfer negotiations, force negotiations on the first transaction to go to the device. This ensures that we aren't tripped up by a left over negotiation from the prom, BIOS, loader, etc.
Add a default async handler funstion to cam_periph.c to remove duplicated code in all initiator type peripheral drivers.
Allow mapping of XPT_CONT_TARGET_IO ccbs from userland. They are itentical to XPT_SCSI_IO ccbs as far as data mapping is concerned.
|
#
40603 |
|
22-Oct-1998 |
ken |
Fix a problem with the way we handled device invalidation when attaching to a device failed.
In theory, the same steps that happen when we get an AC_LOST_DEVICE async notification should have been taken when a driver fails to attach. In practice, that wasn't the case.
This only affected the da, cd and ch drivers, but the fix affects all peripheral drivers.
There were several possible problems: - In the da driver, we didn't remove the peripheral's softc from the da driver's linked list of softcs. Once the peripheral and softc got removed, we'd get a kernel panic the next time the timeout routine called dasendorderedtag(). - In the da, cd and possibly ch drivers, we didn't remove the peripheral's devstat structure from the devstat queue. Once the peripheral and softc were removed, this could cause a panic if anyone tried to access device statistics. (one component of the linked list wouldn't exist anymore) - In the cd driver, we didn't take the peripheral off the changer run queue if it was scheduled to run. In practice, it's highly unlikely, and maybe impossible that the peripheral would have been on the changer run queue at that stage of the probe process.
The fix is: - Add a new peripheral callback function (the "oninvalidate" function) that is called the first time cam_periph_invalidate() is called for a peripheral.
- Create new foooninvalidate() routines for each peripheral driver. This routine is always called at splsoftcam(), and contains all the stuff that used to be in the AC_LOST_DEVICE case of the async callback handler.
- Move the devstat cleanup call to the destructor/cleanup routines, since some of the drivers do I/O in their close routines.
- Make sure that when we're flushing the buffer queue, we traverse it at splbio().
- Add a check for the invalid flag in the pt driver's open routine.
Reviewed by: gibbs
|
#
40318 |
|
13-Oct-1998 |
ken |
Fix a bug in the error recovery code. It was possible to have more than one error recovery action oustanding for a given peripheral.
This is bad for several reasons. The first problem is that the error recovery actions would likely be to fix the same problem. (e.g., we queue 5 CCBs to a disk, and the first one comes back with 0x04,0x02. We start error recovery, and the second one comes back with the same status. Then the third one comes back, and so on. Each one causes the drive to get nailed with a start unit, when we really only need one.)
The other problem is that we only have space to store one CCB while we're doing error recovery. The subsequent error recovery actions that got started were over-writing the CCBs from previous error recovery actions, but we still tried to call the done routine N times for N error recovery actions. Each call to dadone() was done with the same CCB, though. So on the second one, we got a "biodone: buffer not busy" panic, since the buffer in question had already been through biodone().
In any case, this fixes things so that any any given time, there's only one error recovery action outstanding for any given peripheral driver.
Reviewed by: gibbs Reported by: Philippe Regnauld <regnauld@deepo.prosa.dk> [ Philippe wins the "bug finder of the week" award ]
|
#
39212 |
|
15-Sep-1998 |
gibbs |
CAM Transport Layer (XPT).
Submitted by: The CAM Team
|