History log of /freebsd-10.3-release/sys/dev/isp/isp.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 296373 04-Mar-2016 marius

- Copy stable/10@296371 to releng/10.3 in preparation for 10.3-RC1
builds.
- Update newvers.sh to reflect RC1.
- Update __FreeBSD_version to reflect 10.3.
- Update default pkg(8) configuration to use the quarterly branch.

Approved by: re (implicit)

# 292931 30-Dec-2015 mav

MFC r292765: Allocate separate scratch space for scanner purposes.

This space does not require DMA syncing. It reduces lock scope of the DMA
scratch space. It allows whole DMA scratch space to be used to I/O, so now
we can fetch up to ~1000 ports from SNS.

Due to the last fact, increase maximal number of ports from 256 to 1024.


# 292924 30-Dec-2015 mav

MFC r292741: Make port logins asynchronous, following r292739 logic.

This is even more important since it involves more network operations and
more prone to delays and timeouts.


# 292922 30-Dec-2015 mav

MFC r292739: Make virtual ports control asynchronous.

Before this change virtual ports control IOCBs were executed synchronously
via Execute IOCB mailbox command. It required exclusive use of scratch
space of driver and mailbox registers of the hardware. Because of that
shared resources use this code could not really sleep, having to spin for
completion, blocking any other operation.

This change introduces new asynchronous design, sending the IOCBs directly
on request queue and gracefully waiting for their return on response queue.
Returned IOCBs are identified with unified handle space from r292725.


# 292921 30-Dec-2015 mav

MFC r292725: Unify handles allocation for initiator and target IOCBs.

I am not sure why this was split long ago, but I see no reason for it.
At this point this unification just slightly reduces memory usage, but
as next step I plan to reuse shared handle space for other IOCB types.


# 292919 30-Dec-2015 mav

MFC r292715: Clear virtual port's port database when disabling it.

Previously it was done only on full chip reinit, that caused old ports
resurrect in case of virtual port reenabling.


# 292918 30-Dec-2015 mav

MFC r292690: Some polishing for command timeouts handling.


# 292916 30-Dec-2015 mav

MFC r292610: Fix speed setting by NVRAM for 24xx and above chips.


# 292598 22-Dec-2015 mav

MFC r291654, r291727, r291821, r291872, r292034, r292041, r292249, r292042:
Add initial support for 16Gbps FC QLogic chips.


# 291532 30-Nov-2015 mav

MFC r291365, r291369: One more round of port scanner rewrite.

- Make scan aborted by event restart immediately and infinitely.
- Improve handling of some loop events from firmware.
- Remove loop down timer, adding its functionality to scanner thread.
- Some more unification and simplification.


# 291531 30-Nov-2015 mav

MFC r291265: Rename ASYNC_LIP_F8 to ASYNC_LIP_NOS_OLS_RECV.

New name better repsents its meaning for modern chips.


# 291529 30-Nov-2015 mav

MFC r291209: Fix target mode support for Qlogic 2200 FC adapters.

Now target mode works for all supported FC adapters except ancient 2100,
which is not tested.


# 291528 30-Nov-2015 mav

MFC r291188: Rip off target mode support for parallel SCSI QLogic adapters.

Hacks to enable target mode there complicated code, while didn't really
work. And for outdated hardware fixing it is not really interesting.

Initiator mode tested with Qlogic 1080 adapter is still working fine.


# 291523 30-Nov-2015 mav

MFC r291163:
Explicitly call SEND CHANGE REQUEST for pre-24xx chips in target mode.

While later firmware always registers for RSCN requests, older one does
it only in initiator mode. But in target mode there RSCN can be the only
way to detect gone intiator.


# 291521 30-Nov-2015 mav

MFC r291161: Gracefully stop firmware before resetting chip when changing role.


# 291520 30-Nov-2015 mav

MFC r291160: Add some more asynchronous event status codes.


# 291519 30-Nov-2015 mav

MFC r291159: Add more mailbox command codes.


# 291517 30-Nov-2015 mav

MFC r291144: Fix target mode with fabric for pre-24xx chips.

For those chips we are not receiving login events, adding initiators
based on ATIO requests. But there is no port ID in that structure, so
in fabric mode we have to explicitly fetch it from firmware to be able
to do normal scan after that.


# 291516 30-Nov-2015 mav

MFC r291099: Some cosmetics for ancient cards.


# 291515 30-Nov-2015 mav

MFC r291092: Optimize SNS_GID_FT request scratch memory usage.

Now with present 4K of scratch we can fetch up to 508 ports (16 more).


# 291514 30-Nov-2015 mav

MFC r291080: Another round of port scanner rewrite.

This change simplifies and unifies port adding/updating for loop and
fabric scanners. It also fixes problems with scanning restarts due to
concurrent port databases changes. It also fixes many cosmetic issues.


# 291513 30-Nov-2015 mav

MFC r291014: Simplify fabric tasting code.

Except cosmetic changes this removes fabric ports from our port database.
It is always firmware duty to manage them, so driver don't need to worry.


# 291512 30-Nov-2015 mav

MFC r291013: Remove some confusions between loopid and nphdl.

Modern cards in most cases operate abstract port handles, that have no
any relation to real loop IDs. Leave loopid used only where it really
goes about local loop IDs.

While there, fix few more cases where LUNs were still printed in decimal.


# 291511 30-Nov-2015 mav

MFC r291000: Register our FC4 Features in SNS.


# 291510 30-Nov-2015 mav

MFC r290993, r290994: Unify and cleanup FC ports scan.


# 291509 30-Nov-2015 mav

MFC r290981: Off-by-one correctiont to r290980.


# 291508 30-Nov-2015 mav

MFC r290980: Make firmware handle virtual ports SNS logins for us.


# 291507 30-Nov-2015 mav

MFC r290978: Add real initial support for RQSTYPE_RPT_ID_ACQ.


# 291505 30-Nov-2015 mav

MFC r290507: Rework r290504.


# 291504 30-Nov-2015 mav

MFC r290506: Specify VP when sending a marker.


# 291503 30-Nov-2015 mav

MFC r290504: Make ISP_SLEEP() really sleep instead of spinning.

While there, simplify the wait logic.


# 291502 30-Nov-2015 mav

MFC r290160: Remove some unneeded code.


# 291501 30-Nov-2015 mav

MFC r290159: Remove reset delays for which I see neither explanation nor need.


# 291500 30-Nov-2015 mav

MFC r290147: Fix and improve error masking and reporting.


# 291499 30-Nov-2015 mav

MFC r290118: Change the way how target mode is enabled on 23xx chips.

Without docs I am not completely sure about this, but on my tests new
method works better then previous, at least with our latest firmware.


# 291498 30-Nov-2015 mav

MFC r290104: Improve/fix loop scanning routine.

For the most of chips (except anscient ones) port handlers have no relation
to port IDs. In such situation old code scanning first 125 handlers was
quite naive. Instead of doing that, send to chip single request to get full
list of port handlers available on specific virtual port and scan only them.

Old code had problems with case of several virtual ports enabled, when port
handlers allocated from global address space could easily go above 125.
This change was successfully tested on 23xx, 24xx and 25xx chips in loop
mode with 4 virtual initiator ports, each seing 50 virtual target ports.


# 290800 13-Nov-2015 mav

MFC r290054: Reimplement next port handle generation.

For some reason port handles should be allocated from HBA-global space,
while old code was not very specific, mixing per-HBA and per-VP logic.


# 290798 13-Nov-2015 mav

MFC r290018: Reimplement enable and implement disable of virtual ports.

Now on 24xx and above chips it is really possible to simulate several
virtual FC ports with single physical one. For example, it allows to
configure several targets in ctl.conf, assign each of them to separate
virtual port, and let user to control access to them with switch zoning.

I still doubt that all problems are solved there, but at now it passes
at least basic tests.


# 290796 13-Nov-2015 mav

MFC r289937: Try to keep Loop IDs persistent across chip reinits.


# 290795 13-Nov-2015 mav

MFC r289933, r289939: Improve Port Database Changed handling and reporting.


# 290794 13-Nov-2015 mav

MFC r289930: Formalize/unify chip (re-)inits.


# 290793 13-Nov-2015 mav

MFC r289890: Skip reserved IP Broadcast handle from using.


# 290791 13-Nov-2015 mav

MFC r289882: Add PIM_EXTLUNS support to isp(4) driver.

Now 24xx and above chips support full 8-byte LUN address space.
Older FC chips may support up to 16K LUNs when firmware allows.
Tested in both initiator and target modes for 23xx, 24xx and 25xx.


# 290789 13-Nov-2015 mav

MFC r289875: Decode few more response info codes.

Though CAM still does not send any requests that would require those.


# 290785 13-Nov-2015 mav

MFC r289812, r289852: Some polishing and unification in ISR code.


# 290784 13-Nov-2015 mav

MFC r289681: Some more defines and polishing for INIT_FIRMWARE.


# 290781 13-Nov-2015 mav

MFC r289622: Zero mbox[1] for INIT_FIRMWARE to fix version 7.3 firmware.

While there, add new fields to isp_icb_2400_t structure.


# 290780 13-Nov-2015 mav

MFC r289620: Decode more firmware attributes.


# 288717 05-Oct-2015 mav

MFC r285600: MULTI_ID supported does not mean it is used.


# 288714 05-Oct-2015 mav

MFC r285459: Unify port database use for target and initiator roles.

Aside from cleaner and more consistent code, this allows ports to be both
target and initiator same time, and easily switch from any role to any.


# 288712 05-Oct-2015 mav

MFC r285154: Remove extra level of target ID indirection (isp_dev_map).

FreeBSD never had limitation on number of target IDs, and there is no
any other requirement to allocate them densely. Since slots of port
database already populated just sequentially, there is no much need
for another indirection to allocate sequentially too.


# 288709 05-Oct-2015 mav

MFC r285146: Drop discovered targets when initiator role is disabled.


# 284907 28-Jun-2015 mav

MFC r284808: Remove limitations on setting WWNNs starting from 2.

It is odd that driver first tries to generate synthetic WWNN based on
WWPN starting from 2, but then refuses to use it. If we don't trust
generated WWNN, we should probably not generate it. Same time this
limitation prevents potentially valid WWNN setting by user.


# 284802 25-Jun-2015 mav

MFC r284698: Dump additional config bytes for INIT_FIRMWARE_MULTI_ID.


# 284801 25-Jun-2015 mav

MFC r284697: Add logging of executed mailbox command names.

Previously those commands were logged only as part of register dump,
that is not very readable.


# 278171 03-Feb-2015 ken

MFC isp(4) driver changes:

r276839, r276842, r277513, r277514, r277515

------------------------------------------------------------------------
r276839 | ken | 2015-01-08 10:41:28 -0700 (Thu, 08 Jan 2015) | 49 lines

Fix Fibre Channel Command Reference Number handling in the isp(4) driver.

The Command Reference Number is used for precise delivery of
commands, and is part of the FC-Tape functionality set. (This is
only enabled for devices that support precise delivery of commands.)
It is an 8-bit unsigned number that increments from 1 to 255. The
commands sent by the initiator must be processed by the target in
CRN order if the CRN is non-zero.

There are certain scenarios where the Command Reference Number
sequence needs to be reset. When the target is power cycled, for
instance, the initiator needs to reset the CRN to 1. The initiator
will know this because it will see a LIP (when directly connected)
or get a logout/login event (when connected to a switch).

The isp(4) driver was not resetting the CRN when a target
went away and came back. When it saw the target again after a
power cycle, it would continue the CRN sequence where it left off.
The target would ignore the command because the CRN sequence is
supposed to be reset to 1 after a power cycle or other similar
event.

The symptom that the user would see is that there would be lots of
aborted INQUIRY commands after a tape library was power cycled, and
the library would fail to probe. The INQUIRY commands were being
ignored by the tape drive due to the CRN issue mentioned above.

isp_freebsd.c:
Add a new function, isp_fcp_reset_crn(). This will reset
all of the CRNs for a given port, or the CRNs for all LUNs
on a target.

Reset the CRNs for all targets on a port when we get a LIP,
loop reset, or loop down event.

Reset the CRN for a particular target when it arrives, is changed
or departs. This is less precise behavior than the
clearing behavior specified in the FCP-4 spec (which says
that it should be reset for PRLI, PRLO, PLOGI and LOGO),
but this is the level of information we have here. If this
is insufficient, then we will need to add more precise
notification from the lower level isp(4) code.

isp_freebsd.h:
Add a prototype for isp_fcp_reset_crn().

Sponsored by: Spectra Logic
MFC after: 1 week

------------------------------------------------------------------------
r276842 | ken | 2015-01-08 10:51:12 -0700 (Thu, 08 Jan 2015) | 44 lines

Close a race in the isp(4) driver that caused devices to disappear
and not automatically come back if they were gone for a short
period of time.

The isp(4) driver has a 30 second gone device timer that gets
activated whenever a device goes away. If the device comes back
before the timer expires, we don't send a notification to CAM that
it has gone away. If, however, there is a command sent to the
device while it is gone and before it comes back, the isp(4) driver
sends the command back with CAM_SEL_TIMEOUT status.

CAM responds to the CAM_SEL_TIMEOUT status by removing the device.
In the case where a device comes back within the 30 second gone
device timer window, though, we weren't telling CAM the device
came back.

So, fix this by tracking whether we have told CAM the device is
gone, and if we have, send a rescan if it comes back within the 30
second window.

ispvar.h:
In the fcportdb_t structure, add a new bitfield,
reported_gone. This gets set whenever we return a command
with CAM_SEL_TIMEOUT status on a Fibre Channel device.

isp_freebsd.c:
In isp_done(), if we're sending CAM_SEL_TIMEOUT for for a
command sent to a FC device, set the reported_gone bit.

In isp_async(), in the ISPASYNC_DEV_STAYED case, rescan the
device in question if it is mapped to a target ID and has
been reported gone.

In isp_make_here(), take a port database entry argument,
and clear the reported_gone bit when we send a rescan to
CAM.

In isp_make_gone(), take a port database entry as an
argument, and set the reported_gone bit when we send an
async event telling CAM consumers that the device is gone.

Sponsored by: Spectra Logic
MFC after: 1 week

------------------------------------------------------------------------
r277514 | will | 2015-01-21 13:27:11 -0700 (Wed, 21 Jan 2015) | 18 lines

Force commit to record the correct log for r277513.

If the user sends an XPT_RESET_DEV CCB, make sure to reset the
Fibre Channel Command Reference Number if we're running on a FC
controller.

We send a SCSI Target Reset when we get this CCB, and as a result
need to reset the CRN to 1 on the next command.

isp_freebsd.c:
In the XPT_RESET_DEV implementation in isp_action(), reset
the CRN if we're on a FC controller.

Submitted by: ken
MFC after: 1 week
Sponsored by: Spectra Logic
MFSpectraBSD: 1112787 on 2015/01/15

------------------------------------------------------------------------
r277515 | will | 2015-01-21 13:32:36 -0700 (Wed, 21 Jan 2015) | 25 lines

Fix SCSI status byte reporting on 4Gb and 8Gb Qlogic boards.

The newer boards don't have the response field that indicates
whether the SCSI status byte is present. You have to just look to
see whether it is non-zero.

The code was looking to see whether the sense length was valid
before propagating the SCSI status byte (and sense information) up
the stack. With a status like Reservation Conflict, there is no
sense information, only the SCSI status byte. So it wasn't getting
correctly returned.

isp.c:
In isp_intr(), if we are on a 2400 or 2500 type board and
get a response, look at the actual contents of the
SCSI status value and set the RQSF_GOT_STATUS flag
accordingly so that return any SCSI status value we get. The
RQSF_GOT_SENSE flag will get set later on if there is
actual sense information returned.

Submitted by: ken
MFC after: 1 week
Sponsored by: Spectra Logic
MFSpectraBSD: 1112791 on 2015/01/15

------------------------------------------------------------------------

Sponsored by: Spectra Logic


# 275443 03-Dec-2014 mav

MFC r275123: Fix incorrect check, blocking MULTIID functionality.


# 263384 19-Mar-2014 dim

MFC r259860 (by mjacob):

Harvest one no longer used constant string.

Remove another and place it into play in the
normally ifdef protected zone it would be used
int.

Noticed by: dim


# 260346 05-Jan-2014 mav

MFC r257930:
Some more registers access optimizations:
- Process ATIO queue only if interrupt status tells so;
- Do not update queue out pointers after each processed command, do it
only once at the end of the loop.


# 260341 05-Jan-2014 mav

MFC r256705:
Optimize isp(4) to reduce CPU usage, especially in target mode:
- Remove two excessive and slow register reads from isp_intr(). Instead
of rereading value every time, assume that registers contain what we have
written there.
- Avoid sequential search through 4096 array elements when looking for
command tag. Use hash of lists to store active tags separately from free
ones and so greatly speedup the searches.


# 256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


# 253330 13-Jul-2013 mjacob

When fiddling with options of which registers to copy out for
a mailbox command and which registers to copy back in when
the command completes, the bits being set need to not only
specify what bits you want to add from the default from the
table but also what bits you want *subtract* (mask) from the
default from the table.

A failing ISP2200 command pointed this out.

Much appreciation to: marius, who persisted and narrowed down what
the failure delta was, and shamed me into actually fixing it.
MFC after: 1 week


# 247264 25-Feb-2013 mjacob

Turn off fast posting for the ISP2100- I'd forgotten that it actually
might have been enabled for them- now that we use all 32 bits of handle.
Fast Posting doesn't pass the full 32 bits.

Noticed by: Bugs in NetBSD. Only a NetBSD user might actually still use such old hardware.
MFC after: 1 week


# 239219 12-Aug-2012 mjacob

Remove extraneous newline.

MFC after: 1 month


# 238869 28-Jul-2012 mjacob

-----------
MISC CHANGES

Add a new async event- ISP_TARGET_NOTIFY_ACK, that will guarantee
eventual delivery of a NOTIFY ACK. This is tons better than just
ignoring the return from isp_notify_ack and hoping for the best.

Clean up the lower level lun enable code to be a bit more sensible.

Fix a botch in isp_endcmd which was messing up the sense data.

Fix notify ack for SRR to use a sensible error code in the case
of a reject.

Clean up and make clear what kind of firmware we've loaded and
what capabilities it has.
-----------
FULL (252 byte) SENSE DATA

In CTIOs for the ISP, there's only a limimted amount of space
to load SENSE DATA for associated CHECK CONDITIONS (24 or 26
bytes). This makes it difficult to send full SENSE DATA that can
be up to 252 bytes.

Implement MODE 2 responses which have us build the FCP Response
in system memory which the ISP will put onto the wire directly.

On the initiator side, the same problem occurs in that a command
status response only has a limited amount of space for SENSE DATA.
This data is supplemented by status continuation responses that
the ISP pushes onto the response queue after the status response.
We now pull them all together so that full sense data can be
returned to the periph driver.

This is supported on 23XX, 24XX and 25XX cards.

This is also preparation for doing >16 byte CDBs.

-----------
FC TAPE

Implement full FC-TAPE on both initiator and target mode side. This
capability is driven by firmware loaded, board type, board NVRAM
settings, or hint configuration options to enable or disable. This
is supported for 23XX, 24XX and 25XX cards.

On the initiator side, we pretty much just have to generate a command
reference number for each command we send out. This is FCP-4 compliant
in that we do this per ITL nexus to generate the allowed 1 thru 255
CRN.

In order to support the target side of FC-TAPE, we now pay attention
to more of the PRLI word 3 parameters which will tell us whether
an initiator wants confirmed responses. While we're at it, we'll
pay attention to the initiator view too and report it.

On sending back CTIOs, we will notice whether the initiator wants
confirmed responses and we'll set up flags to do so.

If a response or data frame is lost the initiator sends us an SRR
(Sequence Retransmit Request) ELS which shows up as an SRR notify
and all outstanding CTIOs are nuked with SRR Received status. The
SRR notify contains the offset that the initiator wants us to restart
the data transfer from or to retransmit the response frame.

If the ISP driver still has the CCB around for which the data segment
or response applies, it will retransmit.

However, we typically don't know about a lost data frame until we
send the FCP Response and the initiator totes up counters for data
moved and notices missing segments. In this case we've already
completed the data CCBs already and sent themn back up to the periph
driver. Because there's no really clean mechanism yet in CAM to
handle this, a hack has been put into place to complete the CTIO
CCB with the CAM_MESSAGE_RECV status which will have a MODIFY DATA
POINTER extended message in it. The internal ISP target groks this
and ctl(8) will be modified to deal with this as well.

At any rate, the data is retransmitted and an an FCP response is
sent. The whole point here is to successfully complete a command
so that you don't have to depend on ULP (SCSI) to have to recover,
which in the case of tape is not really possible (hence the name
FC-TAPE).

Sponsored by: Spectralogic
MFC after: 1 month


# 238486 15-Jul-2012 brueffer

Fix typo in a message.

Obtained from: DragonFly BSD (change 7a817ab191e4898404a9037c55850e47d177308c)
MFC after: 3 days


# 237544 25-Jun-2012 mjacob

Unbreak register tests for parallel SCSI.
You can't overwrite registers 7 and 8.
MFC after: 3 days


# 237537 24-Jun-2012 mjacob

Clean up multi-id mode so it's driven by the f/w loaded,
not by some hint setting. Do more preparations for FC-Tape.
Clean up resource counting for 24XX or later chipsets so
we find out after EXEC_FIRMWARE what is actually supported.
Set target mode exchange count based upon whether or not
we are supporting simultaneous target/initiator mode. Clean
up some old (pre-24XX) xfwoption and zfwoption issues.

Sponsored by: Spectralogic
MFC after: 3 days


# 237210 17-Jun-2012 mjacob

Prepare for FC-Tape support. This involved doing a lot of little cleanups
and crosschecks against firmware documentation. We now check and report
FC firmware attributes and at least are now prepared for the upper 48 bits
of f/w attributes (which are probably for the 8100 or later cards). This
involed changing how inbits and outbits are calculated for varios commands,
hopefully clearer and cleaner. This also caused me to clean up the actual
mailbox register usage. Finally, we are now unconditionally using a CRN
for initiator mode.

A longstanding issue with the 2400/2500 is that they do *not* support
a "Prefer PTP followed by loop", which explains why enabling that
caused the f/w to crash.

A slightly more invasive change is to let the firmware load entirely
drive whether multi_id support is enabled or not.

Sponsored by: Spectralogic
MFC after: 1 week


# 236427 01-Jun-2012 mjacob

Clean up and complete the incomplete deferred enable code.
Make the default role NONE if target mode is selected. This
allows ctl(8) to switch to/from target mode via knob settings.
If we default to role 'none', this causes a reset of the
24XX f/w which then causes initiators to wake up and notice
when we come online.

Reviewed by: kdm
MFC after: 2 weeks
Sponsored by: Spectralogic


# 227548 16-Nov-2011 mjacob

Was chasing down a failure to load f/w on a 2400. It turns out that the card
is actually broken, or needs a BIOS upgrade for 64 bit loads, but this uncovered
a couple of misplaced opcode definitions and some missing continual mbox command
cases, so might as well update them here.


# 224856 13-Aug-2011 mjacob

Most of these changes to isp are to allow for isp.ko unloading.
We also revive loop down freezes. We also externaliz within isp
isp_prt_endcmd so something outside the core module can print
something about a command completing. Also some work in progress to
assist in handling timed out commands better.

Partially Sponsored by: Panasas
Approved by: re (kib)
MFC after: 1 month


# 219098 28-Feb-2011 mjacob

Sync FreeBSD ISP with mercurial tree. Minor changes having to do with
a macro for minima.


# 218691 14-Feb-2011 marius

- Use the correct DMA tag/map pair for synchronize the FC scratch area.
- Allocate coherent DMA memory for the request/response queue area and
and the FC scratch area.

These changes allow isp(4) to work properly on sparc64 with usage of the
IOMMU streaming buffers enabled.

Approved by: mjacob
MFC after: 2 weeks


# 208849 05-Jun-2010 mjacob

Be more specific about which CDB length we're going to use. Not really a likely
bug but we might as well be clearer.

Found with: Coverity Prevent(tm)
CID: 3981

MFC after: 2 weeks


# 208761 02-Jun-2010 mjacob

Various minor and not so minor fixes suggested by Coverity.
In at least one case, it's amazing that target mode worked at all.

Found by: Coverity.
MFC after: 2 weeks


# 205698 26-Mar-2010 mjacob

Clean up some printing stuff so that we can have a bit finer control
on debug output. Add a new platform function requirement to allow
for printing based upon the ITL nexus instead of the isp unit plus
channel, target and lun. This allows some printouts and error messages
from the core code to appear in the same format as the platform's
subsystem (in FreeBSD's case, CAM path).

MFC after: 1 week


# 204397 27-Feb-2010 mjacob

Revamp the pieces of some of the stuff I forgot to do when shifting to
32 bit handles. The RIO (reduced interrupt operation) and fast posting
for the parallel SCSI cards were all 16 bit handles. Furthermore,
target mode parallel SCSI only can have 16 bit handles.

Use part of a supplied patch to switch over to using 32 bit handles.
Be a bit more conservative here and only do this for parallel SCSI
for the 12160 (Ultra3) cards. There were a lot of marginal Ultra2
cards, and, frankly, few are findable now for testing.

Fix the target handle routine to only do 16 bit handles for parallel
SCSI cards. This is okay because the upper sixteen bits of the new
32 bit handles is a sequence number to help protect against duplicate
completions. This would be very unlikely to happen with parallel
SCSI target mode, and wasn't present before, so we're no worse off
than we used to be.

While we're at it, finally split the async mailbox completion handlers
into FC and parallel SCSI functions. This makes it much cleaner and
easier to figure out what is or isn't a legal async mailbox completion
code for different card classes.

PR: kern/144250
Submitted partially by: Charles D
MFC after: 1 week


# 204050 18-Feb-2010 mjacob

Don't try and re-use a handle, even if the firmware tells you that's what is logged in.

PR: kern/144026
MFC after: 1 week


# 203444 03-Feb-2010 mjacob

Redo how commands handles are created and managed and implement sequence
numbers and handle types in rational way. This will better protect from
(unwittingly) dealing with stale handles/commands.

Fix the watchdog timeout code to better protect itself from mistakes.

If we run an abort on a putatively timed out command, the command
may in fact get completed, so check to make sure the command we're
timing it out is still around. If the abort succeeds, btw, the command
should get returned via a different path.


# 202418 15-Jan-2010 mjacob

Amazingly we've been freeing a handle and using that which it refers to
for years. Bad!

MFC after: 1 week


# 201758 07-Jan-2010 mbr

Remove extraneous semicolons, no functional changes.

Submitted by: Marc Balmer <marc@msys.ch>
MFC after: 1 week


# 201408 03-Jan-2010 mjacob

Make sure that the WWNN is also created for 2100..2300 cards.
MFC after: 1 day


# 201325 31-Dec-2009 mjacob

Create a Node WWN from the *Port* WWN, not vice versa, for 2400s.

If the NAA is type 2, the Node WWN is the Port WWN with the 12 bits
of port (48..60) cleared. This iff a wwn fetched from NVRAM is zero.

MFC after: 1 week


# 197373 21-Sep-2009 mjacob

(semiforced commit to add comment missed in last delta)
Add a maximum response length for FCP RSPNS IUs.

Clarify some of the FC option words for setting parameters
and try and disable automatic PRLI when in target mode- this
should correct some cases of N-port topologies with 23XX cards
where we put out an illegal PRLI (in target mode only we're
not supposed to put out a PRLI).


# 197372 21-Sep-2009 mjacob

Remove file unused in freebsd.


# 197214 15-Sep-2009 mjacob

Accomodate old style XPT_IMMED_NOTIFY and XPT_NOTIFY_ACK so that
we at least don't panic.

We don't really support dual role mode (INITIATOR/TARGET) any more. We
should but it's broken and will take a fair amount of effort to fix
and correctly manage both initiator and target roles sharing the port
database. So, for now, disallow it.


# 196008 01-Aug-2009 mjacob

Add 8Gb support (isp_2500). Fix a fair number of configuration and
firmware loading bugs.

Target mode support has received some serious attention to make it
more usable and stable.

Some backward compatible additions to CAM have been made that make
target mode async events easier to deal with have also been put
into place.

Further refinement and better support for NP-IV (N-port Virtualization)
is now in place.

Code for release prior to RELENG_7 has been stripped away for code clarity.

Sponsored by: Copan Systems

Reviewed by: scottl, ken, jung-uk kim
Approved by: re


# 186140 15-Dec-2008 marius

Don't try reading the SXP_PINS_DIFF on the 10160 and 12160 SCSI
controllers. Reading this register, for which there are indications
that it doesn't really exist, returns 0 on at least some 12160
and doing so on Sun Fire V880 causes a data access error exception.

Reported and tested by: Beat Gaetzi
Approved by: mjacob
Obtained from: OpenBSD (modulo setting isp_lvdmode)


# 171336 10-Jul-2007 mjacob

Be more conservative- turn off fast posting and RIO for 22XX cards.

Approved by: re (ken)
MFC after: 3 days


# 171159 02-Jul-2007 mjacob

Recover from some major omissions/problems with the 24XX port.
First, we were never correctly checking for a 24XX Status Type 0
response- that cased us to fall through to evaluate status for
commands as if this were a 2100/2200/2300 Status Type 0 response.
This is *close*, but not quite the same. This has been reported
to be apparent with some wierd lun configuration problems with
some arrays. It became glaringly apparent on sparc64 where none
of the correct byte swap things were done.

Fixing this omission then caused a whole universe shifting debug
cycle of endian issues for the 2400. The manual for 24XX f/w turns
out to be wrong about the endianness of a couple of entities. The
lun and cdb fields for the type 7 request are *not* unconditionally
big endian- they happen to be opposite of whatever the endian of
the current machine type is. Same with the sense data for the
24XX type 0 response.

While we're at it investigate and resolve some NVRAM endian
issues.

Approved by: re (ken)
MFC after: 3 days


# 171014 24-Jun-2007 mjacob

If we're going to (for 23XX and 24XX cards) DMA firmware from the
request queues rather than shove it down a word at a time, we have
to remember to put it into little endian format. Use the macros
ISP_IOXPUT_{16,32} for this purpose. Otherwise, on sparc the firmware
is loaded garbled and we get a (not surprisingly) firmware checksum
failure and the card won't start and we don't attach it.

Approved by: re (bruce)
MFC after: 3 days


# 169292 05-May-2007 mjacob

Make this an MP safe driver but also still be multi-release.
Seems to work on RELENG_4 through -current and also on sparc64
now. There may still be some issues with the auto attach/detach
code to sort out.

MFC after: 3 days


# 168030 29-Mar-2007 mjacob

some minor error message cleanups


# 167821 22-Mar-2007 mjacob

MFP4: a) Some constification from NetBSD (gcc 4.1.2)
b) Split default param fetching/setting into scsi and fibre functions
and retry the fibre fetch more than once.

MFC after: 1 week


# 167521 14-Mar-2007 mjacob

Don't call isp_intr from isp_start- this seems to, in rare cases,
cause confusion with at least the 23XX chipsets where the output
queue index pointer just gets a bit whacko.

MFC after: 1 day


# 167500 13-Mar-2007 mjacob

Restore optr if you trash it for 24XX target mode.

MFC after: 3 days


# 167473 12-Mar-2007 mjacob

Fix compilation issues found in RELENG_4 port and merge the
diffs back to -current to keep versions identical.


# 167403 10-Mar-2007 mjacob

Fix some stupid copyright mistakes that have been there for quite some time.


# 166929 23-Feb-2007 mjacob

Don't attempt to load illegal hard loop addresses into
an ICB. This shows up on card restarts, and usually for
2200-2300 cards. What happens is that we start up,
attempting to acquire a hard address. We end up instead
being an F-port topology, which reports out a loop id
of 0xff (or 0xffff for 2K Login f/w). Then, if we restart,
we end up telling the card to go off an acquire this loop
address, which the card then rejects. Bah.

Compilation fixes from Solaris port.


# 166894 23-Feb-2007 mjacob

Be a bit more restrictive about printing out 'bad' pdb entries
during loop rescans. They're not bad so much as unstable, so
don't print this stuff out unless ISP_LOGSANCFG is set.


# 166120 20-Jan-2007 mjacob

MFP4: Move default setting to the end of isp_reset instead of the
front of isp_init so we can read NVRAM even if we're role ISP_NONE.
Prepare for reintroduction of channels (for FC) for N-Port
Virtualization.

Fix a botch in handle assignment that caused us to nuke one device
when a new one arrives and end up with two devices with the same
identity in the virtual target mapping table.


# 165816 05-Jan-2007 mjacob

Check the return from registering FC4 types with the fabric name
server.

Don't complain about a hard loop id of 0xffff- we get this in
point-to-point topologies with the 2300 and 2K Login firmware.

Up the timeout on register FC4 types commands.


# 165308 17-Dec-2006 mjacob

Try an experiment with using DMA to load firmware into a 2200- VERIFY
CHECKSUM fails. Oh well, but keep a couple of the changes.

Avoid overflow in usec counters when waiting for mailbox completion.


# 165269 16-Dec-2006 mjacob

Implement ISP_RESET0 for PCI and SBUS attachments- isp_reset has
been modified to call ISP_RESET0 if it fails to do a reset. This
gives us a chance to disable interrupts.


# 164909 05-Dec-2006 mjacob

Make ISPCTL_PLOGX find a handle to log into the management server
with- not hope for the best. Change some things which were gated
off of 24XX to be gated off of 2K login support. Convert some
isp_prt calls to xpt_print calls.


# 164370 18-Nov-2006 mjacob

Make the SAN login/logout stuff more common between different chipsets
and provied an isp_control entry point so that the outer layers can
do PLOGI/LOGO explicitly. Add MS IOCB support. This completes the cycle
for base support for SMI-S.


# 164318 16-Nov-2006 mjacob

Increase the timeout for some SAN commands.

Only complain about FC Reponse errors if they're nonzero.

Shorten some PortID printouts for local loop.

Add an internal isp_xcmd_t data structure which we'll use for some
CT-Passthru support as part of adding SMI-S.


# 164317 16-Nov-2006 mjacob

minor change to reduce some diff noise


# 164272 14-Nov-2006 mjacob

Push things closer to path failover by implementing loop down and
gone device timers and zombie state entries. There are tunables
that can be used to select a number of parameters.

loop_down_limit - how long to wait for loop to come back up before
declaring
all devices dead (default 300 seconds)

gone_device_time- how long to wait for a device that has appeared
to leave the loop or fabric to reappear (default 30 seconds)

Internal tunables include (which should be externalized):

quick_boot_time- how long to wait when booting for loop to come up

change_is_bad- whether or not to accept devices with the same
WWNN/WWPN that reappear at a different PortID as being the 'same'
device.

Keen students of some of the subtle issues here will ask how
one can keep devices from being re-accepted at all (the answer
is to set a gone_device_time to zero- that effectively would
be the same thing).


# 163899 02-Nov-2006 mjacob

Add 4Gb (24XX) support and lay the foundation for a lot of new stuff.


# 161790 01-Sep-2006 mjacob

fix bug in 2322 receive sequencer f/w load


# 161271 14-Aug-2006 mjacob

Fix 2KLOGIN code to specify *ibits* (not *obits*) so that the
options field in register 10 will be deterministic, not random.

Correct the number of input bits for EXECUTE_FIRMWARE 0..1 to
0..2- the 2322 and 24XX cards use mailbox register 2 to specify
whether the f/w being executed is freshly loaded or not.

Correct the number of input bits for {READ,WRITE}_RAM_WORD_EXTENDED
so that register 8 gets picked up.

Fix the indexing and offset for the 2322 f/w download so that it
correctly puts the different code segments where they belong.

Move VERIFY_CHECKSUM to be the 'else' clause to 2322 f/w downloads-
the EXECUTE_FIRMWARE command for 2322 and 24XX cards will tell you
if the f/w checksum is incorrect and VERIFY_CHECKSUM only works for
RISC SRAM address < 64K so you can only do a VERIFY_CHECKSUM on the
first of the 3 f/w segments for the 2322.

Shorten the delay for the continuation mailbox commands- 1ms is
ridiculous (100us is more likely).

All of the more or less is really only for the 2322/6322 cards.


# 160990 05-Aug-2006 mjacob

Remove reference to PTI cards. They haven't been functioning
or around for probably at least 5 years.


# 160977 04-Aug-2006 mjacob

Initialize 2300 request/response pointers in isp_reset- not in
isp_fibre_init.


# 160410 16-Jul-2006 mjacob

Some rearrangement of headers to minimize diffs with outside of
FreeBSD repository and to clean up the license header so as to
not pollute the license with file function.

Zero all mailbox structures prior to use (just in case). Change
the outgoing mailbox count for INIT_FIRMWARE to be correct.


# 160337 14-Jul-2006 mjacob

Add some missing braces.

Add MEMORY_BARRIER for the few scratch dma ops that were missing
them plus add a couple of hi 32 bit dma ops (we could probably
allow 64 bit scratch and request/response queue dma now).


# 160088 03-Jul-2006 mjacob

What the heck - make the last (most recent) 2200 f/w also do
Hard Loop acquisition.


# 160080 03-Jul-2006 mjacob

Do various fixes to support firmware loading for the 2322
(and by extension, the 2422).

One peculiar thing I've found with the 2322 is that if you
don't force it to do Hard LoopID acquisition, the firmware
crashes. This took a while to figure out.

While we're at it, fix various bugs having to do with NVRAM
reading and option setting with respect to pieces of NVRAM.


# 157945 21-Apr-2006 mjacob

Redo some code based upon issues found by Coverity.


# 157943 21-Apr-2006 mjacob

Some more gratuitous format and name changes.

Pull in some target mode changes from a private branch.
Pull in some more RELENG_4 compilation changes.

A lot of lines changed, but not much content change yet.


# 155704 15-Feb-2006 mjacob

a) clean up some declaration stuff (i.e., make more modern with respect
to getting rid u_int for uint and so on).

b) Turn back on 64 bit DAC support. Cheeze it a bit in that we have two
DMA callback functions- one when we have bus_addr_t > 4 bits in width and
the other which should be normal. Even Cheezier in that we turn off setting
up DMA maps to be BUS_SPACE_MAXADDR if we're in ISP_TARGET_MODE. More work
on this in a week or so.

c) Tested under amd64 and 1MB DFLTPHYS, sparc64, i386 (PAE, but insufficient
memory to really test > 4GB). LINT check under amd64.

MFC after: 1 month


# 155206 02-Feb-2006 mjacob

Make sure we don't pick up a loopid that's larger than our
current portdb max (MAX_FC_TARG == 256) now that we support
2K Login f/w.

MFC after: 3 days


# 154704 23-Jan-2006 mjacob

First of several commits as this driver is dusted off and maybe brought
up to date. Principle changes for this reelase is to support 2K Port Login
firmware. This allows us to support the 2322 (and 2422 4Gb) cards which only
come with the 2K Port Login firmware. The 2322 should now work- but we don't
have firmware sets for it in ispfw (as the change to load 2K Port Login f/w
hasn't been made- that f/w is so big it has to be loaded in more than one
chunk).

Other changes are the beginnings of cleaning up some long standing target
mode issues. The next changes here will incorporate a lot of bug fixes
from others.

Finally, some copyright cleanup and attempts to make the parts of the
driver that are FreeBSD specific start conforming more to FreeBSD style.

MFC after: 1 month


# 151834 29-Oct-2005 mjacob

Add an ioctl framework for doing FC task management functions from
a user space tool- useful for doing FC target mode certification.


# 140649 23-Jan-2005 mjacob

Don't set ZIO for 23XX for target mode (use fast posting instead).
Use the correct number of handles for multihandle returns.

Very, very, rarely on some SMP systems we've seen an 'unstable' type
in the response queue. I dunno whether or not it's a bug in our
handling, or whether there's a cache incoherency issue, but
try to guard against it.

MFC after: 2 weeks


# 139749 06-Jan-2005 imp

Start each of the license/copyright comments with /*-, minor shuffle of lines


# 129643 24-May-2004 njl

Store the target handles in a separate list from normal commands. Add a
CTIO fast post routine to handle CTIO completions.

Submitted by: mjacob


# 125546 07-Feb-2004 mjacob

Add case to handle ISPCTL_GET_PDB.

MFC after: 1 week


# 124894 23-Jan-2004 mjacob

If we have ISP_ROLE_INITIATOR set, make sure that we clear ICBOPT_INI_DISABLE
from the fwoptions. Likewise, we *set* ICBOPT_INI_DISABLE if we don't have
initiator role.


# 120012 13-Sep-2003 mjacob

On reset, make sure that we have some parameters set correctly. This
fixes a longstanding issue WRT resetting the chip after startup- it
would fail if we were connected as an F-port to a switch. If we
were connected as an F-port, we got assigned a hard loop ID of 255,
which is really a bogus loop id. Then when we turned around to
reset ourselves, the firmware would reject the ICB_INIT request
because the loop id was bogus. *sputter*

Minor fixlet from somebody in NetBSD with too much time on their
hands (dma -> DMA).


# 119459 25-Aug-2003 mjacob

Revert previous commit. Violates Maintainer (O'Brien knows how to
reach me directly), but more importantly, breaks compiles on
non-FreeBSD platforms.


# 119418 24-Aug-2003 obrien

Use __FBSDID().
Also some minor style cleanups.


# 115630 01-Jun-2003 mjacob

Restore parentheses removed inappropriately in last commit.


# 115521 31-May-2003 phk

Remove unused variables
Add /* FALLTHROUGH */

Found by: FlexeLint


# 110972 16-Feb-2003 mjacob

Pick up some compilation warning fixes from NetBSD.

If we don't have ISP_FW_CRASH_DUMP defined, we have to do
a isp_reinit in the core code- not the platform code- so
fix the ISP_CONN_FATAL case.


# 108533 01-Jan-2003 schweikh

Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup,
especially in troff files.


# 108470 30-Dec-2002 schweikh

Fix typos, mostly s/ an / a / where appropriate and a few s/an/and/
Add FreeBSD Id tag where missing.


# 104916 11-Oct-2002 mjacob

This should enable 10160 support. As best as I can tell, the same
f/w as 12160 is used, and otherwise, this is just a single channel
variant of the 10160.

MFC after: 0 days


# 103819 23-Sep-2002 mjacob

If we have a 1240 or an ULTRA2 or better card, use MBOX_INIT_RES_QUEUE_A64
(preparation for DAC/A64 support)


# 103074 07-Sep-2002 mjacob

The size argument to snprintf does not have to be backed off by one
to account for a NULL byte.

Submitted by: Jacques A. Vidrine <nectar@celabo.org>


# 103035 06-Sep-2002 mjacob

Remove STRNCAT (==>strncat) usage. Apparently I never read the man
page correctly and it wasn't doing what I thought it was.

Noticed by: Brooks Davis <brooks@one-eyed-alien.net>


# 102016 17-Aug-2002 mjacob

If we're using ancient (pre 1.17.0) 2100 f/w (for the cards that cannot
load f/w images > 0x7fff words), set ISP_FW_ATTR_SCCLUN. We explicitly
don't believe we can find attributes if f/w is < 1.17.0, so we have to
set SCCLUN for the 1.15.37 f/w we're using manually- otherwise every
target will replicate itself across all 16 supported luns for non-SCCLUN
f/w.

Correctly set things up for 23XX and either fast posting or ZIO. The
23XX, it turns out, does not support RIO. If you put a non-zero value
in xfwoptions, this will disable fast posting. If you put ICBXOPT_ZIO
in xfwoptions, then the 23XX will do interrupt delays but post to the
response queue- apparently QLogic *now* believes that reading multiple
handles from registers is less of a win than writing (and delaying)
multiple 64 byte responses to the response queue.

At the end of taking a a good f/w crash dump, send the ISPASYNC_FW_DUMPED
event to the outer layers (who can then do things like wake a user
daemon to *fetch* the crash image, etc.).


# 99595 08-Jul-2002 mjacob

Remove the 'bogus registrant' hack for fabric searches. It really
turns out that there's something of a hole in our new fabric name
server stuff. We ask the name server for entities that have
registered as a specific type. That type is FC-SCSI. If the entity
hasn't performed a REGISTER FC4 TYPES, the fabric nameserver won't
return it.

This brings this driver to a bit of a fork in the road as to what
the right thing to do is. For servicing the needs of accessing
FC-SCSI devices, this method is fine, and to be preferred. It is
extremely unlikely we're interested in fabric devices that *don't*
register correctly. If I ever get around to adding an FC-IP stack,
then asking for devices that have registers as FC-IP types is also
the right thing to do.

So- asking the fabric nameserver for a specific type is fine, *as
long as you are only interested in specific types*. If, on the other
hand, you want to create (as for management tool support) a picture
of everything on the fabric, this is *not* so fine. There are a
large class of FC-SCSI *initiators* who *don't* correctly register,
so we never will *see* them.

Is this a problem? Yes, but only a little one. If we want to do such
management tool support, we should probably run a *different* fabric
nameserver query algorithm. Better yet, we should talk to the management
nameserver in Brocade switches instead of the standard FC-GS-2 fabric
nameserver (which can be unwieldy).

Other changes: if we've overrrides marked, don't set some default
values from reading NVRAM. This allows us to override things like
EXEC throttle without having to ignore NVRAM entirely.

MFC after: 1 week


# 98290 16-Jun-2002 mjacob

If the HBA is already 'touched', still set maxluns. Othewise for
CAM_QUIRK_HILUN devices we loop thru 32bits of lun. Oops.

Switch to using USEC_DELAY rather than USEC_SLEEP at isp_reset time.

Try to paper around a defect in clients that don't correctly registers
themeselves with the fabric nameserver.

Minor updates for Mirapoint support- they still use code that is not
HANDLE_LOOPSTATE_IN_OUTER_LAYERS, and, surprise surprise, this old
stuff had some bugs in it.

Clean up some target mode stuff.

MFC after: 1 week


# 95891 01-May-2002 mjacob

If we get a DATA UNDERRUN error from QLogic FC cards, but the RQCS_RU bit
is not set in the scsi completion status, or if the residual is clearly
nonsense, then this was a command that suffered the loss of one or more
FC frames in the middle of the exchange.

Set HBA_BOTCH and hope it will get retried. It's the only thing we can do.

MFC after: 1 day


# 94867 16-Apr-2002 mjacob

Scale back # of luns supported for SCC to 16384- oops- top 3 bits are a
lun address modifier of sorts. Only an HP XP-512 seems to have cared.

Fix a few misplaced pointers for the new fabric goop, which has been
demonstrated to work on newer Brocades and McData switches now.
Put in commented out code which would run GFF_ID if the QLogic f/w
allowed it.

Don't whine about not being able to find a handle for a command if it
was a command aborted (by us).


# 93837 04-Apr-2002 mjacob

Fix bus dma segment count to be based off of MAXPHYS, not BUS_SPACE_MAXSIZE.
Grumble. I've seen better documented architectures out of Redmond.

Redo fabric evaluation to not use GET ALL NEXT (GA_NXT). Switches seem
to be trying to wriggle out of supporting this well. Instead, use
GID_FT to get a list of Port IDs and then use GPN_ID/GNN_ID to find the
port and node wwn. This should make working on fabrics a bit cleaner and
more stable.

This also caused some cleanup of SNS subcommand canonicalization so that
we can actually check for FS_ACC and FS_RJT, and if we get an FS_RJT,
print out the reason and explanation codes.

We'll keep the old GA_NXT method around if people want to uncomment a
controlling definition in ispvar.h.

This also had us clean up ISPASYNC_FABRICDEV to use a local lportdb argument
and to have the caller explicitly say that a device is at the end of the
fabric list.

MFC after: 1 week


# 92893 21-Mar-2002 mjacob

Limit fabric search to a default 256 entries. This will all go away
soon because it's just getting harder and harder to find switches
that correctly implement the GET ALL NEXT subcommands for the SNS
protocol.

Latch up result out pointer and set a busy flag when we're looking
at the response queue. This allows for a cleaner way to make sure
we don't get multiple CPUs trying to read the same response queue
entries.

Change how isp_handle_other_response returns values (clarity).

Make PORT UNAVAILABLE the same as PORT LOGOUT (force a LIP).

Do some formatting changes.

MFC after: 0 days


# 91823 07-Mar-2002 mjacob

Disable RIO (reduced interrupt operation) for 2200 boards- it seemed like
it worked- but I ran into a case with a 2204 where commands were being lost
right and left. Best be safe.

For target mode, or things called if we call isp_handle_other response- note
that we might have dropped locks by changing the output pointer so we bail
from the loop. It's the responsibility of the entity dropping the lock to
make sure that we let the f/w know we've read thus far into the response
queue (else we begin processing the same entries again- blech!).

MFC after: 1 day


# 91003 21-Feb-2002 mjacob

Fix a problem where a local loop disk logs out- and we get a PORT LOGGED
OUT status. We are, apparently, required to force the f/w to log back in
if we want to try and talk to that disk again. This means either issuing
a LOGIN LOCAL LOOP PORT mailbox command, or by issuing a LIP. I've elected
to issue a LIP because this has a better chance of waking up the disk which
clearly just crashed and burned.

These should not occur at all. If they do, they should be darned rare.

MFC after: 1 week


# 90813 18-Feb-2002 mjacob

More for f/w crash dumps (bug fixing and adding ioctl entry points
and hints to enable for specific units)

MFC after: 1 week


# 90754 17-Feb-2002 mjacob

Support for f/w crash dumps (2200 && 23XX).

If you want QLogic to look at a potential f/w problem for FC cards, you really
have to provide them info in the format they expect. This involves dumping
a lot of hardware registers (> 300 16 bit registers) and a lot of SRAM
(> 128KB minimum). Thus all of this code is #ifdef protected which will
become an option so that the memory allocation of where to dump the crash
image is pretty expensive. It's worth it if you have a reproducible problem
because they have some tools that can tell them, given the f/w version,
the precise state of everything.

MFC after: 1 week


# 90224 04-Feb-2002 mjacob

+ A variety of 23XX changes:
disable MWI on 2300

based on function code, set an 'isp_port' for the 2312- it's a
separate instance, but the NVRAM is shared, and the second port's
NVRAM is at offset 256.

+ Enable RIO operation for LVD SCSI cards. This makes a *big* difference
as even under reasonable load we get batched completions of about 30
commands at a time on, say, an ISP1080.

+ Do 'continuation' mailbox commands- this allows us to specify a work
area within the softc and 'continue' repeated mailbox commands. This is
more or less on an ad hoc basis and is currently only used for firmware
loading (which f/w now loads substantially faster becuase the calling
thread is only woken when all the f/w words are loaded- not for each
one of the 40000 f/w words that gets loaded).

+ If we're about to return from isp_intr with a 'bogus interrupt' indication,
and we're not a 23XX card, check to see whether the semaphore register is
currently *2* (not *1* as it should be) and whether there's an async completion
sitting in outgoing mailbox0. This seems to capture cases of lost fast posting
and RIO interrupts that the 12160 && 1080 have been known to pump out under
extreme load (extreme, as in > 250 active commands).

+ FC_SCRATCH_ACQUIRE/FC_SCRATCH_RELEASE macros.

+ Endian correct swizzle/unswizzle of an ATIO2 that has a WWPN in it.

MFC after: 1 week


# 88855 03-Jan-2002 mjacob

Implement REDUCED INTERRUPT OPERATION usage form FC cards- this allows the
firmware to delay completion of commands so that it can attempt to batch
a bunch of completions at once- either returning 16 bit handles in mailbox
registers, or in a resposne queue entry that has a whole wad of 16 bit handles.

Distinguish between 2300 and 2312 chipsets- if only because the revisions
on the chips have different meanings.

Add more instrumentation plus ISP_GET_STATS and ISP_CLR_STATS ioctls.
Run up the maximum number of response queue entities we'll look at
per interrupt.

If we haven't set HBA role yet, always return success from isp_fc_runstate.

MFC after: 2 weeks


# 87671 11-Dec-2001 mjacob

Explicitly decode GetAllNext SNS Response back *as*
a GetAllNext response. Otherwise, we won't unswizzle
it correctly. This was found on linux/PPC.

This mandated creating another inline: isp_get_gan_response.


# 87635 11-Dec-2001 mjacob

Major restructuring for swizzling to the request queue and unswizzling from
the response queue. Instead of the ad hoc ISP_SWIZZLE_REQUEST, we now have
a complete set of inline functions in isp_inline.h. Each platform is
responsible for providing just one of a set of ISP_IOX_{GET,PUT}{8,16,32}
macros.

The reason this needs to be done is that we need to have a single set of
functions that will work correctly on multiple architectures for both little
and big endian machines. It also needs to work correctly in the case that
we have the request or response queues in memory that has to be treated
specially (e.g., have ddi_dma_sync called on it for Solaris after we update
it or before we read from it). It also has to handle the SBus cards (for
platforms that have them) which, while on a Big Endian machine, do *not*
require *most* of the request/response queue entry fields to be swizzled
or unswizzled.

One thing that falls out of this is that we no longer build requests in the
request queue itself. Instead, we build the request locally (e.g., on the
stack) and then as part of the swizzling operation, copy it to the request
queue entry we've allocated. I thought long and hard about whether this was
too expensive a change to make as it in a lot of cases requires an extra
copy. On balance, the flexbility is worth it. With any luck, the entry that
we build locally stays in a processor writeback cache (after all, it's only
64 bytes) so that the cost of actually flushing it to the memory area that is
the shared queue with the PCI device is not all that expensive. We may examine
this again and try to get clever in the future to try and avoid copies.

Another change that falls out of this is that MEMORYBARRIER should be taken
a lot more seriously. The macro ISP_ADD_REQUEST does a MEMORYBARRIER on the
entry being added. But there had been many other places this had been missing.
It's now very important that it be done.

Additional changes:

Fix a longstanding buglet of sorts. When we get an entry via isp_getrqentry,
the iptr value that gets returned is the value we intend to eventually plug
into the ISP registers as the entry *one past* the last one we've written-
*not* the current entry we're updating. All along we've been calling sync
functions on the wrong index value. Argh. The 'fix' here is to rename all
'iptr' variables as 'nxti' to remember that this is the 'next' pointer-
not the current pointer.

Devote a single bit to mboxbsy- and set aside bits for output mbox registers
that we need to pick up- we can have at least one command which does not
have any defined output registers (MBOX_EXECUTE_FIRMWARE).

MFC after: 2 weeks


# 85395 23-Oct-2001 mjacob

Tra-La, another QLogic f/w funny- this time with the 2300.
If we get a completion status of RQCS_QUEUE_FULL, it means
that the internal queues are full. Other QLogic boards set
the QFULL SCSI status. But *nooooooooooo*, not the 2300.

MFC after: 1 day


# 85112 18-Oct-2001 mjacob

Protect against deranged fabric nameservers that spit out 10000 identical
port numbers.

MFC after: 1 day


# 84598 06-Oct-2001 mjacob

Misunderstanding documentation caused me to try and set 1Gbps/2Gps/Auto
connection speed for the 2300 in the wrong offset in the ICB. Oops.

Respect some QLogic errat wrt PCI errors on certain shared host/RISC registers.


# 84241 01-Oct-2001 mjacob

Implement a call to get the actual link data rate (if 23XX) so we can
set whether it's a 2Gps or 1Gps link.

MFC after: 1 week


# 84149 29-Sep-2001 mjacob

When calling isp_reset, set the request/response in/out pointers all at
once so there isn't a window with the ones for the 23XX cards being wrong.

When being verbose, print out some more FC NVRAM values (like framesize).

MFC after: 1 week


# 82842 03-Sep-2001 mjacob

Clarify issues about whether we have SCCLUN (65535 luns) or non-SCCLUN (16
luns) firmware for the Fibre Channel cards.

We used to assume that if we didn't download firmware, we couldn't know
what the firmware capability with respect to SCCLUNs is- and it's important
because the lun field changes in the request queue entry based upon which
firmware it is.

At any rate, we *do* get back firmware attributes in mailbox register 6
when we do ABOUT FIRMWARE for all 2200/2300 cards- and for 2100 cards
with at least 1.17.0 firmware. So- we now assume non-SCCLUN behaviour
for 2100 cards with firmware < 1.17.0- and we check the firmware attributes
for other cards (loaded firmware or not).

This also allows us to get rid of the crappy test of isp_maxluns > 16-
we simply can check firmware attributes for SCCLUN behaviour.

This required an 'oops' fix to the outgoing mailbox count field for
ABOUT FIRMWARE for FC cards.

Also- while here, hardwire firmware revisions for loaded code for SBus
cards. Apparently the 1.35 or 1.37 f/w we've been loading into isp1000
just doesn't report firmware revisions out to mailbox regs 1, 2 and 3
like everyone else. Grumble. Not that this fix hardly matters for FreeBSD.

MFC after: 4 weeks


# 82689 31-Aug-2001 mjacob

Add 2 Gigabit Fibre Channel support (2300 && 2312 cards). This required
some reworking (and consequent cleanup) of the interrupt service code.

Also begin to start a cleanup of target mode support that will (eventually)
not require more inforamtion routed with the ATIO to come back with the
CTIO other than tag.

MFC after: 4 weeks


# 81988 20-Aug-2001 mjacob

Clean up some ways in which we set defaults for SCSI cards
that do not have valid NVRAM. In particular, we were leaving
a retry count set (to retry selection timeouts) when thats
not really what we want. Do some constant string additions
so that LOGDEBUG0 info is useful across all cards.

MFC after: 2 weeks


# 81795 16-Aug-2001 mjacob

oops- typo in a previous commit


# 81790 16-Aug-2001 mjacob

Enable LIP F8, LIP Reset async events.
Be more chatty about SNS failures. Fix
typo for skipped phase mesage. Correct
MBOX_GET_PORT_QUEUE_PARAMS options in
table.

MFC after: 2 weeks


# 80995 02-Aug-2001 mjacob

Oops- don't set 'goal' twice when you mean to set 'nvrm' as well.
This breaks bogus NVRAM boards.

MFC after: 1 day


# 80581 30-Jul-2001 mjacob

Redo how we manage SCSI device settings- have a 3rd flags (nvram) that records
either what's in NVRAM or what the safe defaults would be if we lack NVRAM.
Then we rename cur_XXXX to actv_XXXX (these are the currently active settings)
and the dev_XXX settings to goal_XXXX (these are the settings which we want
cur_XXXX to converge to).


# 79572 11-Jul-2001 mjacob

Hmm. Let's try this on for size...

We originally had it such that if the connection topology was FL-loop
(public loop), we never looked at any local loop addresses. The reason
for not doing that was fear or concern that we'd see the same local
loop disks reflected from the name server and we'd attach them twice.

However, when I recently hooked up a JBOD and a system to an ANCOR SA-8
switch, the disks did *not* show up on the fabric. So at least the
ANCOR is screening those disks from appearing on the fabric. Now, it's
possible this is a 'feature' of the ANCOR. When I get a chance, I'll
check the Brocade (it's hard to do this on a low budget).

In any case, if they *do* also show up on the fabric, we should
simply elect to not log into them because we already have an
entry for the local loop. There is relatively unexercised code
just for this case.

MFC after: 2 weeks


# 79234 04-Jul-2001 mjacob

More 2300 support prep- the Request/Response in/out pointers are
part of the PCI block for the 2300- not software convention usage
of the mailbox registers- so we macrosize in/out pointer usage.

Only report that a LIP destroyed commands if it actually destroyed
commands. Get the chan/tgt/lun order correct. Fix a longstanding
stupid bug that caused us to try and issue a command with a tag on
Channel B because we were checking the tagged capability for the
target against Channel A.

A firmware crash is now vectored out to platform specific code
as an async event.

Some minor formatting tweaks.


# 78221 14-Jun-2001 mjacob

We've had problems with data corruption occuring on
commands that complete (with no apparent error) after
we receive a LIP. This has been observed mostly on
Local Loop topologies. To be safe, let's just mark
all active commands as dead if we get a LIP and we're
on a private or public loop.

MFC after: 4 weeks


# 77776 05-Jun-2001 mjacob

Fix botch for state levels. Role minor release. Start adding code for a
'force logout' path.

MFC after: 4 weeks


# 77365 28-May-2001 mjacob

Spring MegaChange #1.

----

Make a device for each ISP- really usable only with devfs and add an ioctl
entry point (this can be used to (re)set debug levels, reset the HBA,
rescan the fabric, issue lips, etc).

----

Add in a kernel thread for Fibre Channel cards. The purpose of this
thread is to be woken up to clean up after Fibre Channel events
block things. Basically, any FC event that casts doubt on the
location or identify of FC devices blocks the queues. When, and
if, we get the PORT DATABASE CHANGED or NAME SERVER DATABASE CHANGED
async event, we activate the kthread which will then, in full thread
context, re-evaluate the local loop and/or the fabric. When it's
satisfied that things are stable, it can then release the blocked
queues and let commands flow again.

The prior mechanism was a lazy evaluation. That is, the next command
to come down the pipe after change events would pay the full price
for re-evaluation. And if this was done off of a softcall, it really
could hang up the system.

These changes brings the FreeBSD port more in line with the Solaris,
Linux and NetBSD ports. It also, more importantly, gets us being
more proactive about topology changes which could then be reflected
upwards to CAM so that the periph driver can be informed sooner
rather than later when things arrive or depart.

---

Add in the (correct) usage of locking macros- we now have lock transition
macros which allow us to transition from holding the CAM lock (Giant)
and grabbing the softc lock and vice versa. Switch over to having this
HBA do real locking. Some folks claim this won't be a win. They're right.
But you have to start somewhere, and this will begin to teach us how
to DTRT for HBAs, etc.

--

Start putting in prototype 2300 support. Add back in LIP
and Loop Reset as async events that each platform will handle.
Add in another int_bogus instrumentation point.

Do some more substantial target mode cleanups.

MFC after: 8 weeks


# 75192 04-Apr-2001 mjacob

After loading f/w, for FC cards print out Firmware Attributes.

Redo establishment of default SCSI parameters whether or not
we've been compiled for target mode. Unfortunately, the Qlogic
f/w is confused so that if we set all targets to be 'safe' (i.e.,
narrow/async), it will also then report narrow, async if we're
contacted in target mode from that target (acting in initiator
role). D'oh!

Fix ISPCTL_TOGGLE_TMODE to correctly enable the right channel for
dual channel cards. Add some more opcodes. Fix a stupid NULL
pointer bug.


# 74229 14-Mar-2001 mjacob

In order to save ourselves grief with the SUNPRO compiler under
Solaris (which, for reasons unknown to me, chokes on u_int16_t
as a typedef of unsigned short if used in a transitional (mixed K&R
and ANSI) way), we'll go the extra mile and fully ANSIfy things.


# 73529 04-Mar-2001 mjacob

Remove a superfluous newline in a string (isp_prt adds this).
Fix a missed conversion of 32 to 16 bit handles.


# 73319 02-Mar-2001 mjacob

Switch to using 16 bit handles instead of 32 bit handles.
This is a pretty invasive change, but there are three good
reasons to do this:

1. We'll never have > 16 bits of handle.
2. We can (eventually) enable the RIO (Reduced Interrupt Operation)
bits which return multiple completing 16 bit handles in mailbox
registers.
3. The !)$*)$*~)@$*~)$* Qlogic target mode for parallel SCSI spec
changed such that at_reserved (which was 32 bits) was split into
two pieces- and one of which was a 16 bit handle id that functions
like the at_rxid for Fibre Channel (a tag for the f/w to correlate
CTIOs with a particular command). Since we had to muck with that
and this changed the whole handler architecture, we might as well...

Propagate new at_handle on through int ct_fwhandle. Follow
implications of changing to 16 bit handles.

These above changes at least get Qlogic 1040 cards working in target
mode again. 1080/12160 cards don't work yet.

In isp.c:
Prepare for doing all loop management in outer layers.


# 72938 23-Feb-2001 mjacob

Fix a longstanding bug- we had the sense of what bit 14
for the ICB firmware options meant- *I* had taken it to
mean that if you set it, Node Name would be ignored and
derived from Port Name. Actually, it meant the opposite.
As a consequence- change ICBOPT_USE_PORTNAME to the
define ICBOPT_BOTH_WWNS- makes more sense.

Fix wrong input bitmap for MBOX_DUMP_RAM command. Call
ISP_DUMPREGS if we get a f/w crash. Add ISPCTL_RUN_MBOXCMD
control command (so outer layers can run a mailbox command
directly) and add a ISPASYNC_UNHANDLED_RESPONSE hook so
outer layers can understand response queue entries we
might not know about.


# 72346 11-Feb-2001 mjacob

Minor stuff:

Remove ISP2100_FABRIC defines- we always handle fabric now. Insert
isp_getmap helper function (for getting Loop Position map). Make
sure we (for our own benefit) mark req_state_flags with RQSF_GOT_SENSE
for Fibre Channel if we got sense data- the !*$)!*$)~*$)*$ Qlogic
f/w doesn't do so. Add ISPCTL_SCAN_FABRIC, ISPCTL_SCAN_LOOP, ISPCTL_SEND_LIP,
and ISPCTL_GET_POSMAP isp_control functions. Correctly send async notifications
upstream for changes in the name server, changes in the port database, and
f/w crashes. Correctly set topology when we get a ASYNC_PTPMODE event.

Major stuff:
Quite massively redo how we handle Loop events- we've now added several
intermediate states between LOOP_PDB_RCVD and LOOP_READY. This allows us
a lot finer control about how we scan fabric, whether we go further
than scanning fabric, how we look at the local loop, and whether we
merge entries at the level or not. This is the next to last step for
moving managing loop state out of the core module entirely (whereupon
loop && fabric events will simply freeze the command queue and a thread
will run to figure out what's changed and *it* will re-enable the queu).
This fine amount of control also gets us closer to having an external
policy engine decide which fabric devices we really want to log into.


# 71074 15-Jan-2001 mjacob

When resetting the Qlogic 2X00 units, reset the FPM (Fibre Protocol
Module) and FBM (Fibre Buffer Modules). Also remember to clear the
semaphore registers. Tell the RISC processor to not halt on FPM
parity errors.

Throw out the ISP_CFG_NOINIT silliness and instead go to the use of
adapter 'roles' to see whether one completes initialization or not
(mostly for Fibre Channel). The ultimate intent, btw, of all of this
is to have a warm standby adapter for failover reasons. Because
we do roles now, setting of Target Capable Class 3 service parameters
in the ICB for the 2x00 cards reflects from role. Also, in isp_start,
if we're not supporting an initiator role, we bounce outgoing commands
with a Selection Timeout error. Also clean out the TOGGLE_TMODE
goop for FC- there is no toggling of target mode like there is
for parallel SCSI cards.

Do more cleanup with respect to using target ids 0..125 in F-port
topologies. Also keep track of things which *were* fabric devices
so that when you rescan the fabric you can notify the outer layers
when fabric devices go away.

Only force a LOGOUT for fabric devices if they're still logged in
(i.e., you cat their Port Database entry. Clean up the Get All Next
scanning.

Finally, use a new tag in the softc to store the opcode for the
last mailbox command used so we can report which opcode timed
out.


# 70821 09-Jan-2001 mjacob

Add a isp_register_fc4_type function so that we work with McData switches
that require us to register our FC4 types of interest. Allow ourselves, in
F-port topologies, to start logging in fabric devices in the target 0..125
range. Change ISPASYNC_PDB_CHANGED (misnamed) to ISPASYNC_LOGGED_INOUT.
Fix (*SMACK*) again some default WWN stuff. This is *really* hard to get
right across all the range of platforms.


# 70516 30-Dec-2000 mjacob

Change the modification of what could be a const string. Apparently the
construct:

char *foo;
...
foo = "XXX";
...
foo[1] = 'Y';

is wrong. IT blew up on NetBSD-sparc64 because that platform write-protects
constant strings.


# 70489 29-Dec-2000 mjacob

Add in Bill Sommerfelds -Wformat changes. Set up default node && port
WWNs correctly (Again!) - this time for the case that we're not going
to fully init the adapter if isp_init is called (with ISP_CFG_NOINIT
set in options). The pupose for this is to bring the adapter up to
almost ready to go, get info out of NVRAM, but to not start it up- leaving
it until later to actually start things up if wanted (and possibly with
different roles selected).


# 69523 02-Dec-2000 mjacob

Make the Not RESPONSE in RESPONSE QUEUE message have a bit more info
(specifically, how many entries we've looked at so far). Maintain
interrupt instrumentation. Use USEC_SLEEP instead of USEC_DELAY in
a number of places (this allows us to drop locks and sleep instead
of spin). Track changes to configuration options for topology preference.
Fix botched order of printout for Channel, Target, Lun.


# 67048 12-Oct-2000 mjacob

Redo how default Node and Port WWNs are determined (again!). This is so
we don't stomp on the differences between ports for a Qlogic 2202.


# 66189 21-Sep-2000 mjacob

some copyright cleanups


# 66173 21-Sep-2000 mjacob

Inintialize the queue index stuff from what the f/w sends back- just
in case it's insane enough to not do what you tell it to.

Print out (LOGINFO level) initiator ID.


# 65140 27-Aug-2000 mjacob

various fixes


# 64095 01-Aug-2000 mjacob

Major whacking for core version 2.0. A major motivator for 2.0 and these
changes is that there's now a Solaris port of this driver, so some things
in the core version had to change (not much, but some).

In order, from the top.....:

A lot of error strings are gathered in one place at the head of the file.
This caused me to rewrite them to look consistent (with respect to
things like 'Port 0x%' and 'Target %d' and 'Loop ID 0x%x'.

The major mailbox function, isp_mboxcmd, now takes a third argument,
which is a mask that selectively says whether mailbox command failures
will be logged. This will substantially reduce a lot of spurious noise
from the driver.

At the first run through isp_reset we used to try and get the current
running firmware's revision by issuing a mailbox command. This would
invariably fail on alpha's with anything but a Qlogic 1040 since SRM
doesn't *start* the f/w on these cards. Instead, we now see whether we're
sitting ROM state before trying to get a running BIOS loaded f/w version.

All CFGPRINTF/PRINTF/IDPRINTF macros have been replaced with calls to
isp_prt. There are seperate print levels that can be independently
set (see ispvar.h), which include debugging, etc.

All SYS_DELAY macros are now USEC_DELAY macros. RQUEST_QUEUE_LEN and
RESULT_QUEUE_LEN now take ispsoftc as a parameter- the Fibre Channel
cards and the Ultra2/Ultra3 cards can have 16 bit request queue entry
indices, so we can make a 1024 entry index for them instead of the
256 entries we've had until now.

A major change it to fix isp_fclink_test to actually only wait the
delay of time specified in the microsecond argument being passed.
The problem has always been that a call to isp_mboxcmd to get he
current firmware state takes an unknown (sometimes long) amount of
time- this is if the firmware is busy doing PLOGIs while we ask
it what's up. So, up until now, the usdelay argument has been
a joke. The net effect has been that if you boot without being plugged
into a good loop or into a switch, you hang. Massively annonying, and
hard to fix because the actual time delta was impossible to know
from just guessing. Now, using the new GET_NANOTIME macros, a precise
and measured amount of USEC_DELAY calls are done so that only the
specified usecdelay is allowed to pass. This means that if the initial
startup of the firmware if followed by a call from isp_freebsd.c:isp_attach
to isp_control(isp, ISP_FCLINK_TEST, &tdelay) where tdelay is 2 * 1000000,
no more than two seconds will actually elapse before we leave concluding
that the cable is unhooked. Jeez. About time....

Change the ispscsicmd entry point to isp_start, and the XS_CMD_DONE
macro to a call to the platform supplied isp_done (sane naming).

Limit our size of request queue completions we'll look at at interrupt
time. Since we've increased the size of the Request Queue (and the
size of the Response Queue proportionally), let's not create an
interrupt stack overflow by having to keep a max completion list
(forw links are not an option because this is common code with
some platforms that don't have link space in their XS_T structures).
A limit of 32 is not unreasonable- I doubt there'd be even this many
request queue completions at a time- remember, most boards now use
fast posting for normal command completion instead of filling out
response queue entries.

In the isp_mboxcmd cleanup, also create an array of command
names so that "ABOUT FIRMWARE" can be printed instead of "CMD #8".

Remove the isp_lostcmd function- it's been deprecated for a while.
Remove isp_dumpregs- the ISP_DUMPREGS goes to the specific bus
register dump fucntion.

Various other cleanups.


# 63383 18-Jul-2000 mjacob

Raise debug level for some messages. Fix botched inversion
about MBOX_COMMAND_ERROR vs. MBOX_COMMAND_PARAM_ERROR.


# 62618 05-Jul-2000 mjacob

Clean up ISPCTL_ABORT_CMD function to not be too chatty if it succeeds,
or even if it fails with INVALID_PARM (which just means that the handle
doesn't refer to an active commane).


# 62495 04-Jul-2000 mjacob

Change delay loop in new isp_mboxcmd to the use of the new MBOX_WAIT_COMPLETE
macro. Change notification of completion of a mailbox command in isp_intr
to MBOX_NOTIFY_COMPLETE macro.


# 62174 27-Jun-2000 mjacob

Fix usage of DELAY (SYS_DELAY is the platform independent local
define). Fix stupidity wrt checking whether we've gone to
LOOP_PDB_RCVD loopstate- it's okay to be greater than this state.
D'oh! Protect calls to isp_pdb_sync and isp_fclink_state with IS_FC
macros.

Completely redo mailbox command routine (in preparation to make this
possibly wait rather than poll for completion).

Make a major attempt to solve the 'lost interrupt' problem

1. Problem

The Qlogic cards would appear to 'lose' interrupts, i.e., a legitimate
regular SCSI command placed on the request queue would never complete
and the watchdog routine in the driver would eventually wakeup and
catch it. This would typically only happen on Alphas, although a
couple folks with 700MHz Intel platforms have also seen this.

For a long time I thought it was a foulup with f/w negotiations of
SYNC and/or WIDE as it always seemed to happen right after the
platform it was running on had done a SET TARGET PARAMETERS mailbox
command to (re)enable sync && wide (after initially forcing
ASYNC/NARROW at startup). However, occasionally, the same thing
would also occur for the Fibre Channel cards as well (which, ahem,
have no SET TARGET PARAMETERS for transfer mode).

After finally putting in a better set of watchdog routines for the
platforms for this driver, it seemed to be the case that the command
in question (usually a READ CAPACITY) just had up and died- the
watchdog routine would catch it after ~10 seconds. For some platforms
(NetBSD/OpenBSD)- an ABORT COMMAND mailbox command was sent (which
would always fail- indicating that the f/w denied knowledge of this
command, i.e., the f/w thought it was a done command). In any case,
retrying the command worked. But this whole problem needed to be
really fixed.

2. A False Step That Went in The Right Direction

The mailbox code was completely rewritten to no longer try and grab
the mailbox semaphore register and to try and 'by hand' complete
async fast posting completions. It was also rewritten to now have
separate in && out bitpatterns for registers to load to start and
retrieve to complete. This means that isp_intr now handles mailbox
completions.

This substantially simplifies the mailbox handling code, and carries
things 90% toward getting this to be a non-polled routine for this
driver.

This did not solve the problem, though.

3. Register Debouncing

I saw some comments in some errata sheets and some notes in a Qlogic
produced Linux driver (for the Qlogic 2100) that seemed to indicate
that debouncing of reads of the mailbox registers might be needed,
so I added this. This did not affect the problem. In fact, it made
the problem worse for non-2100 cards.

5. Interrupt masking/unmasking

The driver *used* to do a substantial amount of masking/unmasking
of the interrupt control register. This was done to make sure that
the core common code could just assume it would never get pre-empted.

This apparently substantially contributed to the lost interrupt
problem. The rewrite of the ICR (Interrupt Control Register),
which is a separate register from the ISR (Interrupt Status Register)
should not have caused any change to interrupt assertions pending.
The manual does not state that it will, and the register layout
seems to imply that the ICR is just an active route gate. We only
enable PCI Interrupts and RISC Interrupts- this should mean that
when the f/w asserts a RISC interrupt and (and the ICR allows RISC
Interrupts) and we have PCI Interrupts enabled, we should get a
PCI interrupt. Apparently this is a latch- not a signal route.

Removing this got rid of *most* but not all, lost interrupts.

5. Watchdog Smartening

I made sure that the watchdog routine would catch cases where the
Qlogic's ISR showed an interrupt assertion. The watchdog routine
now calls the interrupt service routine if it sees this. Some
additional internal state flags were added so that the watchdog
routine could then know whether the command it was in the middle
of burying (because we had time it out) was in fact completed by
the interrupt service routine.

6. Occasional Constipation Of Commands..

In running some very strenous high IOPs tests (generating about
11000 interrupts/second across one Qlogic 1040, one Qlogic 1080
and one Qlogic 2200 on an Alpha PC164), I found that I would get
occasional but regular 'watchdog timeouts' on both the 1080 and
the 2100 cards. This is under FreeBSD, and the watchdog timeout
routine just marks the command in error and retries it.

Invariably, right after this 'watchdog timeout' error, I'd get a
command completion for the command that I had thought timed out.
That is, I'd get a command completion, but the handle returned by
the firmware mapped to no current command. The frequency of this
problem is low under such a load- it would usually take an 30
minutes per 'lost' interrupt.

I doubled the timeout for commands to see if it just was an edge
case of waiting too short a period. This has no effect.

I gathered and printed out microtimes for the watchdog completed
command and the completion that couldn't find a command- it was
always the case that the order of occurrence was "timeout, completion"
separated by a time on the order of 100 to 150 ms.

This caused me to consider 'firmware constipation' as to be a
possible culprit. That is, resubmission of a command to the device
that had suffered a watchdog timeout seemed to cause the presumed
dead command to show back up.

I added code in the watchdog routine that, when first entered for
the command, marks the command with a flag, reissues a local timeout
call for one second later, but also then issues a MARKER Request
Queue entry to the Qlogic f/w. A MARKER entry is used typically
after a Bus Reset to cause the f/w to get synchronized with respect
to either a Bus, a Nexus or a Target.

Since I've added this code, I always now see the occasional watchdog
timeout, but the command that was about to be terminated always
now seems to be completed after the MARKER entry is issued (and
before the timeout extension fires, which would come back and
*really* terminate the command).


# 61776 18-Jun-2000 mjacob

Once we have firmware running (if isp_reset) and this is the first time
through, establish what our LUN width is. Unfortunately, we can't ask
the f/w. If we loaded the f/w, we'll now assume we have expanded LUNs
(SCCLUN for fibre channel, just plain 32 LUN for SCSI). If we didn't
load firmware, assume 8 LUNs for SCSI and 1 LUN for Fibre Channel. We
have to assume only one LUN for Fibre Channel because the LUN setting
in Request Queue entries is in different places whether we have SCCLUN
firmware or not, so the only LUN guaranteed to work for both is LUN 0.

Clean up the rest of isp.c so that ISP2100_SCCLUN defines aren't used-
instead use run time determinants based upon isp->isp_maxluns.

After starting firmware, delay 500us to give it a chance to get rolling.

Fix the interrupt service routine to check for both isr && sema being zero
before thinking this was a spurious interrupt. Following the manuals,
allow for both Mailbox as well as Queue Reponse type interrupts for regular
SCSI.


# 60224 09-May-2000 mjacob

Fix some breakage about how we build WWNs. Do some other fabric related
changes: consider a new PDB entry different if Class 3 service parameter
roles change (!!!). Do some checking as we're getting a port database
that traps whether things change while we're doing so. Handle N-port
and F-ports correctly. Fix the fabric login loop to retain a login/binding
if things haven't changed (I mean, why logout a device only to log it back
in). No longer accept, after fabric logins, garbage if we can't get a PDB
entry that matches the device we've just logged into- if it doesn't, log
it out as it is very unlikely to still be what we thought it was. Get rid
of some of the debounce loops because we could get stuck there.


# 59451 21-Apr-2000 mjacob

Pick up topology more sanely at f/w startup. Change the restrictions of
where we can have targets (based on topology).

Much more importantly, make sure all mods to isp_sendmarker or |= so
we don't lose the marking of a bus that needs to have a marker sent for it.


# 57584 29-Feb-2000 mjacob

Slightly cleaner fabric support (whiter whites! redder reds!).. No,
seriously- only attempt to logout a previously logged in fabric device.

Fix a longstanding bug for aborting overtime commands- handle halves
have always been reversed.

Clean up some error messages to indicate channel number.

Approved:jkh


# 57213 15-Feb-2000 mjacob

Clean out residual bogosity for fast posting stuff- ISP_NO_FASTPOST_SCSI
is gone as a define. We just don't support fast posting for anything less
than the 1240/1080/1280/12160 or Fibre Channel cards.

Put in support for CDB's larger than 12 bytes for parallel SCSI (up to 44
bytes are allowed).

Approved: jkh


# 57146 11-Feb-2000 mjacob

Restructure nvram reading routine to split out to separate functions
for 1020/1X80/12160/2X00- for readability. Add in 12160 (Ultra3)
support- but not with PPR just yet. Fix and clarify fetching of
return parameter for getting firmware rev which for the 2200 contains
the connection topology (Private Loop (NL-port), N-port, FL-port,
F-port). Synthesize the connection topology for the 2100 which can
only be Private Loop or FL-port. Handle a couple of new async
mailbox commands which signify connection in Point-to-Point mode
(N-port or F-port) or indicate various toe stubbing getting to same.

Approved: jkh@freebsd.org


# 56008 15-Jan-2000 mjacob

clean up for SBus Ultra (yes, we do not do that here yet)


# 55689 09-Jan-2000 mjacob

change debug printout lefvels for a couple of places


# 55386 04-Jan-2000 mjacob

Make Fibre Channel cards correctly note the presence/absence
of ARQ data and punt the dealing with its presence/absence
to the platform layers.


# 55370 03-Jan-2000 mjacob

Raise default FCP logintime to 60 seconds. Move the position
of where we could have seen the loop up at least once so it
makes sense. Change some stuff in ispscsicmd so we don't get
stuck there if the loop has never come up yet. Add in some
target mode support code.


# 54859 20-Dec-1999 mjacob

Clean up some f/w revision checking wrt enabling fast posting.
Make sure we set defaults sanely for dual-bus adapters.


# 54671 16-Dec-1999 mjacob

Add Dual LVD bus (1280) support


# 54057 03-Dec-1999 mjacob

turn some messages into CFGPRINT messages


# 53490 21-Nov-1999 mjacob

Clean up stupidity in the isp_handle_other_response function- indexes
of queue entries have to be at least 16 bits now! If we're running
a 2100 less than rev 5, turn off loop fairness (per Qlogic errata). Fix
typo in checking against 2200 F/W revision. Slightly fix/reorder fabric
login stuff. Change to usage of isp_getrqentry for code clarity. Add some
defensive dual bus assumptions. Various cleanups, etc...


# 52733 01-Nov-1999 mjacob

correct moronic typo


# 52682 30-Oct-1999 mjacob

Use pointer to f/w in md structure as to whether f/w exists or not.
If firmware length isn't specified, extract from the 4th short into
the firmware.


# 52579 28-Oct-1999 mjacob

I was misinformed. I cannot get away from specifying tags for FC. Some devices
are happy w/o them- some are unhappy (IBM drives).


# 52537 26-Oct-1999 mjacob

nuke a debug printout I thought I had already nuked


# 52437 22-Oct-1999 mjacob

remember to initialize mailbox 2 for FC isp bus resets


# 52350 17-Oct-1999 mjacob

Remove some target mode stuff. It will get re-introduced in a different
file later. Do some pencil-sharpening types of minor changes. Change
how active commands are remembered (using new inline functions to get
handles, etc..). Now do a GET FIRMWARE STATUS after firing up the f/w as
outgoing mailbox 2 will tell you the f/w's notion of the max commands
that can be supported. Attempt to retrieve loop topology. Add in the
appropriate SWIZZLE/UNSWIZZLE macros calls (this is a no-op on Little
Endian machines but is needed for sparc (on other platforms)). Move
the temp port database we use to find out where things have moved to
after a LIP to the softc and off the kernel stack. Follow Qlogic's
hint and don't bother setting a tag for commands that don't have
this enabled (presumably the f/w will do it's own selection then).
Use an INT_PENDING macro to check for an interrupt. The call to
ISP_DMAFREE now just takes the handle- not the 'handle-1' which was
a layering violation. Use CFGPRINTF in a couple of places to make
things less chatty if not booting verbose, or CAMDEBUG compiles, etc..


# 50477 28-Aug-1999 peter

$Id$ -> $FreeBSD$


# 49907 16-Aug-1999 mjacob

More code cleanup. Go back to using FULL_LOGIN Fibre Chan if f/w is less than
1.17.0 level. Change where we do the loop database init. Add in the CMD_RQLATER
return. Add some register debounce.


# 48602 05-Jul-1999 mjacob

add 2200 f/w; fix botched define


# 48486 02-Jul-1999 mjacob

Roll revision levels. Add support for the Qlogic 2200 (warn about
not having SCSI_ISP_SCCLUN config defined if we don't have f/w for
the 2200- it's resident firmware uses SCCLUN (65535 luns)). Change
the way the default LoopID is gathered (it's now a platform specific
define so that some attempt at a synthetic WWN can be made in case
NVRAM isn't readable).

Change initialization of options a bit- don't use ADISC. Set
FullDuplex mode if config options tells us to do so. Do not use
FULL_LOGIN after LIP- it's the right thing to do but it causes too
much loop disruption (Loop Resets). Sanity check some default
values. Redo construction of port and node WWNs based upon what we
have- if we have 2 in the top nibble, we can have distinct port
and node WWNs. Clean up some SCCLUN related code that we obviously
had never compiled (:-(). Audit commands coming int ispscsicmd and
don't throw commands at Fibre devices that do not have Class 3
service parameters TARGET ROLE defined.

Clean up f/w initialization a bit. Add Fabric support (or at least
the first blush of it). Whew - way too much to describe here.
Basically, after a LIP, hang out until we see a Loop Up or a Port
DataBase Change async event, then see if we're on a Fabric
(GET_PORT_NAME of FL_PORT_ID). If we are, try and scan the fabric
controller for fabric devices using the GetAllNext SNS subcommand.
As we find devices, announce them to the outer layer. Try and do
some guard code for broken (Brocade) SNS servers (that get stuck
in loops- gotta maybe do this a different way using the GP_ID3 cmd
instead). Then do a scan of the lower (local loop) ids using a
GET_PORT_NAME to see if the f/w has logged into anything at that
loop id. If so, then do a GET_PORT_DATABASE command. Do this scan
into a local database. At this point we can say the loop is 'Ready'.
After this, we merge our local loop port database with our stored
port database- in a as yet to be really fully exercised fashion we
try and follow the logic of something having moved around. The
first time we see something at a Loop ID, we fix it, for the purpose
of this system instance, at that Loop ID. If things shift around
so it ends up somewhere else, we still keep it at this Loop ID (our
'Target') but use the new (moved) Loop ID when we actually throw
commands at it. Check for insane cases of different Loop IDs both
claiming to have the same WWN- if that happens, invalidate both.
Notify the outer layer of devices that have arrived and devices
that have gone away. *Finally*, when this is done, search the
softc's database of Fabric devices and perform logout/login actions.
The Qlogic f/w maintains logout/login for all local loop devices.
We have to maintain logout/login for fabric devices- total PITA.
Expect to see this area undergo more change over time.


# 47072 12-May-1999 mjacob

be a bit more chatty about some speed negotiations


# 46971 11-May-1999 mjacob

Some massive thwunking in initialization to handle dual bus adapters. More
massive thwunking to include an XS_CHANNEL value. Some changes of how
parameters are reported to outer layers (including bus, e.g.). Yet more
stirring around in isp_mboxcmd to try and get it right. Decode of 1080/1240
NVRAM.


# 45682 14-Apr-1999 mjacob

temp fix for internal queue overflow problem


# 45287 04-Apr-1999 mjacob

Make firmware revision a triple. Clean up some FC init stuff for
board versions with no BIOS. Separate mailbox interrupts from
IOCB interrupts. Read OUTMAILBOX5 while RISC_INT is active- not
after you clear it (potential race condition). Clear out older broken
BIG_ENDIAN goop. Don't negotiate narrow/async for LVD busses at startup
if already in LVD mode. Note usage of presumptive 1040C revision. For
all the LIP, PDB Changed, Loop UP/DOWN async events, mark fw state
as unknown as well as marking the need to do a getpdb on targets- after
a LIP for certain the f/w has to do PRLI/PLOGI for all targets again
and marking f/w state as unknown gives us a fighting chance to (start
to) hold up for that to complete.


# 45045 26-Mar-1999 mjacob

Annoying little nigglet- apparently *some* Qlogic temporarily ignore
settings you've just sent them and return random values if you follow
the set by a get. This causes problems when you latter run a Tag-enabled
command when you've command tagged mode off.


# 45040 25-Mar-1999 mjacob

Add in 1080 LVD support and some basis also for the 1240. The port database
printout is now enabled.


# 44819 17-Mar-1999 mjacob

A wad of changes- prepping for 1080/1240 support (which caused a massive
thwank in register layout goop). A different mboxcmd approach. Some PDB change
infrastructure. Some better management of loopdown/loopup events (keep them
distinct from resource starvation for simq freeze/unfreeze actions).


# 43789 09-Feb-1999 mjacob

Roll internal release tag. Print out if we're in a 64 bit PCI slot.
Use fast memory timing NVRAM parameter. Clean up and fix establishment
of default target parameters. Don't use NVRAM if are flagged as not to
do so (I had a busted NVRAM setup which I couldn't edit that enabled SYNC
mode but disabled disconnect/reconnect and wide!!). Fix delays after
resets. BUS resets not done in isp_init anymore- relegated to OS
specific outer layers. Fix a buglet where you can get in a loop for
a NULL xs in the completion list in isp_intr. Add in some defines that
can disable fast posting. Add in code for Loop Up/Loop Down events that
call into the outer layers as to what to do.


# 43420 30-Jan-1999 mjacob

Implement and use Fast Posting for both parallel && fibre. Redo a bit of
the startup code. Implement a call to outer framework function so that
asynchronous events can be handled (e.g., speed negotiation, target mode).

Roll internal release tags.


# 42472 10-Jan-1999 mjacob

Suggested by bde@freebsd.org- memcpy not necessarily good to use. D'oh- not in
the BSD DKI. Stop being lazy and finish the defines so MEMCPY becomes bzero
for FreeBSD.


# 42462 10-Jan-1999 mjacob

Add some prototype deadchip detection. Set FIFO bursting (1XX0 only-
it's already on for the 2XX0) and detect the broken 1040A FIFO. Change
bzero to MEMZERO (portability with **nux). Use memcpy for same reason.

Finally detect QUEUE FULL conditions and return this as an error that
will get cam_periph_error to do it's 'tagged openings now XXX' dance.


# 42131 28-Dec-1998 mjacob

clarify headers;move uninit to outer layer;remove watchdog


# 41525 05-Dec-1998 mjacob

oops on last


# 41524 05-Dec-1998 mjacob

Remove the Target mode functions until they're in better shape. Implement some
suggested compilation cleanups from Eklund. Wire down a hard loop id if we are
not on a platform that has the ability to get to a PCI BIOS (it still will
float to the ID it gets after a LIP but at least we can try). Clarify that the
expanded lun is based upon SCCLUN defines (in f/w).


# 39440 17-Sep-1998 mjacob

per bde (who is right about this) that an inlined fucntion with const
char * strings being returned defined in a header file included several
places but only used in one module, is, uh, silly.


# 39439 17-Sep-1998 mjacob

Cleanliness. Don't leave defined a const char array that's only used
if target mode is defined (which it isn't, yet).


# 39432 17-Sep-1998 mjacob

ISP_DMASETUP now returns a value to be possibly punted to outer layers.
Turn request queue overflow messages into debug messages. Ensure on
isp_restarts that we nullify the xflist array.


# 39315 15-Sep-1998 mjacob

fix reported compile error flying blind- I do not have the new compiler yet


# 39235 15-Sep-1998 gibbs

Update QLogic ISP support for CAM. Add preliminary target mode support.

Submitted by: Matthew Jacob <mjacob@feral.com>


# 35388 22-Apr-1998 mjacob

Add support for the Qlogic ISP SCSI && FC/AL Adapters