Random thoughts, tips & tricks about Slackware-Linux, Lego and Star Wars

VHBA and RocketRaid 2320 (rr232x) modules with Kernel 2.6.37.x

March 20th, 2011 by Niels Horn in , ,

Only a few days ago I was informed by a user that the kernel module - for which I maintain the SlackBuild script - stopped working with the new 2.6.37.3 and 2.6.37.4 kernels in 13.37 (actually, it's still officially ).
When I say "stopped working", I'm actually being nice… It caused a fatal crash that requires rebooting the box it runs on :(

The user sent me a patch that worked, but hacking kernel modules is not to be taken lightly, so I wanted to understand what was going on…

Read the documentation!

It's something most people don't like, but it actually is a good idea to read the documentation once in a while :)
In /usr/src/linux/Documentation/scsi/scsi_mid_low_api.txt (if you have the 2.6.37.x kernel installed), I found this interesting comment:

Locks: up to and including 2.6.36, struct Scsi_Host::host_lock
held on entry (with "irqsave") and is expected to be
held on return. From 2.6.37 onwards, queuecommand is
called without any locks held.

Well, that seems to be the culprit then… So now I headed for the kernel git repository and found :

Move the mid-layer's ->queuecommand() invocation from being locked
with the host lock to being unlocked to facilitate speeding up the
critical path for drivers who don't need this lock taken anyway.

The patch below presents a simple SCSI host lock push-down as an
equivalent transformation. No locking or other behavior should change
with this patch. All existing bugs and locking orders are preserved.

Additionally, add one parameter to queuecommand,
struct Scsi_Host *
and remove one parameter from queuecommand,
void (*done)(struct scsi_cmnd *)

Scsi_Host* is a convenient pointer that most host drivers need anyway,
and 'done' is redundant to struct scsi_cmnd->scsi_done.

Minimal code disturbance was attempted with this change. Most drivers
needed only two one-line modifications for their host lock push-down.

I checked some of the patches in the kernel tree and indeed they seemed quite simple. This is the patch for the aha1542 driver:

--- a/drivers/scsi/aha1542.c
+++ b/drivers/scsi/aha1542.c
@@ -558,7 +558,7 @@ static void aha1542_intr_handle(struct Scsi_Host *shost)
 };
 }
-static int aha1542_queuecommand(Scsi_Cmnd * SCpnt, void (*done) (Scsi_Cmnd *))
+static int aha1542_queuecommand_lck(Scsi_Cmnd * SCpnt, void (*done) (Scsi_Cmnd *))
 {
 unchar ahacmd = CMD_START_SCSI;
 unchar direction;
@@ -718,6 +718,8 @@ static int aha1542_queuecommand(Scsi_Cmnd * SCpnt, void (*done) (Scsi_Cmnd *))
 return 0;
 }
+static DEF_SCSI_QCMD(aha1542_queuecommand)
+
 /* Initialize mailboxes */
 static void setup_mailboxes(int bse, struct Scsi_Host *shpnt)
 {

Patching the VHBA module

In the SVN repository for VHBA I found :

--- trunk/vhba-module/vhba.c 2010/08/15 20:11:18 691
+++ trunk/vhba-module/vhba.c 2011/02/27 15:56:27 730
@@ -363,7 +363,7 @@
 spin_unlock_irqrestore(&vhost->cmd_lock, flags);
 }
-static int vhba_queuecommand(struct scsi_cmnd *cmd, void (*done)(struct scsi_cmnd *))
+static int vhba_queuecommand_lck(struct scsi_cmnd *cmd, void (*done)(struct scsi_cmnd *))
 {
 struct vhba_device *vdev;
 int retval;
@@ -388,6 +388,12 @@
 return retval;
 }
+#ifdef DEF_SCSI_QCMD
+DEF_SCSI_QCMD(vhba_queuecommand)
+#else
+#define vhba_queuecommand vhba_queuecommand_lck
+#endif
+
 static int vhba_abort(struct scsi_cmnd *cmd)
 {
 struct vhba_device *vdev;

(there's actually a bit more, but that's just a cosmetic change…)

The #ifdef / #else / #endif structure was included so that it will work on "older" kernels as well.
It looked similar enough to the official kernel changes, so I included that patch in my SlackBuild script and the VHBA module started working again without crashing my system!

RocketRaid module with similar behavior…

And then yesterday I saw a on LinuxQuestions about similar problems with the RocketRaid rr232x driver. Since it's also a "SCSI" driver and it also crashed, I imagined it to have the same cause.

I created a patch based on my little research and posted it in the forum:

--- rr232x-linux-src-v1.10/osm/linux/osm_linux.c 2009-07-15 22:28:28.000000000 -0300
+++ rr232x-linux-src-v1.10_patched/osm/linux/osm_linux.c 2011-03-19 20:27:49.000000000 -0300
@@ -874,7 +874,7 @@
 }
 }
-static int hpt_queuecommand (Scsi_Cmnd * SCpnt, void (*done) (Scsi_Cmnd *))
+static int hpt_queuecommand_lck (Scsi_Cmnd * SCpnt, void (*done) (Scsi_Cmnd *))
 {
 struct Scsi_Host *phost = sc_host(SCpnt);
 PVBUS_EXT vbus_ext = get_vbus_ext(phost);
@@ -1408,6 +1408,12 @@
 return 0;
 }
+#ifdef DEF_SCSI_QCMD
+DEF_SCSI_QCMD(hpt_queuecommand)
+#else
+#define hpt_queuecommand hpt_queuecommand_lck
+#endif
+
 static int hpt_reset (Scsi_Cmnd *SCpnt)
 {
 PVBUS_EXT vbus_ext = get_vbus_ext(sc_host(SCpnt));

Today the original poster answered that it worked "like a charm" :)
This patch can be downloaded from .