[openib-general] Loading / unloading IB modules

makia at llnl.gov
Fri Mar 26 13:11:07 PST 2004


Since I'm new here, and have recently put openib onto my RHEL3 cluster, 
I figured I'd post in this thread because I've seen the issues of 
unloading as well.  First off, though, I've taken a look at the 
modprobe.conf.infiniband (which doesn't work for me because I use a 
standard modules.conf) and re-arranged it to look like the following:

:::<START>:::
# Infiniband
alias infiniband_hca ib_tavor

alias ib0 ib_ipoib
alias ib1 ib_ipoib

alias net-pf-26 ib_sdp

#infiniband chain
add above ib_ipoib ib_useraccess ib_ip2pr
add above ib_ip2pr ib_udapl
add above ib_udapl ib_useraccess_cm

post-install ib_ipoib /sbin/update_ipoib_intfs
post-install ib_ip2pr /sbin/create_ip2pr_devs
post-install ib_udapl /sbin/create_udapl_devs
post-install ib_useraccess /sbin/create_useraccess_devs
post-install ib_useraccess_cm /sbin/create_useraccess_cm_devs

post-install mod_vipkl /sbin/create_vipkl_devs
post-install mosal /sbin/create_mosal_devs
post-install mst_pci /sbin/create_mst_pci_devs
post-install mst_pciconf /sbin/create_mst_pciconf_devs
:::<END>:::

Doing an modprobe ib0 seems to get everything loaded in a nice fashion 
(of course, there are a lot of modules):

:::<START>:::
ib_useraccess          10180   0  (unused)
ib_useraccess_cm       15808   0  (unused)
ib_cm                  46680   0  [ib_useraccess_cm]
ib_udapl               40920   0  (unused)
ib_ip2pr               29500   0  [ib_useraccess_cm ib_udapl]
ib_ipoib               58380   1  [ib_udapl ib_ip2pr]
ib_sa_client           29076   0  [ib_udapl ib_ip2pr ib_ipoib]
ib_client_query        12736   0  [ib_udapl ib_ip2pr ib_ipoib 
ib_sa_client]
ib_tavor               23940   0  (autoclean) [ib_useraccess_cm]
mod_vapi              129892   0  (autoclean) [ib_useraccess_cm 
ib_udapl ib_tavo
r]
mod_vipkl             219360   0  (autoclean) [mod_vapi]
mod_thh               257920   0  (autoclean) [mod_vapi]
mod_hh                 15608   0  (autoclean) [mod_vipkl mod_thh]
mod_mpga               23584   0  (autoclean) [mod_vapi]
mod_vapi_common        65276   0  (autoclean) [ib_useraccess_cm 
ib_udapl ib_tavo
r mod_vapi mod_vipkl mod_thh]
mosal                 110053   0  (autoclean) [mod_vapi mod_vipkl 
mod_thh mod_mp
ga mod_vapi_common]
ib_mad                 21068   0  [ib_useraccess ib_cm ib_client_query]
ib_poll                14616   0  [ib_cm ib_ip2pr ib_client_query]
ib_core                47636   0  [ib_useraccess ib_useraccess_cm ib_cm 
ib_udapl
 ib_ip2pr ib_ipoib ib_sa_client ib_tavor ib_mad]
ib_packet_lib         147024   0  [ib_mad ib_core]
ib_services            16932   0  [ib_useraccess ib_useraccess_cm ib_cm 
ib_udapl
 ib_ip2pr ib_ipoib ib_sa_client ib_client_query ib_tavor ib_mad ib_poll 
ib_core
ib_packet_lib]
:::<END>:::

Unloading the modules, I see the following on the console (paired up 
with the usual system hang/crash):

:::<START>:::
[KERNEL_IB][tsIbTavorQpDestroy][tavor_qp.c:562]InfiniHost0: 
VAPI_destroy_qp failed, return code = -244 (Invalid HCA Handle.)
[KERNEL_IB][tsIbTavorQpDestroy][tavor_qp.c:562]InfiniHost0: 
VAPI_destroy_qp failed, return code = -244 (Invalid HCA Handle.)
[KERNEL_IB][tsIbTavorQpDestroy][tavor_qp.c:562]InfiniHost0: 
VAPI_destroy_qp failed, return code = -244 (Invalid HCA Handle.)
[KERNEL_IB][tsIbTavorQpDestroy][tavor_qp.c:562]InfiniHost0: 
VAPI_destroy_qp failed, return code = -244 (Invalid HCA Handle.)
[KERNEL_IB][tsIbTavorMemoryDeregister][tavor_mr.c:126]InfiniHost0: 
VAPI_deregister_mr failed, return code = -244 (Invalid HCA Handle.)
[KERNEL_IB][tsIbTavorCqDestroy][tavor_cq.c:124]InfiniHost0: 
EVAPI_clear_comp_eventh failed, return code = -244 (Invalid HCA Handle.)
[KERNEL_IB][tsIbTavorCqDestroy][tavor_cq.c:131]InfiniHost0: 
VAPI_destroy_cq failed, return code = -244 (Invalid HCA Handle.)
[KERNEL_IB][tsIbTavorPdDestroy][tavor_pd.c:76]InfiniHost0: 
VAPI_dealloc_pd failed, return code = -244 (Invalid HCA Handle.)
 VIPKL(1): em.c[88]: EM delete:found unreleased async object
 VIPKL(1): qpm.c[156]: QPM delete: found unreleased qp in array

 VIPKL(1): qpm.c[156]: QPM delete: found unreleased qp in array

 VIPKL(1): qpm.c[156]: QPM delete: found unreleased qp in array

 VIPKL(1): qpm.c[156]: QPM delete: found unreleased qp in array

 VIPKL(1): cqm.c[62]: CQM delete:found unreleased cq
 VIPKL(1): cqm.c[62]: CQM delete:found unreleased cq
 VIPKL(1): cqm.c[62]: CQM delete:found unreleased cq
 VIPKL(1): mmu.c[97]: MM delete:found unreleased mr
 VIPKL(1): mmu.c[97]: MM delete:found unreleased mr
 VIPKL(1): mmu.c[97]: MM delete:found unreleased mr
 VIPKL(1): pdm.c[44]: PDM delete:found unreleased pd
 VIPKL(1): pdm.c[44]: PDM delete:found unreleased pd
 VIPKL(1): pdm.c[44]: PDM delete:found unreleased pd
 THH(1): tmrwm.c[1384]: found unreleased internal mr!!!!

 THH(1): tmrwm.c[1384]: found unreleased internal mr!!!!

 THH(1): tmrwm.c[1384]: found unreleased internal mr!!!!

 THH(1): tmrwm.c[1384]: found unreleased internal mr!!!!

 THH(1): tmrwm.c[1384]: found unreleased internal mr!!!!

 THH(1): tmrwm.c[1384]: found unreleased internal mr!!!!

 THH(1): tmrwm.c[1384]: found unreleased internal mr!!!!

 THH(1): tmrwm.c[1409]: found unreleased mr!!!!

 THH(1): tmrwm.c[1409]: found unreleased mr!!!!

 THH(1): tmrwm.c[1409]: found unreleased mr!!!!

 THH(1): thh_mod_obj.c[378]: cleanup_module: destroying InfiniHost0
THH kernel module removed successfully
Unable to handle kernel paging request at virtual address f8d16183
 printing eip:
f8d16183
*pde = 36cd8067
*pte = 00000000
Oops: 0000
ib_ipoib ib_sa_client ib_client_query ib_mad ib_poll ib_core 
ib_packet_lib ib_services nfs lockd sunrpc e1000 microcode
CPU:    0
EIP:    0060:[<f8d16183>]    Not tainted
EFLAGS: 00010282

EIP is at __insmod_ib_mad_S.data_L288 [ib_mad] 0x111063 (2.4.21-9.EL-
IB_patches/i686)
eax: f73a2d80   ebx: f65e9d00   ecx: f65ec800   edx: 00000000
esi: f8d16183   edi: f661f9b8   ebp: f6495edc   esp: f6495ea8
ds: 0068   es: 0068 00 f90bb12a f661f808 f661f808 f661f808
Call Trace:   [<c010cd9d>
:::<END>:::

(Sorry about formatting... hopefully this will help, though.)
Content-Type: multipart/alternative;
        boundary="----_=_NextPart_001_01C41361.F226A1B0"


------_=_NextPart_001_01C41361.F226A1B0
Content-Type: text/plain;
        charset="iso-8859-1"

        Ignaz> Is there a script available to load / unload
        Ignaz> the necessary InfiniBand modules by hand?

I am using the attached script for unloading the modules.
Please note that you need to kill programs using the stack
before unloading the stack modules.
The script takes care of sutting down the ipoib interface,
but if you mount a filesystem over srp you would need to bring
this down as well.

Chanan 



------_=_NextPart_001_01C41361.F226A1B0
Content-Type: text/html;
        charset="iso-8859-1"

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META NAME="Generator" CONTENT="MS Exchange Server version 5.5.2654.89">
<TITLE> [openib-general] Loading / unloading IB modules</TITLE>
</HEAD>
<BODY>

<P>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <FONT SIZE=2>Ignaz&gt; Is there a script available to load / unload</FONT>
<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <FONT SIZE=2>Ignaz&gt; the necessary InfiniBand modules by hand?</FONT>
</P>

<P><FONT SIZE=2>I am using the attached script for unloading the modules.</FONT>
<BR><FONT SIZE=2>Please note that you need to kill programs using the stack</FONT>
<BR><FONT SIZE=2>before unloading the stack modules.</FONT>
<BR><FONT SIZE=2>The script takes care of sutting down the ipoib interface,</FONT>
<BR><FONT SIZE=2>but if you mount a filesystem over srp you would need to bring</FONT>
<BR><FONT SIZE=2>this down as well.</FONT>
</P>

<P><FONT SIZE=2>Chanan </FONT>
</P>

<P><FONT FACE="Arial" SIZE=2 COLOR="#000000"></FONT>&nbsp;

</BODY>
</HTML>
------_=_NextPart_001_01C41361.F226A1B0--


More information about the openib-general mailing list