[openib-general] udapl:al:query_req_cb() !ERROR!: query failed:

Sears, Steven
Mon Apr 26 05:33:18 PDT 2004


I didn't see a reply to this, but I can offer a couple of insights.

The only significant error is:

> 
> and opensm.when i run a client on score2.ustb.cn,it has these error: 
> > > > 
> > > >  root at score2 dapltest]# ./bw.sh 

> > > > WARNING: <score2-ib0> not registered in DNS, using 
> dummy IP value 

DAPL needs to know what the IP address is of the initiating side. To determine what this is, current DAPL implementations get it from the HCA driver. But older implementations, such as the OpenIB stack you are using, obtain it by using the hostname and adding "-ib0" to the end of it, then invoking getaddrinfo() to get the IP address.

You are running on a machine named "score2", so this simple algorithm will create the local name "score2-ib0", which is what you see in the error string above.

It appears that you don't have an entry for 'score2-ib0' in you DNS system or in your /etc/hosts file. Add an entry with the correct IP address and things will probably work. Or you'll get to the next bug ;-).


<<snip>>

> > > > cl_open_device: opening device /dev/mvdapl8cd30a0 
> > > > cl_open_dev: error opening /dev/mvdapl8cd30a0 (No such file or 
> > directory) 

I don't know what this is all about, but things keep going so perhaps you can ignore it for now?

> > > > --> DsMI: Init MRDB failed = 0x1 
> > > > DT_cs_Client: IA IbalHca0 opened 

<<snip>>

> > > > DT_cs_Client: Connect Endpoint 
> > > > al:query_req_cb() !ERROR!: query failed: IB_REMOTE_ERROR 
> > > > --> DiISQC: SA query callback failed status IB_REMOTE_ERROR 
> > > > --> DsINMG: query SA found no record 
> > > > --> DsIC: fail to map remote_ia_addr (sa_family 2) to gid 

This error is because the local IP address was not determined, as I outlined above.


> > > > DT_cs_Client: Cannot connect Endpoint DAT_INVALID_PARAMETER 
> > > > DT_cs_Client: Cleaning Up ... 
> > > > DT_cs_Client: dat_ep_disconnect (abrupt) error: 
> DAT_INVALID_STATE 
> > > > mlnx_poll_cq() [ 
> > > > mlnx_poll_cq() ] 

<<snip>>

> > > > al:sync_destroy_obj() !ERROR!: Error waiting for 
> references to be 
> > > > released. 
> > > >  Forcing shutdown now.  Ref_cnt = 1 
> > > > al:sync_destroy_obj() !ERROR!: 0x8077200(AL_OBJ_TYPE_H_AL) 
> > > > dapltest: al_common.c:504: async_destroy_cb: Assertion 
> > `!p_obj->ref_cnt' 
> > > > failed. 

... and finally, this set of errors is because the DAPL implementation is out of date, there are still some bugs with cleanup/close.


 -Steve

-- 
To unsubscribe send an email with subject unsubscribe to openib-general at openib.org.
Please contact moderator at openib.org for questions.




More information about the openib-general mailing list