[openib-general] IB: I don't like what I'm seeing.

Roland Dreier
Wed Mar 31 12:07:03 PST 2004


    ron> we don't believe in HCA reliability here. It has not worked
    ron> once in all the years of delivered networks. We're going to
    ron> assume, unless we can see BER of 10-21 app-to-app, that the
    ron> network is unreliable. So, yes, toss and start over is not
    ron> inconceivable.

IB reliable transports do not count on no bit errors on the network.
The reliable connection (RC) transport operates similarly to TCP in
that it will detect and retransmit dropped or corrupted packets.
However the transport is implemented in the HCA hardware so that it is
possible to achieve transfer rates limited by the host bus or the IB
link (8 Gb/sec) with essentially zero CPU use.

I don't think chip bugs that corrupt data with RC are any more likely
than bugs that corrupt data with UD, or for that matter much more
likely than data corruption bugs in the host chipset or CPU.

(IB networks do tend to have low error rates but they're probably only
in the 10^-15 to 10^-18 range)

 - Roland

-- 
To unsubscribe send an email with subject unsubscribe to openib-general at openib.org.
Please contact moderator at openib.org for questions.




More information about the openib-general mailing list