[openib-general] IB: I don't like what I'm seeing.

Troy Benjegerdes
Wed Mar 31 08:37:47 PST 2004


On Wed, Mar 31, 2004 at 07:12:26AM -0800, Roland Dreier wrote:
>     ron> I'd like to see if we can put a simple non-connection-based
>     ron> unreliable datagram stack for this thing and chunk most of
>     ron> this crap out. My faith in it working at scale is basically
>     ron> 0.
> 
> Not necessarily a bad idea if we want to start from scratch.  However,
> using unreliable datagrams (UD) means ignoring most of the performance
> features of IB, since you're limited to single packet messages and
> have to take care of all the reliable delivery in software.  You'd
> probably be lucky to get 1/3 the performance you can get by using
> reliable connected transport.

Maybe so, but if the hardware 'reliable delivery' works intermittently 
with 896 nodes in a network, nobody cares about performance.

Has *anyone* heard of a 1000+ node infiniband network that has been able
to be up, and running compute jobs that last more than a day?

Unreliable datagram allows you to:

1) determine network congestion/packet drops/corruption/retransmits on
an end-to-end basis, making it easier to evaluate how well the hardware
and cableing is actually doing. Taking advantage of all the 'reliable
transport' features means it just goes REAL SLOW and nobody quite knows
why.

2) work around (some) network hardware/scaling problems in software,
that YOU control, and can be tuned. And is generally much simpler than
"vendor provided" solutions that necessarily try to solve the general
market problems first.

I don't think we have to start from scratch, but we do need to
concentrate an effort on scalabilty that throws away all the ULP's and
connection management, etc, and just makes sure the driver and *USER*
level access layer can scale to large node counts, and verify the
hardware works reliably.

In the interest of parallelism ;) , we should also have another effort
that works on different ULP's, and the 'whole stack', but Ron shouldn't
have to know anything about it, or the complexity involved.


-- 
To unsubscribe send an email with subject unsubscribe to openib-general at openib.org.
Please contact moderator at openib.org for questions.




More information about the openib-general mailing list