jauntywundrkind 2 days ago

That so much of the document is about how UltaEthernet maps to Libfabric rather confirms that premise! There's also a credit based flow system, and connection manager roles, which are also easily identifiable Infiniband concepts.

I'm interested to see what the optional hardware features are. And what their relationship is to the different UE profiles (AI Base, AI Full, and HPC).

> The Ultra Ethernet Transport (UET) layer is designed to handle the most challenging application scale, deliver packets reliably and securely, manage and avoid congestion within the network, and react to contention at the endpoints. Its goals are minimal tail latency and highest network utilization. At the same time, UET is designed to enable simple hardware and software implementations – such as what might be required for accelerator-integrated endpoints. UET can be programmed through the OFI libfabric standard interface. It sets out to address the shortcomings of RoCEv2, specifically its semantics, transport layer, wire operations, implementation complexities, and scale limits

1
deaddodo 2 days ago

Self-admittedly less knowledgeable about this subject, but how does this differ from IBoE?

jauntywundrkind 2 days ago

The end of the quote I provided does some very high level contrasting of the well known RoCE v2 RDMA over converged Ethernet, which is the most popular IBoE like thing.

I'm sure there's a lot more nuanced to it all. But I think predictability/utilization/latency in RoCE are worse, that it relies on Explicit Congestion Notification more for flow control. Where-as UE is using Infiniband style credit based flow control, which should insure that any data sent has sufficient throughout allocated to it to be received.

UE seems to be a more direct creation of an Infiniband like network, atop Ethernet but where all players are agreeing to behave in an Infiniband like way with Infiniband predictability, where-as RoCE encapsulates Infiniband data but still behaves more like an Ethernet network lacking the coordination of Infiniband. I'm far from certain; it'd be so fun to have some extensive material to go over to really find out.