Working Notes: a commonplace notebook for recording & exploring ideas.
Home. Site Map. Subscribe. More at expLog.
cpu is not the center anymore, even nics are intelligent
compute outstrips improvements in memory & network bandwidth exponentially
eg. 36k improvement in compute for tensor cores, vs 6x improvement in data movement speed
gpu centric == reduce cpu involement in multi gpu comms to minimize overhead
comms options
most modern: gpus directly talk to the nic and bypass gpu entirely
NVLink is not self-routed; nv switch provides a2a routing
gpu aware mpi
Follow up
— Kunal