They're solving the same "high level problem", but with very different approache...

lairv · on March 9, 2023

I still don't fully grasp what XLA is, where does XLA sits against CUDA, ROCm, OpenVino ? Against ONNX/ONNX-Runtime ? Against OpenAI Triton ?

mathisfun123 · on March 9, 2023

basically all correct but

>You can then hypothetically lower (compiler terminology) it to a TensorRT MLIR dialect that then in turn runs on the Nvidia GPU.

there's no tensorrt dialect (there are nvgpu and nvvm dialects) nor would there be as tensorrt is primarily a runtime (although arguably dialects like omp and spirv basically model runtime calls).

tzhenghao · on March 9, 2023

Good catch and good point. What I was thinking was NVVM dialect. You're right on TensorRT being mostly a runtime.

Joky · on March 10, 2023

TensorFlow is also a runtime, yet we model its dataflow graph (the input to the runtime) as a dialect, same for ONNX. TensorRT isn't that different actually.