PyTorch 2.12 brings 100x faster CUDA linalg.eigh and unified graph API

PyTorch 2.12 brings 100x faster CUDA linalg.eigh and unified graph API

PyTorch 2.12 delivers up to 100x faster batched eigendecomposition on CUDA by overhauling the linalg.eigh backend.

The release introduces the torch.accelerator.Graph API, enabling unified graph capture and replay across CUDA, XPU, and custom backends.

torch.export.save now supports Microscaling quantization formats, allowing export of more aggressively compressed models.

by Fla

PyTorch iconPyTorch
  11
  • ...

PyTorch enables fast, flexible experimentation and efficient production through a hybrid front-end, distributed training, and ecosystem of tools and libraries.

No comments so far, maybe you want to be first?
Gu