PyTorch 2.4 introduces Intel Data Center GPU Max support and enhanced AI processing
PyTorch 2.4 now includes initial support for Intel Data Center GPU Max Series, enhancing AI workload processing on Intel hardware. This update offers a consistent programming experience with minimal coding effort, extending PyTorch’s capabilities to support streaming devices. Intel GPU support has been integrated into PyTorch, allowing both eager and graph modes to fully run Dynamo Hugging Face benchmarks.
Performance-critical graphs and operators are optimized using oneAPI Deep Neural Network Library (oneDNN) and oneAPI Math Kernel Library (oneMKL). In graph mode (torch.compile), the Intel GPU back end is enabled, integrating Triton and implementing specific optimizations. PyTorch 2.4 supports various data types, including FP32, BF16, FP16, and automatic mixed precision (AMP).
A new PyTorch Profiler based on Kineto and oneMKL is under development for PyTorch 2.5, aiming to enhance profiling capabilities for Intel GPUs. Users can migrate from CUDA to Intel GPUs with minimal changes, updating the device name from ‘cuda’ to ‘xpu’. Future enhancements in PyTorch 2.5 include more Aten operators, full Dynamo Torchbench and TIMM support, and Intel GPU support in torch.profile.
