DirectML Update Expands AI Model Support with Phi 3 and Mistral v0.2, Boosts Windows Device Scalability Through Advanced Quantization

Posted by

DirectML now supports Phi 3 mini and medium, plus Mistral v0.2, enhancing Windows scalability. With Activation-Aware Quantization, developers can run models on more devices with minimal accuracy loss.-

DirectML’s Leap Forward with Quantization

DirectML has introduced support for Phi 3 mini, marking a significant advancement in the scalability of AI models on Windows platforms. This update not only enhances performance but also broadens the accessibility of advanced models to developers.

What’s New?

The introduction of quantized versions for Phi-3 mini and the expansion to include Phi 3 medium and Mistral v0.2 models represent major updates. Additionally, the integration of a gradio interface simplifies the testing process of these models, leveraging the new ONNX Runtime Generate() API.

Major Updates

Developers now have access to pre-quantized models, including variants for the 4k and 128k versions, enhancing both performance and accessibility.

Importance of Quantization

Quantization addresses the challenge of memory bandwidth in running models on entry-level and older hardware. By reducing model size, it significantly widens the range of devices capable of supporting complex language models.

“Our goal is to ensure scalability, while also maintaining model accuracy.” – Jacques van Rhyn

This approach not only aids in overcoming hardware limitations but also ensures minimal impact on model accuracy through Activation-Aware Quantization (AWQ).

Understanding AWQ

AWQ is a pivotal technique in quantization, focusing on preserving the accuracy of models while achieving memory efficiency. It quantizes 99% of weights while safeguarding the top 1% crucial for model accuracy.

“Thanks to the significant memory wins from AWQ, Phi-3-mini runs at this speed or faster on older discrete GPUs and even laptop integrated GPUs.” – Patrice Vignola

Perplexity Measurements: A Closer Look

Perplexity scores play a crucial role in evaluating model predictions. A lower score indicates a model’s higher certainty in its predictions, reflecting a closer alignment with the true data distribution.

With these enhancements, DirectML is set to revolutionize the way developers approach model scalability and performance on Windows. The integration of quantization techniques like AWQ not only broadens the accessibility of AI models but also ensures a balance between efficiency and accuracy.

As we continue to witness advancements in AI and machine learning, DirectML’s commitment to innovation remains a beacon for developers looking to push the boundaries of what’s possible on Windows platforms.

  • DirectML’s latest update introduces support for Phi 3 mini and medium, alongside Mistral v0.2, broadening model accessibility.
  • Activation-Aware Quantization (AWQ) significantly reduces model size while preserving accuracy, enabling performance on entry-level hardware.
  • Developers can access pre-quantized models for easier implementation, including different variants for specific needs.
  • A new gradio interface and ONNX Runtime Generate() API are available for streamlined model testing.
  • Quantization efforts aim to democratize AI by ensuring models run effectively on a wider range of devices, including older GPUs.

From the Windows Blog


Related Posts
Unveiling Windows 11: Get a Sneak Peek with Insider Preview Builds 22621 & 22624!

Microsoft has released Windows 11 Insider Preview Build 22621.1470 and 22624.1470, introducing new features and improvements to the Windows experience. Read more

Unlock the Future with Windows 11 Insider Preview Build 25324!

Microsoft has released Windows 11 Insider Preview Build 25324, a new version of the Windows 11 operating system. This update Read more

Microsoft Start Named Most Accurate Global Forecast Provider: Get the Weather You Need!

Microsoft Start has been named the most accurate global forecast provider by the World Meteorological Organization. The app provides users Read more

Level Up Your Creative and Gaming Game with Lenovo’s New Windows 11 Laptops!

Lenovo has just announced the release of their new Windows 11 laptops, designed to meet the needs of both creators Read more