Posted in

Microsoft, NVIDIA Add Nemotron to Foundry

Microsoft and NVIDIA expand joint AI infrastructure: Foundry adds NVIDIA Nemotron models and agent services, Azure deploys Vera Rubin NVL72 and liquid-cooled GPUs, Foundry Local supports regulated environments, and a Physical AI toolchain links simulation, digital twins and production systems.

Microsoft announced expanded Microsoft Foundry capabilities, Azure AI infrastructure updates, and deeper Physical AI integration with NVIDIA. These changes add Nemotron models, Vera Rubin NVL72 support, and tighter Fabric–Omniverse pipelines.

Main feature/change and impact

Microsoft Foundry now integrates NVIDIA Nemotron models and an expanded Foundry Agent Service. This enables production-ready, reasoning-based agents with observability in the Foundry Control Plane. Azure will be the first hyperscale cloud to power NVIDIA Vera Rubin NVL72 systems. Combined, these updates shorten time to production for agentic and inference-heavy workloads while maintaining enterprise governance and operational consistency.

Practical implications

Organizations gain low-latency, fine-tunable open-weight models available through Foundry for edge and cloud deployment. Azure’s liquid-cooled GPU deployments reduce thermal and power constraints for large inference fleets. Foundry Local and Azure Local initial support for Vera Rubin gives regulated environments controlled, Azure-consistent operations. Integration with Fabric and Omniverse supports end-to-end Physical AI pipelines from simulation to real-world action.
Foundry Agent Service allows teams to quickly develop agents that reason, plan and act across tools, data and workflows.
Microsoft’s rollout includes Foundry Agent Service, voice API preview, and expanded security integrations with Prisma AIRS and Zenity. The public Azure Physical AI Toolchain GitHub repository links NVIDIA’s Physical AI Data Factory blueprint with Azure services. These changes enable repeatable robotics and digital twin workflows, and integrate live operational data for real-time decisioning. Closing paragraph: Enterprises should assess agentic workload readiness and update capacity planning for inference scale. Next steps include testing Nemotron models in Foundry, validating Vera Rubin nodes in Azure labs, and piloting Physical AI toolchains with production telemetry.

Key points from the article:

  • Foundry now includes NVIDIA Nemotron models for low-latency deployments.
  • Foundry Agent Service and observability are generally available for production agents.
  • Azure is rolling out Vera Rubin NVL72 and liquid-cooled GPU deployments.
  • Foundry Local enables AI in sovereign and regulated customer-controlled environments.
  • Physical AI Toolchain integrates Omniverse, Fabric, simulation and real-world operations.
  • Related Coverage:

    From the The Official Microsoft Blog