Scale generative AI with new Azure AI infrastructure advancements and availability

22 December 2023

Powerful and revolutionary, generative AI has the potential to revolutionize a variety of sectors, from manufacturing to retail, financial services to healthcare. Customers are beginning to understand the effectiveness and creativity that generative AI can provide thanks to our early investments in hardware and AI infrastructure. Azure OpenAI Service, at the vanguard of this change, offers developers the systems, tools, and resources they need to develop next-generation, AI-powered applications on the Azure platform. Azure AI infrastructure is the foundation of how we scale our products. Users can produce richer user experiences with generative AI, foster creativity, and increase productivity for their companies.

Today, we’re introducing enhancements to how we’re empowering companies with Azure AI infrastructure and applications as part of our commitment to bringing the revolutionary power of AI to our clients. The most cutting-edge OpenAI models, GPT-4 and GPT-35-Turbo, are now accessible in a number of new locations thanks to the global expansion of Azure OpenAI Service, giving companies all over the world unrivaled generative AI capabilities. This scalability is enabled by our Azure AI infrastructure, which we are continuing to build and invest in. A new age of AI applications is being ushered in by the general release of the ND H100 v5 Virtual Machine series, which is outfitted with NVIDIA H100 Tensor Core graphics processing units (GPUs) and low-latency networking.

Here is how these developments broaden Microsoft’s stack-wide, holistic approach to AI.

Unprecedented AI processing and scale with the ND H100 v5 Virtual Machine series, now generally available

Our Azure ND H100 v5 Virtual Machine (VM) series, which uses the most recent NVIDIA H100 Tensor Core GPUs and NVIDIA Quantum-2 InfiniBand networking, became generally available today. In order to handle the workloads for cutting-edge AI that are becoming exponentially more complicated, this VM series has been painstakingly developed using Microsoft’s significant experience in offering supercomputing performance and scale. We are using a 4K GPU cluster that has been optimized for AI as part of our significant and continuous investment in generative AI, and within the next year, we plan to scale up to hundreds of thousands of the most recent GPUs.

Currently, the ND H100 v5 VMs have the following features:

GPUs for AI supercomputing: With eight NVIDIA H100 Tensor Core GPUs, these virtual machines (VMs) offer to perform AI models substantially more quickly than prior generations, giving enterprises access to unrivaled computational capacity.
Computer processing unit (CPU) of the future: Recognizing the importance of CPU performance for AI training and inference, we have selected the 4th Gen Intel Xeon Scalable processors as the basis of these VMs, assuring the fastest possible processing speed.
Networking with little delay Smooth performance across the GPUs is guaranteed by the addition of NVIDIA Quantum-2 ConnectX-7 InfiniBand with 400Gb/s per GPU and 3.2 Tb/s per VM of cross-node bandwidth, matching the capabilities of top-performing supercomputers worldwide.
Host to GPU performance optimization With 64GB/s bandwidth per GPU provided by PCIe Gen5, Azure obtains notable performance improvements over CPU and GPU.
Large scale memory and memory bandwidth: These virtual machines (VMs) are built around DDR5 memory, which offers faster data transfer rates and more efficiency, making it perfect for tasks with larger datasets.
When employing the new 8-bit FP8 floating point data type as opposed to the FP16 in earlier generations, these VMs have demonstrated their performance prowess by speeding up matrix multiplication operations by up to six times. The ND H100 v5 VMs demonstrate their ability to further improve AI applications by achieving up to a two-fold speedup in huge language models such BLOOM 175B end-to-end model inference.

Expanding innovative models globally with Azure OpenAI Service

We are excited to announce that Azure OpenAI Service will now be available globally, opening up a wider audience for OpenAI’s cutting-edge models like GPT-4 and GPT-35-Turbo. We now offer help and expanded access to enterprises looking for strong generative AI capabilities in Australia East, Canada East, East United States 2, Japan East, and United Kingdom South thanks to our new live regions there. Together with our current availability in East United States, France Central, South Central United States, and West Europe, these new regions expand the geographic reach of Azure OpenAI Service. The reception to Azure OpenAI Service has been outstanding, and since our previous disclosure, the number of our customers has almost tripled. With the addition of 100 new customers each day on average this quarter, we now proudly serve more than 11,000 clients. This astounding development is proof of the value our service offers to companies eager to take use of AI for their particular requirements.

We are expanding GPT-4, the most sophisticated generative AI model offered by Azure OpenAI, across the new regions as part of this expansion. With this improvement, more users will be able to take use of GPT-4’s capabilities for content creation, document intelligence, customer support, and more. With the help of Azure OpenAI Service, businesses may intensify their operations while fostering innovation and change in a variety of sectors.

An ethical method for creating generative AI

Microsoft’s dedication to ethical AI is at the heart of its Azure AI and Machine Learning platforms. The AI platform integrates strong safety features and makes use of human feedback mechanisms to responsibly handle dangerous inputs, ensuring the maximum protection for users and end users. Businesses can apply for access to the Azure OpenAI Service to fully realize the power of generative AI and take their operations to new heights.

As we pioneer AI innovation, we extend an invitation to companies and developers everywhere to join us on this transformative path. Azure OpenAI Service is evidence of Microsoft’s commitment to making AI usable, scalable, and valuable for organizations of all sizes. Together, let’s harness the potential of generative AI and Microsoft’s dedication to ethical AI practices to promote development and growth across the globe.

Client motivation

The production and design of content, rapid automation, personalized marketing, customer service, chatbots, product and service innovation, language translation, autonomous driving, fraud detection, and predictive analytics are just a few of the industries that are being revolutionized by generative AI. We are motivated by the generative AI innovations that our clients are making, and we eagerly anticipate how other users will expand on these technologies.

Mercedes-Benz is enhancing the driver’s in-car experience using Azure OpenAI Service. The updated “Hey Mercedes” feature is more conversational and intuitive than before. Global professional services company KPMG uses our technology to accelerate the coding lifecycle, create intelligent automation, and improve its service delivery model. Using Azure Machine Learning and Azure’s AI infrastructure, Wayve trains large scale fundamental neural networks for autonomous driving. Sensa Copilot was introduced by Microsoft partner SymphonyAI to help financial crime detectives fight the impact that unlawful behavior has on the economy and companies. Sensa Copilot helps investigators quickly and effectively identify money laundering patterns by automating data collection, collation, and summary of financial and third-party information. Find out about every Azure AI and ML customer story.