Revolutionizing Search: Bing’s Transition to Language Models for Faster, More Accurate Results

Posted by

Bing is enhancing its search capabilities by transitioning to Large Language Models (LLMs) and Small Language Models (SLMs), optimizing performance with Nvidia’s TensorRT-LLM. This integration significantly reduces latency and costs while improving search accuracy and user experience. The move promises faster, more precise search results, paving the way for future innovations.2. **Unique (HTML Format):**

Bing’s Transition to LLM/SLM Models: A New Era in Search Technology

Bing is redefining search technology by integrating Large Language Models (LLMs) and Small Language Models (SLMs). This transition marks a significant milestone in enhancing search capabilities. As search queries become more complex, the need for more powerful models is evident.

What’s New?

While transformer models have served their purpose, they often struggle with efficiency. The introduction of SLMs offers a remarkable ~100x throughput improvement over LLMs. This change allows Bing to process and understand search queries with greater precision.

“We will not compromise on quality for speed.”

Major Updates: Optimizing with TensorRT-LLM

Managing latency and cost has been a challenge with larger models. To tackle this, Bing has integrated Nvidia’s TensorRT-LLM into its workflow. This optimization tool enhances SLM inference performance significantly.

One key application of TensorRT-LLM is in the ‘Deep Search’ feature. This innovative approach leverages SLMs in real-time to deliver the best possible web results. Understanding user intent and ensuring the relevance of search results are crucial steps in this process.

Before optimization, the original Transformer model had a 95th percentile latency of 4.76 seconds per batch. After integrating TensorRT-LLM, latency was reduced to 3.03 seconds per batch, while throughput increased from 4.2 to 6.6 queries per second. This optimization not only enhances user experience but also reduces operational costs by 57%.

Benefits for Users

The transition to SLM models and TensorRT-LLM brings several advantages:

  • Faster Search Results: Users can enjoy quicker response times, making their search experience seamless.
  • Improved Accuracy: Enhanced SLM capabilities deliver more accurate and contextualized search results.
  • Cost Efficiency: Reduced costs allow Bing to invest in further innovations, keeping it at the forefront of search technology.

Looking Ahead

Bing is committed to refining its search technology. The transition to LLM and SLM models is just the beginning. Exciting advancements are on the horizon, and users can expect more updates as Bing continues to push the boundaries of search technology.

“We are excited about the future possibilities and look forward to sharing more advancements with you.”

  • Bing’s shift to SLMs offers a ~100x throughput improvement over LLMs.
  • Tensorrt-LLM integration reduces model inference time and enhances user experience.
  • Before optimization, the original Transformer model had a latency of 4.76 seconds per batch.
  • Post-integration, latency dropped to 3.03 seconds per batch, with throughput increasing to 6.6 queries per second.
  • The transition allows Bing to invest in further innovations while ensuring cost efficiency.
  • From the Bing Blogs



    Related Posts
    Unlock the Power of Intune 2.211: What’s New for November!

    Microsoft Intune has released its November edition, featuring new updates to help IT admins better manage their organization’s mobile devices. Read more

    Microsoft Leads the Way in 2023 Gartner Magic Quadrant for Low-Code Application Platforms

    Microsoft has been named a Leader in the 2023 Gartner Magic Quadrant for Enterprise Low-Code Application Platforms. This recognition is Read more

    Unlock the Web with AI-Powered Microsoft Bing & Edge: Your Copilot for Search

    Microsoft is introducing a new AI-powered Bing and Edge to revolutionize the way people search the web. Bing and Edge Read more

    Marching Into the Future: Get Ready for On-Premises UUP Updates!

    Microsoft is introducing a new way to deliver Windows 10 updates called Unified Update Platform (UUP). UUP will provide a Read more