MS Ai Insider

Title: Revolutionizing Audio Editing: Microsoft Introduces AI-Powered Audio Editor for Windows

Posted by

ailona

–

August 22, 2024

1. ** **Microsoft’s latest blog post delves into the development of a local AI-powered Audio Editor app for Windows, showcasing its smart trimming feature. By utilizing ONNX models like Silero for voice activity detection, Whisper for transcription, and MiniLM for semantic search, the app efficiently processes audio files based on user-defined themes, providing a streamlined editing experience.2. **Unique in HTML:**

“`html

Local AI on Windows: Exploring the Audio Editor App Sample

Microsoft has unveiled an exciting new application sample that showcases the power of local AI on Windows. The Audio Editor app, demonstrated at Build, leverages on-device AI models to enhance audio editing capabilities. This post delves into the app’s features, models used, and what developers need to know.

What’s New in the Audio Editor App?

The Audio Editor app introduces an innovative “smart trimming” feature. Users can upload audio files containing recognizable speech and specify a theme keyword or phrase. The app then generates a trimmed audio clip that highlights the most relevant segment. This functionality is designed to streamline the editing process significantly.

“Building Windows apps that leverage on-device AI models can seem like a daunting task.” – Zachary Teutsch

Major Updates: The Technology Behind Smart Trimming

Three key models power the smart trimming functionality:

Silero Voice Activity Detection (VAD): This model segments audio into manageable chunks, ensuring accurate transcription.
Whisper Tiny: This transcription model converts speech to text, optimizing performance while maintaining reasonable accuracy.
MiniLM: This text embedding model maps sentences to a multi-dimensional vector space, facilitating semantic search.

These models work in tandem to ensure that audio is processed efficiently and accurately. For instance, Silero VAD detects voice activity and cuts audio at natural breaks, preventing awkward sentence fragments.

What’s Important to Know?

Developers interested in utilizing this sample can access the code repository and a detailed code walkthrough. While setting up the models requires some effort, the README provides comprehensive instructions. This resource is invaluable for anyone looking to integrate local AI capabilities into their applications.

“There’s a lot of work that goes into defining your use case, choosing and tuning the right models.” – Zachary Teutsch

For those eager to explore local AI on Windows further, Microsoft offers extensive documentation. This is a fantastic opportunity for developers to enhance their applications with cutting-edge AI technology.

In conclusion, the Audio Editor app sample represents a significant step forward in integrating local AI into Windows applications. With its innovative features and robust underlying technology, it sets a new standard for audio editing tools.

“`

The Audio Editor app is built using WinUI3 and WinAppSDK.

Smart trimming allows users to create audio clips based on specific themes.

Silero VAD model enhances audio chunking for accurate transcription.

Whisper Tiny model transcribes speech to text, optimizing for performance.

MiniLM generates text embeddings for semantic similarity calculations.

“`

From the Microsoft Developer Community Blog