Posted in

Microsoft Launches Azure Logic Apps Document Indexer for AI-Powered Semantic Search in Cosmos DB

Microsoft announces the public preview of Azure Logic Apps as a document indexer for Azure Cosmos DB, enabling seamless ingestion, parsing, embedding, and indexing of documents. This integration empowers AI-driven insights from unstructured data using prebuilt connectors and customizable templates. :

Azure Logic Apps Document Indexer Now in Public Preview for Cosmos DB

Microsoft just dropped a game-changing update for Azure Logic Apps users. The new Document Indexer feature is now in public preview, making it easier than ever to ingest documents into Azure Cosmos DB’s vector store. This integration is a big deal for anyone working with AI workloads, especially Retrieval-Augmented Generation (RAG).

What’s New?

With this update, Logic Apps can orchestrate the entire document ingestion pipeline. It handles everything from fetching documents to parsing, chunking, embedding, and indexing—all automatically. This means you can unlock insights from unstructured data across your enterprise systems without building complex pipelines yourself.

“This new capability orchestrates the full ingestion pipeline—from fetching documents to parsing, chunking, embedding, and indexing.”

How It Works: Step-by-Step

Connect to Source Systems

Logic Apps supports over 1400 prebuilt connectors. It can pull documents from various sources like Azure Blob Storage or SharePoint using ready-made templates.

Parse and Chunk Documents

AI-powered parsing extracts raw text, then chunks the content into language model-friendly pieces. This ensures better embedding quality and retrieval accuracy.

Generate Embeddings with Azure OpenAI

The chunks are sent to Azure OpenAI to generate semantic embeddings using models like text-embedding-3-small. These vectors capture the meaning behind your content.

Write to Cosmos DB Vector Store

Finally, embeddings and metadata (titles, tags, timestamps) are indexed in Cosmos DB. The schema is optimized for filtering and semantic ranking, speeding up searches.

Ready-to-Use Logic Apps Templates

To help you hit the ground running, Microsoft offers customizable templates:

  • Blob Storage – Simple Text Parsing
  • Blob Storage – OCR with Azure Document Intelligence
  • SharePoint – Simple Text Parsing
  • SharePoint – OCR with Azure Document Intelligence

These templates are flexible, so you can tweak or extend them to fit your unique business needs.

Why This Matters

This update streamlines AI-driven document workflows, reducing manual overhead. It empowers developers and data teams to build smarter, faster semantic search and knowledge discovery solutions.

“We’re just getting started—and we’re building this with you. Your input shapes the future of AI-powered document indexing in Cosmos DB.”

Get Involved: Share Your Feedback

Microsoft wants to hear from you. What data sources should be supported next? Need special formats like legal docs or invoices? Your feedback will help shape future enhancements.

Jump in and share your thoughts through the official feedback form or community post.

Stay tuned for more updates as Azure Logic Apps continues to evolve as a powerhouse for AI and data integration.

  • Leverages 1400+ Logic Apps connectors to pull documents from diverse sources like Azure Blob Storage and SharePoint.
  • AI-powered parsing tokenizes and chunks documents for optimal embedding and retrieval.
  • Generates semantic embeddings using Azure OpenAI’s text-embedding-3-small model for precise search.
  • Indexes embeddings and metadata in Cosmos DB’s vector store optimized for semantic ranking and filtering.
  • Offers ready-to-use, customizable templates including OCR integration with Azure Document Intelligence.
  • From the New blog articles in Microsoft Community Hub