Microsoftpublished a blog post on November 26, announcing that in its LlamaParse Integration of Azure OpenAI endpoints in the GPT-4o series of models.Enhance extraction of unstructured data and parsing of multimodal documents, and seamlessly interface with Azure AI Search vector databases to build complete retrieval augmentation generation (RAG) workflows.
Introduction to LlamaParse
Microsoft LlamaParse is a document parser designed for Generative Artificial Intelligence (GenAI), whose main goal is to parse and clean a wide variety of document data to ensure data quality before passing it on to downstream Large Language Models (LLMs).
Adding Azure OpenAI endpoints
Microsoft LlamaParse, with this integration, enables users to call Azure OpenAI's GPT-4o family of models to extract unstructured data and document transformations. This integration capitalizes on the strengths of bothLlamaParse for efficient parsing and Azure OpenAI for powerful language modeling capabilities, ultimately enabling more accurate and smarter document processing.
Citing the media report, 1AI attached the update as follows:
- Directly connect to models such as GPT-4o and GPT-4o-mini for Azure OpenAI
- Multimodal document parsing in LlamaParse, with multimodal support via Azure OpenAI
- LLM-optimized output for enhanced retrieval and semantic searching
- Seamless ingestion into Azure AI Search's vector repository via LlamaIndex
- Enterprise-grade security and compliance for sensitive workloads
Users can leverage LlamaCloud, Azure AI Search, and Azure OpenAI to build a complete RAG workflow in a number of steps:
- Parsing and Enrichment: Advanced document extraction using LlamaParse Premium and Azure OpenAI to generate LLM-optimized output in multiple formats such as Markdown, LaTeX, and Mermaid charts.
- Chunking and Embedding: Use Azure AI Search as a vector store and leverage the embedding model in the Azure AI Model Catalog to chunk, embed, and index the parsed content.
- Search & Generate: Leverage Azure AI Search's query rewriting and semantic reordering capabilities to improve retrieval quality. Finally, build generative AI applications by orchestrating Azure AI Search and Azure OpenAI with Llamaindex.