What It Does
It handles everything from PDFs and Office documents to images with OCR, audio with transcription, ZIP archives, and even You. Tube URLs. The output preserves document structure (headings, tables, lists, links), making it ideal for feeding content into LLMs or text analysis pipelines.
Markdown Converter transforms virtually any file format into clean, structured Markdown using `markitdown` — invoked via `uvx` with no pre-installation needed.
Key Features
- Broad Format Support — Converts PDF, Word (.docx), PowerPoint (.pptx), Excel (.xlsx/.xls), HTML, CSV, JSON, XML, images, audio, ZIP archives, YouTube URLs, and EPub files — all to Markdown.
- No Installation Required — Uses `uvx markitdown` to run without a global install step. Dependencies are fetched and cached on first run; subsequent runs are significantly faster.
- Structure-Preserving Output — Converted Markdown retains document structure including headings, tables, bullet lists, and links — making downstream LLM ingestion or text analysis more accurate.
- Image OCR and Audio Transcription — Extracts EXIF metadata and runs OCR on images, and transcribes audio files, embedding the results directly in the Markdown output.
- Azure Document Intelligence Integration — For complex or scanned PDFs with poor default extraction, the `-d` flag enables Azure Document Intelligence via a configurable endpoint for higher-quality results.
- Flexible Input/Output Modes — Supports file paths, stdin piping, and stdout — with optional flags to hint file extension, MIME type, and charset for ambiguous inputs.
Requirements
- **Azure Document Intelligence Endpoint** *(optional)* — Required only when using the `-d` flag for enhanced PDF extraction. Provide your Azure Cognitive Services endpoint via the `-e` flag.
Use Cases
- LLM document ingestion pipeline — Convert a folder of PDFs and Word documents to Markdown before feeding them into a retrieval-augmented generation (RAG) system, preserving structure so the model can reason over headings and tables.
- YouTube transcript extraction — Pass a YouTube URL directly to the converter to retrieve a structured Markdown transcript, useful for summarization or research workflows without leaving the terminal.
- Scanned PDF extraction with Azure AI — Use the `-d` flag with an Azure Document Intelligence endpoint to extract text from scanned or image-heavy PDFs that standard parsing handles poorly.
- Spreadsheet and data file normalization — Convert Excel, CSV, or JSON files to Markdown tables, making structured data human-readable and ready for analysis or inclusion in reports.