Convert files (PDF, image, Word, PPT, Excel, notebooks, code snippets) to markdown using powerful multimodal LLM
-
Updated
May 8, 2025 - Python
Convert files (PDF, image, Word, PPT, Excel, notebooks, code snippets) to markdown using powerful multimodal LLM
MinerU (PDF to Markdown converter) — portable, no-installation-required, one-click launch bundle.
LyraPDF: convert a PDF to JSON or MarkDown
PDF转Markdown软件Marker免安装一键启动整合包
High-speed PDF → Markdown ingestion engine for multimodal RAG pipelines. Extracts structured text + isolated images so downstream chunkers, LlamaIndex, and VLM agents get context that actually works
MDify is a free, open-source web app that converts documents to clean Markdown using Microsoft's MarkItDown library.
Convert pdf to markdown using the ocr api by Mistral AI
Automates the pipeline of converting PDF documents and images into clean Markdown text fully offline. It is designed specifically for data preparation workflows where the end goal is feeding high-quality text into Large Language Models (LLMs) or indexing systems (RAG).
DotsOCR-VLLM-DB is a self-hosted, GPU-accelerated document understanding pipeline. It turns messy real-world inputs — scanned PDFs, native PDFs, DOCX files, and images — into clean, structured outputs (Markdown, JSON layout, and reconstructed DOCX) .
Add a description, image, and links to the pdftomarkdown topic page so that developers can more easily learn about it.
To associate your repository with the pdftomarkdown topic, visit your repo's landing page and select "manage topics."