Question 1

Does my PDF get uploaded anywhere?

Accepted Answer

No. Parsing happens entirely in your browser with pdf.js — nothing is sent to a server. That is the main reason to use this over upload-based converters for sensitive documents.

Question 2

Why is the output messy for some PDFs?

Accepted Answer

PDFs store positioned glyphs, not structure, so headings, paragraphs, and lists are reconstructed heuristically from font sizes and spacing. Multi-column layouts, footnotes, and complex tables are the hardest and may come out jumbled.

Question 3

Can it convert scanned PDFs?

Accepted Answer

No. Scanned PDFs are images with no selectable text, which needs OCR — not supported here. If the output is empty, your PDF is almost certainly a scan.

Question 4

How are tables handled?

Accepted Answer

Poorly, honestly. Table cells are just text at coordinates with no row/column metadata, so they usually flatten into lines. For tabular data, exporting to CSV from the source is far more reliable.

Question 5

Why are some headings wrong?

Accepted Answer

Heading levels are guessed from relative font size. A document that styles headings by weight or color rather than size, or uses many sizes, can confuse the heuristic — fix the few lines by hand after converting.

PDF to Markdown

About this tool

Frequently asked questions

Does my PDF get uploaded anywhere?

Why is the output messy for some PDFs?

Can it convert scanned PDFs?

How are tables handled?

Why are some headings wrong?