HTML to Markdown Converter

Why Convert HTML to Markdown?

Markdown has become the de facto standard for writing documentation, README files, blog posts, and technical notes. Its clean, readable syntax makes content easy to write, easy to review in pull requests, and easy to render across hundreds of platforms — from GitHub and GitLab to Notion, Obsidian, and static site generators like Hugo and Jekyll.

But you often start with HTML. Maybe you’re migrating a WordPress blog to a Markdown-based CMS. Maybe you copied content from a web page and need to clean it up. Maybe you’re scraping documentation that only exists as HTML. In all these cases, an HTML-to-Markdown converter saves hours of manual reformatting.

How the Conversion Works

This tool uses the browser’s native DOMParser API to parse your HTML into a proper DOM tree. It then walks the tree recursively, converting each node to its Markdown equivalent. This approach is more reliable than regex-based converters because it correctly handles nested elements, malformed HTML, and edge cases like self-closing tags.

The conversion engine maps HTML elements to Markdown syntax using these rules:

Headings: <h1> through <h6> become # through ######.
Paragraphs: <p> tags become double-newline-separated text blocks.
Bold and italic: <strong> and <b> become **text**; <em> and <i> become *text*.
Inline code: <code> becomes `text`.
Code blocks: <pre><code> becomes fenced code blocks with triple backticks. Language classes like language-javascript are preserved as syntax hints.
Links: <a href="..."> becomes [text](url). Title attributes are preserved.
Images: <img> becomes ![alt](src).
Unordered lists: <ul> with <li> becomes - item with proper indentation for nested lists.
Ordered lists: <ol> with <li> becomes numbered list items.
Blockquotes: <blockquote> becomes > text.
Horizontal rules: <hr> becomes ---.
Tables: <table> becomes a properly formatted Markdown table with header separators.
Strikethrough: <del>, <s>, and <strike> become ~~text~~.

Common Use Cases

Blog Migration

When moving a blog from a CMS like WordPress, Drupal, or Ghost to a Markdown-based system (Hugo, Gatsby, Astro, Jekyll), you need to convert all existing HTML posts to Markdown. This tool handles the conversion one post at a time, preserving headings, links, images, and formatting.

Documentation Cleanup

API documentation, internal wikis, and knowledge bases often exist as HTML. Converting to Markdown makes the content version-controllable in Git, editable in any text editor, and renderable on platforms like GitHub, Confluence, or Notion.

Web Scraping Post-Processing

When scraping web content for research or archival purposes, the raw HTML is cluttered with layout divs, style attributes, and script tags. Converting to Markdown strips away the presentational noise and leaves you with clean, structured text.

Email Content Extraction

HTML emails contain deeply nested tables and inline styles. Pasting the HTML into this converter extracts the meaningful text content as readable Markdown, making it easy to quote in documentation or issue trackers.

Understanding Markdown Syntax

Markdown was created by John Gruber in 2004 as a lightweight markup language. Its design goal was to be as readable as possible in its raw form. The original spec has since been formalized as CommonMark, a standardized specification with a comprehensive test suite. This converter targets CommonMark-compatible output with GitHub Flavored Markdown extensions for tables and strikethrough.

Key Markdown syntax elements include:

# Heading 1
## Heading 2
### Heading 3

**bold** and *italic*

[Link text](https://example.com)
![Alt text](image.png)

- Unordered list item
1. Ordered list item

> Blockquote

`inline code`

```
code block
```

| Column 1 | Column 2 |
| --- | --- |
| Cell 1 | Cell 2 |

Tips for Clean Conversion

Remove layout HTML first: If your HTML contains wrapper divs, navigation bars, or footer content, remove them before converting. The converter processes everything inside the body, so extra elements produce extra output.
Check nested lists: Deeply nested lists in HTML sometimes use inconsistent indentation. Review the Markdown output to ensure the nesting level is correct.
Verify table alignment: Complex HTML tables with colspan or rowspan cannot be represented in standard Markdown. The converter produces a flat table, so verify the output for complex table structures.
Handle inline styles manually: Markdown does not support inline CSS. Elements styled with color, font-size, or other CSS properties will lose their styling in the conversion. This is usually desirable for content migration.
Preserve code language hints: If your HTML code blocks use class="language-python" or similar, the converter preserves the language identifier in the fenced code block for syntax highlighting.

HTML Elements Not Supported in Markdown

Some HTML features have no Markdown equivalent. These include:

Forms and inputs: Form elements like <input>, <select>, and <button> are ignored.
Iframes and embeds: Embedded content cannot be represented in Markdown.
Complex table features: Merged cells (colspan, rowspan), table captions, and nested tables are simplified.
CSS classes and IDs: All styling information is stripped during conversion.

For these elements, you may need to manually add raw HTML blocks in your Markdown file. Most Markdown renderers support inline HTML for cases where pure Markdown is insufficient.

How This Tool Works

This converter runs entirely in your browser. When you click “Convert to Markdown,” the tool parses your HTML using the native DOMParser API, walks the resulting DOM tree node by node, and generates the corresponding Markdown syntax. No data is sent to any server — you can verify this by disconnecting from the internet and confirming the tool still works.

Frequently Asked Questions

Does this tool handle malformed HTML?

Yes. The browser’s DOMParser is very forgiving with malformed HTML. It automatically closes unclosed tags, fixes nesting errors, and produces a valid DOM tree before conversion begins.

Can I convert Markdown back to HTML?

Yes — check out our Markdown Preview tool, which renders Markdown as HTML in real time and lets you copy the generated HTML.

Does it preserve HTML comments?

No. HTML comments () are stripped during conversion. Markdown supports HTML comments, so you can add them back manually if needed.

What about HTML entities?

HTML entities like &, <, and   are decoded by the browser’s parser before conversion. The resulting Markdown contains the actual characters. If you need to work with HTML entities directly, try our HTML Entity Encoder tool.

Is my data sent to a server?

No. All processing happens locally in your browser using JavaScript. Your data never leaves your device.

How do I handle very large HTML files?

This tool works well with HTML content up to several hundred kilobytes. For extremely large files (multiple megabytes), consider splitting the content into smaller sections and converting each section separately.