> ## Documentation Index > Fetch the complete documentation index at: https://docs.ensemble.ai/llms.txt > Use this file to discover all available pages before exploring further. # convert Operation > Document format conversion - HTML to Markdown, Markdown to HTML, DOCX extraction, PDF text extraction, frontmatter parsing The convert operation transforms documents between formats without writing custom code. Convert HTML to clean Markdown, render Markdown to HTML, extract Word documents, extract text from PDFs, or parse frontmatter metadata. The `convert` operation uses Workers-compatible libraries: **turndown** for HTML→Markdown, **marked** for Markdown→HTML, **gray-matter** for frontmatter, **mammoth** for DOCX, and **unpdf** for PDF text extraction. DOCX and PDF require `nodejs_compat`. ## Quick Start **HTML to Markdown**: ```yaml theme={null} agents: - name: clean-html operation: convert config: input: ${fetch-page.output.html} from: html to: markdown ``` **Markdown to HTML**: ```yaml theme={null} agents: - name: render-content operation: convert config: input: ${input.markdown} from: markdown to: html ``` **Extract Frontmatter**: ```yaml theme={null} agents: - name: parse-doc operation: convert config: input: ${read-file.output} from: markdown to: frontmatter ``` **PDF to Text**: ```yaml theme={null} agents: - name: extract-pdf operation: convert config: input: ${read-pdf.output} # ArrayBuffer from: pdf to: text ``` ## Configuration ```yaml theme={null} config: input: any # Content to convert (required) from: string # Source format (required) to: string # Target format (required) # Format-specific options turndown: object # HTML→Markdown options marked: object # Markdown→HTML options mammoth: object # DOCX conversion options pdf: object # PDF extraction options ``` ## Supported Conversions | From | To | Description | | ---------- | ------------- | ------------------------------------------------------ | | `html` | `markdown` | Convert HTML to clean Markdown using turndown with GFM | | `html` | `text` | Strip HTML tags to plain text | | `markdown` | `html` | Render Markdown to HTML using marked with GFM | | `markdown` | `frontmatter` | Extract YAML frontmatter and content | | `docx` | `html` | Convert Word document to HTML | | `docx` | `markdown` | Convert Word document to Markdown | | `pdf` | `text` | Extract text content from PDF documents | ## HTML to Markdown Converts HTML to clean Markdown using [turndown](https://github.com/mixmark-io/turndown) with GitHub Flavored Markdown (GFM) support. ```yaml theme={null} agents: - name: convert-article operation: convert config: input: |

Welcome

This is bold and italic text.

Item 1
Item 2

from: html to: markdown ``` **Output**: ```markdown theme={null} # Welcome This is **bold** and _italic_ text. - Item 1 - Item 2 ``` ### Turndown Options Customize the Markdown output: ````yaml theme={null} config: input: ${html} from: html to: markdown turndown: headingStyle: atx # atx (# heading) or setext (underlines) codeBlockStyle: fenced # fenced (```) or indented bulletListMarker: "-" # -, *, or + emDelimiter: "_" # _ or * strongDelimiter: "**" # ** or __ linkStyle: inlined # inlined or referenced gfm: true # Enable GFM tables, strikethrough ```` ### GFM Table Support Tables are automatically converted: ```yaml theme={null} agents: - name: convert-table operation: convert config: input: |

Name	Age
Alice	30
Bob	25

from: html to: markdown ``` **Output**: ```markdown theme={null} | Name | Age | |------|-----| | Alice | 30 | | Bob | 25 | ``` ## Markdown to HTML Renders Markdown to HTML using [marked](https://marked.js.org/) with GFM support. ```yaml theme={null} agents: - name: render-post operation: convert config: input: | # Hello World This is a **markdown** document with: - Bullet points - [Links](https://example.com) - `inline code` from: markdown to: html ``` **Output**: ```html theme={null}

Hello World

This is a markdown document with:

Bullet points
Links
inline code

``` ### Marked Options ```yaml theme={null} config: input: ${markdown} from: markdown to: html marked: gfm: true # Enable GFM (default: true) breaks: false # Convert \n to
(default: false) ``` ### Code Block Syntax Highlighting Code blocks preserve language hints for syntax highlighting: ````yaml theme={null} agents: - name: render-code operation: convert config: input: | ```javascript const greeting = "Hello, World!"; console.log(greeting); ```` from: markdown to: html ```` **Output**: ```html

const greeting = "Hello, World!";
console.log(greeting);

```` ## Frontmatter Extraction Parses YAML frontmatter from Markdown documents using [gray-matter](https://github.com/jonschlinkert/gray-matter). ```yaml theme={null} agents: - name: parse-blog-post operation: convert config: input: | --- title: My Blog Post author: Alice date: 2024-01-15 tags: - typescript - tutorial --- # Introduction Welcome to my blog post about TypeScript! from: markdown to: frontmatter ``` **Output**: ```typescript theme={null} { frontmatter: { title: "My Blog Post", author: "Alice", date: Date("2024-01-15"), // Parsed as Date object tags: ["typescript", "tutorial"] }, content: "# Introduction\n\nWelcome to my blog post about TypeScript!" } ``` ### Using Extracted Data ```yaml theme={null} agents: - name: parse-doc operation: convert config: input: ${read-file.output} from: markdown to: frontmatter - name: render-page operation: html config: template: blog-post data: title: ${parse-doc.output.frontmatter.title} author: ${parse-doc.output.frontmatter.author} content: ${parse-doc.output.content} ``` ## HTML to Text Strips all HTML tags and returns plain text. Useful for search indexing, text analysis, or email plain-text versions. ```yaml theme={null} agents: - name: extract-text operation: convert config: input: |

Title

This is formatted content.

from: html to: text ``` **Output**: ``` Title This is formatted content. ``` Features: * Removes `