The problem with ODT-to-HTML in JavaScript
The JavaScript ecosystem has mammoth for .docx-to-HTML conversion. For .odt files, the options have been far worse: odt2html was abandoned in 2016, odt.js is limited and unmaintained, and most solutions fall back to shelling out to LibreOffice headless — which requires LibreOffice to be installed on the server, adds significant startup latency, and doesn't work in serverless or edge environments.
odf-kit fills this gap. The odf-kit/reader module is a pure JavaScript ODT parser that reads .odt files directly, with no native dependencies, no LibreOffice, and no subprocess calls.
Installation
Requires Node.js 22 or later. ESM only — use import, not require.
Quick start: ODT to HTML in one line
import { odtToHtml } from "odf-kit/reader"; import { readFileSync } from "fs"; const bytes = new Uint8Array(readFileSync("document.odt")); const html = odtToHtml(bytes); // html is a complete HTML document string: // <!DOCTYPE html><html><head>...</head><body>...</body></html>
Fragment output (embed in an existing page)
Pass { fragment: true } to get just the body content without the HTML document wrapper — useful when embedding output into an existing page:
import { odtToHtml } from "odf-kit/reader"; import { readFileSync } from "fs"; const bytes = new Uint8Array(readFileSync("document.odt")); const fragment = odtToHtml(bytes, { fragment: true }); // Returns inner body content only: // <h1>Title</h1><p>Body text...</p><table>...</table>
Access the document model
Use readOdt() when you need access to the structured document model — for example to extract metadata, iterate over paragraphs, or build a custom renderer:
import { readOdt } from "odf-kit/reader"; import { readFileSync } from "fs"; const bytes = new Uint8Array(readFileSync("document.odt")); const doc = readOdt(bytes); // Document metadata from meta.xml console.log(doc.metadata.title); console.log(doc.metadata.creator); console.log(doc.metadata.creationDate); // Structured body — array of paragraphs, headings, lists, tables for (const block of doc.body) { if (block.kind === "heading") { console.log(`H${block.level}: `, block.spans.map(s => s.text).join("")); } } // Or convert to HTML via toHtml() const html = doc.toHtml({ fragment: true });
What gets extracted
| Element | HTML output | Status |
|---|---|---|
| Paragraphs | <p> | ✓ |
| Headings (levels 1–6) | <h1>–<h6> | ✓ |
| Bold text | <strong> | ✓ |
| Italic text | <em> | ✓ |
| Underline | <u> | ✓ |
| Strikethrough | <s> | ✓ |
| Superscript / subscript | <sup> / <sub> | ✓ |
| Hyperlinks | <a href> | ✓ |
| Bullet lists | <ul><li> | ✓ |
| Numbered lists | <ol><li> | ✓ |
| Nested lists | Nested <ul>/<ol> | ✓ |
| Tables (including merged cells) | <table><tr><td> with colspan/rowspan | ✓ |
| Document metadata | title, creator, dates | ✓ |
| Line breaks | <br> | ✓ |
| Named styles (bold headings, etc.) | Resolved from styles.xml | ✓ |
| Images | — | Roadmap |
| Fonts, colors, font sizes | — | Roadmap |
| Footnotes / endnotes | — | Roadmap |
Why not shell out to LibreOffice?
LibreOffice headless (libreoffice --headless --convert-to html) is the common fallback for ODT-to-HTML conversion. It works, but it comes with real costs:
- Startup time — LibreOffice takes 2–5 seconds to start even for a small document
- Deployment complexity — LibreOffice must be installed on every server, container, and CI runner
- Serverless incompatibility — LibreOffice doesn't run on AWS Lambda, Vercel, Cloudflare Workers, or similar environments
- Size — LibreOffice is several hundred megabytes; it dominates Docker image size
- Security surface — a process spawned from your application with filesystem access is a meaningful attack surface
odf-kit parses .odt files directly in JavaScript. No subprocess, no installed software, no startup penalty. It works anywhere Node.js runs.
Comparison with other options
| Option | Pure JS | Maintained | Tables | Metadata |
|---|---|---|---|---|
| odf-kit/reader | ✓ | ✓ Active (2026) | ✓ | ✓ |
| LibreOffice headless | ✗ Subprocess | ✓ | ✓ | Partial |
| odt2html (npm) | ✓ | ✗ Abandoned (2016) | ✗ | ✗ |
| odt.js | ✓ | ✗ Unmaintained | Partial | ✗ |
odf-kit does more than read
The same library that reads .odt files can also create them from scratch and fill existing templates with data. If you're building a document pipeline — generate, fill, or convert — odf-kit handles all three without additional dependencies.
- Fill .odt templates with JavaScript — design in LibreOffice, fill from code
- Create .odt files programmatically — headings, tables, images, lists, page layout