html.md is a JavaScript library that converts HTML into valid Markdown.

Originally based on the Make.text bookmarklet but with cleaner safer code, not to mention a new simple and understandable API. html.md can be used normally in any browser as well as in the node.js environment where it also provides a CLI.

Install

Install using the package manager for your desired environment(s):

# for node.js:
$ npm install html-md
# OR; for the browser:
$ bower install html-md

Examples

In the browser:

<html>
  <head>
    <script src="/path/to/md.min.js"></script>
    <script>
      (function () {
        var body = document.getElementsByTagName('body')[0];
        console.log(md(body));
      }());
    </script>
  </head>
  <body>
    <h1>Hello, World!</h1>
    <p>My tasks for today:</p>
    <ul>
      <li>Learn all about <a href="http://neocotic.com/html.md">html.md</a></li>
      <li>Tell everyone how <strong>awesome</strong> it is!</li>
    </ul>
  </body>
</html>

In node.js:

var md = require('html-md');

console.log(md('I <em>love</em> html.md!'));

The fantastic jsdom library is used in this environment in order to simulate a working DOM to be traversed and translated to Markdown (see the Windows section for important notes about support for this platform).

In the terminal:

# provide HTML to be converted and print it back out to stdout:
$ htmlmd -epi "I <b>love</b> <a href='http://neocotic.com/html.md'>html.md</a>"
I **love** [html.md](http://neocotic.com/html.md)
# convert HTML files and output them into another directory:
$ htmlmd -o ./markdown ./html/*.html
# convert all HTML files in the current directory into Markdown files:
$ htmlmd -l .

Usage

Usage: htmlmd [options] [ -e html | <file ...> ]

Options:

  -h, --help          output usage information
  -V, --version       output the version number
  -a, --absolute      always use absolute URLs for links and images
  -b, --base <url>    set base URL to resolve relative URLs from
  -d, --debug         print additional debug information
  -e, --eval          pass a string from the command line as input
  -i, --inline        generate inline style links
  -l, --long-ext      use long extension for Markdown files
  -o, --output <dir>  set the output directory for converted Markdown
  -p, --print         print out the converted Markdown

API

md(html, [options])

Parses the HTML into a valid Markdown string. The html can either be an HTML string or DOM element.:

console.log(md('I <strong>love</strong> html.md!')); // "I **love** html.md!"
console.log(md(document.querySelector('p')));        // "Lorem ipsum, *baby*!"

Options

The following options are recognised by this method (all of which are optional);

Property Description
absolute All links and images are parsed with absolute URLs
base All relative links and images are resolved from this URL
debug Prepends additional debug information to the Markdown output
inline All links are generated using the inline style

Note: The base option only works in the node.js environment.

Miscellaneous

noConflict()

Returns md in a no-conflict state, reallocating the md global variable name to its previous owner, where possible.

This is really just intended for use within a browser.

<head>
  <script src="/path/to/conflicting.js"></script>
  <script src="/path/to/md.min.js"></script>
  <script>
    var mdNC = md.noConflict();
    // Conflicting lib works again and use mdNC for this library onwards...
  </script>
</head>

version

The current version of md.

console.log(md.version); // "3.0.2"

Windows

This section is only relevant for node.js users and does not affect browsers.

A lot of care has been put in to ensure html.md runs well on Windows. Unfortunately, one of the dependencies of the jsdom library, which we depend on to emulate a DOM within the node.js environment, does not build well on Windows systems since it's built using "native modules" that are compiled during installation. Contextify, the inherited dependency in question, is used to run <script> contents safely in a sandbox environment and is required to properly parse DOM objects into valid Markdown.

Fortunately, the author has documented some techniques to get it building on your Windows system in a Windows installation guide.

Changes

Version 3.0.2

  • #36: Fix errors in Internet Explorer 8 and older
  • #37: Fix problem with running command line
  • Update versions of dependencies
  • Minor changes and tweaks
View historical changes

Bugs

If you have any problems with this library or would like to see the changes currently in development browse our issues.

Developers should run all tests locally and ensure they pass before submitting a pull request.

Questions?

Take a look at the documentation to get a better understanding of what the code is doing.

If that doesn't help, feel free to follow me on Twitter, @neocotic.