Skip to main content

Interpreter Pattern

Define a grammar as composable functions and evaluate expressions by walking the structure.


The Problem​

You need to parse Markdown into HTML. The naive approach: a pile of regex replacements.

function renderMarkdown(source: string): string {
let html = source;
html = html.replace(/^### (.+)$/gm, "<h3>$1</h3>");
html = html.replace(/^## (.+)$/gm, "<h2>$1</h2>");
html = html.replace(/^# (.+)$/gm, "<h1>$1</h1>");
html = html.replace(/\*\*(.+?)\*\*/g, "<strong>$1</strong>");
html = html.replace(/\*(.+?)\*/g, "<em>$1</em>");
html = html.replace(/`(.+?)`/g, "<code>$1</code>");
// ... 30 more regex, order matters, edge cases everywhere
return html;
}

This breaks the moment you have nested formatting (**bold *and italic***), code blocks that contain *asterisks*, or links inside headings. Each new feature adds more regex, more ordering bugs, more edge cases. It doesn't scale.


The Solution​

Separate the grammar from the evaluation. Define the AST as a discriminated union, then fold over it recursively.

// 1. Define the grammar as types
type MdNode =
| { type: "document"; children: MdNode[] }
| { type: "heading"; level: number; children: MdNode[] }
| { type: "paragraph"; children: MdNode[] }
| { type: "bold"; children: MdNode[] }
| { type: "italic"; children: MdNode[] }
| { type: "code-block"; lang: string; code: string }
| { type: "text"; value: string }
// ... as many node types as your grammar needs

// 2. Evaluate recursively: one switch, one function
function evalNode(node: MdNode): string {
switch (node.type) {
case "document":
return node.children.map(evalNode).join("\n");
case "heading":
return `<h${node.level}>${node.children.map(evalNode).join("")}</h${node.level}>`;
case "paragraph":
return `<p>${node.children.map(evalNode).join("")}</p>`;
case "bold":
return `<strong>${node.children.map(evalNode).join("")}</strong>`;
case "italic":
return `<em>${node.children.map(evalNode).join("")}</em>`;
case "code-block":
return `<pre><code>${escapeHtml(node.code)}</code></pre>`;
case "text":
return escapeHtml(node.value);
}
}

// 3. Pipeline: source β†’ tokens β†’ AST β†’ HTML
const tokens = tokenize(source);
const ast = parse(tokens);
const html = evalNode(ast);

Adding a new node type? Add a variant to the union, add a case to the switch. TypeScript catches missing cases at compile time. Nesting works for free: evalNode calls itself.

Absorbed by the Language

This solution doesn't use Pithos. That's the point.

In TypeScript, discriminated unions + switch are the Interpreter pattern. Eidos exports a @deprecated interpret() function that exists only to guide you here, just like Taphos marks native APIs that replaced Arkhe utilities.


Live Demo​

Type Markdown on the left, see the evaluated HTML on the right. Toggle to AST to see the tree the evaluator folds over.

Code
/**
* Markdown Interpreter β€” evaluator (the Interpreter pattern).
*
* This file IS the pattern: a recursive fold over a discriminated union AST.
* No library needed β€” just a switch on the discriminant + recursion.
*/

import { tokenizeWithRefs, parse, type MdNode } from "./parsing-pipeline";
import { escape as escapeHtml } from "@pithos/core/arkhe/string/escape";
import { evalChildren, evalListItem, evalTable } from "./evalHelpers";

export type { MdNode } from "./parsing-pipeline";

// ─── Evaluator (AST β†’ HTML) β€” the Interpreter pattern ──────────────

type EvalContext = { indent: number };

function evalNode(node: MdNode, ctx: EvalContext): string {
switch (node.type) {
case "document":
return node.children.map((c) => evalNode(c, ctx)).join("\n");
case "heading":
return `<h${node.level}>${evalChildren(node.children, ctx, evalNode)}</h${node.level}>`;
case "paragraph":
return `<p>${evalChildren(node.children, ctx, evalNode)}</p>`;
case "blockquote":
return `<blockquote>${node.children.map((c) => evalNode(c, ctx)).join("\n")}</blockquote>`;
case "code-block":
return `<pre><code class="language-${escapeHtml(node.lang)}">${escapeHtml(node.code)}</code></pre>`;
case "hr":
return "<hr />";
case "unordered-list":
return `<ul>${node.items.map((item) => evalListItem(item, ctx, evalNode)).join("")}</ul>`;
case "ordered-list":
return `<ol>${node.items.map((item) => evalListItem(item, ctx, evalNode)).join("")}</ol>`;
case "table":
return evalTable(node, ctx, evalNode);
case "text":
return escapeHtml(node.value);
case "bold":
return `<strong>${evalChildren(node.children, ctx, evalNode)}</strong>`;
case "italic":
return `<em>${evalChildren(node.children, ctx, evalNode)}</em>`;
case "bold-italic":
return `<strong><em>${evalChildren(node.children, ctx, evalNode)}</em></strong>`;
case "strikethrough":
return `<del>${evalChildren(node.children, ctx, evalNode)}</del>`;
case "inline-code":
return `<code>${escapeHtml(node.value)}</code>`;
case "link":
return `<a href="${escapeHtml(node.href)}" target="_blank" rel="noopener noreferrer">${evalChildren(node.children, ctx, evalNode)}</a>`;
case "autolink":
return `<a href="${escapeHtml(node.href)}" target="_blank" rel="noopener noreferrer">${escapeHtml(node.href)}</a>`;
case "image":
return `<img src="${escapeHtml(node.src)}" alt="${escapeHtml(node.alt)}" />`;
case "checkbox":
return node.checked
? `<input type="checkbox" checked disabled /> `
: `<input type="checkbox" disabled /> `;
case "line-break":
return "<br />";
case "html-block":
return node.content;
case "inline-html":
return node.content;
}
}

// ─── Public API ─────────────────────────────────────────────────────

/** Full pipeline: source β†’ tokens β†’ AST β†’ HTML */
export function renderMarkdown(source: string): { ast: MdNode; html: string } {
const { tokens, refLinks } = tokenizeWithRefs(source);
const ast = parse(tokens, refLinks);
const html = evalNode(ast, { indent: 0 });
return { ast, html };
}


























Result

Beyond Markdown​

The same pattern works for any grammar. Here's a query DSL:

type Query =
| { type: "select"; fields: string[] }
| { type: "where"; source: Query; predicate: (row: Row) => boolean }
| { type: "limit"; source: Query; count: number };

function execute(query: Query, data: Row[]): Row[] {
switch (query.type) {
case "select":
return data.map(row => pick(row, query.fields));
case "where":
return execute(query.source, data).filter(query.predicate);
case "limit":
return execute(query.source, data).slice(0, query.count);
}
}

Discriminated union for the grammar. Recursive function for the evaluation. That's the whole pattern.


API​

  • interpreter @deprecated β€” use discriminated unions with recursive evaluation

Related