Typography Matters

type-setting character blocks

Photo by Raphael Schaller on Unsplash

There are two kinds of quotation marks on the web. The first kind is straight — vertical lines that look the same whether they open or close a quote. They’re the ones on your keyboard, the ones that appear when you type "hello" in a text editor or a code block. They exist because typewriters had limited keys and early character encodings had limited space. They are a compromise from a time when compromise was necessary.

The second kind is curly — “smart quotes” that curve inward toward the text they embrace. Opening quotes face one way, closing quotes face another. They distinguish between the start and end of a quotation. They’re what you see in printed books, in professionally typeset documents, in any context where someone cared enough to get the details right.

Most of the web uses the first kind. Not because anyone decided straight quotes were better, but because they’re what you get by default. You type a quote, you get a straight quote. You’d have to go out of your way to get the curly one — type “ for a left double quote, ” for a right double quote, ‘ and ’ for singles. Nobody does that. Life is too short, and the keyboard doesn’t make it easy.

So the web is full of straight quotes, straight apostrophes, double hyphens where em dashes should be, three dots where ellipsis characters should be, and a general carelessness about typographic detail that wouldn’t survive the first read of a professionally edited manuscript. This isn’t because web developers and writers don’t care. It’s because the tools don’t care for them.

I decided to make tools that do.

Why this matters

Typography isn’t decoration. It’s communication infrastructure.

Good typography makes text easier to read. Not in a vague, hand-wavy, “it just feels nicer” way, but in a concrete, measurable way. Curly quotes reduce ambiguity — you can tell at a glance where a quotation starts and ends. Proper em dashes signal parenthetical asides more clearly than double hyphens. The ellipsis character (…) spaces its dots consistently, while three periods (..‌.) are subject to the vagaries of whatever font the browser chooses.

These are small differences. Individually, none of them will make or break a reader’s comprehension. But they accumulate. A page with correct typography reads smoothly. A page with straight quotes, hyphens for dashes, and periods for ellipses reads like it was drafted in a hurry and published without review. The content might be identical, but the signal is different. One says “I care about this.” The other says "I didn't notice."

For a small software company that publishes technical writing, the signal matters more than usual. If I’m claiming to build careful, considered software — small apps, no bloat, no telemetry, every line of code deliberate — the writing should reflect the same standard. Sloppy typography on a blog post about software craftsmanship is a contradiction. The medium undermines the message.

What the Notes Editor does

The Jorvik Notes Editor is where every blog post on this site is written. It’s a three-pane markdown editor: article list on the left, source editor in the middle, live preview on the right. The source editor is an NSTextView subclass called MarkdownTextView, and it handles typography at three levels.

Level one: as-you-type transforms.

When you type three hyphens and then press space, the editor silently replaces "--‌-" with —. Two hyphens become –. Three dots become …. These replacements happen inline, in real time, as you write. You don’t see the entity codes appear and then get replaced — the transform fires at the word boundary, so the switch is instantaneous.

The transforms are context-aware. If you’re inside a fenced code block (delimited by triple backticks), no transforms are applied. The editor knows that code is code — you want your three dots to stay as three dots, your double hyphens to remain double hyphens. The transform engine splits the document at code block boundaries and only operates on the prose sections.

The transforms are also configurable. The defaults are the three I mentioned — em dash, en dash, and ellipsis — but you can add, remove, or disable any of them. If you want "(c)" to become ©, you can add that. If you want "-‌>" to become →, you can add that too. Each transform is a find-and-replace pair with an enable toggle, persisted in UserDefaults.

Level two: curly quote conversion.

This is the one that took the most thought. When the editor publishes a post, it runs a function called applyCurlyQuotes() that walks the entire document and converts straight quotation marks to their curly HTML entity equivalents.

The logic is simple in principle and fiddly in practice:

The protected contexts are critical. The curly quote converter skips content inside fenced code blocks, inline code spans (backtick-delimited), and HTML tags. You do not want your <a href="..."> to end up with curly quotes in the href attribute. You do not want your code examples to have typographically correct but syntactically incorrect quotation marks. The engine parses the document structure and only touches prose.

There’s a subtlety with Unicode characters too. If the source text contains literal Unicode em dashes (U+2014) or en dashes (U+2013) — perhaps pasted from another application — the converter replaces those with their HTML entity equivalents as well. This ensures the published HTML is consistent regardless of whether the author typed the entity, used a text transform, or pasted from an external source.

Level three: spell checking and grammar.

The editor uses macOS’s native spell-checking infrastructure. Three lines of Swift enable it:

textView.isContinuousSpellCheckingEnabled = true
textView.isGrammarCheckingEnabled = true
textView.isAutomaticSpellingCorrectionEnabled = false

The third line is deliberate. Autocorrect is disabled because blog posts about software development contain words that the system dictionary doesn’t know — API names, command-line flags, variable names, HTML entities. Autocorrect would silently change NSTextView to something it considered more plausible, which is exactly the kind of “help” that causes more problems than it solves. Squiggly underlines for misspellings, yes. Automatic replacement, no.

The macOS smart quote and smart dash substitutions are also explicitly disabled:

isAutomaticQuoteSubstitutionEnabled = false
isAutomaticDashSubstitutionEnabled = false
isAutomaticTextReplacementEnabled = false

This might seem contradictory — I’m building my own curly quote system while disabling the one Apple provides. The reason is control. The system-level substitutions operate on raw characters and don’t understand markdown structure. They’d insert curly quotes inside code blocks, inside HTML attributes, inside places where straight quotes are correct. My implementation understands the document structure and only converts quotes in prose. Same outcome, different level of awareness.

There’s one more piece: the spell checker would flag HTML entities as misspellings. So the editor implements a delegate method — shouldSetSpellingState — that detects when the proposed squiggle range overlaps with an HTML entity (the text between & and ;) and suppresses it. The same delegate suppresses squiggles inside <abbr> tags, which brings us to acronyms.

Acronyms

The Notes Editor maintains a list of common technical acronyms: CLI, DNS, GPU, HTTPS, JSON, REST, SDK, TLS, URL, WASM, YAML … each has an abbreviation and an expansion.

When you type one of these acronyms and press space, the editor checks whether it’s the first occurrence in the document. If it is, the acronym is automatically wrapped in an <abbr> tag with the expansion as its title attribute:

<abbr title="Command Line Interface">CLI</abbr>

Subsequent occurrences of the same acronym are left alone — only the first instance gets the tag. This follows the convention used in most technical writing: define the abbreviation on first use, then use the short form thereafter.

The same logic applies to pasted text. If you paste a paragraph containing several acronyms, the editor scans the paste and wraps first occurrences. It processes matches in reverse document order to keep the character offsets stable as it inserts tags.

Like the text transforms, the acronym list is customisable. You can add domain-specific abbreviations, disable ones you don’t use, and the whole thing persists between sessions.

The spell checker knows about this too. The shouldSetSpellingState delegate recognises <abbr> tags and suppresses squiggles inside them, because “API” is not a misspelling and neither is the tag that wraps it.

What the Web Editor does

The Jorvik Web Editor handles the website’s HTML, CSS, and JavaScript files — everything that isn’t a blog post. It has the same three-pane layout and the same text transform engine, but tuned for markup rather than markdown.

The transforms are the same defaults — "--‌-" to &mdash;, "--" to &ndash;, "..‌." to &hellip; — but the context detection is different. The Web Editor needs to avoid transforms inside HTML tags, <script> blocks, and <style> blocks. The isInsideHTMLTag() function counts opening and closing angle brackets to determine whether the cursor is inside a tag. The isInsideScriptOrStyle() function checks whether the current position falls between <script> and </script> or <style> and </style> tags.

This means you can type "..‌." in a paragraph and get &hellip;, but the same three dots inside a <script> block remain "..‌.". You can type "--" in body text and get an en dash entity, but -- in a CSS comment stays as it is. The editor understands what it’s editing.

The Web Editor also validates HTML using html-validate, a Node.js linter that checks against HTML5 standards. One toolbar button validates the current file and reports errors with line and column numbers. It catches unclosed tags, invalid nesting, missing alt attributes, and the kind of structural problems that are easy to introduce and hard to spot by eye. CSS files get a brace-balance check. JavaScript files get a syntax check via Node’s --check flag.

The makeWebReady pipeline

When a blog post is published from the Notes Editor, it goes through a function called makeWebReady(). This is the final typography pass — the conversion from “what the author typed” to “what the web should display.” The pipeline runs four steps, in order:

  1. Strip existing <abbr> tags. This gives the processor a clean slate. If the document was previously published and then edited, the old acronym wrapping might be stale (perhaps an earlier occurrence was deleted, and the “first occurrence” is now different). Stripping and re-applying ensures correctness.

  2. Apply curly quotes. Straight quotes become their HTML entity equivalents, respecting code blocks, inline code, and HTML tags.

  3. Apply text transforms. Em dashes, en dashes, ellipses, and any custom transforms the user has defined.

  4. Wrap acronyms. First occurrences of known abbreviations get <abbr> tags, skipping protected contexts and checking for pre-existing wrappers.

The order matters. Curly quotes must be applied before acronym wrapping, because the acronym detector uses word boundaries and curly quote entities are valid boundary characters. Transforms must run before acronym wrapping because a transform might change the text in a way that creates or destroys an acronym match.

The output of makeWebReady() is the HTML that gets deployed to the website. The markdown source file is never modified by this process.

This separation is important. The source file is a working document. It should be easy to edit, easy to diff, and easy to search. The published HTML is a presentation document. These are different requirements, and they get different treatment.

Widow control

There’s one more typographic detail that the toolchain handles, and it’s the one most people have never heard of by name but have noticed without knowing it.

A widow is a single word sitting alone on the last line of a paragraph. The paragraph runs across five or six comfortable lines, and then the final word — just one word, or sometimes two short ones — wraps to a new line by itself. It dangles. It looks unfinished, like the paragraph trailed off rather than ended. In print typography, widows are considered a defect. Typesetters go to considerable lengths to avoid them — adjusting word spacing, rewriting sentences, or at minimum inserting a non-breaking space between the last two words so they wrap together.

On the web, widows are everywhere. CSS has orphans and widows properties, but they only apply to paged media like @page contexts and multi-column layouts — not to the normal flow of paragraphs on a screen. The text-wrap: pretty property is a newer addition that gives the browser permission to balance line breaks more thoughtfully, but browser support is still inconsistent and the results are unpredictable. There is no reliable, cross-browser CSS solution for preventing widows in body text.

So we handle it in the build step.

The blog’s build pipeline (build-notes.js) includes a function called preventOrphans that runs on every post’s HTML after Markdown parsing. It finds the last space before a closing </p>, </li>, or heading tag (</h1> through </h6>) and replaces it with a non-breaking space character — Unicode \u00a0, the same character that &nbsp; renders.

function preventOrphans(html) {
    return html.replace(/ ([^ <]+)<\/(p|li|h[1-6])>/g, '\u00a0$1</$2>');
}

One regular expression. One character substitution. The effect: the last two words of every paragraph, list item, and heading are glued together. They always wrap as a pair. You never see a single word stranded on the last line.

There’s a plain-text variant for contexts where the input isn’t HTML — post titles, navigation link text, the index page summaries:

function preventOrphansText(text) {
    return text.replace(/ (\S+)$/, '\u00a0$1');
}

Same idea, simpler pattern. Find the last space in the string, replace it with a non-breaking space. The title “The Case for Small Software” never breaks between “Small” and “Software” — they’re joined at the hip.

The regex deliberately excludes inline tags like <a>, <span>, <em>, <strong>, <code>, and <time> from the closing-tag match. This prevents the substitution from corrupting spaces inside the attributes of adjacent elements. It only fires on block-level closing tags, which is where widows actually matter.

The footnote back-reference links get the same treatment — the space before a <a data-footnote-backref is replaced with a non-breaking space so the back-reference arrow doesn’t orphan onto its own line.

This is one of those changes that’s invisible when it’s working. You don’t notice that the last line of a paragraph has two words instead of one. But you do notice — at some level below conscious awareness — that the paragraphs feel complete. That they end with the same visual weight as they begin. That nothing dangles.

It’s a small thing. It’s one regex. But it’s the kind of small thing that separates text that was published from text that was typeset.

The philosophy

I’ve been thinking about why most writing on the web has poor typography, and I don’t think it’s because people don’t care. I think it’s because the effort of caring is too high relative to the perceived benefit.

Typing &ldquo; instead of " is a seven-character overhead, every time, for a visual difference that many readers won’t consciously notice. Over the course of a 2,000-word blog post, that’s dozens of extra entities to type, remember, and get right. Missing one is easy. Getting the opening/closing direction wrong is easy. Accidentally inserting a curly quote inside a code example is easy. The effort compounds and the reward is subtle.

The solution isn’t to tell writers to try harder. The solution is to make the tools do the work. Type a straight quote; the tool produces a curly entity. Type three hyphens; the tool produces an em dash. Type an acronym; the tool wraps it in an <abbr> tag. Misspell a word; the tool underlines it. Use &hellip; in your prose; the spell checker knows it’s not a typo.

The writer focuses on the content. The tool handles the presentation. The published result has the typography of a professionally typeset document, and the source file has the simplicity of a plain text editor. You don’t have to choose between writing comfort and typographic quality. You get both.

What good typography signals

I mentioned earlier that typography is a signal. I want to expand on that, because I think it’s under-appreciated.

When you visit a website and the text uses proper typographic characters — curly quotes, em dashes, correctly spaced ellipses, acronyms with title attributes — you probably don’t consciously register any of these details. But you register something. The page looks “right.” It looks considered. It looks like someone proofread it, or at least like someone built a system that proofread it for them.

When the text uses straight quotes, double hyphens, and "..‌.", you register that too. Maybe not consciously. But there’s a subtle impression of roughness, of draft quality, of “this went from the author’s head to the screen without passing through any editorial process.”

Neither impression is about the content. A brilliant essay in straight quotes is still brilliant. A mediocre one in curly quotes is still mediocre. But when the content is good, correct typography lets it shine without distraction. And when you’re a small operation — one person writing blog posts between building macOS utilities — the typography is one of the few signals you can control that says “this is not amateur hour.”

It’s the same principle that applies to the software itself. QuitProtect is 600 lines of Swift. It could have rough edges — an un-styled window, a janky animation, a menu bar icon that’s two pixels off-centre — and it would still work. But those rough edges would signal carelessness. The icon is centred. The window is styled. The animation is smooth. Not because anyone asked for it, but because the details are the product.

Typography is the same. The curly quotes are the centred icon. The em dashes are the smooth animation. Nobody asks for them. Most people won’t notice if they’re missing. But their presence says something about the standard to which the work is held.

The cost

The text transform engine is about 100 lines of Swift. The curly quote converter is about 70 lines. The acronym wrapper is about 80 lines. The spell-check entity suppression is about 30 lines. The makeWebReady pipeline is about 50 lines. The widow prevention is two functions totalling about 10 lines of JavaScript.

Call it 340 lines of code, total, across two applications and a build script. That’s the cost of correct typography on every page of this website. No external dependencies. No cloud service. Just string processing, applied at the right moment, with awareness of the document structure.

Three hundred and forty lines of code to convert "I don't care" into “I don’t care.” To transpose -- into –. To wrap “API” in a tag that tells screen readers what it stands for. To keep the last two words of every paragraph together so nothing dangles.

That’s a trade I’ll make every time.