Reasons EPUB hasn’t taken off in academia:
The page number problem is the killer. This is the single biggest barrier. Only 25% of incoming EPUB files to EBSCO contain page numbers, and academic libraries and users will not adopt EPUB at scale until that number is much higher. The reason this matters so deeply is that the entire citation infrastructure of scholarship depends on stable, findable locations within a text. Faculty members who have to grade papers and check sources recoil at the thought of finding an ebook in a database and counting paragraphs. EPUB’s reflowable text means page numbers shift depending on device and font size. Oxford’s guidance to students notes that location numbers in EPUB and Kindle formats do not always provide a reliable method of citation, as they may change depending on font type, size, line spacing, and margins. Every major citation style — APA, Chicago, Harvard — has had to create awkward workarounds like paragraph counting and section-heading citation. This is a fundamental design failure: EPUB was built for reading comfort without considering that scholarship requires addressability.
Proposed solution: Redefine ‘page’ to not mean screen reader view size. Addressability by paragraph, perhaps via Engelbart purple number (which can each be addressed to themselves as anchors).
PDF was there first, and it’s good enough. Scholarly journal publishers were among the first to move toward digital content delivery, which placed them in the unique position of having created a reader community comfortable reading content on web platforms or downloading PDFs. The entire ecosystem — from LaTeX to journal submission systems to library databases — was built around PDF. Academics aren’t resisting EPUB because they love PDF; they’re staying with PDF because switching costs are enormous and the benefits of EPUB (reflowable text, accessibility) don’t address the things academics actually need most: citation stability, visual fidelity of equations and figures, and the ability to say “see page 47.”
Small academic publishers can’t afford the conversion. A significant portion of academic content is still PDF-only — 30% looking at 2019 and forward publication years — and most of the publishers that aren’t creating EPUB for all of their titles indicate that they can’t afford the conversion process. EPUB production requires XHTML/CSS expertise that most small university presses don’t have in-house, and outsourcing costs between $0.60 per page at the low end to thousands of dollars for complex academic content with tables, equations, and footnotes.
EPUB readers are not where scholars read. E-book reading devices are not how most consumers of scholarly content are reading journal content. Academics read on laptops, on library terminals, and increasingly in browsers. The Kindle, Kobo, and Nook ecosystem that drove EPUB adoption in trade publishing simply isn’t where scholarly articles live. EBSCO gives users a choice between EPUB and PDF for every title where both are available, and users only select the EPUB version 15–20% of the time.
The format tried to serve reading but ignored thinking. This is the implicit lesson across all the sources and the one most relevant to Origami text. EPUB was designed as a delivery format — a better way to read a finished text on a screen. It was never designed as a knowledge format. It has no native concept of structured citation data, no semantic markup for claims and evidence, no mechanism for a document to declare its own addressable parts in a way that other documents can reliably point to. It solved the wrong problem: it made text comfortable to consume on devices while leaving the scholarly infrastructure of reference, verification, and interconnection entirely to PDF’s frozen pages.
This is exactly where Origami Text has an opening. The gap isn’t “EPUB but better at reflowing.” The gap is a format that treats addressability, citation, and rich data embedding as first-class citizens — not afterthoughts — while remaining simple enough that a single author can produce one without a conversion pipeline. Standoff markup maps directly onto this: if the text and its structural annotations live in the same package but as separate layers, you get PDF’s addressability and EPUB’s flexibility without sacrificing either.
