AI: Summary
The 23 February 2026 session of the Future Text Lab brought together Frode Hegland, Tom Haymes, Peter Dimitrios, Ken Perlin, Astral_Druid, and Rob Swigart (via chat) for a wide-ranging conversation centered on how text, citations, and knowledge structures can be meaningfully represented and interacted with in spatial and extended reality environments. The session moved fluidly between the concrete design challenge of displaying quotes and citations in visionOS, the broader philosophical question of what XR genuinely offers over 2D interfaces, the role of AI as infrastructure rather than a feature to bolt on, and the historical precedents — HyperCard, VisiCalc, WordPerfect — that might help the group identify what a “killer app” for spatial text could look like.
AI: Main Topic
The primary focus was Frode Hegland’s presentation of two slides posing a specific design question: when displaying a citation or quote as a node in XR, what should be shown in its closed state versus its open state, and how should different types of scholarly objects — defined concepts/glossary terms, citations (author, title, BibTeX), and quotes (citation plus text) — be visually distinguished and navigated. Frode proposed showing three lines of a quote with author and title beneath in the closed state, with full scrollable content on open, and floated the idea that node depth could encode the amount of text contained. This design question opened into a much broader discussion about context versus content in spatial environments, the nature of AI-assisted versus manual navigation, and what spatial computing can do that flat screens simply cannot.
AI: Highlights
Tom Haymes highlighted the importance of breadcrumbs and suggested color-coded layered controls at the bottom of each node — a rollover or gaze-based mechanism to shift between annotation, citation, and connection views — noting that citations are fundamentally breadcrumbs pointing outward.
Ken Perlin introduced the analogy of AI as road infrastructure: just as a car builder must design for roads that already exist and that shape every decision, anyone building spatial text tools must accept that AI is the underlying infrastructure, not an optional add-on. He explicitly stated this doesn’t mean putting AI into everything, but it does mean understanding where things are heading from 2026 onward.
Peter Dimitrios repeatedly stressed the value of specificity — “do a particular thing” — and cautioned against boiling the ocean. He noted that Frode’s authoring use case for Author is already well-defined and that overextending the spatialized version risks losing that clarity.
Ken Perlin proposed the concept of diegetic prototyping as a methodology the group should explore: looking at how science fiction films and TV shows (referencing Minority Report, John Wick, and Iron Man/Jarvis) depict spatial and AI-assisted interaction as a way of thinking through design fictions for mixed reality text environments.
Astral_Druid (via both voice and chat) raised the observation that any research study comparing XR to 2D interfaces faces an especially tricky problem: people are changed by the environments they inhabit, meaning the group is designing for a moving target. Ken affirmed this as a genuine research design challenge requiring careful controls.
Frode highlighted a moment of embodied confusion as a genuine insight: while in conversation, he instinctively tapped to trigger the “focus” function in his spatial app — despite not being in VR. He compared this to the frustration users of his Liquid tool feel when working on a machine without it, and suggested that producing this kind of “withdrawal” feeling could itself be a definition of what a killer XR app would achieve.
AI: Insights
The distinction between context and content in spatial environments emerged as a genuine conceptual reframing. Frode demonstrated that participant profiles, glossary terms, and relational metadata function best as passive “context” — background objects that can be ignored or snapped away — while quotes and citations are active “content.” This separation allows a spatial workspace to breathe without becoming a visual spaghetti of equally weighted objects.
There is a productive tension between Peter’s and Ken’s positions on 2D versus 3D. Peter argued that 2D environments like GatherTown / WorkAdventure already deliver a surprisingly effective conference-navigation experience without requiring a headset, and that 3D may be “sugar coating the information architecture.” Ken countered that the market reality — essentially nobody is using VR compared to hundreds of millions using 2D interfaces — means there can be no standards yet, and that VR today resembles hi-fi audio in the 1950s: a niche interest that won’t drive conventions. Frode played devil’s advocate for the “movie theater” position: some experiences justify a special environment even before it is widely accessible, and the group should feel free to invent for that environment without requiring it to degrade perfectly to 2D.
The shift from hardcoded Ted Nelson–style hyperlinks to AI-inferred contextual links was recognized as a fundamental transition the group is living through. Peter observed that where authors once explicitly authored the link (“go here”), AI will increasingly infer the most statistically probable destination — which raises the question of how an author preserves intentional, curated navigation against the probabilistic averaging of large language models.
Ken’s observation that he is prototyping a Wikipedia visualizer where the entire page text acts as the link — with an AI assistant deciding what to link to at runtime — is directly consistent with Peter’s framing. Both point toward documents where the link layer is no longer baked into the document structure but is generated dynamically based on context. This is a significant departure from the document model the group has been working with.
The x,y,z coordinate system for spatial layout was collectively reframed as a necessary but insufficient abstraction. Ken noted that people organize real rooms not by coordinates but by lived embodied logic — table, window, bookshelf — and that spatial information environments need to reach for a semantic layer above raw coordinates. Peter proposed layout “connection types” (stack, sphere, array, column) as a more human-meaningful vocabulary than positional values. Frode agreed that x,y,z is purely an internal representation — equivalent to a page layout engine — and that no user cares about it directly.
The question of whether walking through information is inherently valuable — or whether “zoom” (the ability to move toward and away from information density at will, with or without physical movement) is the actual underlying affordance — remained productively unresolved. Tom introduced zoom-in/zoom-out and speed-up/slow-down as two independent but related axes of cognitive control that spatial environments could serve. Photography came up as Tom’s personal analog for deliberate deceleration — a conscious hitting of the brakes — suggesting that the design of “slow” modes in information environments may be as important as fast navigation.
The form-factor barrier to XR adoption was named clearly and practically: Frode noted that his partner Dini won’t wear the headset in the morning because it disturbs her hair before meetings. Ken stated the Meta Quest 3 form factor is “a complete showstopper for almost everybody” regardless of its functionality. AR glasses — not VR headsets — were agreed upon as the inevitable and necessary form factor for mainstream spatial text use, with Apple Vision Pro positioned as the closest current research-quality tool, analogous to the original Mac as a first-generation device pointing toward what would eventually be practical.
Peter raised a privacy concern about camera-equipped glasses that went beyond mere inconvenience: face recognition capabilities built into AR glasses could “poison the well” for the entire category, creating a “glass-holes” dynamic that society may reject regardless of the utility of the underlying technology. Ken affirmed this is already happening.
AI: Resources Mentioned
Walter Ong, Orality and Literacy — mentioned by Frode Hegland as currently being read and described as generating an inspiration in every sentence, making it difficult to get through. PDF shared by Peter Dimitrios: https://monoskop.org/images/d/db/Ong_Walter_J_Orality_and_Literacy_2nd_ed.pdf
Visual Meta Spatial format — Frode Hegland’s spatial layout specification using id plus x,y,z: https://visual-meta.info/spatial-visual-meta/
GatherTown — 2D conference environment mentioned by Peter Dimitrios as a successful model for navigating conference rooms virtually. No longer open source. OSS alternative WorkAdventure: https://workadventu.re/ (GitHub: https://github.com/workadventure/workadventure)
Make It So: Interaction Design Lessons From Science Fiction — book shared by Tom Haymes, described as “sci-fi design lessons”: https://a.co/d/0aQupgEF — Ken Perlin had not read it and expressed intent to do so; Frode confirmed having read it.
Liquid — Frode Hegland’s macOS text-selection tool enabling rapid search, translation, and reference operations via keyboard shortcut and toolbar.
Author — Frode Hegland’s macOS writing application with integrated visual map and Visual Meta support.
Ward Cunningham and the Federated Wiki — mentioned by Peter Dimitrios as the context for his 3D wiki explorations, with Federated Wiki pages licensed CC-BY-SA.
Google NotebookLM — mentioned by Peter Dimitrios as an example of AI-generated views going beyond explicit citation and link authorship.
Google Gemini — mentioned by Ken Perlin as a tool he has been using, noting it does not create the illusion of being a person, which he found “refreshing.”
Claude — mentioned by Frode Hegland as his primary AI tool, used for spatial data analysis and text processing in Author, including a “paste processed” function using Apple Intelligence and ChatGPT.
HyperCard / Bill Atkinson — discussed extensively as a historical precedent; Rob Swigart noted via chat that Steve Jobs killed it and that Bill Atkinson was asked to build a VR version but declined.
VisiCalc, WordPerfect — referenced by Peter Dimitrios and Ken Perlin as paradigm-defining applications whose descendants still shape current software, and as a benchmark for what a spatial “killer app” would need to achieve.
Ted Nelson / Xanadu — referenced in the context of explicit authorial hyperlinking versus AI-inferred linking; Tom Haymes and Ken Perlin affirmed in chat that automated linking in Xanadu was indeed Nelson’s intent.
NLS / Doug Engelbart — referenced by Frode as the source of the command-with-question-mark interaction pattern that inspired Liquid.
Second Life — mentioned by Tom Haymes and Peter Dimitrios as a cautionary precedent: conferences held in it often just reproduced PowerPoint presentations, failing to use the medium distinctively.
Babylon.js and Three.js — mentioned by Peter Dimitrios as his current WebXR development tools.
Apple Vision Pro, Meta Quest 3, Meta Ray-Bans — discussed as current XR hardware; Ray-Bans noted by Tom Haymes as increasingly visible in public.
Neo4j — mentioned by Peter Dimitrios in the context of graph database work with node-layout visualizations.
Hiroshi Ishii’s ClearBoard — referenced by Ken Perlin as a 34-year-old precedent for shared spatial interaction over video, with data appearing in the same position on screen for both participants.
Minority Report, John Wick, Iron Man / Jarvis — referenced by Ken Perlin in the context of diegetic prototyping as designed fictions worth studying for spatial interaction ideas.
Munch Museum, Oslo — mentioned by Frode Hegland as a recent visit, noting one floor used physical dark-room installations to orient visitors around Edvard Munch’s life as an example of “fake spaces helping people orient.”
Jim Groom — mentioned by Tom Haymes as having delivered a memorable Second Life conference presentation involving a flamethrower that demonstrated what the medium uniquely enabled.
