‘The Future of Text 6’ in XR

We will be releasing a version of The Future of Text Volume 6 in XR. After a brief overview by Dene Grigar, Frode Hegland and Fabien Bénétou, we will discuss how the book should be presented in XR.

The premise is this, as discussed last week: The articles will remain 2D, at least for this stage of thinking. What will be spatial will be how the user can navigate a large amount of articles (we hope we might include The Future of Text Vol 1-6 all-in) by viewing them in a spatial Volume, in the form of a Cube. Please have a look through last week’s session for a better picture of what we mean by this, but the essence is simple:

The articles will be in a large Cube and the different sides of the Cube can help direct where the articles appear.

Frode Hegland, Tom Haymes, Tess Rafferty, Jim Strahorn, Jonathan Finn, Dene Grigar, Ayaskant Panigrahi, David De Roure, Jack Park, Jamie Blustein, Peter Dimitrios, Rob Swigart, Timur Shchoukine, Mark Anderson, Ken Pfeuffer, Fabien Bénétou, Peter Wasilko, Brandel Zachernuk, Jimmy Sixdof

AI: Summary

This meeting focused on designing interactions for viewing and organizing academic articles from the Future of Text book series within an XR cube environment. Frode Hegland presented a conceptual framework where articles would be displayed within a 3D cube with configurable “frames” representing different organizational dimensions (time, topics, authors, etc.). The discussion centered on how users could scale, rotate, and interact with this cube, as well as manipulate individual articles within it. Key themes included balancing intuitive gestures with advanced functionality, maintaining user comfort and control, enabling both novice and expert interactions, and creating spatial hypertext capabilities where users could annotate and create connections between articles. The group explored various interaction modalities including hand gestures, voice commands, eye tracking, and multi-finger controls while considering practical implementation constraints.

AI: Speaker Summary

Frode Hegland served as the primary presenter and moderator, introducing the cube concept for organizing Future of Text articles in XR. He emphasized the need for simple, intuitive interactions that could inspire novice users while providing advanced capabilities for experts. Frode was particularly interested in scaling interactions, spatial organization, and the concept of “frames” as configurable organizational dimensions. He stressed the importance of balancing complexity with usability and wanted to create a “thinking cap” environment for extended cognition.

Jim Strahorn was enthusiastic about visual text elements and immediately connected with the collapsible text feature Frode demonstrated. He brought up the important distinction between being inside versus outside the cube, suggesting that being inside would allow immersion in the data. He advocated for rich visual representations and emphasized the productivity challenges of current reading processes due to inadequate formatting.

Tom Haymes positioned himself as wanting to go “beyond text” to visually connect ideas rather than just text representations. He was interested in connecting ideas rather than documents and suggested using LLMs to create visual idea networks. He emphasized XR’s strength in providing context rather than detail and advocated for breaking free from physical world constraints while maintaining useful constraints.

Tess Rafferty approached the discussion from an entertainment and storytelling perspective, asking practical questions about navigation between different views and frames. She suggested walking into frames as rooms and was interested in research functionality like keyword search, summaries, and “show me more like this” capabilities. She brought up the importance of specificity in similarity searches and contextualizing interactions in familiar paradigms.

Jonathan Finn contributed insights from his experience creating Sibelius music software and his work on the Mind’s Eye interface concept. He suggested simple two-handed gestures for scaling and emphasized practical considerations for implementation. He was interested in annotation capabilities and suggested that LLMs could help guess relationship annotations between connected items.

Dene Grigar as Co-PI focused on user experience considerations, particularly around signaling to users which interactions follow physical world rules versus digital affordances. She emphasized the importance of consistent design language and familiar interaction patterns that users already understand from existing devices. She stressed the need to think carefully about what signals users receive about interaction possibilities.

Ayaskant Panigrahi provided extensive technical references and examples from Leap Motion and other systems. He shared multiple demonstration videos showing various interaction paradigms including sliders, direct manipulation, scaffolding concepts, and micro-gestures. He was particularly interested in design language consistency and the distinction between physics-enabled and scaffolded interaction modes.

David De Roure brought academic perspectives on digital scholarship and was interested in the social aspects of the cube as a shared artifact. He asked about multi-user capabilities and how cubes could be used both live and archived. He emphasized the importance of making the environment safe and easy for experimentation and play.

Jack Park connected the discussion to Doug Engelbart’s work on co-evolution of human and tool systems. He suggested concepts like 3D pivot browsing and temporal metaphors for navigating information. He advocated for simpler tools like spaCy for text analysis rather than climate-harmful LLMs and emphasized the connection to historical hypertext work.

Jamie Blustein brought extensive expertise in hypertext and information retrieval, particularly interested in finding argument threads across texts despite vocabulary variations. She emphasized the importance of Fitts’s Law for interaction design and suggested avatar-based navigation. She was concerned about the practical limitations of physical gestures and advocated for modal interface design principles.

Peter Dimitrios focused on federation and interoperability concerns, wanting to avoid closed systems. He suggested gaze-based interactions with laser pointers and emphasized the importance of undo functionality for safe experimentation. He connected the discussion to federated wiki concepts and change visualization from software development.

Rob Swigart questioned why voice commands weren’t being discussed more prominently, referencing his experience with Siri and game controllers. He brought up interface history and suggested that voice could provide natural command capabilities without requiring users to learn complex gesture vocabularies.

Timur Shchoukine introduced concepts from continental philosophy and hybrid intelligence, focusing on assemblages of human and machine intelligence. He was interested in resonant prompting and fractal text building within language models. He emphasized core-realization beyond just co-evolution and brought a philosophical perspective to human-machine interaction.

Mark Anderson provided crucial data about the corpus size (approximately 330 articles from volumes 1-5) and emphasized the complexity of working with large datasets versus small selections. He was concerned about the “cheat” of assuming pre-filtered selections and stressed the importance of serendipitous discovery. He questioned the boundaries between LLM extraction and analysis and highlighted the hidden costs of manual data cleaning.

Ken Pfeuffer brought expertise in eye tracking and human-computer interaction, particularly the combination of gaze and hand interactions. He emphasized the importance of comfortable reading experiences and suggested simple solutions like maximize buttons alongside complex gesture systems. He stressed that reading should remain a priority even within the spatial organization system.

Fabien Bénétou served as the technical reality check, having to implement whatever the group designed. He provided insights on current XR capabilities and limitations, including existing work on voice commands combined with gestures. He demonstrated practical implementations like wrist-tapping for commands and dragging code snippets to interaction points. He emphasized the need for realistic metadata and actual files to work with.

Peter Wasilko suggested advanced concepts like temporal scrubbers for navigating interaction history and semantic zooming for different levels of detail. He referenced Benedikt’s cyberspace concepts and dimensional unfolding. He emphasized the importance of being able to return to previous states and keyframe snapshots for complex exploration sessions.

Brandel Zachernuk brought expertise in advanced gesture recognition and multi-finger interactions. He emphasized the richness of hand tracking data and suggested more sophisticated gesture vocabularies beyond simple pinch interactions. He was interested in cancelable actions, progressive refinement of gestures, and the concept of hands as keys with different finger positions meaning different qualifications. He suggested the idea of “GenUI” – generated interfaces that users don’t expect to persist.

AI: Topics Discussed

WebXR was primarily discussed through Fabien Bénétou’s practical implementation perspective. He mentioned their current prototyping work in WebXR and the ability to share configurations via URLs. The discussion included existing implementations of features like wrist-tapping for commands and dragging code snippets to interaction points. There was mention of combining voice commands with gesture interactions in WebXR environments.

Extensive discussion covered various gesture types including two-handed scaling gestures (pinch-and-expand), single-hand interactions, multi-finger complex gestures, and direct manipulation through grabbing and dropping. The group explored hand menus attached to fingers, gesture cancellation capabilities, and the distinction between simple gestures for novices versus complex gesture vocabularies for experts. Specific gestures discussed included pinching for scaling, throwing backwards for deletion, and using both hands spread wide for scaling operations.

Several other significant topics emerged including spatial hypertext and annotation capabilities, the social aspects of shared cube environments, temporal navigation and history management, semantic zooming for different detail levels, voice command integration, eye tracking combined with hand interactions, the distinction between extraction versus analysis of text, federation and interoperability concerns, and the balance between following physical world rules versus creating new digital affordances.

Anecdotes? Frode’s son Edgar made a brief appearance during the call, which Jim Strahorn found amusing. Frode mentioned watching “The Rookie” TV show with his wife and how it seemed to rebrand the LA police force. There was discussion of movies like Minority Report, Time Bandits, and Superman II as reference points for cube-like interfaces. Jamie Blustein mentioned the impracticality of Minority Report gestures, noting that “even Tom Hanks would hurt after a while doing that.”

Did anyone seem to change their position during the call? There wasn’t clear evidence of major position changes, but there was notable evolution in thinking about the complexity of interactions. Several participants, particularly Mark Anderson, emphasized the overwhelming nature of dealing with large datasets versus the simplified examples being discussed. The group seemed to converge on the need for both simple and advanced interaction modes, with progressive disclosure being important.

What were the major outcomes of this session? The session established a framework for thinking about cube-based article organization with configurable frames, identified key interaction challenges around scaling and manipulation, created consensus around the need for both novice and expert interaction modes, and provided Fabien with concrete direction for implementation work leading up to the ACM Hypertext conference presentation. The group also established the importance of user annotation and trail-making capabilities within the spatial environment.

AI: Concepts Introduced

Frames – Defined by Frode Hegland as the faces of the cube that “frame” information rather than wall it off. These represent different organizational dimensions like time, topics, or authors that can be configured by users.

Spatial Hypertext – Referenced by Frode as the territory users enter when they move articles around within the cube space, creating their own organizational structures.

Collapse – Frode’s term for the text folding feature in Author software that allows hiding text sections behind visual indicators.

GenUI – Brandel Zachernuk introduced this term meaning “generated/improvised interfaces that people don’t have an expectation of persistence.”

Modal interfaces – Jamie Blustein defined these as interfaces where “the same gesture can indicate different actions” depending on the current mode or state.

AI: People Mentioned

Doug Engelbart by Jack Park (context of co-evolution of human and tool systems), Frode Hegland (historical connection), Mark Anderson (connection to hypertext work), Jamie Blustein (meeting at hypertext conferences), Tom Haymes (inspiration for thinking), Peter Wasilko (inspiration for thinking)

Ted Nelson by Mark Anderson (context of Xanadu and document visualization), Jamie Blustein (context of hypertext conferences)

Mark Bernstein by Mark Anderson (context of working together), Peter Wasilko (first meeting at Macworld), Tom Haymes (context of Information City work)

Vannevar Bush by Frode Hegland (context of trails), Tom Haymes (major inspiration)

Gordon Moore by Frode Hegland (context of Moore’s Law and Engelbart)

Erik Loyers by Dene Grigar (context of Leap Motion art)

Maxime Cordeil by Brandel Zachernuk (context of recent forensic autopsy work)

Keiichi Matsuda by Brandel Zachernuk (context of previous AR work), Ayaskant Panigrahi (shared reference link)

James Burke by Peter Dimitrios (context of “Connections” show)

Bob’s character from Reboot by Jamie Blustein (context of wrist interaction reference)

Steve Dieberger by Mark Anderson (context of Information City concept)

Simon, Don Dallas by Timur Shchoukine (context of continental philosophy)

Neal Stephenson by Jamie Blustein (context of Diamond Age novel)

AI: Product or Company Names Mentioned

Sibelius by Jonathan Finn (music scoring application he created)

Apple Vision Pro by Ken Pfeuffer (context of gaze and pinch interactions)

SRI International/Stanford Research Institute by Jack Park (context of Doug Engelbart’s work location)

Leap Motion by Ayaskant Panigrahi and Dene Grigar (hand tracking controller and artist experiments)

IBM by Peter Dimitrios (his employer for 40 years)

Bootstrap Institute by Rob Swigart (context of Doug Engelbart connection)

ACM Hypertext by multiple participants (upcoming conference)

Dalhousie University by Jamie Blustein (his academic affiliation)

Washington State University Vancouver by Dene Grigar (her academic affiliation)

Royal Northern College of Music by David De Roure (visiting professor role)

Southampton University by Mark Anderson (visiting professor role)

Aarhus University by Ken Pfeuffer (his academic affiliation)

OpenAI by mentioned in context of LLM discussions

Siri by Rob Swigart (voice command example)

GitHub by Peter Dimitrios (context of change visualization)

Quest by Brandel Zachernuk (VR platform context)

MacOS by Frode Hegland (context of built-in APIs)

Remarkable and Kindle by Ken Pfeuffer (e-reader context)

spaCy by Jack Park (natural language processing library)

AI: Other

The meeting revealed an interesting tension between the academic rigor of the participants and the practical constraints of implementation. Many participants brought deep expertise from hypertext, information retrieval, and human-computer interaction fields, while Fabien Bénétou had to balance their ambitious ideas with implementation realities. The discussion showed how complex even seemingly simple spatial interactions become when considered thoroughly. There was also notable diversity in perspectives, from Timur’s philosophical approach to continental philosophy and hybrid intelligence, to practical concerns about gorilla arm syndrome and user comfort. The meeting demonstrated the interdisciplinary nature of designing XR interfaces for academic work, requiring consideration of cognitive science, interaction design, information architecture, and technical implementation constraints.

Chat Log URLs

‘The Future of Text 6’ in XR

https://en.wikipedia.org/wiki/SRI_International

https://sites.google.com/view/cordeil/home

https://erikloyer.com/index.php/projects/detail/breathing_room

https://docs.ultraleap.com/ultralab/virtual-elements-in-vr.html

https://pipturner.co.uk/#casestudy-prosho

http://km.cx/projects/leap-motion

https://neal.fun/infinite-craft

https://a.co/d/8OLWQ0u

Chat Log Summary

The chat log shows active parallel discussion supporting the main conversation. Key themes included technical references shared by Ayaskant Panigrahi (Leap Motion examples, virtual elements, scaffolding concepts), discussion of interaction modalities and their implementation challenges, sharing of relevant research and artistic work, and clarification of technical terms and concepts. Participants frequently reacted to comments with emojis showing engagement. There was ongoing discussion about LLM usage versus traditional text processing approaches, with Jack Park advocating for simpler tools like spaCy instead of energy-intensive LLMs. The chat also included logistical elements like late arrivals, technical difficulties (particularly Fabien’s audio issues), and location sharing among international participants. Notable was the collaborative nature, with participants building on each other’s ideas and sharing relevant resources in real-time.

Pre-Session Information

Design Framework

The design decisions we are looking to make are all within the scope of helping people find articles to read, since Vol 1-6 features hundreds of articles.

Volume in the shape of a Cube

What we are designing are interactions for articles presented inside a single Cube, where each of the sides, called ‘Frames’ can automatically configure the space, or assist in manual configurations.

Walkthrough Video

Default Configuration

What the default configuration should be will be part of or discussion. To have something tangible to start, it is suggested that:

View options

Views should allow the user to

Change the appearance of the articles (nodes).
Show/hide links/Citations

Interaction Questions

Interaction design questions then become, how should the user be able to perform these interactions?

Move the Frames around.
Change what the Frames are/show.
Change the gamut of what the Frames show, or scale.
Interact with the Cube to scale it.
Interact with the Cube to rotate it.
Specify what to hide/reveal.
Store layout configurations.
Recall layout configurations.

A suggestion is to have something like ‘Labels’ outside the Frames to enable interactions.

Frames

The Frames can be used to configure the contents, such as by:

Timeline axis.
Map of Topics created by user.
Map of Topics created by editor/publisher.
List of Authors.
List of Locations.
List of other Keywords.
List of Glossary Terms.
All References from documents in the volume.
AI clustering.

Display of Articles

The articles should appear at different levels of reveal depending on placement and interaction:

Full, open document.
Preview (such as first page of text).
Icon (such as title and author) with color and shape based on metadata.

Metadata

The expected available metadata includes

Title.
Date.
Authors.
Locations (published).
Locations (mentioned).
Text with time associated with it.
References.
Cited by.
Topics. By editor and user.
Keywords.
Size/length.
Headings.
Glossary.

Reading An Article (out of scope)

Reading an article will be in the presentation of a traditional digital page, either as a single scroll or paginated (ideally user choice). References should be visual when requested by the user, to point to previous articles, where cited, or onto a ‘back wall’ of sorts for References outside of The Future of Text books.

This is inspired by some of the interactions of Ted Nelson’s Xanadu, though does not approach the technical architecture: