Potluck: Dynamic Documents as Personal Software

rw-book-cover

Metadata

Highlights

  • We think a promising workflow is gradual enrichment from docs to apps: starting with regular text documents and incrementally evolving them into interactive software. In this essay, we present a research prototype called Potluck that supports this workflow. Users can create live searches that extract structured information from freeform text, write formulas that compute with that information, and then display the results as dynamic annotations in the original document.
  • These notes contain meaningful information, like quantities in a recipe or weights in a workout log, that can serve as the basis for useful computations. How might we enable people to gradually turn these text documents into custom pieces of software?
  • Extensible searches Users can define searches: custom patterns that detect data within the text of a note. Searches are defined in a compositional pattern language which allows reusing patterns that others have written.
  • Together, these ideas form a loop. By treating text as both a source of information and a substrate for hosting a user interface, we can turn a text document into an interactive application.
  • Using freeform text data as a source of information for computations. Many tools only allow users to compute with data that’s been put into a specific structured format. In Potluck, we encourage people to write data in freeform text, and define searches to parse structure from the text.
  • Using text annotations to power an interactive interface. Even in tools that combine documents and computation, there’s often some separation between editable text and computational results. In Potluck, we deeply entangle interactive elements with the user’s text, by providing dynamic annotations that can overlay or restyle the original document. The effect is to treat the text itself as a place to host UI.
  • hile we care about enabling non-programmers, and have made some design decisions with them in mind, our current prototype does expect the user to have basic knowledge of JavaScript, and our test users have mostly been skilled programmers. We’re also not yet sure exactly where the limits of this model are—what kinds of apps are possible and desirable to build in this style?
    • Note: anything where the primary mode of interaction feels best as free-form text (or even drawing.. combined with OCR?)
  • apps create rigid data schemas that define the kinds of information we can record within them. We can fill out the available form fields, but we can’t add new fields or scribble in the margins. Structured data inputs struggle with ambiguity—when faced with a list of radio buttons, there’s no way we can select two options, like we might have done on a paper form.
  • In their 1998 paper Collaborative, Programmable Intelligent Agents, the Apple researchers Bonnie Nardi, James Miller, and David Wright describe data detectors: intelligent pattern recognizers built into the operating system which can detect structured data like phone numbers and street addresses contained within everyday unstructured documents, and then allow the user to take actions on that structured data. This idea was productized and lives on to this day in MacOS and iOS, although without the user extensibility envisioned by the original paper.
  • People should have the ability to encode their own knowledge and personal micro-syntax into their tools. However, defining abstract patterns over plain text can be difficult even for skilled programmers; regular expressions are notoriously hard to use. We need ergonomic tools for defining patterns.
  • programming by example (PBE): letting users provide concrete examples to specify a more general pattern. This technique has been explored by many systems, including the Flash Fill system deployed in Microsoft Excel, as well as LAPIS. There are also some interesting hybrid interaction models between PBE and code editing—in User Interaction Models for Disambiguation in Programming by Example, Mayer et al. use PBE to generate candidate programs, but also let users directly edit the resulting programs.
  • Potluck allows users to define patterns that are recognized within a text document. Patterns are created with a search interaction. Users are already familiar with searching for content in a word processor or web browser, so it’s a natural on-ramp to creating live data detectors.
  • Later on, the user will want to do arithmetic using only the number and not the unit, so they can also extract the number using a named capture group. If they edit the search to say {number:amount} g, the table of results will automatically show a column that just contains the amount: Extracting the number with a built-in pattern and named capture group
  • The user has computed a scaled value, but it’s not yet visible in the text document. They’d like to complete the scaler feature by covering up the original quantity with a scaled quantity. They can do this by setting up an annotation, which allows any column from a computation to show up in the text document.
  • This relationship can’t be easily expressed in a pattern or a regular expression; we need a spatial query that can look across longer distances in the document. In this case, we can write a spatial query that runs the following logic: “Start from the ingredient name in the directions, get all previous ingredient names and take the quantity of the first ingredient with a matching name”: Or, in code that we could enter into a Potluck formula: AllPrevOfType(ingredient, “ingredient”) .find(other => ( other.data.quantity !== undefined && other.isEqualTo(ingredient) ))
  • A key challenge in Potluck is accurately parsing structured information from a text document. We noticed that the difficulty of parsing depends a lot on how the text content arrives in the document: it’s much easier to parse information from a personal micro-syntax being typed into Potluck than it is to parse preexisting content being pasted in.
  • We also found it very natural to represent information using lightweight text syntaxes. In personal notes, people implicitly develop syntaxes to write down information like times, durations, or domain-specific information, and these conventions can be encoded in Potluck as patterns. For example, the plant watering document from the demo section above showed a simple syntax for recording watering dates. The figure below shows another example, where workout activities are grouped underneath dates. We found that Potluck’s combination of pattern searches and spatial queries was generally flexible enough to capture the underlying structure in many different kinds of informal syntax.
  • In Potluck, application state lives in the text. For example, you might note the last watered date for a plant using the text syntax 08/31/2022 , or use [x] to indicate a completed task. There is no hidden metadata; searches are just a function of the text.
  • Imagine an operating system with the principles of Potluck deeply woven in. People could start by just organically recording information however they want. As they come across places where the computer could help them, they would gradually add structure to their data, but only as much structure as is needed for the task at hand. They would then add bits of computational behavior, borrowed from others or created from scratch, to complete the task.
  • In Potluck we’ve focused on text as the general-purpose medium for data, but the broader ideas could extend to other media as well. A recipe could be represented as a photo of a scribbled index card, processed with a combination of OCR and customized data detectors, with the annotations printed back onto the photo. Applying the principles from Potluck, but to a photo rather than a text document Ultimately, we want a world where people are in control of their computing experiences. People should be able to teach their computers the meaning behind their data,

title: “Potluck: Dynamic Documents as Personal Software” author: “inkandswitch.com” url: ”https://www.inkandswitch.com/potluck/” date: 2023-12-19 source: hypothesis tags: media/articles

Potluck: Dynamic Documents as Personal Software

rw-book-cover

Metadata

Highlights

  • We think a promising workflow is gradual enrichment from docs to apps: starting with regular text documents and incrementally evolving them into interactive software. In this essay, we present a research prototype called Potluck that supports this workflow. Users can create live searches that extract structured information from freeform text, write formulas that compute with that information, and then display the results as dynamic annotations in the original document.
  • These notes contain meaningful information, like quantities in a recipe or weights in a workout log, that can serve as the basis for useful computations. How might we enable people to gradually turn these text documents into custom pieces of software?
  • Extensible searches Users can define searches: custom patterns that detect data within the text of a note. Searches are defined in a compositional pattern language which allows reusing patterns that others have written.
  • Together, these ideas form a loop. By treating text as both a source of information and a substrate for hosting a user interface, we can turn a text document into an interactive application.
  • Using freeform text data as a source of information for computations. Many tools only allow users to compute with data that’s been put into a specific structured format. In Potluck, we encourage people to write data in freeform text, and define searches to parse structure from the text.
  • Using text annotations to power an interactive interface. Even in tools that combine documents and computation, there’s often some separation between editable text and computational results. In Potluck, we deeply entangle interactive elements with the user’s text, by providing dynamic annotations that can overlay or restyle the original document. The effect is to treat the text itself as a place to host UI.
  • hile we care about enabling non-programmers, and have made some design decisions with them in mind, our current prototype does expect the user to have basic knowledge of JavaScript, and our test users have mostly been skilled programmers. We’re also not yet sure exactly where the limits of this model are—what kinds of apps are possible and desirable to build in this style?
    • Note: anything where the primary mode of interaction feels best as free-form text (or even drawing.. combined with OCR?)
  • apps create rigid data schemas that define the kinds of information we can record within them. We can fill out the available form fields, but we can’t add new fields or scribble in the margins. Structured data inputs struggle with ambiguity—when faced with a list of radio buttons, there’s no way we can select two options, like we might have done on a paper form.
  • In their 1998 paper Collaborative, Programmable Intelligent Agents, the Apple researchers Bonnie Nardi, James Miller, and David Wright describe data detectors: intelligent pattern recognizers built into the operating system which can detect structured data like phone numbers and street addresses contained within everyday unstructured documents, and then allow the user to take actions on that structured data. This idea was productized and lives on to this day in MacOS and iOS, although without the user extensibility envisioned by the original paper.
  • People should have the ability to encode their own knowledge and personal micro-syntax into their tools. However, defining abstract patterns over plain text can be difficult even for skilled programmers; regular expressions are notoriously hard to use. We need ergonomic tools for defining patterns.
  • programming by example (PBE): letting users provide concrete examples to specify a more general pattern. This technique has been explored by many systems, including the Flash Fill system deployed in Microsoft Excel, as well as LAPIS. There are also some interesting hybrid interaction models between PBE and code editing—in User Interaction Models for Disambiguation in Programming by Example, Mayer et al. use PBE to generate candidate programs, but also let users directly edit the resulting programs.
  • Potluck allows users to define patterns that are recognized within a text document. Patterns are created with a search interaction. Users are already familiar with searching for content in a word processor or web browser, so it’s a natural on-ramp to creating live data detectors.
  • Later on, the user will want to do arithmetic using only the number and not the unit, so they can also extract the number using a named capture group. If they edit the search to say {number:amount} g, the table of results will automatically show a column that just contains the amount: Extracting the number with a built-in pattern and named capture group
  • The user has computed a scaled value, but it’s not yet visible in the text document. They’d like to complete the scaler feature by covering up the original quantity with a scaled quantity. They can do this by setting up an annotation, which allows any column from a computation to show up in the text document.
  • This relationship can’t be easily expressed in a pattern or a regular expression; we need a spatial query that can look across longer distances in the document. In this case, we can write a spatial query that runs the following logic: “Start from the ingredient name in the directions, get all previous ingredient names and take the quantity of the first ingredient with a matching name”: Or, in code that we could enter into a Potluck formula: AllPrevOfType(ingredient, “ingredient”) .find(other => ( other.data.quantity !== undefined && other.isEqualTo(ingredient) ))
  • A key challenge in Potluck is accurately parsing structured information from a text document. We noticed that the difficulty of parsing depends a lot on how the text content arrives in the document: it’s much easier to parse information from a personal micro-syntax being typed into Potluck than it is to parse preexisting content being pasted in.
  • We also found it very natural to represent information using lightweight text syntaxes. In personal notes, people implicitly develop syntaxes to write down information like times, durations, or domain-specific information, and these conventions can be encoded in Potluck as patterns. For example, the plant watering document from the demo section above showed a simple syntax for recording watering dates. The figure below shows another example, where workout activities are grouped underneath dates. We found that Potluck’s combination of pattern searches and spatial queries was generally flexible enough to capture the underlying structure in many different kinds of informal syntax.
  • In Potluck, application state lives in the text. For example, you might note the last watered date for a plant using the text syntax 08/31/2022 , or use [x] to indicate a completed task. There is no hidden metadata; searches are just a function of the text.
  • Imagine an operating system with the principles of Potluck deeply woven in. People could start by just organically recording information however they want. As they come across places where the computer could help them, they would gradually add structure to their data, but only as much structure as is needed for the task at hand. They would then add bits of computational behavior, borrowed from others or created from scratch, to complete the task.
  • In Potluck we’ve focused on text as the general-purpose medium for data, but the broader ideas could extend to other media as well. A recipe could be represented as a photo of a scribbled index card, processed with a combination of OCR and customized data detectors, with the annotations printed back onto the photo. Applying the principles from Potluck, but to a photo rather than a text document Ultimately, we want a world where people are in control of their computing experiences. People should be able to teach their computers the meaning behind their data,