Reservoir
created on 2021-08-12
development-log
Purpose
Reservoir is my project to capture the work I’m doing around creating a personal knowledge graph to power a personal search engine and other projects that involve full access to all the digital data I produce (like time capsules and waypoints). Eventually, I hope this project evolves to push the envelope forward to allow anyone to take control of their data and then use that in collaborative, fun, and whimsical ways.
Goals
Reservoir should act as a data aggregator that pushes for personal data ownership and interoperability. It should be a champion for a universally interoperable data surface that anyone can use
Design
The design composes of an outer layer of third-party data sources that are fed into a central data reservoir. The data reservoir enforces a consistent schema on these different types of data and segments them based on type of data (text, media, etc.) and exposes a simple API for querying that stream of data which can be composed by any number of clients (bring your own client) or apps.
Some sample client apps that I hope to support with this data would be time capsules and waypoints.
- time capsules requires date information, multi-media content and people information
- waypoints requires app/browser/location information, date, information, multi-media content and people information
Data Sources
A data source can be…
- a mobile app
- i.e. iMessage, Apple Photos
- a website
- i.e. amazon.com, news websites, etc.
- a set of files (local or remote or generated from an export flow)
- i.e. Google Takeout, Roam data, personal website
- a container / resource within an app
- i.e. different coda docs, different tables within a data warehouse
The kinds of insights or behaviors that I want to learn about myself / my friends from these data sources include things that I was…
- listening
- reading
- building
- smelling
- eating
- feeling
- how i was moving
- sleeping
- the weather
- eating
- cooking
- drinking
- dancing to
- learning
- creating
- watching
- internalizing
- remembering
- appreciating
- practicing
- ritualizing
- rhyming
- crafting
- striving for
- lamenting
- grieving
- inspired by
- what was strenuous
- something i was moved by
- something that touched me
- who i was hanging with
- who im admiring or learning from
- what objects im appreciating
- who im appreciating
- what aspects of life im appreciating
- what you’re thinking of
- who you’re missing
- who you’re loving
- how your skin is feeling
- what are you on the fence about
- what is inspiring you aesthetically
- where you want to go
- where you were last geographically
All of this is an act of appreciation. it’s an act of active gratitude to the things that happen in your life and bringing them into the foreground. That’s the motivation behind my window.spencerchang.me project. I wanted a digital memorial to the things that happen in the physical world. A digital mirror of what is happening in my life in my control so that I can present how it appears to people that other people can pass by and start conversations around
from https://coda.io/d/Waypoints_dTqv7dmS4zZ/Another-framing_sugpq#_lu5ep
that can be accessed via…
- an API
- web scraping
- a data dump/export
- a script
Note that some data sources could require multiple methods of accessing in order to
Data sources can be hooked up to the reservoir by specifying a Fetcher
and a Reducer
.
Fetcher
: a function that is able to retrieve the raw data from a given data source if provided specific inputs. Ideally, this should be able to run automatically and functionally without any user input- how to handle file inputs?
Reducer
: a function that takes in the raw data and massages it into an accepted schema format for inputting into the reservoir.
An example data source module
// Fetcher for an API
interface SpotifyRawSong {
...
}
function fetchPlayedSpotifySongs() {
}
Schema
Each entry in the reservoir should correspond to one entity that has been created / consumed / acted upon by the user in another app. This might include:
- a note from a personal note software
- a song played from music software
- an article from a read later software
- a book read from a book software
- a post from a social media software
Some schema properties that should be common to every object in the reservoir:
- content
- insertedAt
- modifiedAt
- createdAt
- creators
- source
- people?
Data Pipeline
Inspirational Work
- tanagram: a project reimagining from the ground-up what a development system that has open data could be like
- zero data apps: a project working to create apps where data is owned by the user and never leaves their control. (i.e. like obsidian)
- monocle: the personal search engine from linus which uses his own personal software stack in order to power a universal search
- Dogsheep is probably the OG flavor of this. It’s a personal analytics hub made by Simon Willison and already has a bunch of converters from X platform to Y format (except the Y is always sqlite for this tool)