Just some brief notes reflecting on the tools I’ve built and what I think is important. A lot of this came out of a recent project I just finished up and submitted to an HCI conference, which we’ll be sharing in a few months.
There are a million and one articles online about bidirectional editors, but somehow it’s still really rare to be done well, especially in computational graphics tools. Besides it just being really hard to implement well, it implies a few things:
A good bidirectional editor will probably need a declarative model for rendering graphics. This is why most examples we see today are basically glorified WYSIWYG editors, because it’s easy to map markup to output 1:1. Imperative graphics models however are much harder to actually link to the produced output since you might have a many:1 model of lines of code written to graphics output onto the screen, especially if you have dynamic stuff that changes over time.
Unfortunately even having a totally declarative rendering model doesn’t solve the issue of linking things like dynamic state to output. For example, if your code adds a new bouncing ball to the screen every time you click the screen, how do you show people what’s happening in the code? You need to show which state variables are being manipulated, which interaction / mouse handlers are being activated, and lastly which lines of code in the renderer are affected by the changes in state.
This kind of behavior implies being able to create map between 1) what’s drawn on the screen, 2) the call stack of the program (which graphics functions are called, in which scope / contexts, and with which state and data) and 3) which AST nodes they’re linked to.
And being able to link these three things together is where all tools currently fail. It’s because you need to completely control the runtime and the rendering engine in parallel. If you want to understand how functions are being called, you need to inspect the callstack. And if you want to be able to do this at any point in the program, or even inspect this in real-time, you need to control the interpreter. Using the browser’s built-in debugging tools won’t work at runtime, and playing tricks like inserting log statements after every variable declaration is really hard to get right and very slow. And you can’t pause
eval programatically or easily extract the entire state of the program and memory, especially in a structured way.
Unfortunately it’s very hard to beat the browser’s native implementation of
eval in terms of performance, and every js-in-js interpreter isn’t fast enough for real-time applications. So what we’ll likely need in the future is a highly customizable wasm-based JS interpreter that supports time-travel-friendly state snapshots out of the box.
When I say structure, I mean having strict rules on how to write your code. Specific naming conventions, specific locations for data, etc.
Adding some extra friction to the dev experience but has some unexpected benefits:
Let’s take an example instead where instead of having a specific place for storing state in your sketch, you jus use global variables. Thats cool. But then doing any sort of analysis suddenly becomes a parsing game. Sure, extracting / reading out global variables is easy enough, but what about editing or writing to them? Now, you’re dealing with obtuse AST APIs, and if you have a custom code editor, you’re doing endless translation between ASTs and character ranges / positions.
Now instead let’s assume all state is stored in a top level object. Now writing / reading state is one call to retrieve that object, then you can read / write to it with standard functions on the object. ASTs are hard to work with. Regular objects are not.
Especially since more and more code will be read and written by AI, the details of what syntax you’re writing will become increasingly less important. But what will remain important is making it easy for LLMs to parse and understand how to edit and contribute to code it’s written, and the terser the better.
For some really fantastic work on these concepts, please check out Szymon Kaliski’s work on Dacein
Manually using git when you’re in a divergent creative exploration or trying out lots of different things just doesn’t cut it. It has to be automatic to a large degree and support extremely lightweight bookmarking / committing.
It’s also important to have both character-level history (being able to scrub through the history of code) and larger, semantically and graphically meaningful checkpoints. You should be able to look at some timeline of your project and just point out what part of your project you wanna go back to, either by reading a description or looking at a picture (screenshot?). People zoom in and out of abstraction levels all the time when making creative work, and it’s just as important to really squint close at specific changes in code, and zoom out to see what moments in time over the course of an entire programming session or project are important, and what the project looked like at every single point. Balancing detail and noise is key.
With this in mind, immutable history is the way to go. And I think most people actually prefer this without knowing what it’s called. It’s why we take so many screenshots– they’re exact, unchangable, frozen moments in time that capture all “visual state” of a project.
Branching should also happen automatically while trying to edit code in the past. Combining intelligent history graph pruning (getting rid of stale branches or useless checkpoints) with graph snapshots should make it trivial to get to any important state of the project, and create new versions of the graph by editing the past without needing to worry about how unexpected changes earlier in the project have to propogate and be applied downward. But, if you want to do that, it can saved as an entirely new branch.
Taking immutable snapshots of your sketch/code at any time implies a few important things for the system:
- Being able to record side effects and using seeded, pseudo-random values
- Time-travel debugging becomes easy
- Playback of sketches is easy
- Can easily diff two timestamps
- Can recreate a sketch with a data seed
- Not needing video or photo to effectively share output (people can just run your code)
- Can deeplink to bugs and specific moments in time
Undo / redo (cmd+z et. al) are very good. They are fast, easy, and they should be used as hooks / onramps onto more complex and sophisticated versioning tools more.
It’s great to lean into people’s colloquial habits, like using undo/redo + copy+paste as a lightweight versioning tool. Will probably write a whole post on building a versioning tool on top of
cmd+v. For example, let’s say you notice your code isn’t working now, but you remembered it working 5 minutes ago.
A very useful (and emergent) flow that people do is they just undo (
cmd+z) a bunch until they see the code that worked, they grab it with
cmd+c, then they redo (
cmd+y) back to the latest point, and paste in the correct code snippet back into place (
cmd+v). This is essentially a kid-friendly version of checking out an earlier commit, stashing some changes, and then checking out the latest commit you were on and applying the
git stash (or doing some cherry-picking and applying patches).
This undo-copy-redo-paste workflow works really, really well for small things, but when you have changes that span multiple files (or even multiple locations within a single file), it breaks down. But there is something important here, and it should be formalized into a more approachable, robust system.