I’ve been reading Paul Christiano’s AI alignment blog recently. His ideas about capability amplification seem like they can be directly turned into a perhaps fun game. (Thanks also to Ajeya for discussing this with me and clarifying my thoughts on it.)

Let’s call the game “relay programming”. To run the game, the game master picks a challenging programming problem. For concreteness, let’s say Project Euler problem 62:

“The cube, 41063625 (345 cubed), can be permuted to produce two other cubes: 56623104 (384 cubed) and 66430125 (405 cubed). In fact, 41063625 is the smallest cube which has exactly three permutations of its digits which are also cube. Find the smallest cube for which exactly five permutations of its digits are cube.”

(I chose this problem because I don’t know how to solve it, and I don’t know what strategy I’d use to solve it, and I don’t think I’d make that much progress in ten minutes, but I’m 75% confident I’d solve it in five hours of work, based on its difficulty as rated on the Project Euler website.)

This game is called a relay because it is played with a team of people, each of whom is only allowed ten minutes to contribute. The first player is sent an email which contains the problem; before ten minutes is up they must reply with the email that they want to be sent to the second player. The game continues with fresh players until the task is solved.

So every player has ten minutes to make as much progress as possible. One possible thing that the first player could do over those ten minutes is think about how to decompose the problem. They might identify plausible approaches A, B, and C. I imagine that the team might maintain a global stack of tasks that need to be done at the top of the giant chain email; the player might add these items to the top of the stack:

  • Investigate approach A, write a summary of how promising it is.
  • Investigate approach B, write a summary of how promising it is.
  • Investigate approach C, write a summary of how promising it is.
  • Read the summaries of approaches A, B, and C; pursue whichever seems most promising, and pass forward the instruction “If this approach doesn’t seem promising, write a summary of what happened and then come back to read the summaries of A, B, and C, then proceed as seems reasonable”.

(This approach of having a stack of tasks is like a human version of continuation-passing style, which is a style of functional programming.)

If the stack becomes empty and the current player doesn’t know a way forward, the next player should probably try to summarize the lessons learned from the previous approach and then try to start a new approach.

There are a wide variety of other subtasks I can imagine players doing, such as trying to condense memos from previous players so that future players have an easier time.

I think it’s an interesting open question how much slowdown you get from having everything broken into ten minute chunks. I think it’s also interesting to ask how the amount of slowdown varies as a function of how long each chunk is—e.g., doing this with ten seconds each seems impossible, and doing it with two weeks each sounds quite a bit easier.

I also think that it’s fun to speculate about how what kinds of tasks are possible with this chunked approach. Eg, suppose the task is to translate a complicated English sentence into idiomatic French, given access to the internet but not to translation tools. I think it is likely that there is no way for me to do a good job of that in ten minute chunks, no matter how many copies of me I can make. However, this seems like an unfair task, because it would also take a lot of time for me even without the ten-minute-chunk restriction. I’d be interested to hear examples of tasks that would normally take me only an hour that seem like they would be way harder inside this game.

And there are some tasks that I think would be pretty efficiently doable with this approach. For example, some programming tasks are sequential enough that I could mostly just work on improving the previous person’s code without very much context required. Eg building frontend apps often feels like a sequence of tasks in a very obvious order that can be easily split up into very small chunks; I think I’d have less than a 2x slowdown on that. (The slowdown would be much higher if the previous programmer had a very different programming style than me!)

I also think it might be fun to try to actually play this game. I’d be interested in starting out with a much easier problem which is intended to take 45 minutes. If you’d be interested in contributing ten minutes of programming and thinking time, let me know.

Comments on Facebook here