Thanks heaps to Ben Weinstein-Raun, without whose comments this post would have a confident and wrong conclusion. Thanks also to Chen-Yu Yang for spotting a typo.

Suppose you’ve just inherited a bunch of money, and you want to spend it all on producing as much of some good as you can, for example paperclips. You have to decide how much of your wealth you want to spend on researching the efficient production of paperclips, and how much you want to spend on manufacturing the paperclips. The right move is probably to do all the research and then manufacture all your paperclips after you’ve found the most efficient way of producing them.

Research has diminishing marginal returns. When I started thinking about this, my guess was that the proportion of your effort you’d spend on research would decrease quickly as you increased the amount of wealth you control. But then I tried making a mathematical model for it and now I am not sure.

My result: If the price of making a paperclip decreases and gets arbitrarily close to zero after sufficent research, you probably end up doing a lot of research. If the price does not get arbitrarily close to zero, you end up doing much less.

(I started wondering about this because I was wondering about the subjective experience of a paperclip maximizer, and the answer to this question informs that. The actual manufacturing work of a paperclip maximizer is probably not very computationally intensive, but the research work probably is. It seems possible that the research work of the paperclip maximizer involves subjective experience, but it seems much less likely that the manufacturing work does.)

Logarithmic returns

Let’s investigate what happens if returns to research are logarithmic and unlimited.

(Owen Cotton-Barratt has previously argued that some kinds of research have logarithmic returns. One nice argument for it is that your returns are logarithmic if on day \(n\) you’re twice as productive as you’ll be on day \(2n\), and this inverse law of progress seems plausible and natural.)

So the cost of manufacturing a paperclip after you’ve expended \(r\) effort on researching paperclip manufacturing is \(\frac{1}{\log(r)}\). Now if you have \(x\) total wealth, the total paperclip manufacture you get done is

$^$(x - r) \cdot \log(r)$^$

which you can solve to get the optimal value of \(r\) (full working here):

$^$ e^{\operatorname{LambertW}{\left (e \cdot x \right )} - 1} $^$

The Lambert W function is asymptotially equal to \(\log(x) - \log(\log(x)) + o(1)\). So overall we have

\[\begin{align} e^{\operatorname{LambertW}{\left (e \cdot x \right )} - 1} &= e^{\log(e \cdot x) - \log(\log(e \cdot x)) + o(1) - 1} \\ &= \frac{e \cdot x}{\log(e \cdot x)} \cdot o(1) \\ &\propto \frac{x}{\log(x)} \end{align}\]

So the proportion of energy dedicated to research increases like \(\frac{1}{\log(x)}\).

Other functional forms

If the cost of a paperclip after \(r\) research is \(\frac{1}{\sqrt[n]{r}}\), the proportion of your resources that you should spend on research is the constant \(\frac{1}{n + 1}\).

Returns which approach an asymptote

How about if there’s some minimum paperclip cost, and as you do more research you approach that asymptotically?

If your cost is \(1 + \frac{1}{r}\), which represents a situation where you can get more and more efficient, but there’s a fundamental cost you can’t get better than, then optimal \(r\) is \(\sqrt{x + 1} - 1\).

So in this case, \(r\) grows much slower than \(x\).

Conclusion

I have no idea what the answer to this question is! Depending on the returns to research, you might do a lot of it or not much of it.

This looks kind of like an exploitation-exploration question. I don’t know the results on that kind of tradeoff well enough to know if they apply here.

Aside: What about if the maximizer has an unbounded utility function?

If your utility function is literally the expected number of paperclips in the universe, then you might spend all of your wealth looking for a way to gain control over an infinitely large amount of energy, because if the probability of this strategy working is nonzero it will have a higher EV than any other option, and you will plausibly never be able to be confident enough that this strategy won’t work in order to stop working on it. If you think that this is rational behavior, then with probability 1 you will spend all of your wealth on researching perpetual motion machines or whatever.

Footnote: comment from Chen-Yu

Let c(r) be the unit cost given research unit r. Maximizing (x-r)/c(r) yields 1/(x-r) = -c’(r)/c(r) = - (d/dr) log(c(r)). Thus best r depends on how fast log(c(r)) drops. If c(r) = exp(-kr), r* will be very close to x. If c(r) = r^(-1/n), r* will be proportional to x. And c(r) = r^(-1/n) is even slower


comments on Facebook here