Three ways to apply AI to mathematics

If you wanted to train a machine to play Go, there are three ways you could do it, at decreasing levels of “from-scratchness.

You could tell the machine the rules of the game, and have it play many millions of games against itself; your goal is to learn a function that does a good job assigning a value to a game state, and you evaluate such a function by seeing how often it wins in this repeated arena. This is an oversimplified account of what AlphaGo does.

Or you could have the machine try to learn a state function from some database of games actually played by expert human competitors — those games would be entered in some formal format and the machine would try to learn to imitate those expert players. Of course, you could then combine this with simulated internal games as in the first step, but you’d be starting with a leg up from accumulated human knowledge.

The third way would be to train on every natural-language book ever written about Go and try to produce natural-language response to natural-language questions that just tells you what to do.

I don’t actually care about Go, but I do care about math, and I think all three of these approaches have loose analogues as we ask what it might look like for machines to help mathematicians. The first, “from scratch” approach, is the side of things I’ve worked on in projects like PatternBoost and FunSearch. (OK, maybe FunSearch has aspects of both method 1 and method 2.) Here you actively try to keep prior human knowledge away from the machine, because you want to see what it can do on its own.

The second approach is where I’d put formalized proof. If we try to train a machine to get from one assertion to another using a chain of proven theorems in a formal system like Lean, we’re starting from a high platform: a huge repository of theorems and even more importantly definitions which guide the machine along channels which people have already figured out are rich in meaning. AlphaProof is like this.

The third approach is more like what GPT o1 is doing — asking whether you can circumvent the formal language entirely and just generate text which kindles mathematical insight in the mind of the human reader.

I think all of these are reasonable things to try. I guess my own mostly unjustified prejudice is that the first approach is the one that has the most to teach us about what the scope of what machine learning actually is, while the second is the one that will probably end up being most useful in practice. The third? So far I think it doesn’t work. I don’t think it couldn’t work. But I also don’t think it’s on an obvious trajectory towards working, if words like “trajectory” even make sense in this context. At some point I’ll post an o1-preview dialogue which I found very illluminating in this respect.

 •  0 comments  •  flag
Share on Twitter
Published on December 20, 2024 08:24
No comments have been added yet.


Jordan Ellenberg's Blog

Jordan Ellenberg
Jordan Ellenberg isn't a Goodreads Author (yet), but they do have a blog, so here are some recent posts imported from their feed.
Follow Jordan Ellenberg's blog with rss.