How Good Can AI Get?

After spending some time on this post I realized I’d already written a different version of it which is arguably better, but I’m going to press on to finish this since I need the writing practice.

When thinking about what can we expect from AI in the future, I think its good to analyze it in terms of what kinds of limitations exist on its abilities.

Compute: The compute capacity of the system both for training and evaluation.
Structure: How the data is organized and the model defined.
- We need to develop good mappings from the task domain to the model such as tokens, patches, embedding etc.
- We also need the right structure for the AI system/models to encode the knowledge (transformers, JEPA etc)
Data: The existence of enough examples of the task being performed correctly or measured accurately
- Examples could come from humans (LLMs) or simulations
- More complicated tasks require more examples
Form: The synthetic system has the physical characteristics necessary to perform the task.
- ChatGPT can produce text and images but it cannot throw a baseball.

We’ve seen a lot of improvement in AI recently because of advances in compute, data, and structure. Form is something that a lot of people tend to ignore these days while we focus on trying to get AI to pass a standardized test or make cool videos.

The future of how well AI systems work in the future and how quickly will be determined by how well we can overcome the limitations which result from these categories listed above. Let’s look at them one by one.

Compute

The huge advances in Compute have been driven by GPU and software innovations and simply by the willingness to spend a lot of money. The idea that computers increasing in functionality with less cost has been broken somewhat by the huge requirements of these AI systems. For decades, we’ve been talking about how hardware gets faster but more complex software means we need faster and faster devices just to keep the UI running, but these large models require so much computation that even Apple has to offload much of its new Apple Intelligence to a data center despite having crazy fast chips on the devices.

People will find new tricks to speed up these algorithms. Hardware will improve but it’s not going to get super cheap soon as the demands keep going up. We might see more stuff move to on device models over time, but we still have to content with the physics there for power etc.

We might see the rate of growth in spending decline here and we might see a slowdown in how fast we can innovate in silicon. Even Nvidia makes mistakes. I am more optimistic for improvements in algorithms and overall, I don’t see compute as a key limiter to the rate at which AI advances over the next few years.

Structure

AI systems have shown a remarkable ability to be able to absorb and learn from all kinds of data. For any domain you want the synthetic system to be able to operate in, you must have a way to encode the data so you can train it well. We see this come up in things like self-driving cars where there is a debate between the people who make the ones that work who say you need lidar, and the people who promise it will work someday who say you don’t.

It seems from all the advances in recent years that people are good at figuring out how to encode things in some cases and in other cases it seems like the specific encoding don’t matter. With enough data the system learns what features to care about anyway.

Beyond encodings we’ve seen big advancements in how we organize the structure of these large Neural Net systems. Transformers have proven valuable for a variety of use cases but I feel pretty confident people will continue to invent better and better versions of these things. We will see systems which are really several different kinds of models interacting with each other.

I don’t see the lack of innovation in structure as a key reason to think we aren’t going to continue to see rapid advances in AI, but I do think we won’t get to AGI by just putting more stuff into the same structures we have now. Researchers will have to invent new ones and we can’t guarantee that innovation will happen on any kind of schedule.

Data

Data is a big question mark since for LLMs some people might argue that they’ve already been trained on almost everything so how much more data is there? I’m pretty convinced there’s still some more that can be extracted by continuing to get better at selecting the best data to train on. These LLMs seem to have seen enough text now to have a pretty solid facility with language; what they need is a more clear set of solid information from which to do a better job of guessing.

For image generation models, it’s a bit more unclear how many more artist’s works need to be scraped to make these systems become even better artists but there’s clearly enough photo and video data to make a real run at improving these systems if you are willing to spend the fortune required on the compute. And it seems like people are willing.

There’s also the potential to create all kinds of synthetic data. Consider using a video game engine to create video to train on. A key advantage of this kind of data is that you already have a lot of semantic information about the video since you used structured information such as 3D models to create it. You can create all of the data you would like this way but you still need someone to decide what that data should be. It’s still a finite amount of data and somewhat limited by human creativity.

When I think about AI systems which are smarter than humans, it seems like this might only be possible if you can create almost infinite amounts of data. This data can’t be too expensive to produce or train with or the training cost will be too high. One way you might create that amount of data is with a simulation. This is essentially how AlphaGo works. It can play as many games of Go as it wants (even against itself) because that entire environment (the Go game) can easily and efficiently be simulated inside the computer. The system can just keep throwing things at the wall and keep trying until it finds a way to be a little better. That idea of better can be measured and selected for and it keeps going until at some point it is effectively superintelligent but only in the context of the universe of Go.

True superintelligence the way we tend to mean it would require that level of mastery of the world we live in and we can’t easily model or simulate that world. We might think a super intelligent AI could learn to manage the global financial system by training on all the data for asset prices, transactions, bank accounts etc. The problem with this is that a huge part of that system is just in the heads of people and their desires etc.

I don’t see how we are going to acquire this data so I do think that lack of Data is going to be a challenge for building any AI that is smarter than people.

Form

The last thing to discuss from the above list is Form. A functioning AI system is embedded into the world in some way and takes on some kind of Form. It might be a chat box in a web browser, it might be a drone, a self driving car, or a factory robot. As humans we think a lot about intelligence as being something “in our heads” but really if we want synthetic systems that can do what we do we need to consider how we are physical beings who can learn to ski as well as learn to write an email.

I think the tricky part here is that we simply don’t know how much coupling there is between the form we take in the world and the intelligence than manifests in our brains. We’ve seen computers be able to manipulate symbols by giving it lots of symbols as examples, but we don’t really know how well that would extend to a computer learning to surf. By comparison table tennis is pretty easy.

I think we can eventually solve all this problems but I think innovation here will be much slower. Physical systems take longer to design and iterate than most software. Working with Atoms instead of Bits always takes more time and testing. Control theory is pretty well understood but not so much when the controls are deep learning models.

Continuous Learning

As a closing thought, I want to consider something not directly addressed above that is mostly absent from current deep learning AI system. The ability to autonomously continue learning. The most intelligent AI systems we have now still work via process of training on a large set of data and then evaluating new situations based on that training. Very few are set up to learn and integrate new information in real time. They can analyze a large context sure, but they will analyze it the same way the next time and not learn anything from the interaction in general. People are of course working on things that do this but its still new.

Can we get to AGI or beyond with a system that can’t learn new things?

Compute

Structure

Data

Form

Continuous Learning

Comments

Leave a Reply Cancel reply