The AI Moment

Created

Mar 3, 2023 8:56 PM

Part I

What is the goal of AI: a mediocre human or a better machine?

When ChatGPT burst onto the scene late last year, it was a moment that some compared to the launch of the iPhone. It was the fastest-ever application to hit 100m users. A thousand generative AI market maps from VCs were born. Microsoft moved to inject a huge sum of capital into OpenAI and roll out an integration into Bing. For the first time since Larry and Sergey were tinkering in a garage, Google’s monopoly on search seemed on the precipice of breaking.

However, as people played with tools like ChatGPT and DALL-E, they quickly realized some of the flaws: the hallucinations, the lack of real-time liveness, and imprecision when it came to unusual requests.

Sydney, Bing’s alter ego powered by ChatGPT, might’ve accelerated public consciousness of AI X-risk research by several decades with her manic, otherworldly conversations and disturbing demands.

What’s perhaps more interesting than rehashing the obvious flaws of current LLMs is studying the reaction to ChatGPT and Sydney. For ChatGPT, more and more topics became taboo for it to answer as users tried to break it in various ways by asking it to write poems about Donald Trump or say slurs. Meanwhile, Sydney was effectively killed by Microsoft a few days after rollout as they tried to limit her to the most basic of use cases. In our deeply divided political climate, there’s more and more noise about how we’ll need to build AGIs for each of our political systems where users can “choose their own adventure.”

I can’t find where I originally saw this meme but credit to whoever the author was.

This brings us back to my original question of what our purpose is when we build artificial intelligence: a mediocre human or a hyper-intelligent machine assistant? If you think it’s the latter, it’s kind of wild that we’re using billions of dollars of compute to generate a model that needs to search through its entire training data set of mostly irrelevant info to try to guess the answer to a question. Oh and then we need to use more compute, filters, and hours of cheap, potentially scarring human labor feedback on top of that to make sure it’s coherent and doesn’t say anything offensive or illegal.

There’s a very real possibility that even if LLMs are the right approach to building an artificial general intelligence,

all of the interventions will devitalize the tools so much that they become neutered human intelligences, perhaps capable of mimicking or commoditizing basic work, but hardly fulfilling the promise of wealth creation and massive abundance.

The question that follows then is: what is a better approach for utilizing AI for human flourishing, a giant monolith that is lobotomized with a thousand cuts or a narrow agent that could become more generalized and powerful over time?

AlphaGo, an AI program using neural networks and a decision-making algorithm, beat the best human in the world at the ancient game of Go in 2016. AlphaGo works by accessing the entire library and history of Go, analyzing all matches including ones it plays against itself, and narrowly focusing on what is the highest-probability move that leads to winning. But what’s perhaps far more interesting is what has happened in Go since then.

This paper highlights the details, but the key takeaway is that after the widespread availability of AlphaGo, human gameplay significantly improved, with less errors and better decisions. Humans have limited processing power and could not possibly search through the entire history of Go and evaluate the probability of each move across a vast search space of millions of decision trees. In this case, the AI was an instructor to humans, finding new knowledge and then teaching it to them to help them make better decisions. The tool doesn’t generalize as well as GPT-3, but it solves one particular game better and more efficiently.

While Go is just a game, this has interesting implications for other areas like drug discovery, education, defense, etc. Wouldn’t you rather have the world’s best threat detection machine that has wargamed millions of scenarios and then teaches humans its best strategic findings versus something more generalizable, but much more bloated and expensive, making its best educated guess after taking in streams of irrelevant input? Wouldn’t you rather have AI agents making unique discoveries in math or biology versus ones that also know random facts about the Peloponnesian War that might or might not be true?

In this vein, there’s interesting novel research being done at DeepMind and other early-stage startups to combine some of the huge advances in deep learning with more traditional forms of computer science to build AI agents in a more efficient way than the brute scale approach. Even within current language model approaches, there are startups training language models on smaller samples and then building workflow tools on top of them to solve more narrow problems.

Even if GPT-12 or whatever could reach the same conclusions over time, it’s likely not the most efficient and best way to solve many real-world problems. And we still have to worry that all the interventions we have to make to make this agent "safe” may turn it into an overwrought Frankenstein machine lumbering around haphazardly rather than a useful, hyper-efficient, and intelligent assistant to humans. Additionally, we might find out the intelligent assistant can be trained to add new skills over time individually in a compounding layer cake of learning rather than try to learn everything all at once mediocrely.

The counterpoint to my view is that at a certain scale or emergent takeoff point, a giant model becomes so much more intelligent than humans that for most conceivable economic problems, it will be superior to human labor.

The thinking goes that if this is true, it's worth the resources and input because the AGI solves nearly every problem for us instantly rather than each AI agent solving just one or two individually. It’s a machine God that we hope doesn’t kill us all. This is what we see in science fiction sometimes and the origin of a million Less Wrong blog posts.

Credits: Sean Gallup, Getty Images

Even if you believe that this is the future

, the timeline of this AGI takeoff is completely unpredictable and impossible to build an investing thesis around, especially for VCs whose job is to "see the present clearly.” It also calls into question how much it makes sense to invest in companies built on top of a company that’s attempting to achieve a different goal than “purpose-built infrastructure for enterprise or consumer use cases.”

Finally, as alluded to before, the types of questions you need to ask if an AGI happens and OpenAI succeeds are far more existential, philosophical, and political in nature than what type of analytics dashboard or devtools you need for LLMs.

roon @tszzlit is a bit strange to watch ai people talk about go to market and disruption and stripe dashboards and all that mundane startup stuff when the rewriting of civilization is at hand
12:04 PM ∙ Feb 12, 2023966Likes53Retweets

In the meantime, let’s go to our second question: What applications or workflows do you want a mediocre human replacement versus an intelligent machine assistant?

The two ideas of mediocre human replacement or machine assistant are often conflated together because of how we lump all of artificial intelligence together, but a machine that plays every game of Go and pulls forward strategy improvements by hundreds of years is quite different than a machine replying to customer service requests and being fine-tuned to not gaslight users or say anything offensive along the way.

Commoditization of human labor is economically valuable: generalized LLMs could slot into some workflows as replacements entirely or early draft generators that automate hours of work before a human finishes the final edit.

Some examples include:

Customer service
Design and sketch tools for creators
Copywriting
Rough drafts for PowerPoints, financial models, etc.
Better search, particularly in enterprise, e-commerce, etc.
Summarization of information or data
Most work on Upwork and Fiverr

These are all use cases where there is high error tolerance and any mistakes are not super-frustrating to the user leveraging the tools compared to the time saved. Starting with a blank sheet of paper and instantly having a ten-page deck that I can edit quickly for errors is a huge unlock and time-saver versus doing it from scratch. We can also expect these models to continue to improve over the next few years although there will likely be diminishing returns.

Comparatively, there are other workflows where high accuracy is mandatory for user happiness or some where you might simply look to push the boundary of what’s possible in solving a problem. These are applications where purpose-built verticalized models could be much more useful.

Some examples include:

Threat detection in defense
Research assistants
Strategy games
Drug discovery
Other use cases founders will invent and are perhaps unlocked by new research

I’m personally much more interested in companies building in the latter category than the former. It’s not that the former categories won’t exist, but the question of moats is uncertain with most value likely accruing to incumbent. For example, is there a new startup building “Generative AI for design” or does Figma simply integrate with the latest models and win the category?

I’ve currently made one investment in the latter category and am eager to make more.

I’ll end with a provocative question a friend posed to me: are current implementations like ChatGPT and Sydney just a faster horse instead of a brand-new car?

These are giant monoliths continually being hacked at to prevent danger rather than something sleek and efficient built to serve humans. By putting in tons of guardrails, will the best outcome it can achieve a perfect simulacrum of a mediocre, inoffensive human or can it instead push the boundaries of what’s possible in various fields?

François Chollet @fcholletThe near future of AI is to serve as a universal assistant. Whatever you create on a computer -- slides, code, spreadsheets, docs, tunes, 3D environments, etc. -- you will be able to leverage a digital assistant to help you with boilerplate, filling in details, autocomplete, etc.
2:50 AM ∙ Feb 1, 2023987Likes179Retweets

For some use cases, a faster horse is certainly valuable but it’s hardly the complete economic game-changer that some imagine.

If we want to use these machines to unlock humans in a totally new way, maybe we should think about building a car.

In Part II, we’ll look at how large language models might deploy into enterprise. Thanks to Jungwon Byun, Ben Van Roo, John Dulin, and Blake Eastman for reading earlier versions of this article and for their immensely valuable feedback. Would love to hear thoughts and feedback and if you’re a founder building in this area, please reach out at pratyush [at] susaventures [dot] com.

To quote Peter Thiel, the most contrarian thing to do is not to oppose the crowd, but to think for yourself.

Most VC narratives should be discarded at this point. Most are narrative mirages.

A good example a founder gave me: if you ask diffusion models like DALL-E anything precise with unusual associations like “basketball player holding two tennis balls” it will completely fail, while comparatively producing amazing output for anything dream-like and highly associative like “ancient ruins in a magical forest.”

Hardly a foregone conclusion. Well-known deep learning researchers like Yann LeCun say LLMs will only ever be stochastic parrots and other approaches will be needed for true autonomous machine intelligence.

This is the definition OpenAI uses.

I don’t believe this but a conversation here ventures more into philosophy, religion, man’s search for control over his domain, and perhaps all the way back to the Garden of Eden. If you want to talk about this, reach out for a coffee but I’ll skip it here.