Of AI models and why it’s important to understand them

An intriguing trend in AI 🤖:

“Models all the way down” (aka "stacking")

Have models invoke other models, then watch as emergent intelligence develops ✨

Here’s a discussion of what, how, and why this is important to watch 👇

In its simplest form, the idea is to have AI models use other models as “tools”. For example:

👪 Let GPT spawn copies of itself to solve subtasks

🎨 Give GPT access to a vision model so it can draw portraits


Recently, this has lead to programs with surprisingly sophisticated capabilities.

The self-referential nature of it seems to hold real promise. It multiplies the capabilities of the base models.

Right now, I consider this the frontier in building things that look like AGI.


The concept has blown up in the public discourse due to (primarily) two projects, BabyAGI and AutoGPT

Both involve LLMs recursively calling themselves.

And it’s wild.

Watch AutoGPT incrementally build an entire react/tailwind app, step by step: https://twitter.com/SullyOmarr/status/1644160222733406214


This concept of building smarter systems by composing models has a long history, however:

Minsky’s Society of Mind (1986), for instance, describes human intelligence as a "bureaucracy" of many interacting, self-contained intelligent sub-systems.



Now, we’re seeing it rapidly pop up in several domains:

In ViperGPT, for example, you give GPT access to a python REPL and a high-level API for manipulating CV models

It can then perform complex CV tasks involving both perception and reasoning



Likewise, approaches like SayCan are emerging in robotics:

Here, you use an LLM as the “backbone” for robotic reasoning:

Feed in textual representations of CV models

→ Have GPT come up with a textual “plan”

→ “compile” that plan to robotic action


A recent project that I’m excited about turns this up to 11:

http://toolkit.club uses LLMs to build/deploy “tools” for other AIs.

Implies a loop where:

→ agent asks for tool

→ tool is built/deployed (by LLM)

→ agent can use tool



So why is this important?

Most interesting tasks are hard enough that e.g. a single LLM query can't solve it. (e.g. build me a $10mm SAAS co.)

But when you stack models like this, suddenly a lot of these more complex tasks seem within reach


And while Stacking is powerful, its limitations are fundamentally not understood, especially in the LLM era.

In having AIs control AIs, we remove humans further from the model's operation.

We have less control over and insight into their outputs. This should be concerning.


While it's exciting to see AI capabilities rapidly advance (and largely driven by hackers, not researchers!), this pattern has the potential to really go off the rails if not properly monitored

We need better tech for observability around large, recursive systems like this.


But the party is just getting started. I'm seeing people drop new, mind-blowing demos leveraging some sort of stacking or recursive self-invocation on a daily basis.

If it proves robust, expect to see this in consumer-facing products soon.

In conclusion:

✨ "Stacking" AI multiplies capabilities
🚀 This is popping up all over AI subdomains
💻 Anyone with a laptop can get in on the action
🤔 This raises new questions about AI safety

Excited to see where this leads.

Originally tweeted by Jay Hack (@mathemagic1an) on April 9, 2023.

Sign Up for nextbigwhat newsletter

The smartest newsletter, partly written by AI.

Download Pluggd.in, the short news app for busy professionals