The goal of many AI research projects is to create human-like intelligence. I’ve heard it suggested that, on the road there, we’ll arrive at cat-like intelligence first.
I love the idea of cat-like intelligence. I adore my cat and the bizarre way she thinks. But the suggestion of a continuum between cat and human intelligence is an interesting one: while we have similar brain cells, our last common ancestor was a shrew-like being around 80 million years ago.
Is mammal intelligence a fungible phenomenon, scaling linearly with the volume of available brain mass? Or does it result from a complex combination of structural and systemic attributes that are unique to the bodies of individual animals?
I’m no biologist, but I strongly suspect the second: we’re not just gooey processors, points on the trend line of a biological Moore’s Law. An animal’s brain and mind result from a utilitarian assembly of capabilities that promote its species’ survival in the wild. Humans have the wetware to read faces. Cats may have circuits for spotting mice.
A cat’s brain is part of a finely tuned mechanism for converting raw matter into further cats. Environmental factors, such as the availability of matter in the form of unfortunate birds and mice, create constraints. Natural selection produces structure and behavior that fit them. Evolution is the ultimate neural architecture search, performed over aeons and implemented in flesh and blood.
Intelligence, as we think of it, is bullshit. There’s no way to measure the intelligence of a human being, let alone an animal, or a piece of software. The metrics we use, like IQ, ARC, or the International Mathematical Olympiad, are terrible proxies, designed to measure a concept we have no way to define. Our benchmarks measure something, but not intelligence. That word has no true meaning.
Folk wisdom describes intelligence as a sort of mystical fluid. It’s a solution of enzymes, weak or strong, that can dissolve ideas and remake them into others. A big brain contains lots of it, a small brain contains a little. In some brains it’s more concentrated than others. Einstein’s was full of juice; Forrest Gump’s is mostly water.
This faulty intuition leads to odd thought experiments, pervasive in popular culture. According to this model, it might be possible to decant intelligence and experience from one brain to another: we could increase our intelligence with wonder drugs, switch bodies with another person, or inhabit the mind of an animal while remaining our capable selves. This comic-book fungibility of intelligence—the sock-puppet model of the mind—points to a dualist belief in intelligence as a variable aspect of the soul.
But the metaphor of fluid intelligence is clearly wrong. There’s no single general mechanism for “figuring stuff out”. Our brains do many things—perceiving, reacting, planning, executing—and for the most part, these capabilities are distinct. A person can be great at writing poetry but incapable of recognizing faces; they might play beautiful piano while unable to do basic maths. Think of the calculation implicit in choosing the angle and power of a tennis shot, performed entirely outside of our rational minds. Our intellect has topology, which is too much to ask of a fluid.
There are two reasons for this topology. The first one is physical. There are regions of our brain that are specialized. Damage to the occipito-temporal lobe can result in face blindness. If intelligence were fungible, this wouldn’t happen: the lesioned brain might have reduced capacity, but it wouldn’t lose specific functions. Lose a few cores and a processor can still do the same work, but lose a chunk of brain and all bets are off.
The second reason for our topological intellect is the necessity of learning. Through experience we acquire knowledge and skills which form the bedrock of our capabilities. To a great extent, we learn how to think. Our problem solving ability is based on situations we’ve encountered in the past, or things we’ve been exposed to through our cultural environments. Much of what we consider “intelligence”—the noble application of reason, logic, and creativity—resembles cart tracks in a rutted road.
All this goes to say that our intelligence, however measured, is not at all comparable with a cat’s. It has been shaped by a different set of evolutionary constraints and experiences. The idea of a continuum—amoeba to the left, cat in the middle, and human being on the right—sounds intuitive, but it’s nonsense. We can only really be evaluated with respect to our environments: how well do we operate in the conditions we inhabit?
The goal of much of AI research is to add another being to this scale: amoeba, feline, human, robot. Our assumption is that a “superior” intellect to ours would be incredibly useful: it could think faster, make greater leaps, and draw on deeper wells of knowledge. It could tackle the problems we struggle to grasp, automate our drudgery, and accelerate our progress.
But there is no continuum of minds. A “greater” intelligence is simply different: like humans and cats it is fitted to its environment, and its environment is not the same as ours. We struggle to build systems that emulate humans; we admire their endless poetry, and simulated art, but these ersatz minds are built for a different space.
Our most sophisticated AI models inhabit the simplest of possible worlds. Their homogeneous neural architectures are trained to replicate writing or images: take one input, output the same. The nature of these proxy tasks forces the evolution of internal structures, inward projections of the nuance of the data.
If you use enough data, the internal structures grow until they mimic the processes creating the data: which, if your data is writing, are the minds of human beings. This results in some cool shit: a large language model can write. It can describe images, create software code, give instructions on how bake, and perform some simple reasoning. They mimic knowledge, as well as skills—their inner state contains encoded facts in addition to the rules of grammar.
But when your world consists of mimicry, it’s hard to make sense of context. A model asked a question will give you an answer, whether it’s right or not. It will attempt to appear statistically convincing: it will produce a reasonable sample from its training distribution. It can’t determine whether it actually makes sense.
This makes these models hard to use to solve many real problems. While techniques like RLHF attempt to steer a model towards reasonable outputs, they are expensive and difficult and not entirely effective. Making LLMs usable is a massive engineering challenge, and the focus of a growing industry. It remains unclear if it will succeed.
Intelligence is unmeasurable, but that doesn’t stop people from trying. We may indeed train foundation models that beat humans on specific metrics. But if we can only be evaluated with respect to our environments—and we continue to train these models on unrepresentative proxy tasks—we’ll never escape the problem that our tools are built for worlds that in no way resemble our own.
Enter the cat.
A cat lives in a similar world to us. Their lives are quite different, but their motives and activities are in some ways alike. At the very least, the process of being a cat has more in common with the process of being human than either has with the training loop of a large language model.
Cats, like us, are highly successful in their niche. They’ve developed behaviors and tendencies that have helped them colonize the planet. Cats are smart little hunters with complex social behaviors, so their success depends more on intelligence than, say, an amoeba. In addition, cats are able to thrive in direct contact with humans: they share our cities, inhabit our homes, and often rely on us to meet their needs.
With so much shared context, a cat is much more effective at navigating our world than a large language model. This shows the limits of brute force: unlike an LLM, a cat’s brain doesn’t contain the sum total of human knowledge—and they certainly aren’t capable of sophisticated reasoning, which is the thing we pursue by chasing modern AI benchmarks. By all of our dubious metrics, a cat is less intelligent than state-of-the-art AI—yet they are capable of autonomy that technologists can only dream of.
If we can’t measure intelligence, perhaps we can describe it. What characterizes the intelligence of a cat? What does it mean to approach the world in a cat-like way? And why is this approach so much more successful than the “fluid” intelligence that we’re attempting to create with modern AI?
I spent a few minutes brainstorming aspects of a domestic cat’s capabilities and behavior, in a totally unscientific way. I’ve grouped them into a few categories:
Learning
Strong associative learning: good or bad experiences quickly lead to conditioning (learn a trick in exchange for a treat, run away when you see the vacuum since it was noisy in the past).
Limited deductive and inductive reasoning, or generalization. Knowledge seems to only come from direct associative experiences.
Planning
Goals have strong dependence on chronology/timing (expect food at 5pm).
High chance of novel actions when expectations are not met (mischief at 5:10pm if food has not yet arrived). This makes sense—since they have to learn associatively, they need to try new stuff if their current action isn’t working.
Time-based recovery after intense negative experiences (hiding, reduced activity).
Behavior
Preserve energy by resting in the absence of stimuli or a chronological trigger.
Instinctive hunting behaviors. If a situation even slightly resembles a hunt, start hunting!
Instinctive burying of waste.
Curiosity
Investigation of novel stimuli below a comfortable intensity threshold (snuffling around a new item of furniture).
Avoidance of novel stimuli when above a comfortable intensity threshold (hiding from a noisy guest).
A cat’s capabilities are distinct from its intelligence, but listing them is a useful way to analyze what a cat really is—and what makes cats good at coexisting with humans. A cat can recognize individual people, communicate across the species barrier, navigate urban environments, and avoid being squished by fast-moving machines. They are smart little things.
For example, a cat’s associative learning means they can be trained to behave in convenient ways (don’t scratch the sofa!). Their ability to learn patterns in time allows them to fit within our own routines. Their natural curiosity makes them fun companions, while an aversion to intense stimuli keeps them safe from harm.
All of these things require intelligence, but it’s their organized behavior that helps cats function in our world. A cat’s behavior gives structure to its intelligence: it helps the cat perform the appropriate tasks as the appropriate time. It remains unknown how this structure is implemented in the brain, but it’s likely a product of the co-evolution of an animal with its environment.
A chatbot, like ChatGPT, is subject to a simple feedback loop between its output and input. Animals exist within feedback loops, too, but they are infinitely more elaborate. To survive within the infinite complexity of time and space, animals must be able to modify themselves, their environments, and their own inner state.
There’s a widespread and highly effective way to model the behavior of a system that interacts with feedback loops: the state machine. A state machine describes a finite number of possible states that a system may have, and the events that will cause the system to move from one state to another. State machines are used to implement all sorts of devices that interact with humans and the physical world: think traffic lights, turnstiles, and vending machines.
State machines help structure and control the behavior of a system. They reduce an infinite space of inputs to a finite number of states. This makes them ideal for handling real world events. It also makes them useful for integrating the messy and complex outputs of deep learning models.
Modern video games are huge and sophisticated. Their complex environments and vast numbers of possible user actions provide many degrees of freedom. Computer controlled game entities—such as enemies, and non-player characters—must respond to their environments in coherent ways for a game to be convincing and fun.
Game developers use state machines, and similar algorithms, to model this complex behavior. They provide a way to analyze a large series of inputs and decide on a reasonable response.
Until recently, AI was all about state. Recent advances in deep learning and generative AI have shifted the focus, but classical AI was based around the application of rules to symbols. The past two decades have given us shiny new tools, and we’ve lost track of the finely honed implements that we already had.
The current paradigm in AI research says that a sufficiently powerful deep learning model can do everything. If we can only find the right combination of dataset and training, we’ll create a system that will give our desired output given any set of inputs.
Unfortunately, this doesn’t really work. With an infinite input space, there’s no way to validate that the system consistently behaves as desired. In real world applications, edge cases are rife. While your system may work predictably 99% of the time, that last 1% is often the deciding factor in whether it is usable for a real world task.
State machines, along with other old-school AI concepts, let us escape from the paradigm of “one-model-to-rule-them-all”. They allow us to specify the finite set of behaviors we want, along with the conditions that should trigger them. They also provide a way to assemble the outputs of multiple models—computer vision, sentiment analysis, generative AI—and make decisions based on all of them. This combined approach is known as neurosymbolic, or hybrid, AI.
A cat is not a state machine, but a state machine provides us a convenient way to emulate the external behaviors of a cat-like mind. Simple perception models, like object detectors and audio classifiers, become the inputs to high-level, hand-written algorithms that encode the states and behaviors of a cat: hiding from the vacuum, or hunting a mouse. We fuse together many small, easy capabilities into a single sophisticated thing.
State machines give us tons of control over behavior, and they are extremely efficient: just some if statements in a loop. They can be used to model many types of systems, from incredibly simple to extremely sophisticated. They are easier to describe, manipulate, and reason about than pure deep learning models. They are a powerful engineering tool.
AI research began as a way to try and understand intelligence from first principles. It was created as a tool for comprehending living minds: if we can build new types of intelligence, perhaps we can understand our own. Even modern AI, built with artificial neural networks, has its roots in these lofty goals.
But in 2024, AI is not just research: it’s a practical engineering discipline. The goal of AI engineering is to build intelligent systems that can act in our world in ways that have utility. The motives of an AI engineer are different to those of a researcher: we wish to build useful tools, and we don’t mind how they work. There’s no need for biological inspiration if a system does its job.
Engineers wish to build systems that exhibit certain behaviors. It’s much easier to start by describing those behaviors—and working backwards from them to an implementation—than by starting with the blank canvas of a neural network and attempting to evolve a working mind.
To the eyes of an engineer, a cat-like intelligence is a system that behaves approximately the same as a cat. Given the same inputs, it will respond in a similar way. Cat behavior can be described—superficially, and experimentally—in relatively simple terms, like the behavior of a sidekick in a video game.
If we’re looking for cat-like intelligence, we’ll find it fastest by building one from scratch: by describing the things we need, and assembling the capabilities that deliver them. Simple perceptive models, united by state machines and other basic algorithms, and running on modest hardware, can get us most of the way there.
Cats are simple, practical creatures that excel at inhabiting our world. If we want AI we can physically coexist with, building cat-like intelligence may be a good way place to start. But we need to stop thinking in terms of a continuum, from amoeba to feline to human. These are not stations along the express line to AGI: that’s an entirely different track.
It seems to me that the way to square the circle is by programming desire and aversion into a more generalized neural network-- then you could mimic the if-then statements of a state machine, within something more complex and chaotic than a mere video game character. Some kind of rewards system.