Disconnectivity
We're addicted to building networks. What happens when AI starts to break them?
As a baby of the eighties, my life coincides with the history of the personal computer. My first experience of computing came early, when I removed and chewed the keys of my parents’ ZX Spectrum. New computers were childhood landmarks: the Amiga 1200 we got for Christmas, the 386 laptop my Dad brought home from work, the 486DX with its mysterious Turbo button, the bulbous Compaq with integrated speakers and Windows 95.
I was lucky to grow up around these machines. But one thing we didn’t have was Internet—not until the late 90s, when it finally got cheap. I spent countless hours poking around with no connection to the outside world. No web, no APIs, no updates, no email. If you had to transfer data you’d use floppies, and 1.44 MB was never very much. This was the world we lived in: little me, with my computer obsession, and my parents, who used them for work.
For a time, personal computers were disconnected, but now they are rarely offline. And not just our personal computers, but all of our devices—from phones and tablets to our watches, appliances, and toys. Industrial equipment, personal vehicles, and home thermostats are all plugged in to the global network, sending packets back and forth.
This isn’t new. The first computers, hulking behemoths that occupied entire rooms, were singular in nature. Pretty soon, though, these beasts sprouted terminals: small devices for input and output, designed to make programming easier. As terminals got more sophisticated they captured more responsibilities from the mainframes they connected to, becoming simple computers in their own right. These were the ancestors of the early PC.
From the first days until now, computer power has shifted back and forth from the center to the edge. Mainframes with terminals became servers with thin clients, until personal computers with beefy CPUs took over. Once the Internet was born, computation drifted to the cloud: today’s biggest workloads are generative AI, and are almost exclusively cloud-based.
This tug-of-war is ongoing, and will continue until the end of time. In my book AI at the Edge I describe how our newest technologies are pulling the rope away from the center, shifting more and more compute to the edges of the network. Closer to the world, where the real data lives.
But to speak of cloud and edge is to talk about connection: a network exists, with varying topology. The home connection that eluded me until 1998 is now an omnipresent web; it’s an underlying assumption we find it hard to break. All our talk of cloud and edge is through the lens of connectivity, with one device reporting back to another.
In fact, there are two separate axes. The axis we just describes spans from centralized to decentralized. At one end is the mainframe, and at the other is edge computing. But the second axis is distinct. At one of its poles we have full connectivity: the assumption all devices are linked to a world wide web. And at the other, things are atomized—we’re in a disconnected world.
Connection is a compromise
Disconnectivity. What if devices are just not connected? The brief period between the death of the mainframe and the birth of the Internet—the prime years of my childhood—gave a glimpse of what this can be like. It comes with problems, risks, and inconvenience—but also benefits, security, and the enablement of applications that today seem out of reach.
Engineering can feel easier in a disconnected world. There are none of the latency or reliability challenges that result from outsourcing processing to a separate device, and the simplicity of standalone hardware pays dividends in cost and maintenance. Networking, an expensive, power-hungry1 nightmare of complexity, ceases to be a requirement. Finally, but perhaps most significantly, standalone devices can safeguard the privacy of their users in a way that connected devices never can.
The advantages of disconnectivity are strongest for devices other than personal computers. Think household appliances, like washing machines and refrigerators. These big, expensive devices are infrequently replaced, making reliability a key feature. Connecting a fridge to the Internet opens a Pandora’s box of reliability issues including security vulnerabilities, bugs, and software obsolescence, which are rarely worth the benefits. This is why large appliances have stayed dumb for so long.
In fact, some of the biggest challenges in product design are a direct result of connectivity. Take the recent example of Humane’s AI Pin. This intriguing device is designed to replace attention-hogging smartphones with a streamlined interface built around an AI voice assistant. Convenience, without distraction.
The AI Pin looks like a Star Trek communicator badge. It uses a microphone to hear speech and a camera to see the world. It can answer factual questions, translate multilingual conversations, and describe the nutritional content of foods that you show it. It also works as a hands-free phone.
When you speak to AI Pin, its microphones capture audio. This data is streamed to the cloud via cellular networks, converted into text by a transcription model, and fed into a large language model that generates a written response. The response, turned into speech by another large model, is streamed back to the device.
I love the concept of AI Pin, but its use of connectivity is a deal with the devil. The intense back-and-forth between device and cloud requires lots of power, so its onboard battery lasts just four hours. A wireless “battery booster”, connected with a magnet, takes the total to nine hours: still not enough for an entire day. The solution is a portable charging case; a third piece of hardware to carry around.
This cloud-based architecture imposes a $25 monthly fee, although this high cost also covers voice calls and texting. AI Pin also suffers from latency issues: it can be slow to reply, which—as I learned from my work on Google Assistant—is deeply frustrating for users. And when you leave cellular coverage, the Pin becomes a helpless ornament.
The truth of connectivity is that it’s often a compromise, forced onto developers by the constraints of their hardware platform. AI Pin would be much better if its core functionality ran locally, but the technology to do that is not yet available. Its use of the cloud is not a feature but a workaround, and it makes the product worse.
But models are getting smaller, embedded processors are getting faster, and soon enough the two will intersect. Embedded deep learning and generative AI are both extremely new technologies, and both fields are advancing so fast that it is hard to keep up. There’s a window of applications that require no compromise, and it’s widening every day.
Disconnecting from the cloud
What does it look like when an application leaves the cloud? There can be major benefits—and some notable drawbacks. In essence, on-device computing gives us simpler devices with more sophisticated capabilities.
A cloud-reliant system can be thought of as an iceberg: the visible tip, the device itself, is dwarfed in complexity by the enormous cloud and networking infrastructure that lies beneath. This machinery creates most of the costs, reliability issues, and application-related limitations.
AI will be the driver of disconnectivity. The development of faster processors and more efficient algorithms has enabled sophisticated decision-making to happen on-device. By distilling intelligence to a compact form, deep learning models allow applications to break connection with the cloud but still do meaningful work. And many new AI applications depend on disconnection to provide maximum utility.
The most immediate impact of on-device intelligence is a reduction in recurring costs. Imagine trading cloud infrastructure for a more capable local device. While faster processors are more expensive per unit, they are still far cheaper than the expense of hosting—and staffing—the gigantic infrastructure required to make a cloud application work.
To sell a system that depends on the cloud, device manufacturers are forced into the subscription business model. All that hosting costs money, which is impossible to recoup from device sales alone. This is why so many modern gadgets come with a significant monthly fee. While subscription revenue can be useful for companies, it reduces the accessibility of products and limits their overall market.
I’m super excited about the commodification of AI. In the market for most hardware, buyers have a huge range of options. They can go for the big, expensive flagships—or for basic, affordable models. As the personal computer became popular, companies competed to lower the price, raising the accessibility of a once-niche device until it numbered in the billions.
Most consumer gadgets have been similarly commodified. In every category, there’s an endless array of cheap, generic devices that are priced as low as possible. Manufacturers are great at reducing costs—and freed from the tether of monthly subscriptions, hardware can be very cheap. It is made from rocks and oil.
Being very cheap has tremendous benefits. AI is a way to scale human insight. Once a single ML team has spent time and expertise to label a dataset and train a model, their work can be used by millions of people with no recurring cost. With inexpensive hardware, that work can be rolled out to the ends of the earth.
At one end of the spectrum, small application-specific models on commodity hardware can help monitor crop health or diagnose skin conditions. At the other end, an on-device foundation model might provide knowledge and assistance on a variety of topics—an oracle in a grain of sand. Widespread availability of these tools will lead to economic benefits across the globe.
Despite being cheap, disconnected devices can be highly durable. With one less interface to the outside world, and no dependence on infrastructure maintenance, a device can keep on ticking for a very long time. Without networking there’s no need for security patches, and it won’t stop working when its subscription service is shuttered. The device truly belongs to its owner: it isn’t rented from the business that supplies it.
With limited networking, a tiny sensor with a microcontroller capable of hosting deep learning models can run for years on a small, cheap battery. This allows designers to bring functionality into all sorts of places. Conservation researchers are building disconnected devices to log animal behavior in remote wilderness, using computer vision to observe key species with a thousand borrowed eyes.
But to me, one of the greatest wins from disconnection is the protection of privacy. Connected systems are simply unsuitable for many applications. I’ve spoken with dozens of industrial organizations who just can’t permit data to be vacuumed from their facilities and piped up to the cloud. They know they could benefit from AI—for example, for predictive maintenance—but the risk associated with letting data leave the building is just too high to take.
Many human interaction applications are downright impossible with connected devices. We’ve all used public restrooms where the awful IR sensors make it hard to wash your hands (and there are few things worse than sitting on a hyperactive auto-flushing toilet). A computer vision system could do a much better job of recognizing when to switch the water on—but it’s absurd to install networked cameras in a bathroom. The same applies for children’s toys, and smart-home sensors. Applications that share our personal space demand disconnectivity or nothing.
Despite all this excitement, there are certainly downsides to disconnected architectures. Data professionals are familiar with the idea of drift: that the world tends to change over time, making models obsolete. This affects ChatGPT, whose knowledge of facts becomes quickly out-of-date, and smaller models, too. Imagine a model designed to count passing cars, trained on vehicles available in 2023. Its accuracy might drop as new styles proliferate over the course of the following years.
With no way to update their models, disconnected AI systems can suffer from a reduction in performance over time. This limits the lifespan of devices, and constrains their use to applications where a certain stability can be expected. There are techniques to detect drift, but there’s no way to avoid it—so the trade-off needs to be considered.
Disconnection in the wild
As disconnected intelligence spreads, we’ll see it more and more in our built environments, workplaces, and homes. Disconnected devices will live alongside the connected devices we are familiar with, improving our experiences in many small ways.
In a sense, they are already here: 31 billion microcontrollers were shipped in 2021 alone, so you’re never more than a few feet from a tiny computer. With advances in edge AI, these computers have become host to machine learning models: little slices of intelligence, baked into physical matter.
We’ll never see a world with total disconnection. While connectivity is often a crutch, it still has major benefits: it helps us pool and store data from multiple sources, access compute beyond what is available on an individual device, and communicate with other human beings.
Many devices will straddle the border: they’ll do most of their thinking on-device, but report back to the cloud with results. This approach provides the best of both worlds; efficient use of resources, plus the ability to communicate with the systems around them. Private sub-networks of devices, like remote temperature sensors connected to smart thermostat, receive the benefits of networking without the drawbacks of the cloud.
Occasional communication brings a new set of trade-offs, and there are networking technologies ready to fill the gap. Low-power radio, like LoRA, can send a few bytes every now and again—enough to report back summaries without burning through a battery. And platforms like Ditto enable asynchronous transfer of data via opportunistic mesh networks, for when connectivity is intermittent at best.
The economics of disconnected AI
The novel compromise of disconnectivity unlocks a million new applications. It will transform the way AI companies make money.
Disconnection destroys the justification for a subscription business model. Without subscriptions, the primary way for companies to sustain revenue is to sell more devices. This means it will be easier to build a business around cheaper devices that are replaced and upgraded frequently.
With sales so important, there’ll be massive pressure to produce commodity hardware that is cheap, fast, and can be easily integrated into other products. This is the same incentive driving the silicon industry today.
In the silicon ecosystem, IP vendors—like ARM—design general architectures for processor cores, then license them to other companies. These companies tailor the cores to suit specific applications, then manufacture the actual silicon—either in-house or through another set of companies. People building hardware products buy these chips.
It’s absurdly complex and expensive to design and make a chip. Only a handful of companies can do it all in-house. Apple only got big enough to design its own silicon in relatively recent years, and they may never manufacture it themselves. Few tasks in technology are as expensive as creating processors. The one that comes closest is training generative AI.
Silicon IP and AI foundation models have a great deal in common. They’re designed by small communities of expert engineers and are tremendously expensive to produce. They are developed in iterations, updated from time to time as new techniques become available. Their purpose is deployment at a global scale: the investment is recovered through an eye-watering volume of sales.
Silicon and models are designed in unison. Modern silicon is built to accelerate deep learning workloads, and deep learning algorithms are designed to suit commodity hardware. It is possible to design custom silicon to run specific model architectures extremely fast—and this will be necessary to deploy the largest models to the edge. At a certain point, the lines begin to blur: processor and model become indistinct.
As we hurtle towards commodity AI, model and hardware IP will become intermingled. AI companies will license models in the same way ARM licenses processor IP. Silicon designers will produce their own models, paired intimately with their processor cores. AI companies will produce their own processor IP, designed to run their models fast. It’s how they’ll create their competitive “moat”: the perfect combination of dataset, model, and hardware design.
This is already happening inside companies like Apple and Google, who now design their own processors based on the models they need to run. As the concept matures, it will move beyond the vertically integrated and become available to anyone designing hardware products.
You’ll purchase a chip with baked-in AI: it could be speech transcription, a conversational model, audio generation, or image transformation. You’ll invoke it from embedded code, like any other API. The model will be a black box, with well-documented behavior and parameters for customization. You’ll process the output, wrap it up, and use it in your application.
Like today’s silicon ecosystem, vendors will produce variants of chips that suit different use cases. Some will focus on entertainment and productivity, providing generative text, audio, and video. Others will be geared towards industry, enabling signal processing and computer vision.
You’ll always be able to run custom models on general purpose hardware, which is getting faster and faster over time. This will be important for solving long-tail problems2. But the most demanding models, with the highest general performance, will run best on hardware with which they were co-designed.
There’ll be a major role for open source. I’m a big believer in open source AI, which paired with silicon delivers a proven business model. Open source silicon IP, like RISC-V, is tremendously important in the industry: a genuine competitor to proprietary designs. The companies that support it derive major benefits, like huge flexibility and reduced costs.
While an open source foundation model can be trained and run by anyone, it will work best on cores that are designed alongside it. And open source models bring similar benefits to RISC-V: they are not black boxes, so they can be analyzed and customized by anyone who wants to use them. There’s a clear incentive for companies to sponsor open source.
As with silicon IP, there’ll be huge benefits to the first movers who start building this ecosystem first. Hardware moves slowly, and decisions can be sticky: it’s not easy for customers to move from one chip to another. The first companies to build a market will be locked in for the ride.
Disconnected future
Disconnection is coming: economics and pragmatism will basically force it. It simply isn’t possible to build the next generation of AI products while still relying on the cloud. You can see this in the efforts of Apple and Google, who are furiously developing chips to run proprietary models on-device.
There will be a sea change in the AI business model, as on-device compute makes subscriptions a tougher sell. The barriers to entry will fall, and AI will be embedded in everything—as cheap and pervasive as the microcontrollers we have today.
Connectivity will always be important, but as an auxiliary function: a way to tap into recent information, or share state across devices. It won’t always be responsible for core features, and devices will still be smart without it.
For a radical example of where this might go, think about the future of streaming video. Today’s TV is pre-recorded: written, filmed, and directed by gigantic teams. With photorealistic generative AI, we could create TV on the fly: custom shows that respond to your feedback and tell a personal story to every viewer. A new form of entertainment, likely possible within the next few years.
If we did this in the cloud, the costs would be absurd: you’d need enough AI compute resources to generate video for every subscriber. The more minutes people watch, the more cash it costs the company. This is exactly why cloud gaming has struggled to take off.
But if you can generate on the edge, those costs disappear. Local hardware does it all. New stories, characters, and concepts could be fetched across the network—but the real work will happen on-device. A customer pays a single price and receives entertainment for life.
The future will bring amazing ideas that we’ve barely begun to imagine—but we can see far enough ahead to guess the lay of the land. The huge demand for cloud inference is a temporary condition: peak GPU is on the way.
Embedded chips are getting faster, models are becoming more efficient, and new applications are on the horizon. Intelligence is moving to the edge, where the data and the users live.
Networking hardware typically eats the lion’s share of any energy budget for connected devices. Radio transmission uses lots of energy, and ethernet is highly inefficient.
Foundation models provide general capabilities, and work well in domains where the rate of change is slow. Since the majority of real-world problems are specific and dynamic, you can usually get better performance if you use a model that is trained for a particular task. For example, a computer vision model trained on a specific production line will get better performance when applied to that line than a model that tries to be general.
A very good piece, and I broadly agree. But I'd just a couple of thoughts:
1. "This means it will be easier to build a business around cheaper devices that are replaced and upgraded frequently." - Precisely. This is already a huge problem, and I fear it'll keep getting worse for at least a few years. "Oh I'll just throw it away and buy a replacement".... and meanwhile global temperatures and piles of waste keep growing.
2. You state that the cloud+subscription model is a bandaid - and for many cases you're certainly correct. But there's also the cases where it's a business choice to enable the vendor to keep milking the customer. Is there any technical reason why e.g. a smart lightbulb stops functioning without cloud connection, especially bearing in mind that approximately 100% (my guestimate) of the people who buy them most definitely have a desktop or laptop, wifi router, AND smartphone? No. But if you make the smart device require a cloud connection, you can force customers to pay a subscription and/or make them throw away the device and buy a new one (which is effectively the same, just hidden). In German we have a saying, "Was lange hält bringt kein Geld", rough translation "What remains usable for a long time doesn't turn a profit". The economics term for this is "planned obsolescence".
I think those two issues will be, or to be precise 'continue to be', a big conflict area between customers and vendors. Let's hope regulators manage to find a balance that leads to positive results - not choking off innovation, but also not leaving customers at the mercy of rent-seeking by a tiny number of companies, in more than a few cases duopolies (e.g. desktop/laptop CPUs+GPUs) or even near-monopolies (e.g. desktop/laptop OSs and smartphone CPU design [though not the actual chips]).
But to finish up, I'm glad you pointed out open source and its cousin open hardware, which could conceivably limit (though not by themselves resolve) these issues.
I wonder if we’ll be able to choose disconnected devices in the near future. What if you want to opt out, for a dumb toaster or a dumb blender.
I got an air fryer that has an app and Bluetooth. It is nice to look up recipes. And it could be nice to get notifications that your food is done. But I am in a small apartment. I am not more than 15 feet away at any point. I can hear it.