Artificial intelligence has promised much, but there has been something holding it back from being used successfully by billions of people: a frustrating struggle for humans and machines to understand one another in natural language.
This is now changing, thanks to the arrival of large language models powered by transformer architectures, one of the most important AI breakthroughs in the past 20 years.
Transformers are neural networks designed to model sequential data and generate a prediction of what should come next in a series. Core to their success is the idea of “attention,” which allows the transformer to “attend” to the most salient features of an input rather than trying to process everything.
These new models have delivered significant improvements to applications using natural language like language translation, summarization, information retrieval, and, most important, text generation. In the past, each required bespoke architectures. Now transformers are delivering state-of-the-art results across the board.
Although Google pioneered transformer architecture, OpenAI became the first to demonstrate its power at scale, in 2020, with the launch of GPT-3 (Generative Pre-Trained Transformer 3). At the time, it was the largest language model ever created.
GPT-3’s ability to produce humanlike text generated a wave of excitement. It was only the start. Large language models are now improving at a truly impressive rate.
“Parameter count” is generally accepted as a rough proxy for a model’s capabilities. So far, we’ve seen models perform better on a wide range of tasks as the parameter count scales up. Models have been growing by almost an order of magnitude every year for the past five years, so it’s no surprise that the results have been impressive. However, these very large models are expensive to serve in production.
What’s really remarkable is that, in the past year, they have been getting smaller and dramatically more efficient. We’re now seeing impressive performance from small models that are a lot cheaper to run. Many are being open-sourced, further reducing barriers to experimenting with and deploying these new AI models. This, of course, means they’ll become more widely integrated into apps and services that you’ll use every day.
They will increasingly be able to generate very high-quality text, images, audio, and video content. This new wave of AI will redefine what computers can do for their users, unleashing a torrent of advanced capabilities into existing and radically new products.
The area I’m most excited about is language. Throughout the history of computing, humans have had to painstakingly input their thoughts using interfaces designed for technology, not humans. With this wave of breakthroughs, in 2023 we will start chatting with machines in our language—instantly and comprehensively. Eventually, we will have truly fluent, conversational interactions with all our devices. This promises to fundamentally redefine human-machine interaction.
Over the past several decades, we have rightly focused on teaching people how to code—in effect teaching the language of computers. That will remain important. But in 2023, we will start to flip that script, and computers will speak our language. That will massively broaden access to tools for creativity, learning, and playing.
As AI finally emerges into an age of utility, the opportunities for new, AI-first products are immense. Soon, we will live in a world where, regardless of your programming abilities, the main limitations are simply curiosity and imagination.