When ChatGPT-3 crash-landed onto our computers in November 2022, you’d have been forgiven for thinking this massive leap in artificial intelligence had sprung out of nowhere. From one day to the next, tired online chatbots sending constant links to ‘Frequently Asked Questions’ pages were replaced by an AI model which could write job applications, plan a holiday, and even comfort you after a break-up (try it!).
But how did we cross this technological gap, and was the leap really as sudden as it seemed? Let’s take a peek behind the scientific scenes of the last 10 years, to explore the milestones in AI research that got us here.
Milestone 1: The invention of GANs (2014)
Generative Adversarial Networks (GANs) were the key technological innovation which would eventually enable realistic-looking AI-generated images and videos. GANs use neural networks, a popular machine learning model based on the way neurons in the brain work. GANs contain two neural networks: a generator, and an adversary. In a GAN, the generator network is used to generate examples that could easily be mistaken for real data – like a realistic-looking human face. The adversary then works out which ones are fake, and this process repeats many times, with the generator gradually learning to make more realistic examples.
Milestone 2: Sophia the robot is activated (2016)
Sophia the robot, the world’s first non-human to be recognised with legal personhood, was activated in February 2016 (and subsequently brought to life by Gigi Goode on Drag Race in 2020). Sophia, a humanoid robot developed by Hanson Robotics, is said to combine cutting edge technology in robotics and AI in a way that “personifies our dreams for the future of AI.” Sophia combines vision algorithms, that process visual inputs from its camera eyes, with speech algorithms that employ natural language processing to process and produce speech, to create a human-like impression.
Milestone 3: The AI audio generator WaveNet is launched (2016)
Creating natural-sounding computer-generated voices had long been a challenge for computer scientists. Most previous attempts relied on cutting up raw voice recordings and mashing them back together, a laborious process that produced robotic and artificial-sounding results. In 2016, this all changed when Google DeepMind launched WaveNet. WaveNet is an AI voice-generator which uses a ‘generator’ algorithm, similar to that found in a GAN. The generator algorithm is trained on an example dataset and can then produce new, similar-sounding examples which weren’t part of the training data. AI-generated voices now have a range of good, bad and ugly real-world applications, from helping those with neurological diseases regain a voice, to being used by criminals to simulate family members in money-extracting scams.
Milestone 4: AlphaGo beats the world’s Go champion (2016)
The ancient boardgame Go was developed in China 3000 years ago, and is so ridiculously complex that the amount of possible moves is a googol greater than in chess. For reference: a googol is a number greater than there are atoms in the universe. And yes, it’s also the root of the differently-spelled name of your favourite internet search engine. Developing computer programs that can beat humans at logical games, a benchmark for increasingly capable algorithms, had been a goal for AI researchers since a computer first mastered noughts and crosses in 1952. But in 2016, DeepMind’s AlphaGo beat the human world champion at Go for the first time.
Milestone 5: The birth of deepfakes (2017)
The term ‘deepfakes’ was coined when the Reddit user ‘deepfakes’ began posting hyper-realistic AI videos online – mostly involving pornographic videos with celebrities’ faces super-imposed onto actresses without their consent. Computer-generated special effects were nothing new, as anyone else whose childhood was haunted by Harry Potter’s dementors will attest to. But deepfake videos, with a leg up from GAN technology, allow anyone to easily produce convincingly real videos, and they’re only getting better.
Milestone 6: The first ‘Transformer’ lays the technological foundation for large language models (LLMs) (2017)
Scientists at Google had been working on a new way to program Google Translate. In 2017, the scientists published the seminal paper ‘Attention is all you need’, in which they introduced the first Transformer, providing a step-change to machine translation. Instead of individually translating each word, Transformers read whole sentences at once, capturing the dependencies between words and extracting meaning based on the context. The way Transformers extract and generate meaning from patterns would become central to the technology used in subsequent AI breakthroughs like AlphaFold and large language models.
Milestone 7: OpenAI releases GPT-1 (2018)
The first Generative Pre-trained Transformer was released by OpenAI in 2018. Employing a Transformer architecture, the large language model GPT-1 was able to answer questions and generate blocks of text. It gained these abilities after being trained using two large datasets: one with around 8 million web pages, and one with over 11,000 books. Although this language processor was fluent and accurate on an unprecedented scale, it was unable to coherently generate longer blocks of text and was prone to repetition.
Milestone 8: AlphaFold wins protein folding contest (2020)
After the massive advances for AI in winning games like Go, it was time for a task with real-world implications. Predicting protein folding had been the holy grail of biology for 50 years, since these molecular structures determine most biological processes. However, characterising specific protein structures traditionally required years of excruciating laboratory tests. In 2020, Google DeepMind released the AI algorithm AlphaFold, which, after being trained on a public database of 170,000 protein sequences, reached an accuracy comparable to the lab work at predicting protein structure. AlphaFold has revolutionised biological research and is already contributing to novel drug design. And at its algorithmic heart is – you guessed it – a Transformer.
Milestone 9: Generative AI goes mainstream (2022)
OpenAI introduced ChatGPT to the public in 2022, launching a free preview of GPT-3.5. Just one week after its release, the chatbot interface had surpassed one million users - soon becoming the fastest-growing consumer application in history. This was welcome news to OpenAI, who were using the users’ data to improve their product. But not everyone was so enchanted by the new large language model. It was blocked in several countries, including China, Iran and Italy. And a legal case was brought against OpenAI with concerns about the use of unauthorised data from artists and writers to train their original model. ChatGPT was not the only generative AI to take public interest by storm. In the same year, DALL-E 2, a text-to-image generative AI model, and GitHub Copilot, a code writing assistant, among others, were released to similarly sceptical but enthusiastic receptions.
Milestone 10: The release of Chat GPT-4 (2023)
The newest GPT to date was launched in March 2023. Although GPT-4 retained many of its predecessors’ flaws, it had some key advancements – such as the abilities to take in videos and images, rather than just text, as input prompts, as well as to access the internet in real time. Although OpenAI were forerunners in releasing these models to the public, they are not the only companies developing such models: others like DeepMind and Hugging Face are also vying to get ahead in the race to super-human artificial intelligence.
Overall, AI breakthroughs weren’t quite as sudden as they might have seemed in the news. But looking back on the past decade, the rate of progress is still pretty breath-taking. Will this be one of the last human-written Royal Institution blogs, before AI works out how to give you the content you want without me having to sit at my computer all day? Keep your eye on our website to find out.
Become an Ri member
and gain entry to the CHRISTMAS LECTURES ticket ballot, as well as many other benefits including discounts on all our event tickets. And attend all our Discourses for free!
Find out more