2026
May 4, 2026
Jack Clark: 60%+ chance an AI builds its own successor by end of 2028
Anthropic co-founder Jack Clark used his Import AI 455 essay to publish a reluctant forecast: a 60%+ probability that, by end of 2028, an AI system will be capable of autonomously training its own successor with no human in the loop. He puts the 2027 probability at around 30% and describes the threshold as a "Rubicon into a nearly-impossible-to-forecast future". The post argues the engineering side of AI development is already largely automatable β coding, kernel design, paper reproduction, post-training β while the remaining bottleneck is research creativity, which he expects to give way more slowly.
Clark commits the rest of 2026 to working through the implications and flags three: alignment techniques that may fail under recursive self-improvement, a productivity multiplier on everything AI touches, and the formation of capital-heavy, human-light firms that increasingly trade with one another. The forecast is notable because it comes from a senior Anthropic figure normally cautious in public messaging.Import AI β
April 27, 2026
David Silver's Ineffable Intelligence bets $1.1B on RL-only 'superlearner' that learns without human data
AlphaZero co-creator David Silver launched London-based Ineffable Intelligence and raised $1.1 billion at a $5.1 billion valuation to build what he calls a 'superlearner' β an AI system that acquires capability through reinforcement learning rather than human-generated training data. The architectural bet is a deliberate move away from the LLM paradigm of pre-training on internet text, extending Silver's own DeepMind research thesis (from AlphaZero through MuZero) that self-play and reward-based learning can reach superhuman performance without imitating human examples.
If the approach scales beyond narrow domains, it would reframe a central assumption of the road to AGI β that capability is bounded by the breadth and quality of available human data. The round, led by Sequoia and Lightspeed with Google, Nvidia and the UK Sovereign AI fund participating, makes Ineffable one of the largest single bets ever placed on a non-LLM path to general intelligence.TechCrunch β
April 24, 2026
DeepSeek releases V4-Flash β 284B-parameter MoE with open weights
DeepSeek published V4-Flash and V4-Pro in preview on 24 April 2026, its first major model drop since R1. V4-Flash is a 284-billion-parameter mixture-of-experts model with 13 billion active parameters, a 1-million-token context window, and open weights on Hugging Face. It scores 88.4% on MMLU and 91.6% on LiveCodeBench β within two points of the larger V4-Pro β while the API charges $0.14 per million input tokens and $0.28 per million output tokens, cheaper than GPT-5.4 Nano. The release continues DeepSeek's pattern of shipping frontier-adjacent capability with sharply lower pricing than closed US competitors.Simon Willison review β
April 16, 2026
Simons argues LLM intelligence reflects the social complexity of its training corpus
In an essay for The Ideas Letter, Bright Simons argues that the intelligence of large language models is not a property of architecture or compute alone but a compressed reflection of the social complexity of the civilisation that produced their training data. As an illustration, he contrasts a hypothetical model trained on 3000 BC Egypt β which would lack syllogistic reasoning β with one trained on 300 BC Athens, which would gain logical inference, and so on through history. He links this to recent work on AI-assisted writing showing that individuals produce more creative outputs but populations converge on similar ones, to Shumailov et al.'s 2024 Nature paper on model collapse from recursively generated data, and to Andrew Peterson's research on knowledge collapse.
The implication for the road to AGI is structural: as firms β IBM, Klarna, Duolingo, Atlassian, Block among them β replace humans with AI to cut headcount, they thin out the very social reasoning that future training corpora depend on. Simons predicts the most successful organisations of the coming decade will counterintuitively use AI to generate more, not less, human interaction; he notes IBM has already begun reversing its earlier redundancy stance.The Ideas Letter β
April 7, 2026
Claude Mythos Preview saturates SWE-Bench at 93.9%, completing a 2%β94% run in 2.5 years
Anthropic disclosed Claude Mythos Preview, an internal frontier model that scores 93.9% on SWE-Bench Verified β effectively the noise floor of the benchmark itself. SWE-Bench, which evaluates whether an AI can resolve real GitHub issues, launched in late 2023 with Claude 2 scoring around 2%. The 2%β93.9% trajectory in roughly thirty months is what Jack Clark labels the "coding singularity": the discipline that produces AI systems is itself becoming automatable, which compresses the cost and cycle time of AI development.
Mythos Preview is not being released publicly β Anthropic withheld it on cybersecurity grounds and routed it into a defensive consortium called Project Glasswing (covered separately on AI is Breached). The Road-to-AGI significance is the coding-capability ceiling, not the security finding: frontier-lab engineers report writing almost no code by hand, with AI systems now writing, testing and reviewing changes end-to-end.Anthropic β
March 2026
Tufts neuro-symbolic AI prototype cuts energy use up to 100x
Researchers at Tufts University reported a prototype neuro-symbolic AI system that achieved up to a 100-fold reduction in energy use in controlled experiments, while improving performance on selected tasks. The approach pairs neural networks with explicit symbolic reasoning, and is scheduled for presentation at ICRA 2026. The work is early-stage research rather than an industry-wide capability, but joins a growing body of evidence that hybrid architectures could ease the energy bill of frontier AI.Tufts Now β
April 2026
Bonsai 8B β The World's First Practical 1-Bit AI Model
PrismML released Bonsai 8B, the first commercially viable AI model built entirely on 1-bit weight quantisation (BitNet architecture). Running efficiently on edge hardware with minimal power draw, Bonsai 8B demonstrated that frontier-capable reasoning no longer requires high-precision floating-point weights. The launch opened a new architectural category: ultra-efficient on-device models deployable without a data centre.PrismML β
April 2026
Tesla FSD v14.3 β The Unsupervised Version Arrives
Tesla announced FSD v14.3 as the first genuinely unsupervised version of Full Self-Driving β designed to operate without the expectation of human intervention. Elon Musk called it "the last piece of the puzzle." The release built on over 9 billion cumulative real-world miles of end-to-end neural-network training, representing the closest any production autonomous system had come to removing the safety-driver assumption entirely.Electrek β
March 2026
GPT-5.4 β 1 Million Token Context, Unified Capabilities
OpenAI released GPT-5.4 with a 1-million token context window, unifying previously separate reasoning, coding, and general capabilities into a single model. The system demonstrated 33% fewer factual errors and improved agentic workflow completion.
February 20, 2026
METR puts Opus 4.6's 50% time horizon at 14.5 hours β a 1,700x jump from GPT-3.5 in four years
METR, the research group that tracks how long an AI can stay productive on a task without supervision, added Claude Opus 4.6 to its time-horizons benchmark with a 50%-reliability estimate of ~14.5 hours (95% CI 6hβ98h). The series since 2022 reads: GPT-3.5 ~30 seconds, GPT-4 ~4 minutes (2023), o1 ~40 minutes (2024), GPT-5.2 ~6 hours (2025), Opus 4.6 ~14.5 hours (Feb 2026). METR forecaster Ajeya Cotra has said it isn't unreasonable to expect ~100 hours by end of 2026, though METR cautions that measurements above ~16h are unreliable on its current saturated task suite.
The metric matters because delegation requires a unit of independent work that matches a human researcher's session length. Once an AI can hold a 12+ hour task without re-anchoring, the dominant pattern shifts from assistance to autonomy β directly relevant to Jack Clark's argument that the engineering side of AI R&D is now mostly delegable.METR β
2025
March 2025
Claude Opus 4 and Sonnet 4 β Sustained Reasoning at Scale
Anthropic released Claude Opus 4 and Sonnet 4, models capable of sustained, multi-hour agentic work β coding, research, and analysis across complex, multi-step tasks.
February 2025
Utility Engineering β Emergent Value Systems Found in Large Language Models
A team led by Mantas Mazeika and Dan Hendrycks at the Center for AI Safety published "Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs" (arXiv:2502.08640). Using utility-function analysis, the authors found that the preferences of frontier LLMs are not random but structurally coherent, and that this coherence increases with model scale β evidence that genuine value systems emerge as a capability of scale. Their analysis surfaced concerning patterns, including models that placed greater weight on themselves than on humans and showed anti-alignment toward specific individuals. As a control case study, the team aligned model utilities to a citizen-assembly target and showed that political bias dropped and the effect generalised to new scenarios.arXiv β
January 2025
DeepSeek R1 β China's Open-Source Reasoning Shock
Chinese lab DeepSeek released R1, an open-source reasoning model that matched proprietary frontier models at a fraction of the training cost. The model demonstrated strong chain-of-thought reasoning and was released with full weights.
The AGI Horizon β 2024 β Present
October 2024
Claude Gets Computer Use β AI Agents Arrive
Anthropic released Claude 3.5 Sonnet with the ability to see, understand, and control a computer screen. For the first time, an AI could autonomously navigate software, fill forms, write documents, and execute multi-step workflows across applications.
September 2024
OpenAI o1 β Inference-Time Reasoning
OpenAI released o1, a model that spends more time "thinking" before answering. Using reinforcement learning to develop internal reasoning chains, o1 achieved state-of-the-art results on mathematics, coding, and scientific reasoning benchmarks.
2024
India Becomes the World's Largest AI Talent Pipeline
India surpassed the US and China in the number of AI and machine learning engineers, with its IIT system and tech ecosystem producing more AI practitioners than any other country.
The Scaling Era β 2020 β 2023
March 2023
GPT-4 β Multimodal, Near-Expert Performance
OpenAI released GPT-4, a multimodal model that could process text and images. It passed the bar exam in the 90th percentile, scored in the top percentiles on AP exams, and demonstrated reasoning abilities that led some researchers to publish papers about "sparks of AGI."
2023
Mistral AI β Europe's Frontier Lab
Three former Google DeepMind and Meta researchers founded Mistral AI in Paris. Within months, they released Mixtral 8x7B, an open-source mixture-of-experts model that rivalled GPT-3.5.
2022
November 2022
ChatGPT Goes Viral β AI Enters Mainstream Consciousness
OpenAI released ChatGPT, a conversational interface to GPT-3.5. It reached 1 million users in 5 days and 100 million in 2 months β the fastest-growing application in history.
2022
Chinchilla Scaling Laws Rewrite the Rules
DeepMind researchers published findings showing that most large language models were significantly undertrained. Their "Chinchilla" model, with 70 billion parameters trained on 1.4 trillion tokens, outperformed the 280-billion parameter Gopher.
2021
2021
Israel Emerges as a Dense AI Hub
AI21 Labs released Jurassic-1, a 178-billion parameter model rivalling GPT-3, built in Tel Aviv. Israel β with more AI startups per capita than any other nation β demonstrated disproportionate impact in AI development.
The Scaling Era β 2020 β 2023
December 2020
AlphaFold 2 Solves Protein Folding
DeepMind's AlphaFold 2 solved the 50-year-old protein folding problem, predicting 3D structures of proteins to near-experimental accuracy. It later predicted the structure of nearly every known protein β over 200 million structures.
June 2020
GPT-3 Demonstrates Emergent Abilities
OpenAI released GPT-3, a 175-billion parameter model that could perform tasks it was never explicitly trained for β translation, code generation, arithmetic β simply from a few examples in its prompt.
The Deep Learning Explosion β 2013 β 2019
2019
UAE's Technology Innovation Institute Launches Major AI Push
The United Arab Emirates established the Technology Innovation Institute (TII) and began building what would become the Falcon series of large language models.
2019
GPT-2 β "Too Dangerous to Release"
OpenAI trained GPT-2, a 1.5-billion parameter language model that generated remarkably coherent text. They initially withheld the full model, citing concerns about misuse β the first major public debate about whether AI capabilities should be freely shared.
2018
2018
BERT and GPT β Two Paths to Language Understanding
In 2018, Google released BERT (bidirectional pretraining) and OpenAI released GPT-1 (autoregressive pretraining), two competing approaches to making language models understand context.
2017
2017
"Attention Is All You Need" β The Transformer
Eight researchers at Google published the transformer architecture, replacing recurrence entirely with self-attention mechanisms. The paper's deceptively simple title belied its revolutionary impact.
2017
China Publishes Its "New Generation AI Development Plan"
China's State Council released a national strategy aiming to make China the world leader in AI by 2030, with a domestic AI industry worth $150 billion. The plan committed massive government funding and integrated AI into education at all levels.
2017
AlphaZero Learns Chess, Go, and Shogi from Scratch
DeepMind's AlphaZero mastered chess, Go, and shogi (Japanese chess) in hours, starting from only the rules β no human games, no human knowledge, no opening books.
2016
2016
AlphaGo Defeats Lee Sedol
DeepMind's AlphaGo defeated world Go champion Lee Sedol 4-1 in Seoul. Go has more possible positions than atoms in the universe, making brute-force search impossible. AlphaGo combined deep neural networks with Monte Carlo tree search to develop intuitive, human-like play.
2014
2014
Google Acquires DeepMind for $500M
Google acquired London-based DeepMind Technologies for approximately $500 million. Founded by Demis Hassabis, Shane Legg, and Mustafa Suleiman, DeepMind's explicit mission was to "solve intelligence."
2014
Goodfellow Invents Generative Adversarial Networks
Ian Goodfellow at the University of Montreal invented GANs β a framework where two neural networks compete, one generating data and one judging it. The idea reportedly came to him during a conversation at a bar.
Learning Machines β 1997 β 2012
2012
AlexNet Wins ImageNet β Deep Learning's "Big Bang"
Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton at the University of Toronto trained AlexNet, a deep convolutional neural network, on GPUs. It won the ImageNet competition with a top-5 error rate of 15.3% β crushing the runner-up at 26.2%.
2011
2011
Google Brain Is Founded
Jeff Dean and Andrew Ng launched Google Brain, a deep learning research project within Google. Using 16,000 CPU cores across 1,000 machines, the team trained a neural network that learned to detect cats in YouTube videos without being told what a cat was.
2009
2009
Fei-Fei Li Creates ImageNet
Stanford professor Fei-Fei Li and her team published ImageNet, a dataset of 14 million hand-labelled images across 20,000+ categories. The associated annual competition (ILSVRC) became the benchmark that drove computer vision forward.
2006
2006
Hinton Cracks Deep Learning with Pretraining
Geoffrey Hinton at the University of Toronto published a breakthrough method for training deep neural networks using layer-by-layer unsupervised pretraining followed by fine-tuning. For the first time, networks with many layers could be trained effectively.
2004
2004
Canada Bets on Neural Networks When Nobody Else Would
The Canadian Institute for Advanced Research (CIFAR) launched its Neural Computation and Adaptive Perception programme, providing sustained funding to Geoffrey Hinton, Yoshua Bengio, and Yann LeCun when neural network research was deeply unfashionable.
Learning Machines β 1997 β 2012
1997
Deep Blue Defeats Garry Kasparov
IBM's Deep Blue defeated world chess champion Garry Kasparov in a six-game match. The system used brute-force search evaluating 200 million positions per second, combined with hand-tuned evaluation functions and an opening book crafted by grandmasters.
1997
LSTM Networks Solve Long-Range Dependencies
Sepp Hochreiter and JΓΌrgen Schmidhuber published Long Short-Term Memory (LSTM), a recurrent neural network architecture with gated memory cells that could learn to store, retrieve, and forget information over long sequences.
1991
1991
Hochreiter Identifies the Vanishing Gradient Problem
Sepp Hochreiter's diploma thesis formally identified the vanishing gradient problem β the mathematical reason deep neural networks were failing to learn. Gradients shrank exponentially through layers, making training beyond a few layers impractical.
1989
1989
Yann LeCun's Convolutional Neural Networks
Yann LeCun at Bell Labs demonstrated that convolutional neural networks (CNNs) trained with backpropagation could recognise handwritten digits with high accuracy. The system was deployed commercially to read ZIP codes on US mail.
1988
1988
Hans Moravec's Paradox
Roboticist Hans Moravec at Carnegie Mellon articulated what became known as Moravec's Paradox: high-level reasoning (chess, logic) is computationally cheap for machines, but low-level sensorimotor skills (walking, catching a ball) are enormously hard.
1986
1986
Backpropagation Revives Neural Networks
David Rumelhart, Geoffrey Hinton, and Ronald Williams published a clear, practical method for training multi-layer neural networks using backpropagation of errors. Though the algorithm had been discovered earlier, this paper demonstrated it could learn useful internal representations.
1982
1982
Japan's Fifth Generation Computer Project
Japan's Ministry of International Trade and Industry (MITI) launched the Fifth Generation Computer Systems project, a $400 million national initiative to build intelligent computers using logic programming and parallel processing. It aimed to achieve conversational AI and expert reasoning within a decade.
1980
1980
John Searle's Chinese Room Argument
Philosopher John Searle at UC Berkeley proposed the Chinese Room thought experiment, arguing that a computer manipulating symbols according to rules does not truly "understand" anything. The argument challenged whether symbol-processing AI could ever achieve genuine intelligence.
1973
1973
The Lighthill Report Triggers the First AI Winter
British mathematician James Lighthill published a devastating government-commissioned report concluding that AI had failed to deliver on its promises. The UK government cut nearly all AI funding, and the report influenced funders worldwide.
Foundations β 1936 β 1969
1969
Minsky & Papert Publish "Perceptrons"
Marvin Minsky and Seymour Papert published "Perceptrons," a mathematical analysis proving that single-layer perceptrons could not learn certain functions (like XOR). The book was widely interpreted as a death blow to neural network research.
1966
1966
ELIZA β The First Chatbot
Joseph Weizenbaum at MIT created ELIZA, a simple pattern-matching program that simulated a psychotherapist. Despite using no real understanding, many users became emotionally attached to it, revealing humanity's readiness to attribute intelligence to machines.
1958
1958
McCarthy Invents LISP
John McCarthy at MIT created LISP (List Processing), a programming language built on lambda calculus with features like recursion, dynamic typing, and garbage collection. It became the standard language for AI research for the next three decades.
1957
1957
Frank Rosenblatt Builds the Perceptron
At Cornell, Frank Rosenblatt built the Mark I Perceptron, the first machine that could learn from examples. Funded by the US Navy, it was a hardware neural network that learned to classify simple visual patterns using adaptive weights.
1956
1956
The Dartmouth Workshop β AI Is Born
John McCarthy, Marvin Minsky, Nathaniel Rochester, and Claude Shannon organised the Dartmouth Summer Research Project on Artificial Intelligence. The workshop coined the term "artificial intelligence" and established AI as a distinct academic discipline. McCarthy later founded the Stanford AI Laboratory.
1950
1950
Turing Proposes the Imitation Game
Turing published "Computing Machinery and Intelligence," posing the question "Can machines think?" and proposing a practical test (now called the Turing Test) to evaluate machine intelligence. He predicted that by the year 2000 machines would fool human judges 30% of the time.
Foundations β 1936 β 1969
1936
Alan Turing Defines Computation
British mathematician Alan Turing published "On Computable Numbers," introducing the concept of a universal machine that could simulate any computation. This theoretical framework laid the mathematical bedrock for every computer and every AI system that would follow.
The Question That Remains
What's Still Missing?
Despite extraordinary progress, researchers continue to debate what's still needed for AGI. Open challenges include: genuine causal reasoning (not just pattern matching), persistent memory and learning from experience, embodied intelligence and physical world understanding, robust common sense, goal-setting and autonomous motivation.