Why AI is Missing Something Fundamental About the Brain

The human brain accomplishes extraordinary feats of learning and intelligence with just 20 watts of power and limited data—a stark contrast to large language models that consume vastly more energy and training data yet achieve only a fraction of human cognitive capabilities. This efficiency gap points to fundamental differences in how biological and artificial systems process information, learn from experience, and navigate the world.

Key Takeaways

The brain likely uses highly specific, evolution-encoded loss functions rather than the simple mathematical objectives favored in machine learning
Cortical regions may perform "omnidirectional inference"—predicting any subset of variables from any other subset, unlike LLMs' unidirectional next-token prediction
Evolution encodes complex reward systems through specialized cell types in subcortical "Steering Subsystem" regions, which learning areas then predict and generalize
Understanding brain architecture could require connectome mapping at scale, potentially costing low billions but offering insights worth trillions in AI development
Current AI may be missing crucial components like energy-based models, sophisticated memory systems, and hierarchical learning architectures found in biological intelligence

The Complexity Problem: Why Simple Loss Functions Fall Short

Machine learning gravitates toward mathematically elegant solutions: predict the next token, minimize cross-entropy loss, optimize straightforward objectives that computers handle efficiently. Yet this preference for simplicity may represent a fundamental limitation.

Evolution operated under different constraints. Over millions of years, natural selection could encode remarkably complex loss functions—essentially sophisticated "Python code" specifying what different brain regions should learn at various developmental stages. These aren't the clean mathematical functions preferred by computer scientists, but intricate, multi-layered objectives tailored to survival and reproduction.

Consider the sophistication required: evolution needed to create learning systems that could master language, navigate social hierarchies, develop tool use, and countless other skills—all without knowing in advance what specific challenges organisms would face. The solution appears to involve numerous specialized loss functions, each targeting different aspects of cognition and behavior.

Beyond Next-Token Prediction

Large language models excel at their training objective: given a sequence of tokens, predict what comes next. This unidirectional approach works remarkably well for text generation, but represents just one type of prediction among many possible forms.

The cortex may implement something far more flexible: omnidirectional inference. Any cortical area might learn to predict any pattern in any subset of its inputs, given any other missing subset. Rather than the rigid "context window to next token" mapping, the brain could support arbitrary combinations of known and unknown variables.

This flexibility enables remarkable generalization. Association areas might predict vision from audition, or anticipate emotional responses from conceptual understanding. The system learns to model not just external sensory data, but internal states—predicting when muscles will tense, when reflexes will trigger, or when specific emotional responses will activate.

The Steering and Learning Subsystem Framework

Recent neuroscience suggests the brain operates through two interconnected but distinct systems. The Learning Subsystem—primarily cortical regions—builds flexible world models and adapts to new information. The Steering Subsystem—subcortical areas like the hypothalamus and brainstem—contains more hardwired responses and generates reward signals.

This division helps solve a crucial puzzle: how can evolution create organisms that respond appropriately to situations it never directly experienced? Evolution never encountered podcasts, social media, or modern technology, yet humans develop emotional responses to these novel contexts.

The Generalization Solution

The answer lies in predictive modeling. Parts of the Learning Subsystem learn to predict what the Steering Subsystem will do. These predictions can generalize in ways the hardwired responses cannot.

Consider spider phobia as an example. The Steering Subsystem might contain innate circuitry that triggers fear responses to small, dark, fast-moving objects—a reasonable heuristic for avoiding dangerous insects. But the Learning Subsystem can learn to predict this response and generalize it to abstract concepts like the word "spider" or even conversations about spiders.

The cortex inherently has the ability to generalize because it's just predicting based on these very abstract variables and all these integrated information that it has.

This creates sophisticated reward functions without requiring evolution to anticipate every possible scenario. The system learns which learned features should connect to which innate emotional responses, enabling appropriate reactions to novel situations that share relevant underlying patterns with ancestral challenges.

Cellular Diversity and Functional Specialization

Recent single-cell mapping studies reveal striking differences between brain regions. Cortical areas show relatively uniform cell types—consistent with implementing general learning algorithms. But subcortical Steering Subsystem regions contain thousands of highly specialized, diverse cell types.

This cellular diversity likely reflects functional specialization. Each distinct reward function, each specific innate response, may require dedicated neural circuitry with unique molecular properties and connectivity patterns. The "spider flinch reflex" needs different wiring than the "salt taste detection" system, necessitating distinct cell types to implement these precise connections.

Architectural Insights and AI Implications

If this framework accurately describes biological intelligence, it suggests current AI systems miss crucial architectural elements. Most significantly, they lack the sophisticated reward and attention mechanisms that guide learning in biological systems.

Energy-Based Models and Probabilistic Inference

The omnidirectional inference capabilities of the cortex align more closely with energy-based models than with standard neural networks. Rather than computing specific conditional probabilities, these systems model joint distributions across all variables—allowing flexible inference in any direction.

This approach requires more sophisticated inference procedures, potentially involving sampling and iterative refinement rather than simple feedforward computation. The brain's apparent ability to perform such computations in milliseconds suggests either highly optimized approximation methods or fundamental differences in how biological systems handle probabilistic reasoning.

The Amortization Trade-off

Current AI systems heavily rely on amortized inference—baking complex reasoning processes into feedforward networks during training to avoid expensive computation at test time. The brain may strike a different balance, using more test-time compute for flexible reasoning while amortizing other capabilities into fast reflexive responses.

This distinction becomes crucial for digital minds that can be copied and updated. While biological systems must learn everything from scratch in each individual, digital systems can share learned capabilities. This asymmetry may favor different architectural choices and learning strategies than those evolution discovered.

The Connectome Project: Mapping Intelligence

Understanding these principles requires detailed knowledge of brain architecture—specifically, comprehensive maps of neural connectivity at cellular resolution. Current connectome projects aim to trace every neuron and synapse in mammalian brains, providing the data necessary to reverse-engineer biological intelligence.

Technological Challenges and Solutions

Initial estimates suggested mapping a single mouse brain would cost several billion dollars using electron microscopy. However, new optical approaches promise dramatic cost reductions—potentially bringing mouse connectomes down to tens of millions of dollars and making human brain mapping economically feasible.

The key innovation involves switching from electron to optical microscopy, allowing molecular annotation of connections rather than just structural mapping. This provides information about synapse types, cell identities, and functional properties—crucial data for understanding how connectivity patterns implement specific algorithms.

Timeline and Funding Considerations

Comprehensive connectome mapping would require sustained investment in the low billions of dollars—substantial but modest compared to current AI development costs. The timeline depends heavily on technological development and coordinated effort, but major progress appears feasible within a decade given sufficient resources.

If you can only do that for low billions of dollars or something to really comprehensively solve that, it seems to me, in the grand scheme of trillions of dollars of GPUs and stuff, it actually makes sense to do that investment.

The urgency depends partly on AI development timelines. If current approaches achieve human-level intelligence within a few years, neuroscience insights may have limited impact on the initial development phase. However, understanding biological intelligence remains crucial for longer-term safety and capability questions.

Formal Methods and Mathematical Reasoning

Beyond brain-inspired architectures, advances in formal verification offer another path toward more reliable AI systems. Languages like Lean enable mathematical proofs to be expressed in verifiable computer code, creating perfect training signals for AI systems learning to reason mathematically.

The Promise of Provable AI

Formal verification transforms mathematical reasoning from a fuzzy, intuition-based process into a precise optimization problem with clear success criteria. AI systems can search for proofs using reinforcement learning, receiving immediate feedback on correctness through automated verification.

This approach already shows promise for automating routine mathematical tasks and could extend to software verification, enabling provably secure systems resistant to hacking and manipulation. The combination of automated cleverness in proof search with rigorous verification of results offers a compelling model for trustworthy AI reasoning.

Collaborative Intelligence

Formal methods also enable new forms of collaboration between human and artificial intelligence. When both parties express ideas in verifiable formal languages, trust becomes mechanically checkable rather than relying on reputation or authority.

This capability becomes increasingly important as AI systems proliferate. In a world with numerous artificial intelligences, formal communication protocols may prove essential for reliable cooperation and knowledge sharing between different systems.

Future Directions and Open Questions

The convergence of neuroscience insights, connectome mapping, and formal methods points toward a more comprehensive understanding of intelligence. However, significant questions remain about consciousness, continual learning, and the relationship between biological and digital minds.

The Representation Problem

How does the brain represent knowledge and world models? Current evidence suggests a complex mixture of geometric, associative, and potentially symbolic representations, but the precise organizational principles remain unclear. Understanding these representations could inform more efficient AI architectures.

Scaling and Generalization

The rapid expansion of human cognitive capabilities during evolution suggests architectural changes that dramatically improved learning efficiency. Identifying these innovations could accelerate AI development while maintaining safety through better understanding of goal formation and value learning.

The ultimate question remains whether digital intelligence will recapitulate biological principles or discover entirely different approaches to general intelligence. Current trends suggest a hybrid future where AI systems incorporate biological insights while exploiting unique advantages of digital implementation—copyability, external memory access, and precise control over internal states.

This research program promises not just better AI systems, but deeper understanding of intelligence itself. Whether pursued through connectome mapping, formal methods, or hybrid approaches, the quest to understand how minds work represents one of the most profound scientific challenges of our time—one whose solution could reshape both technology and our understanding of ourselves.

Adam Marblestone – AI is missing something fundamental about the brain

Table of Contents

Key Takeaways