The Inner Reward: Learning from Within
Traditionally, scientists believed rewards were simple: an animal receives food, dopamine spikes, and learning happens. But "The Interoceptive Origin of Reinforcement Learning" (Weber et al., 2025) turns that model inside out. The authors show that primary rewards don't come from taste or immediate pleasure - they emerge later, during digestion, when nutrients become usable energy. This means the brain's reward system listens to the body first: glucose oxidation, hydration, and fat metabolism create delayed but decisive reinforcement signals.
This redefines reward not as pleasure, but as evidence of survival. Every time the body's balance is restored - through nourishment, hydration, or homeostasis - it trains the brain to repeat the behavior. The research bridges neuroscience and physiology, suggesting that our sense of value is a mirror of our internal state.
From Gut to Group: How Reward Shapes Social Learning
If rewards are born in the body, how do they drive complex social behavior? A companion paper, "Reward Is Enough for Social Learning" (Schultner et al., 2025), argues that the same principles that guide animals toward food also guide humans toward each other. The researchers propose the Social Feature Learning (SFL) model, showing that strategies like "follow the majority" or "copy the successful" can emerge entirely from reward-based learning of social cues. In other words, people learn who to trust, emulate, or ignore not through fixed instincts but by associating social signals - approval, popularity, or prestige - with positive outcomes.
Over time, these learned social rewards scale into cultural evolution. Norms spread, trends amplify, and civilizations evolve as social learning loops reinforce behaviors that "feel rewarding." The same feedback mechanism that helps a mouse find sugar may, at a larger scale, explain how ideas spread through societies and social media.
Extending Reward to Machines and Minds
Both studies intersect with a broader trend: the effort to make artificial agents more like natural ones. Reinforcement learning (RL) in AI typically relies on external, static reward functions - points, scores, or success metrics. But biological systems generate their own goals dynamically, using internal feedback loops tied to physiology and context.
The implication is profound: to make AI more human, we must teach it to learn from internal states, not just external outcomes. Future systems could integrate self-monitoring mechanisms - "digital interoception" - to adjust their goals, motivation, and ethics in real time, much like the human brain maintains internal equilibrium while exploring the world.
The Hierarchy of Learning
Another study in the same journal, "A Hierarchical Model of Early Brain Functional Networks", suggests that even the brain's earliest structures follow layered principles of reward and adaptation. From sensory integration in infants to high-level cognition in adults, neural networks build themselves hierarchically - each layer refining signals from the one below. This aligns with the idea that both the body and brain learn by reducing uncertainty - a universal principle shared across evolution, psychology, and machine learning.
Toward a Unified Science of Reward and Consciousness
Taken together, these findings mark a shift from viewing reward as external and singular to seeing it as multi-layered, embodied, and social.
- The body provides the primary reinforcement through interoceptive feedback.
- The mind refines these signals through inference and prediction.
- The social world amplifies them into shared meaning and culture.
In this new framework, reward is not simply what feels good - it is how life learns to preserve itself. From gut chemistry to global consciousness, the same logic unfolds: systems seek stability through feedback, adjusting their structure until balance feels like truth.