Galton Lab - Beyond Language Models

🖼️ Image Classification via 2D Flow Fields

Instead of CNN → softmax over 10 digits, we create a 2D probability landscape where probes naturally flow toward the correct digit's region.

Watch as probes drop randomly and flow through the learned geometry toward digit classes arranged in a circle!

Select Target Digit (what the image shows)

Flow Strength (α = 1.5)

0

Probes Dropped

—

Predicted Digit

0%

Confidence

💡 How It Works

2D Flow Field: Each image creates a unique 2D velocity field via a learned SDF network.

Class Centers: The 10 digit classes are arranged in a circle (learnable positions).

Probe Integration: Probes start at random positions and flow toward the correct digit's region using RK2 integration.

Soft Assignment: Gaussian windows around class centers convert probe positions → class probabilities.

🔗 Key Benefits:

Uncertainty Quantification: Probe spread = classification confidence
Interpretability: Visualize the decision landscape
Smooth Gradients: Continuous flow → stable training
Same Loss: Standard cross-entropy still works!

💻 Run It Yourself

See the full implementation with CNN encoder, 2D SDF flow field, and training loop:

📂 examples/mnist_classifier.py

python examples/mnist_classifier.py

# Creates model with:
# - SimpleCNN encoder (image → embedding)
# - FlowField2D (embedding → 2D velocity field)
# - Learnable class centers in a circle
# - RK2 integration for probe trajectories
# - Visualization functions included!

🔍 Attention Mechanism as Geometric Routing

Attention is fundamentally about routing information. Instead of computing routing weights via softmax(QK^T), let them emerge from geometric flow.

Probes flow from a uniform distribution across the sequence toward relevant key positions!

Query Token (which token is attending?)

Relevant Keys (where should it attend?)

0

Probes Integrated

—

Top Attended Key

0%

Attention Weight

💡 Flow-Based Attention

Standard Attention:

Q, K, V = projections(x)
scores = Q @ K.T / √d       # O(L²) dot products
weights = softmax(scores)    # Normalize
output = weights @ V

Flow Attention:

Q, K, V = projections(x)
for each query:
    field = sdf_network(query, K)  # Create routing field
    probes = integrate(field)       # Flow to relevant keys
    weights = probe_density(probes) # Where did they land?
output = weights @ V                # Same as standard!

🔗 Why This Helps:

Sparse Routing: Probes only visit relevant keys
Hierarchical: Coarse routing → fine routing
Adaptive: More probes for hard queries
Interpretable: Visualize probe trajectories!
Drop-in Replacement: Same API as standard attention

💻 Run It Yourself

Full implementation with Q/K/V projections and probe-based routing:

📂 examples/attention_flow.py

python examples/attention_flow.py

# Creates FlowAttention module:
# - Standard Q, K, V projections
# - AttentionFlowField (SDF for routing)
# - Per-query probe integration
# - Soft bucket assignment to keys
# - Visualize attention patterns!

🎮 RL Policy as Flow Through Action Space

Policy networks map states to action distributions. Instead of state → logits → softmax, create a flow field over action space.

The agent "feels out" the action landscape—probes flow toward good actions, uncertainty guides exploration!

GridWorld State

Policy Confidence

Medium (spread = moderate)

0

Probes Dropped

—

Chosen Action

0%

Action Probability

💡 Policy as Flow

Standard Policy:

logits = policy_network(state)
action_probs = softmax(logits)
action = sample(action_probs)

Flow Policy:

field = sdf_network(state, action_space)
probes = integrate(field)         # Flow to good actions
action_probs = probe_density()    # Where did they land?
action = sample(action_probs)     # Same sampling!

🔗 RL-Specific Benefits:

Uncertainty → Exploration: Probe spread = state uncertainty → explore more
Smooth Gradients: Continuous flow → stable policy updates
Interpretable: "Why did the agent choose that action?" → visualize the flow!
Adaptive Compute: Simple decisions = few probes (fast), critical decisions = more probes (careful)
Natural Smoothing: Flow naturally spreads to nearby actions

💻 Run It Yourself

Complete RL policy implementation with GridWorld environment:

📂 examples/rl_policy_flow.py

python examples/rl_policy_flow.py

# Creates FlowPolicy with:
# - State encoder network
# - PolicyFlowField (1D action space)
# - Probe integration with RK2
# - GridWorld environment included
# - Training loop sketch for REINFORCE/PPO

🌊 The Universal Pattern

The core insight—probability as geometric flow—applies anywhere you use softmax.

All three examples follow the same template. This isn't just for LLMs—it's a fundamental rethinking of categorical probability!

🎯 The Template

// Traditional approach (everywhere)
logits = neural_network(input)
probabilities = softmax(logits)
choice = sample(probabilities)

// Galton approach (universal)
context = neural_network(input)
field = sdf_network(context, choice_space)
probes = integrate(field)
probabilities = probe_density(probes)
choice = sample(probabilities)

💡 What You Gain Across All Domains:

Benefit	How It Helps
🎯 Uncertainty	Probe spread = confidence (no post-hoc entropy calculations)
🔍 Interpretability	Visualize decision landscapes, understand why choices were made
⚡ Adaptive Compute	Confident decisions = few probes (fast), uncertain = more probes (careful)
🎨 Smooth Optimization	Continuous flow = stable gradients, better training dynamics
🌊 Physical Intuition	Decisions as flow (not algebra), natural and interpretable

🚀 More Use Cases

The pattern extends to:

Mixture of Experts: Router network → flow → expert selection
Hierarchical Classification: Cascaded flow (coarse → fine)
Structured Prediction: Flow over structured spaces (parsing, graphs)
Neural Architecture Search: Flow through architecture space
Multi-Modal Fusion: Multiple modalities create combined flow field
Seq2Seq Decoding: Replace decoder softmax with flow
Recommendation Systems: User state → flow over item space
Graph Neural Networks: Message passing via geometric flow

See docs/use-cases.md for detailed patterns and code!

🧪 Try It on Your Domain

Step-by-step:

Identify where you use softmax (classification, routing, sampling, etc.)
Define your choice space (classes, tokens, actions, items, etc.)
Create an SDF network that takes your context as input
Integrate probes through the learned field (RK2, adaptive steps)
Assign probes to choices via soft buckets (Gaussian windows)
Train with same loss (cross-entropy, policy gradient, etc.)

The geometry handles the rest!

💬 Final Thought:

"For decades, we've calculated probabilities algebraically. But in the physical world, probability flows. Water finds its level. Particles settle. Energy minimizes.

Galton Lab isn't just for language models. It's a new way of thinking about uncertainty in AI—one where the geometry does the work, the physics guides the bits, and probability emerges naturally from flow."

— Explore the code, run the examples, try it on your domain!

🔗 GitHub Repository 📂 All Examples 📖 Use Cases Guide ← Back to Foundation Demo

🌐 Beyond Language Models

🖼️ Use Case 1

🔍 Use Case 2

🎮 Use Case 3

🌊 Universal

🖼️ Image Classification via 2D Flow Fields

💡 How It Works

💻 Run It Yourself

🔍 Attention Mechanism as Geometric Routing

💡 Flow-Based Attention

💻 Run It Yourself

🎮 RL Policy as Flow Through Action Space

💡 Policy as Flow

💻 Run It Yourself

🌊 The Universal Pattern

🎯 The Template

🚀 More Use Cases

🧪 Try It on Your Domain