Priority map
sorted by leverage- Core
Agent Evaluation
Reliable measurement of agent capability and failure.
- Core
Agent Observability
Instrumentation for understanding agent internals and behavior.
- Core
Distribution Shift
Performance changes when deployment differs from training.
- Core
Objective Misalignment
Mismatch between optimized and intended objectives.
- Core
Reward Hacking
Policies exploiting flaws in specified rewards.
- Core
Reward Model Drift
Changes in reward model behavior over time.
- Core
Safe RL
Sequential decision making under explicit safety constraints.
- Core
Safety Instrumentation
Systems that expose leading indicators of unsafe behavior.
- Active
Causal RL
Causal reasoning for reinforcement learning.
- Active
Long Horizon Planning
Reliable reasoning over extended action horizons.
- Active
Online Adaptation
Safe learning and adjustment during deployment.
- Active
Policy Collapse
Abrupt degradation or loss of useful policy behavior.
- Active
POMDP
Decision making under partial observability.
- Active
RLHF
Reinforcement learning from human feedback.
- Active
Robust RL
Policies that remain effective under uncertainty and perturbation.
- Active
Sequential Decision Making
Decisions whose effects unfold over time.
- Explore
Embodied AI
Agents acting through bodies in physical environments.
- Explore
Geometry of Learning
Geometric structure in optimization and representation.
- Explore
World Models
Learned predictive models for decision making.