Our alignment team ensures that AI systems remain helpful, honest, and aligned with human values as they become more capable. We develop methods to steer AI towards beneficial outcomes while maintaining human oversight and control in healthcare applications.
As AI systems grow more autonomous and capable, ensuring their alignment with human values becomes increasingly critical—especially in healthcare. Our mission is to develop practical techniques for value alignment that allow AI systems to remain under human control while maximizing patient benefit and safety.
Learning human preferences and values from clinical outcomes, expert decisions, and patient feedback. We develop methods to encode clinician and patient values into AI systems.
Creating reward functions that capture complex clinical objectives—not just accuracy on a single metric, but holistic patient wellbeing, quality of life, and equitable outcomes.
Applying reinforcement learning techniques in ways that ensure the learned policies remain aligned with human values and remain interpretable to clinicians.
Testing whether AI systems maintain their intended behavior under adversarial conditions, distribution shifts, and edge cases that might occur in real clinical settings.
Developing methods to incorporate patient preferences and values into AI treatment recommendations, ensuring personalized care alignment.
Designing AI systems that maintain meaningful human oversight, where clinicians can easily correct and guide AI behavior.
Evaluating whether AI systems make recommendations fairly across different patient populations and detecting potential biases.
Creating comprehensive evaluation frameworks to test whether healthcare AI systems behave as intended in diverse scenarios.
We collaborate with clinical partners, ethics experts, and patient advocacy groups to ensure our alignment research reflects real-world healthcare values and needs. Our work is informed by direct engagement with clinicians and patients who depend on these systems.
We're seeking researchers in AI safety, ethicists, healthcare professionals, and engineers passionate about building trustworthy AI systems. Help us ensure AI systems in healthcare remain aligned with human values.