This role leads multifunctional teams developing groundbreaking approaches to AI assessment, including automated evaluation systems. You'll work with ML researchers, engineers, and domain experts to pioneer new methods for scalable, high-quality AI evaluation.
We are looking for an outstanding, hands-on manager who will thrive in a fast-paced environment. We believe the most exciting problems in machine learning research arise at the intersection with real-world use cases, and this is also where the most critical breakthroughs come from.
KEY RESPONSIBILITIES:
Lead R&D in automated AI evaluation, including development of LLM-based assessment systems that can reliably evaluate model outputs
Drive research and implementation of novel approaches to measure and improve AI system quality, safety, and alignment
Build and scale evaluation infrastructure that combines human expertise with ML-powered automation
Work with cross-functional partners to integrate evaluation systems into production workflows
Masterβs degree with 4+ years of industry experience, or PhD with 3+ years, or equivalent work experience
3+ years experience developing ML evaluation systems
2+ years leading technical teams in ML/AI
Strong technical leadership and communication skills
- Expertise in full-stack LLM development and deployment
Deep experience with LLM evaluation and automated assessments
Track record scaling ML systems in production
Strong product intuition and ability to identify high-impact opportunities
Excellence in building and leading high-performing technical teams