About Me
Research Scientist with a PhD in Computer Science, working on human-centric evaluation of LLM capabilities. Creator of multiple domain-specific benchmarks, with focus on practical relevance. Experience as both an individual research lead and supervisor for research projects across the broader space of LLM applications. As a top contributor on Stackoverflow, I've reached more than 1.5 million people to date with my technical insights and solutions.
Experience
09/2023 — present
Member of Technical Staff
Cohere Remote from Philadelphia, PA
- Core team member for North Mini Code, contributor to Command model family since Command R
- Research lead for Cohere's human annotation efforts, introducing internal benchmarks for text, vision, agentic coding, and audio evaluations
- Instigated multiple projects to reduce the average annotation handling time by over 30% while increasing annotation consistency
- Co-maintainer of our internal "eval-as-a-service" framework, used across the broader modeling organization
08/2021 — 12/2021
Applied Scientist Intern
Amazon Berlin, Germany
- Training Seq2Seq networks to improve result coverage on long-tail query recommendations to 100%
09/2019 — 01/2021
Founding Engineer (Part-time)
Codefy GmbH Heidelberg, Germany
- Co-leading the implementation of a legal document search solution, securing 200,000€ in seed funding
06/2018 — 09/2018
Software Engineering Intern
SAP Walldorf, Germany
- Researching and implementing randomized matrix decomposition algorithms resulting in 1000x speedup
Education
06/2019 — 07/2024
Ph.D. Computer Science
Heidelberg University Heidelberg, Germany
- Thesis title: "Towards a Unified Framework for Aspect-based Multi-document Text Summarization"
- Head TA for several NLP/IR-related graduate courses and seminars
09/2017 — 05/2019
M.Sc. Applied Computer Science
Heidelberg University Heidelberg, Germany
- GPA: 4.0 (with distinction), minor in Computational Linguistics
09/2017 — 04/2018
Exchange Year
University of Toronto Toronto, Canada
- GPA: 3.95, focus on algorithmic game theory and ML theory
10/2013 — 08/2017
B.Sc. Applied Computer Science
Heidelberg University Heidelberg, Germany
Select Publications
Divide, Evaluate, and Conquer: Checklists Improve AI Feedback
ACL 2026
How Does Quantization Affect Multilingual LLMs?
EMNLP 2024
EUR-Lex-Sum: A Multilingual Dataset for Long-form Text Summarization
EMNLP 2022