GenAI Zürich 2026

Testing LLM Outputs: Caging the Wind or Just Another Day in the Office?

April 2, 2026 - Tech & Startup Stage, Volkshaus Zürich

Slides PDF, GenAI Zürich 2026
Download slides

About this talk

As LLM-based applications scale and teams grow, you can no longer rely on intuition to know if things work. This talk covers Adobe's journey from a simple LLM app to a sophisticated skills-based system, the shift to rigorous testing with Promptfoo, and lessons learned managing systems that feel unpredictable.

Key takeaways

  • Why testing LLM outputs is different from traditional software testing, and what that means for your workflow
  • How to set up evaluation-driven development with Promptfoo to catch regressions before they reach users
  • Practical patterns for scaling LLM testing as your application grows from a single prompt to a multi-skill system
  • Lessons from running this at Adobe: what worked, what surprised us, and what we'd do differently

Resources

Alex Carol Software Development Engineer at Adobe