GenAI Zürich 2026

Testing LLM Outputs: Caging the Wind or Just Another Day in the Office?

April 2, 2026 - Tech & Startup Stage, Volkshaus Zürich

Slides PDF, GenAI Zürich 2026

Connect with me on LinkedIn genaizurich@alexcarol.com

About this talk

As LLM-based applications scale and teams grow, you can no longer rely on intuition to know if things work. This talk covers Adobe's journey from a simple LLM app to a sophisticated skills-based system, the shift to rigorous testing with Promptfoo, and lessons learned managing systems that feel unpredictable.

Key takeaways

Why testing LLM outputs is different from traditional software testing, and what that means for your workflow
How to set up evaluation-driven development with Promptfoo to catch regressions before they reach users
Practical patterns for scaling LLM testing as your application grows from a single prompt to a multi-skill system
Lessons from running this at Adobe: what worked, what surprised us, and what we'd do differently

Resources

Promptfoo - LLM evaluation framework
Claude Agent SDK plugin support - promptfoo/promptfoo#6377
Redact exported config secrets - promptfoo/promptfoo#7974

Alex Carol Software Development Engineer at Adobe

Testing LLM Outputs: Caging the Wind or Just Another Day in the Office?

About this talk

Key takeaways

Resources

More from me