AI Engineer Europe 2026 - day 1

By Judith van Stegeren

Banner of the AI Engineer Europe 2026 conference in London

AI Engineer Europe 2026 took place 8 april - 10 april 2026 in London. It was the first event of ai.engineer in Europe. This is a conference report of day 1.

Hallway track

The first half of day 1 was an unplanned hallway track. When it was my turn to finally pick up my badge, the venue was already at max capacity and they couldn't let more people in. Organizer swyx told us they were surprised by the amount of interest in the workshops, and they hadn't counted on so many people showing up on the first day of the conference!

After picking up my conf badge, I went for some coffee with my friend Lucas Meijer, and Joyee Cheung and Ana Rute from Spanish free software consultancy Igalia whom I'd met in the queue. We had a very interesting discussion about building debuggers with and for agents. Do coding agents have other UI requirements for debuggers than human users?

I've been using Langfuse for LLM observability and prompt management for the past two years now. For humans their interface is great, but I'm not satisfied with the Langfuse APIs. Recently I experimented with the Victoria stack to make the impact of code changes better legible for coding agents. Lucas recommended me to also take a look at Aspire by David Fowler.

We also talked about Claude Code, and how it's not the end-all solution for high-performance agents. Claude Code is super-accessible to beginners -- it even has a cute orange ASCII art mascot in the cli when you start it up! I concluded Claude Code is more like the friendly "Tamagotchi" gateway to more sophisticated harnesses. Multiple people told me that they had good experiences with the Claude Agent SDK. A good blogpost about working with a custom harness is Anthropic's blogpost on building a C compiler with a team of agents.

Various people recommended coding agent pi as alternative to Claude Code, especially for people that like minimalism and hackability. I was impressed by the day 3 keynote by its creator, Mario Zechner, so I'm looking forward to trying it out.

Our Datakami collaborator Duarte O. Carmo was also at the conference. It was great to meet up IRL for the first time and talk shop. ;) Among other things, he recommended I check out the LLM engineer's handbook book by Paul Iusztin and Maxime Labonne, both speakers at the conference, and software engineering conference Qcon.

We talked about where the generative field is going next, and what kind of project requests we expect in the near future:

  • Cleaning up vibecoded codebases at scale
  • Developer tooling and DX work to keep human devs and agents on track
  • Observability in bulk — although I wonder whether this has even been solved for regular software?
  • Deploying software that was created by a team of non-engineers and coding agents
  • Increased need for sandboxing and security principles for agents, as more people start using them

Afternoon talks

In the afternoon, the venue was ready for us, so we could finally go in and attend some talks.

Chris Parsons gave a workshop about Ralph loops. Although I was already familiar with Ralph loops, i.e. agentic feedback loops, the workshop was still interesting because Chris showed how he personally works with Claude. He contrasted using Waterfall method patterns (elaborate specifications, longer feedback loops) with a more iterative way of working (smaller tickets, underspecifying work, shorter feedback loops). During the workshop, Chris demonstrated various features of Claude Code I haven't used yet (experimental feature flags, audio input, claude -p for a Linux-tool experience, the /loop command). I appreciated that Chris also spent some time on the security aspects. He encouraged people to read Simon Willison's post about the lethal trifecta, and he shared how he hardened the VPS where he runs Claude: a combination of separate keys, specific claude permissions, and tooling permissions, among other things. He also stores agentic sessions as .json files, and lets an agent review these sessions nightly to find points of improvement.

"Trust the agents to tell you what they need" was a recurring theme throughout the conference.

Raj Navakoti (IKEA) gave a talk about demand-driven context. This is an approach inspired by Test Driven Development, where a human gives an AI agent a task it will probably fail to do with its current knowledge, let the agent ask questions, and iterate on the required context from there. There a paper and a related github repo with a toy-example software system that automates parts of this approach.

Although I think the idea has potential, I also see a few problems: LLMs are trained to do information elicitation, so one LLM can ask more questions than 1000 domain experts can answer. Just look at the followup questions Claude will always ask at the end of a turn... And although you can successfully create a feedback loop with the DDC approach, there are still quite a few bottlenecks:

  • the coverage of the task that the agent is meant to fail
  • the speed with which the agent can fail the task
  • the quality of the feedback for the agent
  • the quality of the information elicitation by the agent in response
  • the quality of the answers provided by the humans

This was a great quote though: "The movie Memento will tell you everything you need to know about agents."

Other conference attendees told me they rather liked the Anthropic workshop "How to build agents that run for hours", the workshops on evals, and Matt Pocock's workshop "AI coding for real engineers". I'm also really curious about the "Build your own deep research agent" workshop by Louis-François Bouchard, Paul Iusztin, and Samridhi Vaid. I couldn't attend that one because the room was full. Once the talk videos are online, I will definitely watch these back!

Claude Code meet-up

In the evening I went to a Claude Code themed side-event organised by AI For Engineers Meetup and Claude Code Community London. About 100 people attended this meetup, and there were 9 lightning talks and an open mic slot. Some of the talks were really good:

Jan Peer Stöcklmair from Sentry demonstrated how he used agents to investigate an OOM bug in a Node.js app with Sentry and Cloudflare Workers. Normally, human developers make memory dumps by hand in chrome while the webapp is running and memory usage is increasing. He automated this with an agent and an MCP server for cdp, the Chrome DevTools Protocol. The agent could directly access Chrome, reproduce the error, save memory dumps, and then analyze the dumps.

Valerii Iatsko (Tealstreet) gave a wild talk about an almost fully agentic trading platform startup, that runs 1M lines of production code with a team of agents and 2 human developers.

Daniel Büchele (Figma) talked about using autonomous agents at Figma. Their objective was quick and easy Claude agents in parallel, at scale, with fast setup. They use coder.com dev environments, with tmux, and terminal and vscode in the browser. They sync .claude settings and memories to these dev environments with git, and they have a hierarchy of Claude files in a monorepo for the entire company. Figma also has an internal claude marketplace for skills and MCPs.

Rhys Cazenove (Natural history museum) gave a very good talk about engineering robust agentic systems that hold up in production, with emphasis on security and observability. What stood out to me is how much can be done with Claude Code hooks.

At the open mic moment of the evening, I went on stage to ask audience recommendations for observability frameworks that are easy to use for both humans and bots. Thanks for shouting toolnames at me, I'm looking forward to trying: Arize Phoenix, Sentry, Opik, and Logfire. One participant mentioned MLflow, which was the only viable alternative for a German company where permissive licenses were a requirement.

At the end of the evening, we had a groupchat with those participants that focus on robust AI engineering for production. I asked about everyone's favorite down-to-earth AI influencers. Apart from Simon Willison (of course), people mentioned Armin Ronacher, Thoughtworks (Martin Fowler, Birgitta Böckeler), Ethan Mollick, and Matt Pocock.

Continue reading about AI Engineer Europe 2026 - day 2.

AI Engineer Europe 2026 - day 3 blogpost coming soon!

More like this

Subscribe to our newsletter "Creative Bot Bulletin" to receive more of our writing in your inbox. We only write articles that we would like to read ourselves.