Inside the AI Team Weekly (May 19th 2026)_transcript

Speaker 1 (00:00): Lots of new features since last week weekly. For example, we build a feature that is being able to instrument and visualize workflows. We also redesigned the agents feed, adding data from the quality, from evaluations, and also the ability to select multiple conversations and create a new collection and lot of ongoing work like visualizing [00:00:30] exemplars and jump into conversations, Ivana is going to demo this and also evaluation rule actions and experiments ongoing work. I will start with the demo about workflows. When we are building AI agents, it might be that maybe you use frameworks like LangGraph that produce workflows. In this case, now [00:01:00] in AI Observability, you have the ability to instrument these kind of workflows and visualize them. Let me show you an example. So in case you instrumented a conversation as a workflow, now you have this new tab here, workflow that when you click in here, you will see this graph, this graph, and this represents a workflow. (01:26): It now represents a step and each step [00:01:30] can contain or not LLM calls. For example, this step, you can see the name of the step, the loops. And in this case, we are running two LLM calls in this step. You can click in here and you can navigate through each generation and see at the right side all the details of each step of each generation. Then you can see also the loops and you can see, for example, in this [00:02:00] step, this step does not contain any LLM call. It's only logic, but you can see that we have two loops. And in the right side in the details, we can see the relevant information regarding the workflow step that is the state input state, output state. So this graph is really useful because when developing, for example, when something goes wrong, you can immediately see which step was failing, the cost of each step [00:02:30] and detect things very easily. (02:33): And this is a demo for workflows. And I think Ivana has another button. Speaker 2 (02:39): Yeah, I wanted to do a very quick demo, very small demo. And here I would just like to show how sometimes we can apply concepts from, for example, normal observability and bring it to new features and new products like AI Observability. So here I am in a [00:03:00] AI Observability app and we are using Prometheus metrics to see how many requests, what was the error rate, time to the first token and so on. And Prometheus has a really cool feature called exemplars. So exemplars is a way how you can attach trace ID to your metrics. So you can see actual events, actual traces that contributed to that metric. [00:03:30] And this is very helpful when you are, for example, trying to debug latency issue. Or for example, here we have a spike in time to the first token. So you can see actual traces, actual conversation that contributed to this spike value. (03:47): So we have decided that in AI Observability, we are going to show these exemplars. And here I would like to say that Assistant is already instrumented. So this came for free. So [00:04:00] all histograms, all Prometheus histograms already have exemplars. I think it's 5% of metrics have exemplars. And then we are able to jump the new feature. So we are able to jump to that conversation that spiked. The latency or time to the first token, we are able to see the trace, are able to see which generation happen and we are able to, again, then use Assistant to investigate [00:04:30] what is the cost of this elevated value. So yeah, I was thinking it's kind of interesting way to show how we can, I don't know if it's dogfood, maybe how we can use features and functionalities that already exist all over the place and bring them to users also in our new apps. (04:56): So this is my short demo. Yes. [00:05:00] Todd, this is amazingly cool. Do we have a Claude Code plugin that sends data to Sigil? Yeah. So if you go to Sigil SDK and we have plugins, here are the plugins that we support currently. Speaker 3 (05:22): Perfect. Thank you. So install it as a plugin. It will track your usage in Claude Code. Right. [00:05:30] This would be cool. Should try that out. But I think we'll call it here so we can stop and have five minutes to breathe before anything new stops for everybody. Thank you very much, anybody. Nice to see you all. I'll see you around. Take care.