Nicole (00:00:02):
Hi everyone. Welcome to a Grafana Campfire with neither of the hosts that you normally have. So let us introduce ourselves. I'm Nicole van der Hoeven. I'm a senior developer advocate that's filling in for Usman. I'm trying to help him out because he's away this month and I hope that I do him proud. I also invited my friend here, Stephanie.
Stephanie (00:00:30):
Hello.
Nicole (00:00:31):
We totally forgot how you say your last name. I'm sorry.
Stephanie (00:00:34):
It is totally fine. It is impossible to pronounce Stephanie Hingtgen. Hingtgen. Yeah. Yeah. When I changed my last name, the DMV that I went to didn't believe that that was how you spelled my last name.
Nicole (00:00:54):
Yeah, of course they would know more than you.
Stephanie (00:00:57):
Yeah. Yeah, of course.
Nicole (00:00:59):
I get that with my name. I always say H-O-E-V-E-N because people think it's Hoven without the E. I'm like, no, it's my last name. It has two Es. I know that. But anyway, today we're talking about GrafanaCON. GrafanaCON was, oh man, was it just last month? No, it was two months. It was last month. It was last month. Of course it was. It feels like it simultaneously feels like it just happened and that it was months ago. Totally. But actually, I would have fewer memories because I didn't go. Stephanie did.
Stephanie (00:01:38):
It was so great. Yeah. I love GrafanaCON. It's such a great event.
Nicole (00:01:43):
So tell us also what you do at Grafana for people who weren't at GrafanaCON because probably they didn't go if they're watching this. What do you do and what did you talk about there?
Stephanie (00:01:55):
Yeah. So for the past almost five years now, I've been working on Grafana itself. So I talked about Grafana 13 and what we released there and yeah, it was great.
Nicole (00:02:06):
Yeah. And I guess we should say that GrafanaCON is our biggest community event, Grafana community event. And it has a strong community focus, so open source all the things. And lots of people are there who just really use Grafana products across the whole spectrum of things to build real things. And I was bummed not to go, but I did have something personal that I did at the same week so I could not do it. But I'm pretty sure that I've watched more talks than probably you have because you were no doubt in the booth.
Stephanie (00:02:48):
Yeah. The booth is also checking out the science fair. One of the things I really love about GrafanaCON is you have the people who have, like we talked about Google using Grafana, but then also you have the Tamagotchi and you have all these fun, unique ways too. The spectrum of how you can use Grafana is so cool. And just seeing that at GrafanaCON is also really fun.
Nicole (00:03:10):
Yeah. And I really loved, in the keynote, they talked about the history of GrafanaCON and how the first GrafanaCON had 30 people and not even in our own office. It was in Squarespace's office. I didn't realize that they were an early user of ours. And then this last one was the 10th one and had a special thing because apparently Grafana as in the Grafana Grafana app was written in Barcelona.
Stephanie (00:03:39):
Yeah. It's so cool. I didn't know that until GrafanaCON this year, actually.
Nicole (00:03:44):
Yeah. There's some impressive figures, like 35 million users. That's something to be proud of.
Stephanie (00:03:52):
It's insane. It's so cool. And I just loved the community around it too. And that's probably also just why I love GrafanaCON so much is like the people who are there are just amazing. I love our community. They're all kind people and fun people to hang out with. And so it's just a great atmosphere.
Nicole (00:04:10):
Yeah. Well, I guess in this one, in this episode, I wanted to also highlight the announcements here because we're going to go off of this and also add a little bit more detail. This is a really good blog though that was released and it's like if you're wanting just a quick look of all of the highlights of everything that was announced, that's it. But if you want a little bit more flavor and some more detail, we'll go into them here. Well, I also just enjoyed the chance to ask to get to talk to you and call it work. Because you and I, we met in person at a GrafanaCON two years ago, right?
Stephanie (00:04:57):
Yes. Yeah. That was so great.
Nicole (00:04:59):
Yeah.
Stephanie (00:05:00):
GrafanaCON Amsterdam, I think was the one, right?
Nicole (00:05:02):
Yes. Yeah. Yeah. I was there. Yeah. But tell us about Grafana 13 because that's like GrafanaCON. So let's talk about the latest release of Grafana. What are all the new things that you can tell us about?
Stephanie (00:05:17):
Oh yeah. I mean, there's like so many new things so we can dive into a lot of them. But I think one of the really cool things with Grafana 13 is we GA'd a lot of stuff that in Grafana 12 we had marked as experimental and we were super excited about. So we told people about them, but things like Git Sync and Dynamic Dashboards are both now GA and are really powerful, cool new features. We have some new data sources as well and PDC now can use Tailscale. We also have different updates for annotations. I could go on, the list goes on. Where should we start?
Nicole (00:05:55):
Yeah. Well, let's start with Dynamic Dashboards because that was announced, was it at the last one that it was first announced, but now it's GA?
Stephanie (00:06:07):
Yeah. So Dynamic Dashboards, one of the cool things about it is the backing of it. So it's backed by this new dashboard schema. So we have Dashboard Schema V2 and that enabled a whole bunch of features, but also what that means is that we also have versioned APIs on our dashboards, which is very nice. So when you upgrade Grafana, you no longer have to be worried of like, oh, this API is going to change slightly and break different scripts. Lots of people script against Grafana. And so now it's all going to be consistent versioned APIs, which also enabled Git Sync. But yeah, it's really cool.
Nicole (00:06:45):
What makes a dashboard dynamic for people who haven't tried it yet?
Stephanie (00:06:50):
Yeah. So I mean, there's a whole bunch of different features that went into Dynamic Dashboards. So one of the things, we have tabs now, which has been a pretty highly requested feature in Dynamic Dashboards or in dashboards in general. And then within those tabs too, you can now have template variables that are inside the tabs and inside the rows. And you also can have show/hide rules, which is another very highly requested feature that we're super excited to get out. That if, for example, the panel shows no data, you can not show it when a viewer is viewing it and just show them what matters to them. Or if this threshold is hit, then show this panel. So yeah, super cool.
Nicole (00:07:35):
I feel like I'm kind of worried that this will lead to my current state of my browser tabs too, because initially it's like, "Oh, great. I have so many Grafana dashboards that are just like, I'm scrolling through things." And there are sections within the single page of dashboards and now I can have tabs. I'm like, great. Oh, wait, now I want 10 of them or something.
Stephanie (00:08:00):
That's fair. Yeah. I think a nice part, if you can put tabs into more of personas, that can be really helpful. So the dashboard can be more of a product view of what's going on in the product and then the tabs can have like, this is what is useful for the SREs, this is what's useful for managers to look at and different things, but then you're all on the same dashboard and only having to keep one dashboard up to date.
Nicole (00:08:27):
One of our Grafana champions, Roxana Turc, was particularly interested in this. She made this career dashboard. It's on the Grafana YouTube channel and it was like monitoring her career.
Stephanie (00:08:42):
Wow. What is she monitoring? That's cool.
Nicole (00:08:45):
Yeah. She was using the Infinity data source to pull in stuff from her GitHub and say, this is how many commits she's done to different repositories and different languages. And I think she was asking to try and improve that and have different tabs. And I just thought, what a cool idea. Why don't I have my CV on Grafana?
Stephanie (00:09:09):
Yeah. Is the dashboard JSON available? That sounds pretty cool.
Nicole (00:09:13):
Yeah. It's publicly available. Yeah. I'll find it while we're talking. But yeah, I'm excited about the Dynamic Dashboards, but also about the Graphviz panel. Can you tell us a bit about that?
Stephanie (00:09:29):
Yeah. So we introduced a new Graphviz panel. It's really neat. You can have different components that are within that panel and then show how data flows between them. It's super powerful and we've been using it internally as well and have found a lot of use out of it. So cool. Yeah.
Nicole (00:09:51):
So I've just posted the link to her dashboard, to her career monitoring dashboard. But I remember Graphviz, it was actually from a hackathon, right?
Stephanie (00:10:03):
Yeah. Yeah. I'm pretty sure.
Nicole (00:10:05):
I remember talking to them about it because it used to be called Mesh, I think.
Stephanie (00:10:11):
Okay, nice.
Nicole (00:10:13):
Yeah. It's really great because before that, what we had was Canvas.
Stephanie (00:10:20):
Yeah. The Graphviz panel is definitely very powerful and something also that has been highly requested internally as well. So that's been neat to see it dog-fooded quite a bit within Grafana.
Nicole (00:10:35):
Yeah. I haven't tried ... Well, I played with an earlier version of it, but I haven't played with an official version, I guess. But I especially like that they had chosen the DOT language, which was the first time that I'd really understood what it was good for. But Terraform outputs graphs in DOT format. And so there's that nice integration there where you can just have Terraform create those visualizations for you.
Stephanie (00:11:07):
That's really awesome. Yeah. Super powerful.
Nicole (00:11:10):
What is your favorite feature that we announced at GrafanaCON?
Stephanie (00:11:16):
I mean, I guess I'm partial to Git Sync. I have worked on Git Sync a bit too. Oh, okay.
Nicole (00:11:21):
Tell us about it.
Stephanie (00:11:22):
Yeah. So we had launched Git Sync as experimental in Grafana 12. And so then Grafana 13, we did a lot of hardening around it to make it a GA and it's really cool. So before we had an as-code story, but basically you just had to put dashboard JSON into a file and then it would sync to Grafana and then you couldn't update it in the UI, but the easiest way to create a dashboard is in the UI. And so that was a point of contention for adoption with that. So with Git Sync, you can update it in the UI, edit it, and then you can create PRs or push directly to main within the UI and then have everything backed by Git. And yeah, it's really cool. And we're currently working. I know a lot of people are going to ask if alerting is coming. That's been the most requested feature.
(00:12:16):
Oh, really? Okay. Yeah. So that is coming. We're hoping by the end of this year that will be available, but stay tuned.
Nicole (00:12:26):
We have a question about Dynamic Dashboard, sorry, Git Sync from one of our Grafana champions, Vaniash Vaghela. I hope I'm pronouncing that name right. How does the dashboard's delete/restore functionality work with the Git Sync feature? Do both actions result in a PR or commit to main? I don't know if that's something you could answer.
Stephanie (00:12:51):
Yeah. Yeah. So the dashboard restore functionality is disabled for Git Sync. For Git Sync, you'd want to look instead at the commit history and use that to restore the dashboard. But they are currently working on having something that's more native to Git Sync where you can look at the Git history and say, restore to this version, but that is a work in progress.
Nicole (00:13:15):
Okay, cool. How about Interactive Learning? Wow, there's just so much that was announced. I don't know how we're going to get through all of this. Okay, next one. Have you looked at Interactive Learning?
Stephanie (00:13:28):
I haven't gotten to experiment with it too much. I'm pretty sure is that when you're getting to go through different steps of new features?
Nicole (00:13:37):
Yeah, it's super cool. It actually came from my team. I didn't work on it, but a few of the other advocates did and it's like the question mark now, if you click on it, it has tutorials that are more than your average tutorial. It's not just like do this and then do that. It still has that, but then there's also buttons there that say like, "Show me how to do it." So it'll then go and drive the actual Grafana app and say like, "See this part?" And it highlights it. This is the part where you click and then you can also have it do it for you and it takes you to different pages and it's like, now here's where you type a query and it can type it for you. So basically it's like if you've ever followed a tutorial and you've been completely lost because you didn't do exactly what it was, you can have it do it for you and you're like, "Oh, that's how." And it's really good.
(00:14:34):
I use it to learn features that I didn't know about.
Stephanie (00:14:38):
Yeah. I mean, we launched quite a bit of things, so that is helpful to have tutorials that are actually showing how to do it.
Nicole (00:14:44):
Yeah. And real humans are generating these and so it's more of a guided experience. I'm actually working on, I've been working on it forever. I kind of set it aside, but I will get back to it. I need to put one in for LogQL, so how to write LogQL queries and building it up every step of the way. I'm really proud of that. I'm proud of the work that my colleagues have done.
Stephanie (00:15:11):
Yeah, that's awesome. That sounds really helpful.
Nicole (00:15:14):
What about saved queries?
Stephanie (00:15:17):
Yeah. Saved queries. I mean, I guess as you're learning LogQL, it can be really helpful to have people who have been using Grafana a lot be able to say, "Here is this query that we run when we're trying to figure out what's going on with your service," and then save it. And then all of the teammates, everyone in that org can go ahead and view that saved query and rerun it. So it just is a way to be able to have those Grafana experts in an organization be able to give really good queries that they're using and that have been optimized and have teams onboard to Grafana quicker.
Nicole (00:15:55):
Yeah. Anything that saves people from writing their own queries is a good thing. It's hard. No one talks about the overhead of having to know multiple query languages just to get your data. It's significant. And also for reproducibility, that's so good because someone might be like, "Oh, that's not what I got." It's like, "What query did you run? Oh, okay, you did it that way." Now it's just like, "No, that is the query that we use to find out this thing."
Stephanie (00:16:24):
Yeah, definitely. It's a lot. I mean, it's fantastic that we have all of the different data sources that you can view within Grafana. Then on the harder side of that is that there's different querying languages for all of those data sources and learning that can be a pretty big curve. So yeah, that will help a lot with saved queries.
Nicole (00:16:44):
Yeah. What about Grafana Marketplace? What's the status on that?
Stephanie (00:16:49):
Yeah, so that was announced that we're getting interest in it. There's a form, I believe. I don't know if you have the link to that form, but if you're interested in it, basically if you want to create a plugin and then sell it on Grafana Marketplace, then you can get in partnership with Grafana and we're still figuring out all of the different details on that. So there was a form that was like a QR code that was shared in the Grafana 13 deep dive talk that if you are interested in that, fill out that form and we'll get in touch.
Nicole (00:17:27):
I will link to the entire blog post and the form is further down in the getting involved part where you can apply, but here is the link in the comments here where you can apply. That's great because it's not just for us, it's also helping plugin developers get something for their money, for their work. I don't believe in the whole starving artist thing. Just because we're open source doesn't mean we don't have mouths to feed and we don't have bills. Yeah, absolutely. I'm all for that. What else are we missing from the Grafana 13 perspective? Dashboard templates.
Stephanie (00:18:09):
Yeah. So there's a few different, I guess, related functionality here. So there's dashboard templates and then there's also, what are we calling those now? Suggested dashboards. So dashboard templates are ones that we have created that follow different industry standards to try to get you up and running quicker. So when you're looking at that blank page of like, how am I supposed to monitor my system? You can use dashboard templates to try to have an industry standard way of doing that. And then we also have suggested dashboards which goes into the data source itself. So if you have a new Prometheus data source, we have ones that are from the community that have been really well used throughout the community and we'll suggest like, here's this dashboard that a community member has made for Prometheus or for Loki, again, to just not have you have to deal with the blank page of a dashboard for so long and have something to start with.
(00:19:17):
And the other neat thing that you can then do with that is it'll give you a starting point, but then there's different things you're probably going to want to tweak. And then if you go into Assistant, it can also do that for you. And so it can just get you a dashboard significantly faster.
Nicole (00:19:34):
Yeah. Honestly, I've spent some time learning the query languages and then just when I felt like, yeah, I'm functional. I can do most of the things that I want to do without looking stuff up, then it's like, oh, but actually Assistant just gets me there faster without having to learn that.
Stephanie (00:19:56):
Yeah. I mean, I've been working in Grafana for five years now and I had used a dashboard template and then had Assistant improve it for me after that just last week and it was significantly faster than even I could do after being in Grafana for so long. Yeah.
Nicole (00:20:14):
I think this is especially good for people who might not know, like they have, maybe they've already instrumented their stack but don't quite know the industry standards for things or might miss a panel. And it's like, you just want those decisions taken care of for you, just give me the best practices dashboards for this sort of thing.
Stephanie (00:20:38):
Yeah. Yeah. And then you can totally add to it as well and go from there. It's nice to have a starting point and then you start looking at it and you're like, "Okay, what data am I missing? What am I needing?" And then you can tweak it from there with Assistant and it's great.
Nicole (00:20:54):
We have a question from the same Grafana champion as the previous one, Vaniash, and he asked about the upcoming changes to the orgs or teams and how that affects provisioning and permissions. And I didn't get a chance to ask him for more context, but I believe that he's talking about the SCIM changes. Oh, nice.
Stephanie (00:21:19):
Yeah. So - Do you know anything about
Nicole (00:21:19):
that?
Stephanie (00:21:21):
Yeah. So SCIM, which stands for System for Cross-domain Identity Management.
Nicole (00:21:27):
Apparently. That's how little I know about it.
Stephanie (00:21:32):
It's a really neat feature. So previously what you had to do to set up new users in Grafana is you had to have the user log in first and then the user would be created and then you could add them to teams and orgs and do things from there. But if you have an identity provider, that is not great, like if you're trying to onboard a new person to the company, usually you want them all set up on day one so they could just hit the ground running. So with SCIM, as soon as they're within the identity provider, they'll be synced to Grafana and exist there and you can add them to whatever teams you want. You can add them to IRM workflows, you can add them to anything. Yeah, it's super helpful.
Nicole (00:22:14):
How does that work in terms of permissions? So what do they get by default?
Stephanie (00:22:20):
So SCIM solely does the user and team life cycles and then permissions are a separate thing that you can handle with Terraform or different things like that. If you add a user to a team and the team has specific permissions, then the user will automatically get those permissions.
Nicole (00:22:39):
Okay. So it doesn't change the permissions at all. It just makes it easier to log in basically to Grafana. It's like already having a login.
Stephanie (00:22:49):
Yep.
Nicole (00:22:51):
Okay. All right. That's cool. Okay. Thank you for that. I had no idea how to answer that question. We have a question here in the chat from Matt Fine who asks, any idea when Grafana 13.0.2 might be released? Seems like there might be a lot of bug fixes related to Git Sync in that release.
Stephanie (00:23:12):
I am not sure on when the ... I can check. I think it's supposed to be at the end of this month. Let me see. 13.0.2.
Nicole (00:23:20):
But it's like May 28th.
Stephanie (00:23:23):
Yeah. I think
Nicole (00:23:24):
Stephanie said that she's staying as late as she can until she gets this release out tomorrow. You heard it here first.
Stephanie (00:23:37):
No one wants me to be the one to release it. Yeah. We were supposed to have it at the end of May. It looks like it's been a little bit delayed, but shortly in the next coming weeks, it'll be released. But yeah, the Git Sync team has been doing a lot of bug fixes recently, so we'll be good.
Nicole (00:23:57):
Nice. Okay. There's also another question from Vaniash. I really love how active our champions are. He says lately ... Whoops, I don't know what just happened to my camera, but I'm here. I promise I'm still here.
Stephanie (00:24:13):
I can
Nicole (00:24:14):
hear you at least. My audio, right?
Stephanie (00:24:16):
Yes.
Nicole (00:24:17):
Okay. Well, while I'm sorting that out, Vaniash asks, lately there have been a lot of changes happening to the UI, like the new query panel. Do we expect this trend to continue? Does an individual user have an ability to switch between the old and new interfaces or is it going to be tied to the feature flag or Grafana instance version only? So like a system-wide change.
Stephanie (00:24:41):
Yeah. So they are system-wide changes that are ... Yeah, it's a feature flag or Grafana version when we make the feature flag defaulted on. But for dashboards in particular, the Dashboard Schema V2 was an accumulation of a ton of different things that we've been wanting to change and wanting to address, but we couldn't until we had this schematized API. So this would be the big overhaul that we had. What split this, but welcome back. Good to see
Nicole (00:25:15):
you. I totally do this for a living.
Stephanie (00:25:19):
You do it
Nicole (00:25:20):
well. Yeah, just a cable head come loose. That just happens sometimes.
Stephanie (00:25:27):
For sure.
Nicole (00:25:28):
Okay. So thank you, Vaniash, for your questions and just in general for being a Grafana champion.
Stephanie (00:25:36):
Yes, definitely.
Nicole (00:25:36):
Shall we move on? Shall we move on to Grafana Alerting?
Stephanie (00:25:41):
Yeah, let's do it.
Nicole (00:25:42):
Okay. So what I understood was they're now moving because previously it had been quite segmented. There's like a Prometheus Alertmanager or Mimir Alertmanager, Loki Alertmanager and we have separate teams for each of those products at Grafana or OSS projects as well. And so they've kind of done their own thing, but on Grafana Cloud or Grafana, when you're using all of those things, then it can be kind of difficult because you're managing multiple things. So now we're doing this unified alert management system tool and it just rolls up all of those into one. So it should make it a lot easier because you shouldn't have to care whether you're using the Prometheus Alertmanager or the Mimir one. It should just be alert and they should know about each other and they should be able to integrate with each other. Have you gotten a chance to play around with this at all?
Stephanie (00:26:45):
Yeah. Internally we have migrated over to that and it's been really nice to have that unified experience in comparison to before.
Nicole (00:26:54):
Yeah. And Vaniash's question for this one is, will Mimir Alertmanager continue to be supported? And I actually reached out to the Mimir team to ask about it. So it is still available now and it will still be supported although we're pushing users to move to the Grafana alerting, the unified one now. There's still going to be some exceptions and we know that, so we're not just going to be like, "Bye, no, now you have to switch." And if we hear anything about it, we will announce it well in advance. Grafana alerting engine covers everything that the Mimir Alertmanager did so you're not missing anything and there's also a one click import wizard. So if you do want to migrate, that's maybe easier than you think. So I would recommend that you give it a try. Clearly the direction is towards consolidating because we just don't want the operational burden of maintaining these multiple ones, not just as the people that are building the product, but people that use it too.
(00:28:02):
It's just very confusing to have disparate sources.
Stephanie (00:28:07):
Yeah, definitely.
Nicole (00:28:09):
So let's move on to one of the biggest topics I think in this GrafanaCON more than previous years for sure, which was AI. It was AI, all the things. What did you think about that?
Stephanie (00:28:22):
I was super excited. I think what we announced at GrafanaCON of the Assistant being available everywhere is a huge deal and just really cool. And I also love that we're sticking to how Grafana does things and that you can use things in open source, that we are making it actually useful. Yeah, I'm proud of that.
Nicole (00:28:48):
Yeah. I've recently moved over to the AI team and I'm still new. I feel like when I talk to people on the team, I'm like, "Oh my God, I know nothing about AI." My uses for it are so primitive compared to theirs. And then I talk to a normal person and they're like, "Wow, you are so much more advanced." Okay, okay. AI is just moving really quickly. We're all scrambling. Let's be honest. Things are happening right and left and for sure here it's like lots of exciting things that are happening all the time. Assistant Everywhere, which Stephanie mentioned, that means that Assistant, which started out as just a chatbot but has gained agentic capabilities now and it's so good that it can do not just create dashboards, but it can answer questions about your infrastructure. There's like all of the data sources now are being supported one by one by Assistant.
(00:29:53):
I have a video actually where I go, this is from a talk. I went to a meetup and it was kind of last minute and I had nothing to speak about and I had no slides. Well, I always have
Stephanie (00:30:04):
something to
Nicole (00:30:04):
speak about, but I said, "You know what? I'm going to spend the 30 minutes going from absolutely nothing on a Grafana instance, on an OSS Grafana instance, to being able to visualize data from a video game, an online public video game API and have full dashboards." And it worked. And I was like, "No one was more flabbergasted than me." This is not fake anymore. This is not marketing anymore. It's actually there. We're already past that point of being actually useful. It's like, of course, of course it's useful.
Stephanie (00:30:44):
Yeah. It's so nice. The AI team has definitely been very focused on making sure that AI slop is not a thing that we produce and it shows in the Assistant. I've used it too. And as I was saying earlier, I've done a lot of dashboarding and it still made me faster and still made it a really good dashboard. So that was really nice.
Nicole (00:31:07):
Yeah. So Assistant has been available for Grafana Cloud. And we should probably say that Assistant itself is not open source, but you can hook up your Grafana open source instance to Assistant. And it's a little bit counterintuitive because you still need a Grafana Cloud account and you have to link the two of them, your Grafana OSS one to your cloud one. And that's because while we also have AI costs and we can't just be like, "Free AI for everyone, unlimited." So hopefully most people understand that it's not unlimited for us either.
Stephanie (00:31:50):
Yeah. I know the AI team has spent a lot of time for different tasks, figuring out what models are best for those tasks. And so also with bringing your own model, that's not something that we're supporting because we have really tried to optimize what agent should be used for specific tasks and making sure that it's going to do well and be actually useful.
Nicole (00:32:12):
Yeah. And I think this one was laying a lot of the groundwork for that. So we already mentioned Grafana OSS and self-hosted, but also MCP and GCX. So there's a lot of different ways to use AI. It's like, do you use it on Grafana Cloud or do you use the MCP server way or do you use CLI way? And in Grafana typically for these questions of which way do we choose, the answer is yes. So we're doing
(00:32:44):
all of the things and there are different ... It's the same team that's working on all of these and nobody's particularly attached to one outcome. I think that's the best way to do it is just let's see what sticks and what is actually useful. Oh, we've even gotten a comment. Martine1674 says, love the MCP. Yeah. You might also want to try GCX if you haven't. GCX is like the CLI tool now. Every time I check, things are being added to it and now it's like Loki's on there and I don't know if k6 is on there yet, but you're able to programmatically control a lot of Grafana now from whatever tool you're already using.
Stephanie (00:33:40):
Yeah. Yeah. GCX is super helpful. If you're an on-call engineer, I highly recommend using it. I've been using it and it's very nice.
Nicole (00:33:51):
And also on what you were saying earlier of how proud you are that the AI team is also not just like, "Yeah, AI is always good in every situation." Actually, a lot of effort is being spent towards making sure we can trust AI. So AI Observability is also in public preview. Oh, you don't understand, people who saw it in GrafanaCON don't realize how quick the turnaround was.
Stephanie (00:34:18):
So quick. It was insane.
Nicole (00:34:20):
That was a hackathon project. I don't know, a month or two. Yeah. Yeah. So it went from nothing to something and the reason that it was able to iterate that quickly was because we needed it. That's true. The AI team needs observability into the Assistant that they developed. And so it's just something that we found ourselves needing. So if we need it, then probably other people do as well. Have you played around with this yet?
Stephanie (00:34:55):
I haven't gotten to play around with it yet. I have talked to quite a few people on the AI team of how they're using it and it is really neat. When we have our own itches that we want to scratch out of the hackathons and then it's like, oh wow, this is actually useful for us. And so we can go ahead and give that to other people. And I think that's how we can build really cool stuff that people want to use is if we also want to use it. And yeah, it's nice.
Nicole (00:35:22):
Actually on the week of GrafanaCON after that was announced, I also did another stream, The Context Window, which is like the AI sort of community call, except it's not just about ... Assistant is not open source. It's not a community call, but we did talk about AI Observability and to prepare for it, I decided to instrument my own demo AI app, which was like a chatbot. It was like a D&D chatbot. Nice. That's awesome. And it was surprisingly easy. You can still do manual instrumentation and depending on your use case, you might still have to, but it was surprising that during the setup, instead of having just like a list of things to import or whatever, it was prompts too like, here's one for Claude Code, here's one for Cursor. I was like, wow, this is a new world.
Stephanie (00:36:17):
Yeah. It's crazy how much everything changes from day to day even.
Nicole (00:36:23):
We have a question here from Martine again. If doing your own AI inference, any plans to export your TTFT, prefilled decode, tool calling, et cetera, towards Grafana. Yeah. Hey, that's what we're talking about. TTFT is time to first token. I'm sure AI Observability does that already natively. The other thing I'm not sure, but you can still instrument it to be exposed. Tool calling, yes. There's actually a lot more. There are conversations as well and system prompt analysis is surprisingly useful and that's something that we've seen in Assistant too because our system prompts have to change. Every time we use a different model, we kind of tweak it slightly. There's so, so much to do there. If you haven't tried AI Observability, Martine, I think you're going to like it. There's a lot. There's also evaluators. Oh, just watch The Context Window. I think it's the last episode and it was all about AI Observability, full demos.
(00:37:38):
I learned live. It was the best way to do it, to learn how to use it for sure. And also on the trust side, there's also o11y-bench, which is a new benchmark that we also came out with. Oh my God, so many.
(00:37:59):
Yeah, so many things. So that's a new benchmark that we've got. There are a bunch of general AI benchmarks and they're useful but also limited because the models at the time that they were released, it meant something because if one model got like 70% and another one got 90, like 20% is a pretty big difference. And now if they're all like 95 versus 96%, what really does 1% mean? And that can mean that a lot of these benchmarks are of limited use because they don't really tell you, they don't help you make a decision about which model to use. So our approach, and we've had to do this internally because we also want to know which model is the best, is to make it more specific. So now o11y-bench only benchmarks models on observability use cases in particular. So it's much narrower because a general benchmark will also ask models about historical facts and that's not really that useful for our use case anyway.
(00:39:14):
o11y-bench is a public thing, it's open source and also seeking contributions. One cool thing, because there's a leaderboard where you can run some tests against some models and there's a leaderboard, and I'm really excited to see where the open-weight models are. So it's still like Opus at the top, frontier models first, of course. But the open-weight models like Qwen 3.6 are getting up there and Qwen 3.6 is something I run locally on my MacBook. I don't have to have a data center to run it. So I'm really excited to see those kind of open-weight models climbing the ranks.
Stephanie (00:39:57):
That's super cool. So like at first when we had o11y-bench, was it like lower down and now it's been like getting better and better? Or how have you seen it change throughout?
Nicole (00:40:06):
Yeah. Well, Qwen 3.6 just came out like I want to say two months ago or something. So yeah, I feel like previously open-weight models were not really great unless you were doing a weird thing where you're buying a bunch of Mac Studios and then daisy-chaining them together and then you run some huge model. But Qwen 3.6 and Gemma 4 are now legitimately good and you were talking about moving towards bring your own models earlier. I think this is a step towards that because the problem with using different models for Assistant is that people might use some really tiny one and then have a very bad experience with Assistant, but it's not because of Assistant, it's because of the model. But if we have a benchmark like this, we can just run these tests and present the user with all the information that they need to decide.
(00:41:05):
They might decide like, okay, I see that this tiny model is not great, but I still want to use it, but at least they're doing it intentionally. They're not just like, "Oh, this whole thing doesn't work."
Stephanie (00:41:16):
Yeah, that is really cool. I think AI, one of the harder parts is the observability behind it and figuring out what it's doing and why it's doing that. And so with o11y-bench, and then also with AI Observability, that's really nice.
Nicole (00:41:31):
Yeah. And then there's also ChatOps stuff. So Assistant integrating with Slack and Microsoft Teams now, because internally we use Slack. So there are some channels that have emerged that are purely interacting with bots. That's really good for investigations and also just sometimes someone will go into the team channel and be like, "What is the status of this and this?" And then people will be like at whatever they've named their bot, "Could you find this out for a person?" And it just helps disseminate knowledge faster. And unlike when you are just using Assistant in Cloud or even GCX or something, only you can see that. So having it on a shared channel really changes the dynamics of the whole thing because it's like you're all prompting it and you're all part of that conversation.
Stephanie (00:42:27):
Yeah. And you had mentioned Assistant Investigations as well. That's something that I've personally found super helpful too. What you can set it up to do with Grafana alerting too is kick off the investigation. So you get paged and like before you're even on the computer, the agents are already looking at what might be going on. And then you get on and you can look and be like, "Okay, here's the summary of what it thinks it might be." And then it all has links and stuff and then you can click into that and then it gives you a chat page where you can start talking to it and be like, "Well, have you looked at this? Have you looked at that?" And then also be looking at different things. And yeah, it speeds up your mean time to resolution quite a bit. It's nice.
Nicole (00:43:07):
Yeah. So much is being done with that now. We just started a thing in the AI team now where I'm actually editing the real team meetings down where we have demos and we're releasing that every week. Oh, cool. Yeah. It's like very much building in public. Almost nothing that you see there will make it in that exact form to production. Might not even make it at all, but it's there and there's a record of what we're working on and the things that are coming for investigations. I'm so excited.
Stephanie (00:43:42):
That's awesome. Is that on this YouTube channel or where can you find that?
Nicole (00:43:45):
Yeah, it is. It's like the Inside AI Team Weekly. I condense a one hour meeting into a five minute thing of all demos.
Stephanie (00:43:57):
That's really cool. I've heard that the AI team meetings are really cool because it's just a ton of demos and I'll have to go watch those. That's exciting.
Nicole (00:44:05):
Yeah. And I use AI to do it too. I'm also experimenting with AI video editing tools.
Stephanie (00:44:10):
Oh, nice. That's awesome. What tools are you using?
Nicole (00:44:13):
Descript.
Stephanie (00:44:15):
Cool.
Nicole (00:44:17):
Also databases, all the databases. The theme for the databases is they're a little bit easier because one thing that I love about Grafana is it's very bottom up. So there are different teams and each team really gets to decide for themselves how they implement something or how they tune it. And there was no direction of like, you must do this one thing and yet magically four teams just converged on the same solution. So there isn't like an official name for the entire project because there is no one project that dictated what must be done, but I kind of call it in my notes, just for me, I call it Project Yggdrasil just because of all of the Norse stuff like that. In my notes, it's Project Yggdrasil. And Project Yggdrasil's goal is to reduce the total cost of ownership because at a certain scale it's just prohibitively expensive to have this much data ingested and query it and everything, and also to improve availability.
(00:45:30):
And those two things have been intentional focuses because these changes actually come with some trade-offs. They come with higher write and read latency, but we've kind of made the decision that the reduction in cost and the increase in availability is more than enough to get back for that. The idea is that different teams have implemented it slightly different ways, but the idea is that now whereas it was very ingester heavy and that became kind of a bottleneck, now they're decoupling the read and write paths usually through something like Kafka, like some sort of Kafka-like solution in there and that way for availability, that's really good because if the read path is down, the write one is not affected. And also to improve ineffective replication, we used to have a replication factor of three slightly changing across the databases and we're also reducing the multi-availability zone costs. That has meant this has been a huge rearchitecture of every database.
Stephanie (00:46:53):
So much
Nicole (00:46:53):
work. Amazing. The scale of it. Yes, so much work. And it's not just that. It's not just the read and write decoupling because in order to take advantage of that, almost all of them have had to rewrite the query engine as well to some degree, just to take advantage of the gains, like the increased availability and reduced costs of that. So now we have the Mimir Query Engine, Loki Query Engine, Tempo has a streaming query engine and per-attribute now Parquet dictionaries, and then Pyroscope has stateless query backends. Pyroscope is the one that's kind of doing things a little bit differently. The others are using Kafka or WarpStream or something like that, but Pyroscope built it in so it's just part of Pyroscope now, but still the same sort of idea.
Stephanie (00:47:47):
And if I'm correct, also the other ones we can still run in like a single binary mode, right? You don't need to run Kafka in order to ... Cool.
Nicole (00:47:56):
Yeah. I'm really excited about that, that they didn't just cut and run and be like, sorry, open source users. Nope, you still can. You're still within the ecosystem, which is great.
(00:48:11):
Loki is a little bit different because it also has data objects, which is a columnar storage now. This is something that has been asked for because Loki is good at some things and not so much for others and the analytics use case it wasn't so good at, and that use case, those kinds of queries that are aggregating across a column really required some other type of storage and that's data objects. And we also acquired a company called Logline, which is a guy named Jason Nochlin who managed to solve this needle in a haystack issue that we've been plagued by, that the Loki team has been plagued by. I say we, because I used to work on the Loki team, but I just will say we for everything that I've worked for. But the needle and haystack query is like when there's so many logs, how do you find just one log out of like terabytes of logs?
(00:49:14):
And that's a super hard thing. We've tried many things, bloom filters, but they're only great for structured metadata, and Jason Nochlin had a solution for it and it has led to object storage native index just to be able to account for these queries. So happy to see that innovation there.
Stephanie (00:49:40):
For sure. That'll be so nice to have.
Nicole (00:49:44):
Yeah. And then Pyroscope and Tempo have really been growing closer together too. I'm really enjoying seeing that now there's a span CPU heat map panel and so like you can go seamlessly from like something like a slow span and then you can see like what was taking it up and vice versa. You can go both ways. I'm loving the integrations because I think that's really the strength of Grafana, bringing everything together.
Stephanie (00:50:19):
Makes debugging so much easier. When you're trying to figure out what is going slow, that's usually one of the harder things to debug and it's a really nice integration.
Nicole (00:50:30):
Yeah. And then also like jumping ahead here to cover everything, k6 is now like being refocused towards agentic testing. So it's like AI, all the things, but in an actually useful way. So there are new subcommands for agent, MCP, docs and explorer. So it's k6 docs and you can do it right from the command line and you can also configure an agent more easily so whatever agent you're using locally, you can teach it how to k6 basically and have it write really good opinionated k6 tests for you. There's also an assertions API now, which we're modeling after Playwright's API. And oh yeah, k6 Operator is now at 1.0 for distributed testing. Have you used this because it is amazing.
Stephanie (00:51:32):
Yes, it is fantastic and that's super exciting to hit that milestone. That's
Nicole (00:51:36):
awesome. Yeah. For people who haven't used it yet, this is an extension of k6 that allows it to run kubectl commands. So do you say kube-CTL or kubectl?
Stephanie (00:51:49):
I say kube-CTL.
Nicole (00:51:51):
Okay. Okay. I do too. Yeah. So imagine a load test or just any test that can run the test and then based on the responses, delete some config and then reapply something else. It's amazing. I haven't seen anything else that can do that, like a testing tool that can change the environment that it's testing on and improve it. I love that. So I still really love k6.
Stephanie (00:52:24):
As you should.
Nicole (00:52:25):
Yeah. Yeah. OpenTelemetry. I love that there was this thing, Ted Young spoke for OpenTelemetry and he was like, the goal is to be as boring as possible. And you know what? He's so right. We don't want things to be exciting for telemetry. We want them not to change. Stop changing things, just be the same and work.
Stephanie (00:52:49):
I loved his talk. He is just very funny. If you haven't seen that talk, definitely watch it.
Nicole (00:52:55):
Yeah. And OpenTelemetry has graduated now, where actually now my colleague Imfeld and I are going to be talking about that. A bunch of other people, the guest list isn't quite firmed up yet, but we're going to talk about it on an OTel community call just about what it took to graduate CNCF and where we're going next. Yeah. It feels like it's been a while. CNCF
Stephanie (00:53:27):
graduation is a really big deal. Sorry, go ahead.
Nicole (00:53:30):
Yeah. No, it's all right. Alloy is also now officially an OTel collector distribution. That's something that the Alloy team has been pushing for. And what I love is that they still didn't just give up on what makes Alloy Alloy. You can still run the .alloy files and everything. There are two engines that are side by side and now the big thing that makes it an official distribution is that it can now natively accept the OTel YAML files. So now it doesn't matter which one you're using. It can take both. And one of the talks was about fleet management with OpAMP. That's so cool because it lets you ... It's almost like there was a talk that someone did a few GrafanaCONs ago that was all about IoT devices and like managing that IoT fleet and it reminded me of that, where it can remotely push config to any OTel collector, not just Alloy, by the way, and just like any OpenTelemetry collector from one place.
(00:54:38):
There's a single dashboard, not a dashboard, single app that you go to and you can see everything, whether it's up, whether it's down, what config it's running. So great.
Stephanie (00:54:50):
That's super helpful. That's awesome.
Nicole (00:54:53):
Yeah. And then also Beyla is now called OBI, or is it OBI? I say OBI, and it was donated to OpenTelemetry. Yeah, OBI is cute, right?
Stephanie (00:55:06):
It is. I like it.
Nicole (00:55:08):
It's now the OpenTelemetry official eBPF auto-instrumentation tool. Super nice too to see that Grafana donated that. We didn't have to. We could have monetized the stuff out of that, but we didn't. I appreciate working for a company like that, that really contributes back, makes significant substantial contributions back to the open source community.
Stephanie (00:55:34):
Definitely. I totally agree.
Nicole (00:55:37):
Okay. I think we've covered ...
Stephanie (00:55:41):
A few features, right?
Nicole (00:55:43):
Yeah. I think we've covered the big features, but the thing that I got the most FOMO out of when I was watching these things and also watching the banter and the channels about it was Tamagotchi.
Stephanie (00:55:56):
It is so good. That was so nostalgic for me. I love that.
Nicole (00:56:02):
Did you have a Tamagotchi?
Stephanie (00:56:04):
I did, yes. It died a lot, but I tried.
Nicole (00:56:08):
I had several Tamagotchis and yeah, they died, but I loved them and I cried about them. And for those who weren't there, there was a big Tamagotchi that even looks like the Tamagotchi from days gone by, except it was a Grot. He's our mascot. And I love that it's not just a gimmick because the real cool thing about it was how to instrument resource-constrained devices that maybe don't have enough juice to run the full OTel SDK and it's actually harder than it sounds, but yeah, it's like a small device or something that's not very beefy that had to be ... There were some really clever things about serializing the OpenTelemetry packet and posting it over HTTP and there are actually legitimate lessons to be learned there just for any resource-constrained device, not just Tamagotchi, although that's pretty cool.
Stephanie (00:57:18):
Yeah, it was so cool. That was the first booth that I went to. I was like, "I have to see this Tamagotchi in person." When I saw it in the hackathon videos, I was very excited.
Nicole (00:57:29):
Did you have a favorite talk because you were actually there.
Stephanie (00:57:34):
I guess I was really excited about the planet-size dashboards, I think is what it's called, but the Google talk was really cool. We've been doing a lot of work internally to be able to, I guess, scale from small more monitoring gardens all the way to scaling to Google. And so getting to see, I guess, how far Grafana has come to, and that is now the industry standard that Google wants to use Grafana. That's a really huge deal and was really exciting.
Nicole (00:58:07):
Yeah. And one part of me is like, "How did we swing that?" And the second part of me is like, "Well, of course, what else would they use?" And that's kind of cocky, isn't it? But I'm very proud of what we've built here.
Stephanie (00:58:21):
Yeah, same. I mean, Google typically will create their own internal things. So for them to want to use Grafana is just a testament to what we've built and what the community has built. And it's really cool and really proud of how far Grafana and the community have taken Grafana. That's awesome.
Nicole (00:58:40):
My favorite talk was actually not any of the announcements. It was the supply chain attack, which was
(00:58:48):
a postmortem of a huge security incident in Grafana. I was so surprised when I was watching it on YouTube. By the way, all of the talks, I think all the talks are on YouTube, that's how I can talk about them without having been there. But this supply chain attack one was really, it takes bravery. It takes a lot of guts for a company to say like, "Yeah, we've been hacked. Here's how, and here's everything that could have happened and didn't happen and here's how we responded." And I noticed that that was a real theme in this year's GrafanaCON. There were several things where there's a lot of honesty about limits. I love that it's not just, "Rah, rah, Grafana is the best, nothing's wrong." And it's like, no, we are re-architecting our databases, but there are trade-offs. We think that they're worthwhile trade-offs and we stand by those decisions, but you have to know it's not all sunshine.
(00:59:50):
There are some things that you can expect, but also any talk about security is like, wow, that really helps people trust us. It really shows how intentional our security team is about what they release to people, and how they used in the talk, they use the same tools that the hackers use. I thought that was ... Yeah,
Stephanie (01:00:16):
it was really interesting.
Nicole (01:00:18):
Yeah.
Stephanie (01:00:19):
That talk, I was on call the weekend that that happened. So it was interesting as well hearing back the events that had played out, but also all of the things that they had set up with canary tokens and how those alerted us immediately. They do top-notch work. It was awesome.
Nicole (01:00:39):
Yeah. Their description of having canary tokens to make it a hellscape for hackers. Oh, that's so awesome. Awesome in the best way.
Stephanie (01:00:50):
Yes,
Nicole (01:00:51):
it was great. Yeah, great talk, but also a lot of humility. It wasn't like we are the best and we're infallible and nothing gets by us. And I thought it took real honesty and humility to be like, "Yeah, we should have caught that or we missed this one thing." And that actually makes me trust. When a company does that, it makes me trust them more.
Stephanie (01:01:14):
For sure. The level of transparency that Grafana has is really nice and it's nice to be working for someone or the company that is going to be transparent.
Nicole (01:01:24):
Yeah. Well, also there's the Mimir Tempo talk where they talked about three production incidents that were pretty big, and then also the scaling Assistant talk where they're like, "Yeah, we had consistency issues with our stuff and here's what we're doing to try and combat that," which is honest.
Stephanie (01:01:45):
Yeah. I mean, I think in engineering in general, whenever you're building something, you go through things and to say that you don't would be completely dishonest. We all are trying to scale things, trying to figure things out and yeah, we get there, but it's good to learn from each other as well. When things don't go well, how can you improve that?
Nicole (01:02:07):
Do you go to every GrafanaCON?
Stephanie (01:02:10):
I've been to the last three now, so it's been really nice. Very lucky to be able to go to those.
Nicole (01:02:18):
Can you give us a pitch for people who maybe have never gone to GrafanaCON? Well, first, would you pitch it? Okay. Well, how would you recommend it to people?
Stephanie (01:02:32):
Yeah. I mean, I've been to a lot of different tech conferences and regardless of if I was working at Grafana or not, I would say this still, that it is the nicest community to go to. When we had, for example, launched Git Sync last year and it was experimental, we had demoed it on stage and someone else from the audience called out. They're like, "We still love it. We like, great job." And the community is just so kind and just wants to, I don't know, make Grafana better and do it together. And I've never been to a tech conference that is just so uplifting and kind as GrafanaCON is. I really loved it.
Nicole (01:03:20):
I also really love that when choosing speakers, we really don't pick the ones that always get into the talks. We really try to make it as accessible as possible to first time speakers. My team, the dev advocacy team, is usually pulled in on a lot of the review work for that. So if you get something accepted, we will help you with the talk. We really have multiple rehearsals and feedback and coaching. We want to be as inclusive as possible and your experience as a speaker should not come into play when you're thinking about whether or not to apply. Just apply. If you can't apply, do go because it is just a really ... The best part is the vibe. Everyone is just there to learn. No one's going to sell at you unless you say you want to buy something, but it's really just trading use cases and looking at cool science fair stuff like the Tamagotchi.
Stephanie (01:04:24):
Yeah. And even there was one with gardens at the science fair and I really love seeing what people have been tinkering with too. It's just really interesting to see what I should be doing on my spare time. So
Nicole (01:04:40):
it's great. Yeah. Well, thanks so much for joining me on this, Stephanie. We missed Matt. Oh yeah, we didn't say Matt couldn't make it last minute. So Stephanie and I decided that we were just going to make it our own thing.
Stephanie (01:04:54):
Thank you so much for having me. This was so much fun. Always great to talk to you.
Nicole (01:04:58):
Yeah, it always is. Thank you people for watching who are here, and if you're watching this after the fact and you have any other questions, you can leave them in the comments below. I always check them and I make sure you get a response. So thank you. And next month, I think Usman might be back so it won't be me. Thanks for letting me crash your community call.
Stephanie (01:05:22):
Me too.
Nicole (01:05:23):
Yeah. Bye everyone.