Grafana Loki - Fork My Brain

%% date:: [[2023-04-23]], [[2023-08-28]], [[2023-11-30]], [[2024-04-08]], [[2024-04-24]], [[2024-05-10]] parent:: %% # [[Grafana Loki]] [repo](https://github.com/grafana/loki) | [site](https://grafana.com/oss/loki/) Loki is an [[Open-source|opensource]] [[Logs|log]] aggregation tool that takes much inspiration from [[Prometheus]]. It was created by [[Grafana Labs]] . Loki includes a centralized database to forward logs to. This centralization is useful especially in the case of [[Serverless computing]] or ephemeral [[Kubernetes]] pods. Another reason to centralize logs is to see the bigger pictures, especially in the case where you might have multiple servers with [[Load Balancer|load balancing]], such that looking at individual servers would only show you the requests that went to each particular server. Loki works so closely with Prometheus that they use the same label sets as metrics. It also lets you turn logs into metrics to make them easier to work with. Here's a quick intro to Loki: <iframe width="560" height="315" src="https://www.youtube.com/embed/6Eau8k0SNvs" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> Here's an introduction to Loki that I made with [[Jay Clifford]]: ![](https://youtu.be/1uk8LtQqsZQ) ## Features of Loki ### "Like [[Prometheus]], for logs" Loki prides itself on being the log counterpart of the metrics-focused [[Prometheus]]. Here are a few reasons why that could be a good thing: - Since [[Metrics]] and [[Logs]] are often both collected as important parts of an [[Observability]] strategy, modelling Loki after Prometheus means platform engineers have less to learn to work with them. - Loki uses [[Promtail]] as a default agent (although you can change that to something else). Promtail uses the same service discovery mechanisms as Prometheus. - Loki works with [[Prometheus Alertmanager]], by letting you create a metric out of your logs using [[LogQL]]. - [[LogQL]] is very similar to [[PromQL]]. ### Plays well with others Loki was made to work with [[Cloud Native Computing Foundation|cloud-native]] projects, including [[Kubernetes]], [[Prometheus]], [[Grafana]], and more. ### Performant, reliable,and cost-effective - Loki only indexes metadata instead of the full text, which makes it faster to query and cheaper to run. - Log lines are stored and compressed. - The grepping workload is [[Distributed computing|distributed]] in a [[Parallelism|parallelized]] way. - [[Promtail]] uses [[Exponential backoff]] so it doesn't hammer your server. - Only the index (as a sort of table of contents) is placed in object storage, unlike other tools that save everything, including the log line, to disk and to RAM for easy access. - Object storage is 100% persistent: log data is not lost in case of system failures or restarts. This is accomplished through the high availability. ### Horizontally [[Scalability|scalable]] - Loki uses a [[Microservices|microservices-based architecture]]: you can scale up, for example, the query buffer, depending on your needs. ### Explore Logs In [[GrafanaCON 2024]], a new feature for [[Grafana Loki|Loki]] on [[Grafana]] was announced: *Explore Logs*. Explore Logs is a way to visualize logs, narrow down on errors, and find issues without having to use [[LogQL]]. [[How to enable Explore Logs for Loki]] ## Installing Loki Loki can be installed from a binary. On [[Kubernetes]], you can also use [[Helm for Kubernetes|Helm]] to install Loki using [this Helm chart](https://grafana.com/docs/loki/latest/installation/helm/install-scalable). [Here's a demo app called Carnivorous Green House](https://github.com/grafana/loki-fundamentals/tree/what-is-loki) that shows Loki set up with [[Grafana Alloy|Alloy]] and [[Grafana]] via [[Docker Compose]]. ## How Loki works A Loki log consists of: - A timestamp - Labels/selectors (key-value pairs) - Content of the log line Of these, the content is unindexed. ![[loki-how-it-works.png]] [^goh] Instead of being indexed, the log line is grouped into streams and then indexed with Prometheus-style labels. So it's more like a table of contents than an index. [[LogQL]] is the query language used to query logs in Loki. When you query logs, the raw logs are queried by the label selector you've specified, then by timeframe, and then only the last part is brute forced. This process makes Loki more performant than other log aggregators. ![[loki-querying.png]] [^goh] ## Architecture of Loki The illustration below shows the architecture of Loki for a single instance on the left, and in a [[Multi-tenant]] setup on the right. ![[loki-scaling.png]] [^goh] Loki is most commonly run on a [[Kubernetes]] cluster, though it can run on a Docker container as well. It consists of three microservices: - Read Path - Write Path (it's recommended to have 3 - 2 to ensure 100% consistent object storage, and the third for availability) - Administrative (`backend` component) ## Object Storage Loki needs a separate object storage. Here are some options for what to use: - [[Minio]] - [[AWS Simple Storage Service (S3)]] - [[Google Cloud Storage]] ## Other resources Here's a Grafana Office Hours livestream that I did with [[Paul Balogh]] and [[Ward Bekker]] about Loki: ![](https://www.youtube.com/watch?v=OLebNPLIJMI) [^goh] - [[Loki Community Call 2024-04-04]]: Community Call tackling recent Loki features in [[Loki 3.0]]. - [[Loki 3.0]] [^goh]: Bekker, W. (2023). *Getting started wit hGrafana Loki (Grafana Office Hours #09*. Retrieved from: https://www.youtube.com/watch?v=OLebNPLIJMI