mirror of
https://github.com/VictoriaMetrics/VictoriaMetrics.git
synced 2024-11-21 14:44:00 +00:00
docs: fix typos and format in case study (#7374)
### Describe Your Changes - made small typo fix in case studies ### Checklist The following checks are **mandatory**: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/).
This commit is contained in:
parent
45896fb477
commit
8dc4e2b5a5
1 changed files with 46 additions and 46 deletions
|
@ -3,11 +3,12 @@ weight: 21
|
|||
title: Case studies and talks
|
||||
menu:
|
||||
docs:
|
||||
parent: 'victoriametrics'
|
||||
parent: "victoriametrics"
|
||||
weight: 21
|
||||
aliases:
|
||||
- /CaseStudies.html
|
||||
---
|
||||
|
||||
Below please find public case studies and talks from VictoriaMetrics users. You can also join our [community Slack channel](https://slack.victoriametrics.com/)
|
||||
where you can chat with VictoriaMetrics users to get additional references, reviews and case studies.
|
||||
|
||||
|
@ -87,7 +88,7 @@ We ended up with the following configuration:
|
|||
We learned that remote write protocol generated too much traffic and connections so after 8 months we started looking for alternatives.
|
||||
|
||||
Around the same time, VictoriaMetrics released [vmagent](https://docs.victoriametrics.com/vmagent/).
|
||||
We tried to scrape all the metrics via a single instance of vmagent but it that didn't work because vmagent wasn't able to catch up with writes
|
||||
We tried to scrape all the metrics via a single instance of vmagent but that didn't work because vmagent wasn't able to catch up with writes
|
||||
into VictoriaMetrics. We tested different options and end up with the following scheme:
|
||||
|
||||
- We removed Prometheus from our setup.
|
||||
|
@ -260,38 +261,38 @@ We started with a Prometheus server on EKS. That worked until it didn't. We then
|
|||
|
||||
### What VictoriaMetrics means for us
|
||||
|
||||
* Easy to use and maintain
|
||||
* Cost effective
|
||||
* The ability to handle billions of time series events at any point of time
|
||||
* Multiple K8s clusters to monitor
|
||||
* Consistent monitoring infra for each cluster across multiple Regions and clouds
|
||||
* Secure communication and data storage
|
||||
* Easy Retention
|
||||
- Easy to use and maintain
|
||||
- Cost effective
|
||||
- The ability to handle billions of time series events at any point of time
|
||||
- Multiple K8s clusters to monitor
|
||||
- Consistent monitoring infra for each cluster across multiple Regions and clouds
|
||||
- Secure communication and data storage
|
||||
- Easy Retention
|
||||
|
||||
### Some of our initial challenges prior to moving to VictoriaMetrics
|
||||
|
||||
* Reducing cost by not using a managed solution of one of the clouds
|
||||
* Support HA and recover fast
|
||||
* No downtimes
|
||||
* Having our main prometheus using too much Ram and restarts.
|
||||
- Reducing cost by not using a managed solution of one of the clouds
|
||||
- Support HA and recover fast
|
||||
- No downtimes
|
||||
- Having our main prometheus using too much Ram and restarts.
|
||||
|
||||
### Some of the reasons we chose VictoriaMetrics
|
||||
|
||||
* The API is compatible with Prometheus and all standard PromQL queries work well out of the box
|
||||
* Handles storage well
|
||||
* Available to use in Grafana easily
|
||||
* Single and small executable
|
||||
* Easy and fast backups
|
||||
* Better benchmarks than all the competitors
|
||||
* Open Source and maintained with good community
|
||||
- The API is compatible with Prometheus and all standard PromQL queries work well out of the box
|
||||
- Handles storage well
|
||||
- Available to use in Grafana easily
|
||||
- Single and small executable
|
||||
- Easy and fast backups
|
||||
- Better benchmarks than all the competitors
|
||||
- Open Source and maintained with good community
|
||||
|
||||
### Some of the benefits we experienced since working with VictoriaMetrics
|
||||
|
||||
* We saved around $5K USD per month
|
||||
* It’s seamless and doesn’t cause any override complications on the Infrastructure team
|
||||
* It doesn’t use lots of storage
|
||||
* It can serve us in the future in even bigger scales
|
||||
* It has support with a great community.
|
||||
- We saved around $5K USD per month
|
||||
- It’s seamless and doesn’t cause any override complications on the Infrastructure team
|
||||
- It doesn’t use lots of storage
|
||||
- It can serve us in the future in even bigger scales
|
||||
- It has support with a great community.
|
||||
|
||||
## Fly.io
|
||||
|
||||
|
@ -412,11 +413,12 @@ See [this video](https://www.youtube.com/watch?v=OUyXPgVcdw4) and [these slides]
|
|||
[NetEase Cloud Music](https://music.163.com/) is a Chinese freemium music streaming service developed and owned by [NetEase, Inc](https://en.wikipedia.org/wiki/NetEase). It is one of the biggest competitors in the Chinese music streaming business, primarily competing with [Tencent](https://en.wikipedia.org/wiki/Tencent)'s QQ Music.
|
||||
|
||||
The huge scale of services and the diverse monitoring requirements bring great challenges to timeseries database’s reliability, availability, and performance. With year’s evolution, we finally build a metrics system around VictoriaMetrics, aiming to solve following problems:
|
||||
* Weak observability on application layer: in the past, internal monitoring of the product mainly focused on machine level. Although it also provided monitoring plugins for common frameworks, there was still room for improvement in both performance and visualization effects.
|
||||
* Linking metrics to trace: metrics are the most intuitive way to discover problems, such as "getting 10 failed http requests in the past 30s", but sometimes traces are also needed to locate the root cause of the errors.
|
||||
* Performance and cost: storage cost of the old metric system is relatively high, since prometheus as a standalone application cannot support large scale of data.
|
||||
* aggregate queries: aggregate queries are often needed and could take several seconds or even tens of seconds, slowing down troubleshooting process seriously.
|
||||
* Weak visualization capabilities: monitoring data are often used in YoY comparison and multi-instance comparison to help locate problems. Neither Prometheus UI nor Grafana supports this feature.
|
||||
|
||||
- Weak observability on application layer: in the past, internal monitoring of the product mainly focused on machine level. Although it also provided monitoring plugins for common frameworks, there was still room for improvement in both performance and visualization effects.
|
||||
- Linking metrics to trace: metrics are the most intuitive way to discover problems, such as "getting 10 failed http requests in the past 30s", but sometimes traces are also needed to locate the root cause of the errors.
|
||||
- Performance and cost: storage cost of the old metric system is relatively high, since prometheus as a standalone application cannot support large scale of data.
|
||||
- aggregate queries: aggregate queries are often needed and could take several seconds or even tens of seconds, slowing down troubleshooting process seriously.
|
||||
- Weak visualization capabilities: monitoring data are often used in YoY comparison and multi-instance comparison to help locate problems. Neither Prometheus UI nor Grafana supports this feature.
|
||||
|
||||
See [this article](https://juejin.cn/post/7322268449409744931) for details on how NetEase Cloud Music build a metrics system base on VictoriaMetrics and give solutions to above problems.
|
||||
|
||||
|
@ -611,7 +613,6 @@ Numbers:
|
|||
|
||||
Alex Ulstein, Head of Monitoring, Wix.com
|
||||
|
||||
|
||||
## xiaohongshu
|
||||
|
||||
With a mission to “inspire lives”, [Xiaohongshu](https://www.xiaohongshu.com) is a lifestyle platform that inspires people to discover and connect with a range of diverse lifestyles from China.
|
||||
|
@ -620,6 +621,7 @@ Now more than thirty VictoriaMetrics storage clusters are running online, includ
|
|||
See [this article](https://mp.weixin.qq.com/s/uJ1t0B8WBBryzvbLWDfl5A) on how Xiaohongshu build metrics system base on VictoriaMetrics and the competing solutions.
|
||||
|
||||
Across our production VictoriaMetrics clusters, numbers as below:
|
||||
|
||||
- Cpu cores in all VictoriaMetrics clusters: almost 50000
|
||||
- Data size on disk: 2400 TB
|
||||
- Retention period: 1 month
|
||||
|
@ -629,7 +631,6 @@ Across our production VictoriaMetrics clusters, numbers as below:
|
|||
- /api/v1/query_range: 2300 queries per second
|
||||
- /api/v1/query: 260 queries per second
|
||||
|
||||
|
||||
## Zerodha
|
||||
|
||||
[Zerodha](https://zerodha.com/) is India's largest stock broker. The monitoring team at Zerodha had the following requirements:
|
||||
|
@ -665,7 +666,6 @@ Numbers:
|
|||
- The average query rate is ~3k per second (mostly alert queries).
|
||||
- Query duration: median is ~40ms, 99th percentile is ~100ms.
|
||||
|
||||
|
||||
## Zomato
|
||||
|
||||
### Who We Are
|
||||
|
@ -679,6 +679,7 @@ As we scaled, our existing observability stack (Prometheus and Thanos) began to
|
|||
### Our Solution
|
||||
|
||||
To address these challenges, we decided to migrate to VictoriaMetrics. We were drawn to its reputation for high performance, low resource usage, and scalability. The migration process was carefully planned to ensure a smooth transition with minimal disruption. We focused on:
|
||||
|
||||
- **Data Optimization**: We reduced unnecessary metrics to minimize data ingestion and storage needs.
|
||||
- **Performance Enhancements**: VictoriaMetrics’ efficient query processing allowed us to achieve significantly faster query response times.
|
||||
- **Cost Efficiency**: The optimized storage format in VictoriaMetrics led to a noticeable reduction in our storage and operational costs.
|
||||
|
@ -687,5 +688,4 @@ To address these challenges, we decided to migrate to VictoriaMetrics. We were d
|
|||
|
||||
Post-migration, we successfully scaled our monitoring infrastructure to handle billions of data points daily, all while experiencing faster query performance and 60% reduction in yearly infra cost. The improved observability has enhanced our ability to deliver reliable service, allowing us to troubleshoot issues more quickly and effectively.
|
||||
|
||||
|
||||
Read more about the migration journey in our blog - https://blog.zomato.com/migrating-to-victoriametrics-a-complete-overhaul-for-enhanced-observability
|
Loading…
Reference in a new issue