日本熟妇hd丰满老熟妇,中文字幕一区二区三区在线不卡 ,亚洲成片在线观看,免费女同在线一区二区

Prometheus Remote Write配置

更新時間:

阿里云提供不同規格的TSDB實例,設置了不同的最大寫入TPS,避免過大TPS導致TSDB實例不可用,保護TSDB實例正常運行。當寫入TPS超過TSDB實例允許的最大TPS時,將觸發TSDB實例限流保護規則,會造成寫入失敗異常。因此需要根據TSDB實例規格來調整Prometheus的remote_write配置,從而實現平穩可靠的將Prometheus采集到的指標寫入TSDB中。

Prometheus的remote_write的所有配置項可以從Prometheus官網得到,本文這里只介紹Prometheus對接阿里云TSDB時的寫入配置最佳實踐。為了提高寫入效率,Prometheus在將采集到的samples寫入遠程存儲之前,會先緩存在內存隊列中,然后打包發送給遠端存儲。而這個內存隊列的配置參數,對于Prometheus寫入遠程存儲的效率影響較大,其包括的配置項主要如下所示。

# Configures the queue used to write to remote storage.
queue_config:
  # Number of samples to buffer per shard before we start dropping them.
  [ capacity: <int> | default = 10000 ]
  # Maximum number of shards, i.e. amount of concurrency.
  [ max_shards: <int> | default = 1000 ]
  # Minimum number of shards, i.e. amount of concurrency.
  [ min_shards: <int> | default = 1 ]
  # Maximum number of samples per send.
  [ max_samples_per_send: <int> | default = 100]
  # Maximum time a sample will wait in buffer.
  [ batch_send_deadline: <duration> | default = 5s ]
  # Maximum number of times to retry a batch on recoverable errors.
  [ max_retries: <int> | default = 3 ]
  # Initial retry delay. Gets doubled for every retry.
  [ min_backoff: <duration> | default = 30ms ]
  # Maximum retry delay.
  [ max_backoff: <duration> | default = 100ms ]

上面配置中對于min_shards這個配置項,僅Prometheus V2.6.0及其之后的版本才支持,V2.6.0以前的版本默認是1,因此若無特殊需要,可以不用設置該參數。

上面的參數中的max_shards和max_samples_per_send決定了Prometheus寫入遠程存儲的最大TPS。假設發送100個sample需要100ms, 那么按照上面的默認配置,Prometheus寫入遠程存儲的最大TPS為 1000 * 100 / 0.1s = 100W/s。 若購買的TSDB實例的最大寫入TPS小于100W/s,則很容易觸發TSDB實例限流保護規則,會造成寫入失敗異常。下面給出了對于TSDB不同規格,Prometheus對接TSDB時remote_write參考的配置,在不同的使用場景下,可以適當調整。

TSDB規格ID

寫入數據點/秒

參考配置

mlarge

5000

capacity:10000max_samples_per_send:500max_shards:1

large

10000

capacity:10000max_samples_per_send:500max_shards:2

3xlarge

30000

capacity:10000max_samples_per_send:500max_shards:6

4xlarge

40000

capacity:10000max_samples_per_send:500max_shards:8

6xlarge

60000

capacity:10000max_samples_per_send:500max_shards:12

12xlarge

120000

capacity:10000max_samples_per_send:500max_shards:24

24xlarge

240000

capacity:10000max_samples_per_send:500max_shards:48

48xlarge

480000

capacity:10000max_samples_per_send:500max_shards:96

96xlarge

960000

capacity:10000max_samples_per_send:500max_shards:192

以TSDB實例為mlarge規格的為例,則Prometheus的參考配置的完整示例如下:

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['localhost:9090']
# Remote write configuration (TSDB).
remote_write:
  - url: "http://ts-xxxxxxxxxxxx.hitsdb.rds.aliyuncs.com:3242/api/prom_write"
    # Configures the queue used to write to remote storage.
    queue_config:
      # Number of samples to buffer per shard before we start dropping them.
      capacity: 10000
      # Maximum number of shards, i.e. amount of concurrency.
      max_shards: 1
      # Maximum number of samples per send.
      max_samples_per_send: 500

# Remote read configuration (TSDB).
remote_read:
  - url: "http://ts-xxxxxxxxxxxx.hitsdb.rds.aliyuncs.com:3242/api/prom_read"
    read_recent: true