Agent
This document describes the Prometheus metrics generated by Aperture Agents.
Flux Meter
Metrics
Name | Type | Labels | Unit | Description |
---|---|---|---|---|
flux_meter | Histogram | agent_group, instance, job, process_uuid, flux_meter_name, decision_type, http_status_code, flow_status | ms | Flow's workload duration |
invalid_flux_meter_total | Counter | agent_group, instance, job, process_uuid, flux_meter_name, decision_type, http_status_code, flow_status | count (no unit) | Count of invalid FluxMeter readings |
Labels
Name | Example | Description |
---|---|---|
agent_group | default | Agent Group of the policy that FluxMeter belongs to |
instance | aperture-agent-cbfnp | Host instance of the Aperture Agent |
job | aperture-self | The configured job name that the target belongs to |
process_uuid | dc0e82af-6730-4f70-8228-ee91da53ac5f | Host instance's UUID |
flux_meter_name | service1-demo-app | Name of the FluxMeter |
decision_type | DECISION_TYPE_ACCEPTED, DECISION_TYPE_REJECTED | Whether the flow was accepted or not |
http_status_code | 200, 503 | HTTP status code |
flow_status | OK, Error | Flow status. A common label to denote OK or Error across all protocols |
Load Scheduler
Metrics
Name | Type | Labels | Unit | Description |
---|---|---|---|---|
workload_latency_ms | Summary | agent_group, instance, job, process_uuid, policy_name, policy_hash, component_id, workload_index | ms | Latency summary of workload |
incoming_tokens_total | Counter | agent_group, instance, job, process_uuid, policy_name, policy_hash, component_id | token | A counter measuring tokens incoming into Scheduler |
accepted_tokens_total | Counter | agent_group, instance, job, process_uuid, policy_name, policy_hash, component_id | token | A counter measuring tokens admitted by Scheduler |
Labels
Name | Example | Description |
---|---|---|
agent_group | default | Agent Group of the policy that Load Scheduler belongs to |
instance | aperture-agent-cbfnp | Host instance of the Aperture Agent |
job | aperture-self | The configured job name that the target belongs to |
process_uuid | dc0e82af-6730-4f70-8228-ee91da53ac5f | Host instance's UUID |
policy_name | service1-demo-app | Name of the policy. |
policy_hash | 5kZjjSgDAtGWmLnDT67SmQhZdHVmz0+GvKcOGTfWMVo= | Hash of the policy used for checking the integrity of the policy. |
workload_index | 0, 1, 2, default | Index of the workload in order of specification in the policy. |
component_id | 13 | Index of the component in order of specification in the policy. |
Rate Limiter
Metrics
Name | Type | Labels | Unit | Description |
---|---|---|---|---|
rate_limiter_counter_total | Counter | agent_group, instance, job, process_uuid, policy_name, policy_hash, component_id, decision_type, limiter_dropped | count (no unit) | A counter measuring the number of times Rate Limiter was triggered |
Labels
Name | Example | Description |
---|---|---|
agent_group | default | Agent Group of the policy that Rate Limiter belongs to |
instance | aperture-agent-cbfnp | Host instance of the Aperture Agent |
job | aperture-self | The configured job name that the target belongs to |
process_uuid | dc0e82af-6730-4f70-8228-ee91da53ac5f | Host instance's UUID |
policy_name | service1-demo-app | Name of the policy. |
policy_hash | 5kZjjSgDAtGWmLnDT67SmQhZdHVmz0+GvKcOGTfWMVo= | Hash of the policy used for checking the integrity of the policy. |
component_id | 13 | Index of the component in order of specification in the policy. |
decision_type | DECISION_TYPE_ACCEPTED, DECISION_TYPE_REJECTED | Whether the flow was accepted or not |
limiter_dropped | true, false | Whether this particular limiter has dropped the request. |
Concurrency Limiter
Metrics
Name | Type | Labels | Unit | Description |
---|---|---|---|---|
concurrency_limiter_counter_total | Counter | agent_group, instance, job, process_uuid, policy_name, policy_hash, component_id, decision_type, limiter_dropped | count (no unit) | A counter measuring the number of times Concurrency Limiter was triggered |
Labels
Name | Example | Description |
---|---|---|
agent_group | default | Agent Group of the policy that Concurrency Limiter belongs to |
instance | aperture-agent-cbfnp | Host instance of the Aperture Agent |
job | aperture-self | The configured job name that the target belongs to |
process_uuid | dc0e82af-6730-4f70-8228-ee91da53ac5f | Host instance's UUID |
policy_name | service1-demo-app | Name of the policy. |
policy_hash | 5kZjjSgDAtGWmLnDT67SmQhZdHVmz0+GvKcOGTfWMVo= | Hash of the policy used for checking the integrity of the policy. |
component_id | 13 | Index of the component in order of specification in the policy. |
decision_type | DECISION_TYPE_ACCEPTED, DECISION_TYPE_REJECTED | Whether the flow was accepted or not |
limiter_dropped | true, false | Whether this particular limiter has dropped the request. |
Sampler
Metrics
Name | Type | Labels | Unit | Description |
---|---|---|---|---|
sampler_counter_total | Counter | agent_group, instance, job, process_uuid, policy_name, policy_hash, component_id, decision_type, sampler_dropped | count (no unit) | A counter measuring the number of times Sampler was triggered |
Labels
Name | Example | Description |
---|---|---|
agent_group | default | Agent Group of the policy that Sampler belongs to |
instance | aperture-agent-cbfnp | Host instance of the Aperture Agent |
job | aperture-self | The configured job name that the target belongs to |
process_uuid | dc0e82af-6730-4f70-8228-ee91da53ac5f | Host instance's UUID |
policy_name | service1-demo-app | Name of the policy. |
policy_hash | 5kZjjSgDAtGWmLnDT67SmQhZdHVmz0+GvKcOGTfWMVo= | Hash of the policy used for checking the integrity of the policy. |
component_id | 13 | Index of the component in order of specification in the policy. |
decision_type | DECISION_TYPE_ACCEPTED, DECISION_TYPE_REJECTED | Whether the flow was accepted or not |
sampler_dropped | true, false | Whether this particular sampler has dropped the request. |
Classifier
Metrics
Name | Type | Labels | Unit | Description |
---|---|---|---|---|
classifier_counter_total | Counter | agent_group, instance, job, process_uuid, policy_name, policy_hash, classifier_index | count (no unit) | A counter measuring the number of times classifier was triggered |
Labels
Name | Example | Description |
---|---|---|
agent_group | default | Agent Group of the policy that Classifier belongs to |
instance | aperture-agent-cbfnp | Host instance of the Aperture Agent |
job | aperture-self | The configured job name that the target belongs to |
process_uuid | dc0e82af-6730-4f70-8228-ee91da53ac5f | Host instance's UUID |
policy_name | service1-demo-app | Name of the policy. |
policy_hash | 5kZjjSgDAtGWmLnDT67SmQhZdHVmz0+GvKcOGTfWMVo= | Hash of the policy used for checking the integrity of the policy. |
classifier_index | 0, 1 | Index of the classifier in order of specification in the policy. |
Flow Control Summary
Flow Control Metrics
Name | Type | Labels | Unit | Description |
---|---|---|---|---|
flowcontrol_requests_total | Counter | control_point, agent_group, instance, job, process_uuid | count (no unit) | Total number of aperture check requests handled |
flowcontrol_decisions_total | Counter | control_point, agent_group, instance, job, process_uuid, decision_type | count (no unit) | Number of aperture check decisions |
flowcontrol_reject_reasons_total | Counter | control_point, agent_group, instance, job, process_uuid, reject_reason | count (no unit) | Number of reject reasons other than unspecified |
flowcontrol_ends_total | Counter | control_point, agent_group, instance, job, process_uuid | count (no unit) | Total number of flow end calls handled |
Flow Control Labels
Name | Example | Description |
---|---|---|
agent_group | default | Agent Group of the Aperture Agent |
instance | aperture-agent-cbfnp | Host instance of the Aperture Agent |
job | aperture-self | The configured job name that the target belongs to |
process_uuid | dc0e82af-6730-4f70-8228-ee91da53ac5f | Host instance's UUID |
decision_type | DECISION_TYPE_ACCEPTED, DECISION_TYPE_REJECTED | Whether the flow was accepted or not |
reject_reason | REJECT_REASON_RATE_LIMITED, REJECT_REASON_NO_TOKENS, REJECT_REASON_NOT_SAMPLED | Reason why FlowControl Check response rejected the flow. |
Distributed Cache Metrics
Name | Type | Labels | Unit | Description |
---|---|---|---|---|
distcache_entries_total | Gauge | agent_group, instance, job, process_uuid, distcache_member_id, distcache_member_name | count (no unit) | Total number of entries in the DMap |
distcache_delete_hits | Gauge | agent_group, instance, job, process_uuid, distcache_member_id, distcache_member_name | count (no unit) | Number of deletion requests resulting in an item being removed in the DMap |
distcache_delete_misses | Gauge | agent_group, instance, job, process_uuid, distcache_member_id, distcache_member_name | count (no unit) | Number of deletion requests for missing keys in the DMap |
distcache_get_misses | Gauge | agent_group, instance, job, process_uuid, distcache_member_id, distcache_member_name | count (no unit) | Number of entries that have been requested and not found in the DMap |
distcache_get_hits | Gauge | agent_group, instance, job, process_uuid, distcache_member_id, distcache_member_name | count (no unit) | Number of entries that have been requested and found present in the DMap |
distcache_evicted_total | Gauge | agent_group, instance, job, process_uuid, distcache_member_id, distcache_member_name | count (no unit) | Total number of entries removed from cache to free memory for new entries in the DMap |
cache_lookup_hits_total | Counter | agent_group, instance, job, process_uuid, type, control_point | count (no unit) | Cumulative number of cache lookup hits |
cache_lookup_misses_total | Counter | agent_group, instance, job, process_uuid, type, control_point | count (no unit) | Cumulative number of cache lookup misses |
cache_operation_results_total | Counter | agent_group, instance, job, process_uuid, type, control_point | count (no unit) | Cumulative number of cache operation results |
Distributed Cache Labels
Name | Example | Description |
---|---|---|
agent_group | default | Agent Group of the Aperture Agent |
instance | aperture-agent-cbfnp | Host instance of the Aperture Agent |
job | aperture-self | The configured job name that the target belongs to |
process_uuid | dc0e82af-6730-4f70-8228-ee91da53ac5f | Host instance's UUID |
distcache_member_id | 384313659919819706 | Internal ID of distributed cache cluster member. |
distcache_member_name | 10.244.1.20:3320 | Internal unique name of distributed cache cluster member. |
Scheduler Metrics
Name | Type | Labels | Unit | Description |
---|---|---|---|---|
token_bucket_lm_ratio | Gauge | agent_group, instance, job, process_uuid, policy_name, policy_hash, component_id | percentage | A gauge that tracks the load multiplier |
token_bucket_fill_rate | Gauge | agent_group, instance, job, process_uuid, policy_name, policy_hash, component_id | tokens/s | A gauge that tracks the fill rate of token bucket |
token_bucket_capacity_total | Gauge | agent_group, instance, job, process_uuid, policy_name, policy_hash, component_id | count (no unit) | A gauge that tracks the capacity of token bucket |
token_bucket_available_tokens_total | Gauge | agent_group, instance, job, process_uuid, policy_name, policy_hash, component_id | count (no unit) | A gauge that tracks the number of tokens available in token bucket |
workload_requests_total | Counter | agent_group, instance, job, process_uuid, policy_name, policy_hash, component_id, workload_index, decision_type, limiter_dropped | count (no unit) | A counter of workload requests |
request_in_queue_duration_ms | Summary | agent_group, instance, job, process_uuid, policy_name, policy_hash, component_id, workload_index | ms | Metric used for grouping durations for requests by workload in queue of Scheduler |
workload_preempted_tokens | Summary | agent_group, instance, job, process_uuid, policy_name, policy_hash, component_id, workload_index | token | Metric used for counting tokens preempted per request measured end-to-end in the scheduler across all workloads. |
workload_delayed_tokens | Summary | agent_group, instance, job, process_uuid, policy_name, policy_hash, component_id, workload_index | token | Metric used for counting tokens delayed per request measured end-to-end in the scheduler across all workloads. |
workload_on_time_total | Counter | agent_group, instance, job, process_uuid, policy_name, policy_hash, component_id, workload_index, decision_type, limiter_dropped | count (no unit) | Metric used for counting requests that are on time, neither preempted nor delayed measured end-to-end in the scheduler across all workloads. |
fairness_preempted_tokens | Summary | agent_group, instance, job, process_uuid, policy_name, policy_hash, component_id, fairness_index | token | Metric used for counting tokens preempted per request measured at fairness queues within the same workload. |
fairness_delayed_tokens | Summary | agent_group, instance, job, process_uuid, policy_name, policy_hash, component_id, fairness_index | token | Metric used for counting tokens delayed per request measured at fairness queues within the same workload. |
fairness_on_time_total | Counter | agent_group, instance, job, process_uuid, policy_name, policy_hash, component_id, fairness_index, decision_type, limiter_dropped | count (no unit) | Metric used for counting requests that are on time, neither preempted nor delayed measured at fairness queues within the same workload. |
Scheduler Labels
Name | Example | Description |
---|---|---|
agent_group | default | Agent Group of the policy that Scheduler belongs to |
instance | aperture-agent-cbfnp | Host instance of the Aperture Agent |
job | aperture-self | The configured job name that the target belongs to |
process_uuid | dc0e82af-6730-4f70-8228-ee91da53ac5f | Host instance's UUID |
policy_name | service1-demo-app | Name of the policy. |
policy_hash | 5kZjjSgDAtGWmLnDT67SmQhZdHVmz0+GvKcOGTfWMVo= | Hash of the policy used for checking the integrity of the policy. |
component_id | root.13 | Index of the component in order of specification in the policy. |
workload_index | 0, 1, 2, default | Index of the workload in order of specification in the policy. |
component_id | 13 | Index of the component in order of specification in the policy. |
decision_type | DECISION_TYPE_ACCEPTED, DECISION_TYPE_REJECTED | Whether the flow was accepted or not |
limiter_dropped | true, false | Whether this particular limiter has dropped the request. |