Skip to main content
Version: 1.12

Metrics

Another important aspect of observability and monitoring is the collection of metrics data, for example, to identify patterns and understand trends, and to make sure SLAs are met.

ADS collects metrics for requests according to the following key metrics:

  • Rate - the number of access requests ADS is serving

  • Errors - the number of failed access requests

  • Duration - distribution of the amount of time each access request takes

  • Rate of successful evaluations - the rate of successful access requests that evaluated to Permit, Deny, Indeterminate, or NotApplicable, respectively

Pull or push

There are two main models for how the communication of metrics data is orchestrated between the monitored application and the metrics backend, based on either a client pulling or a server pushing the metrics data.

ADS is compatible with both models and currently supports the following monitoring systems:

  • Prometheus (pull)
  • Azure Monitor (push)
  • InfluxDB (push)
  • Axiomatics Services Manager (push)

Several monitoring systems can be used concurrently with ADS.

Pull

In this model, the metrics backend pulls data from the application every T time units (usually the time unit is "seconds"). This action is also referred to as polling or scraping. This is done by having the application expose an HTTP endpoint, which returns the current value of each metric without doing any calculation.

The polling period is configured in the metrics backend by the operator. It should be set to a value that yields enough data to satisfy monitoring needs and the ability to draw conclusions, while not negatively affecting the performance of the primary operations of the application. This may require some tuning and testing by the operator.

Push

In this model, the metrics backend waits for metrics data to be pushed (or sent) to it, at a time set by the application, that is, the monitored application is configured to send metrics data to the metrics backend every T time units (usually the time unit is "seconds").

This means that the metrics library used and the polling period is configured in the monitored application by the operator. As with the pull model, the polling period should be set to a value that yields enough data to satisfy monitoring needs and the ability to draw conclusions, while not negatively affecting the performance of the primary operations of the application. This may require some tuning and testing by the operator.

Metrics data

The metrics data provided by ADS is presented in the form of counters and timers collated into an output, ready to be accessed by the metrics backend.

ADS tags metrics with the following parameters:

  • ADS instance identity, as described in Configuring ADS instance identity.

  • Authorization domain identity, as described in The Identity section.

  • Authorization domain (namespace and domain name).

  • Domain sequence, a counter that represents how many domain changes the current instance of ADS has gone through since its startup.

Note: The domain tag (namespace and domain name) is only available when ADS is configured to retrieve the authorization domain from ASM/ADM using the RetrieveByName endpoint.

This functionality enables data to be filtered by their respective ADS instance id, domain id and/or domain values.

The following example shows the plain text format used for Prometheus:

# HELP decisions_total
# TYPE decisions_total counter
decisions_total{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0", type="Indeterminate",} 5.0
decisions_total{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="1", type="Deny",} 1.0
# HELP successful_requests_total
# HELP successful_requests_total
# TYPE successful_requests_total counter
successful_requests_total 8.0
# HELP error_requests_total
# TYPE error_requests_total counter
error_requests_total 1.0
# HELP duration_info_seconds_max
# TYPE duration_info_seconds_max gauge
duration_info_seconds_max 0.014
# HELP duration_info_seconds
# TYPE duration_info_seconds summary
duration_info_seconds{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0", quantile="0.5",} 0.005210112
duration_info_seconds{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0", quantile="0.75",} 0.014123008
duration_info_seconds{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0", quantile="0.9",} 0.014123008
duration_info_seconds{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0", quantile="0.99",} 0.014123008
duration_info_seconds{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0", quantile="0.999",} 0.014123008
duration_info_seconds{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0", quantile="1.0",} 0.014123008
duration_info_seconds_count 9.0
duration_info_seconds_sum 0.244
duration_info_seconds_bucket{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0", le="0.001",} 182.0
...
duration_info_seconds_bucket{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="1", le="30.0",} 1200.0
duration_info_seconds_bucket{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="1", le="+Inf",} 1200.0

Sample metrics output for Prometheus

Looking at the output per type, it can be broken down into the following sections:

# HELP successful_requests_total
# TYPE successful_requests_total counter
successful_requests_total{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0",} 8.0

The number of successful access requests.

# HELP error_requests_total
# TYPE error_requests_total counter
error_requests_total{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0",} 1.0

The number of errors.

# HELP duration_info_seconds_max
# TYPE duration_info_seconds_max gauge
duration_info_seconds_max 0.014
# HELP duration_info_seconds
# TYPE duration_info_seconds summary
duration_info_seconds{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0", quantile="0.5",} 0.005210112
duration_info_seconds{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0", quantile="0.75",} 0.014123008
duration_info_seconds{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0", quantile="0.9",} 0.014123008
duration_info_seconds{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0", quantile="0.99",} 0.014123008
duration_info_seconds{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0", quantile="0.999",} 0.014123008
duration_info_seconds{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0", quantile="1.0",} 0.014123008
duration_info_seconds_count 9.0
duration_info_seconds_sum 0.244
duration_info_seconds_bucket{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0", le="0.001",} 182.0
duration_info_seconds_bucket{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0", le="0.001048576",} 182.0
...
duration_info_seconds_bucket{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0", le="30.0",} 1200.0
duration_info_seconds_bucket{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0", le="+Inf",} 1200.0

The duration distribution for access requests. (Prometheus uses several buckets with histogram percentiles and for reasons of space the list is abbreviated.)

# HELP decisions_total
# TYPE decisions_total counter
decisions_total{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0", type="Permit",} 5.0
decisions_total{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0", type="Indeterminate",} 0.0
decisions_total{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0", type="Deny",} 1.0
decisions_total{ads_id="default-1603e8a2",domain="namespace0:domain1", domain_id="80bbc5fa-0647-4f22-804f-949056787c6b",domain_sequence="0", type="NotApplicable",} 0.0

The rate of successful access requests that evaluated to Permit, Deny, Indeterminate, or NotApplicable, respectively.

Note: Only requests that result in single decisions are included in the data for decision values.

To access the metrics data, you need to set up a configuration property in the deployment configuration file that includes the configuration of the metrics backend according to the chosen model to receive the data for visualization and/or further processing.

Note: In this implementation, metrics are collected for requests served from the REST endpoint of ADS. No extra features are implemented for the legacy SOAP endpoint. Any metrics data shown for the SOAP endpoint would be information produced by default by Dropwizard's Metrics library.

Shared configuration parameters for metrics

Statistics emanating from a timer like max, percentiles, and histogram counts decay over time to give greater weight to recent samples. To manage the decay behavior, ADS internally uses a ring buffer (an array with a pointer to a particular element) to keep track of the values for maximums and percentiles.

This functionality can be controlled by two optional parameters that are shared by all monitoring systems used for metrics processing. (As mentioned above, several monitoring systems can be used concurrently.)

  1. Open the deployment configuration file in a text editor.
  2. Add the property metricsBackends:.
  3. Add the subproperty parameters:.
  4. Add the parameters decay: and bufferLength: with suitable values.
metricsBackends:
parameters:
decay: 2 minutes
bufferLength: 3

Shared configuration parameters for metrics

ParameterDefault valuePossible valuesDescription
decay2 minutes<a valid duration> minimum value 1 secondThe decay parameter refers to how often the array pointer is moved to the next element of the ring buffer. The duration must be expressed as a positive integer and a time unit.
NOTE: To avoid undersampling, regardless of using a single or multiple backends, Axiomatics strongly recommends using a minimum decay value that is at least two times higher than the highest value used for the step parameter in the configuration of all enabled metrics backends.
For example, if the step is equal to 1 minute, the decay should be equal to 2 minutes. Axiomatics also recommends determining the value for the step parameter first and then adjusting the decay value accordingly.
bufferLength3<a positive integer>The bufferLength parameter refers to the size of the ring buffer.
NOTE: Axiomatics does not recommend using a value as low as 1.

Shared parameters for metrics

Important: These parameters are optional. However, if added, the value for a parameter cannot be empty or null. That would be an invalid entry and the system would not be initialized. If the parameters are not included in the configuration, their default values will be used.

Configuration for ASM

ASM can be used as a metrics backend. This configuration is used to publish key metrics for the graph displays of the Dashboard feature of ASM.

Note: ASM can be used as a metrics backend only in the case where ADS is started with an authorization domain retrieved from ASM. Ιn order to publish the metrics to ASM, you have to configure the authentication to ASM as described in Authentication using an authorization server section. Furthermore, ASM must be running with the Dashboard functionality enabled, as described in the Installation section of the ASM documentation.

To set up ADS for use with ASM, you need to add a property for ASM and two parameters in the deployment configuration file.

  1. Open the deployment configuration file in a text editor.
  2. Add the property metricsBackends: with shared parameters as described in the previous section (if not already present).
  3. Add the property asm: and the parameter enabled: with the value true, and the parameter uri: with a string that represents the ASM endpoint.
metricsBackends:
parameters:
decay: 2 minute
bufferLength: 3
asm:
enabled: true
uri: https://localhost/metrics/push
ParameterMan./Opt.Default valuePossible valueDescription
enabledmandatory-true, falseThis parameter enables/disables the collection of data for the ASM metrics backend.
urimandatory-https://localhost/metrics/pushThe URI for the ASM backend.

ASM configuration parameters

Configuration for Prometheus

To set up ADS for use with Prometheus, you need to add a property for Prometheus and at least one parameter in the deployment configuration file.

  1. Open the deployment configuration file in a text editor.
  2. Add the property metricsBackends: with shared parameters as described in the previous section (if not already present).
  3. For a minimal Prometheus configuration, where only the mandatory parameter is used, add the property prometheus: and the parameter enabled: with the value true.
metricsBackends:
parameters:
decay: 2 minute
bufferLength: 3
prometheus:
enabled: true

Minimal Prometheus configuration

Other configuration parameters are available for Prometheus. These are optional and can be added to suit your configuration needs. The following example shows an instance using all available parameters. The table provides a description of the parameters and their possible values.

metricsBackends:
parameters:
decay: 2 minutes
bufferLength: 3
prometheus:
enabled: true
descriptions: false
histogramFlavor: VictoriaMetrics
prefix: prom
step: 1 minute

Complete Prometheus configuration

ParameterMan./Opt.Default valuePossible valueDescription
enabledmandatory-true, falsehis parameter enables/disables the collection of data for the Prometheus metrics backend via an administration endpoint. The endpoint is
GET /admin/metrics/prometheus
under the administration endpoint.
descriptionsoptionaltruetrue, falseTurn this parameter on if meter descriptions should be sent to Prometheus. Turn it off to minimize the amount of data sent on each scrape.
histogramFlavoroptionalPrometheusPrometheus, VictoriaMetricsThe Histogram type to use for the meters DistributionSummary and Timer.
prefixoptionalprometheus<string>This is the prefix used by the metrics library employed internally by ADS.
stepoptional1 minute<a valid duration> minimum value 1 secondThis parameter refers to how often data is sampled from gauges and percentiles. The duration must be expressed as a positive integer and a time unit.
NOTE: Axiomatics strongly recommends that the step interval is the same as the pull (or scrape) interval set for Prometheus, and half that of the value of the decay parameter mentioned above. Axiomatics also recommends determining the value for the step parameter first and then adjusting the decay value accordingly.

Prometheus configuration parameters

Important: If an optional parameter is added, its value cannot be empty or null. That would be an invalid entry and the system would not be initialized. If a parameter is not included in the configuration, its default value will be used.

These instructions are only referring to steps relevant to configure ADS for use with Prometheus. For other questions regarding Prometheus, please refer to Prometheus documentationOpens in a new tab.

Prometheus endpoint

This is the administration endpoint at which ADS exposes the current values of the metrics for Prometheus to pull (or scrape). The endpoint is only available when there is a valid Prometheus configuration enabled.

GET /admin/metrics/prometheus

Configuration for Azure Monitor

Important: The option of configuring Azure Monitor as a metrics backend in ADS as described below (using the OpenTelemetry Java agent) is no longer recommended, and its use is deprecated. This custom configuration option will be removed in a future release. When Azure Monitoring Application Insights is used, the Application Insights Java agent should be used for both tracing and metrics information. See Running ADS with the Application Insights Java agent.

Azure Monitor Application Insights is a feature of Azure Monitor that is used to monitor live applications. To set up ADS for use with Azure Monitor Application Insights, you need to provide a property for Azure Monitor and at least two parameters in the deployment configuration file.

Note: If ADS is launched with the Application Insights Java agent, Axiomatics recommends that the azureMonitor configuration in the deployment configuration file is disabled, and vice versa, do not use the Application Insights Java agent when the azureMonitor configuration is enabled.

  1. Open the deployment configuration file in a text editor.
  2. Add the property metricsBackends: with shared parameters as described in the previous section (if not already present).
  3. For a minimal Azure Monitor configuration, where only the mandatory parameters are used, add the subproperty azureMonitor: and the parameter enabled: with the value true, and the parameter instrumentationKey: with the value of a string that represents the Instrumentation Key.
metricsBackends:
parameters:
decay: 2 minutes
bufferLength: 3
azureMonitor:
enabled: true
instrumentationKey: <String that represents the Instrumentation Key>

Minimal Azure Monitor Application Insights configuration

Other configuration parameters are available for Azure Monitor. These are optional and can be added to suit your configuration needs. The following example shows an instance using all available parameters. The table provides a description of the parameters and their possible values.

metricsBackends:
parameters:
decay: 2 minutes
bufferLength: 3
azuremonitor:
enabled: true
instrumentationKey: <String that represents the Instrumentation Key>
prefix: azuremonitor
step: 1 minute

Complete Azure Monitor Application Insights configuration

ParameterMan./Opt.Default valuePossible valuesDescription
enabledmandatory-true, falseThis parameter enables/disables the collection of metrics data for the Azure Monitor metrics backend (Application Insights) via the Instrumentation Key.
instrumentationKeymandatory-<a string that represents the Instrumentation Key>The Instrumentation Key identifies the Application Insights resource that should be associated with the metrics data sent by ADS.
It is the key integration point between ADS and the Application Insights monitoring service.
prefixoptionalazuremonitor<string>This is the prefix used by the metrics library employed internally by ADS.
stepoptional1 minute<a valid duration> minimum value 1 minuteThis parameter refers to how often data is sampled from gauges and percentiles. It also governs the push interval, that is, the reporting frequency. The duration must be expressed as a positive integer and a time unit.
NOTE: Axiomatics strongly recommends that the step interval is half that of the value of the decay parameter mentioned above. Axiomatics also recommends determining the value for the step parameter first and then adjusting the decay value accordingly.

Azure Monitor Application Insights configuration parameters

Important: If an optional parameter is added, its value cannot be empty or null. That would be an invalid entry and the system would not be initialized. If a parameter is not included in the configuration, its default value will be used.

These instructions are only referring to steps relevant to configure ADS for use with Azure Monitor Application Insights. For other questions regarding Azure Monitor Application Insights, please refer to Azure Monitor Application Insights documentationOpens in a new tab.

Configuration for InfluxDB

When setting up ADS and InfluxDB, you need to create an InfluxDB account to include some of the parameters needed in the deployment configuration file.

  1. Open the deployment configuration file in a text editor.
  2. Add the property metricsBackends: with shared parameters as described in the previous section (if not already present).
  3. For a minimal InfluxDB configuration, where only the mandatory parameters are used, add the subproperty influx: and the parameter enabled: with the value true, the parameter org: with the value of a string that represents the destination organization for writes, the parameter bucket: with the value of a string that represents destination bucket for writes, and the parameter token: with the value of a string that represents the authentication token for the InfluxDB API.

Note: The parameters bucket, org, and token are created by the action of creating the InfluxDB account and the corresponding values can then be entered/copied into the configuration file. (The values for org and bucket are manually defined, while the token is automatically generated.)

metricsBackends:
parameters:
decay: 2 minute
bufferLength: 3
influx:
enabled: true
org: my-organization
bucket: my-bucket
token: <String that represents the authentication token>

Minimal InfluxDB configuration

Other configuration parameters are available for InfluxDB. These are optional and can be added to suit your configuration needs. The following example shows an instance using all available parameters. The table provides a description of the parameters and their possible values.

metricsBackends:
parameters:
decay: 2 minutes
bufferLength: 3
influx:
enabled: true
org: my-organization
prefix: influx
step: 1 minute
bucket: my-bucket
token: <String that represents the authentication token>
uri: http://localhost:8086

Complete InfluxDB configuration

ParameterMan./Opt.Default valuePossible valuesDescription
enabledmandatory-true, falseThis enables/disables the InfluxDB time series platform that collects, stores, processes and visualizes metrics and events.
prefixoptionalinflux<string>This is the prefix used by the metrics library employed internally by ADS.
orgmandatory-<string>Specifies the destination organization for writes. Takes either the ID or Name interchangeably. This needs to be an org that exists in InfluxDB.
bucketmandatory-<string>Specifies the destination bucket for writes. Takes either the ID or Name interchangeably. This needs to be a bucket that exists in InfluxDB.
tokenmandatory<a string that represents the authentication token>Authentication token for the InfluxDB API, to authorize API requests. This token is automatically generated in InfluxDB.
stepoptional1 minute<a valid duration> minimum value 1 secondThis parameter refers to how often data is sampled from gauges and percentiles. It also governs the push interval, that is, the reporting frequency. The duration must be expressed as a positive integer and a time unit.
NOTE: Axiomatics strongly recommends that the step interval is half that of the value of the decay parameter mentioned above. Axiomatics also recommends determining the value for the step parameter first and then adjusting the decay value
urioptionalhttp://localhost:8086<string>The URI for the Influx backend. This parameter can be set to allow writes only to a specific instance of InfluxDB.

InfluxDB configuration parameters

ADS will push metrics data to influxDB according to the step setting. If step is set to 1 minute (and assuming that there is 1 request per second), then all the 60 requests will be written to influxDB every 1 minute.

Important: If an optional parameter is added, its value cannot be empty or null. That would be an invalid entry and the system would not be initialized. If a parameter is not included in the configuration, its default value will be used.

These instructions are only referring to steps relevant to configure ADS for use with InfluxDB. For other questions regarding InfluxDB, please refer to InfluxDB documentationOpens in a new tab.

Enabling TLS encryption

InfluxData strongly recommends enabling TLS, especially if you plan on sending requests to InfluxDB over a network.

Proper TLS certificates must be provided to InfluxDB, and to the JVM used to run ADS.

Refer to Enable TLS encryptionOpens in a new tab for information on how to enable and configure TLS encryption.