Sample
Sample events from an event stream based on supplied criteria and at a configurable rate
Configuration
Example configurations
{
"transforms": {
"my_transform_id": {
"type": "sample",
"inputs": [
"my-source-or-transform-id"
],
"rate": 1500
}
}
}
[transforms.my_transform_id]
type = "sample"
inputs = [ "my-source-or-transform-id" ]
rate = 1_500
transforms:
my_transform_id:
type: sample
inputs:
- my-source-or-transform-id
rate: 1500
{
"transforms": {
"my_transform_id": {
"type": "sample",
"inputs": [
"my-source-or-transform-id"
],
"group_by": "{{ service }}",
"key_field": "message",
"rate": 1500,
"sample_rate_key": "sample_rate"
}
}
}
[transforms.my_transform_id]
type = "sample"
inputs = [ "my-source-or-transform-id" ]
group_by = "{{ service }}"
key_field = "message"
rate = 1_500
sample_rate_key = "sample_rate"
transforms:
my_transform_id:
type: sample
inputs:
- my-source-or-transform-id
group_by: "{{ service }}"
key_field: message
rate: 1500
sample_rate_key: sample_rate
exclude
optional conditionAvailable syntaxes
Syntax | Description | Example |
---|---|---|
vrl | A Vector Remap Language (VRL) Boolean expression. | .status_code != 200 && !includes(["info", "debug"], .severity) |
datadog_search | A Datadog Search query string. | *stack |
is_log | Whether the incoming event is a log. |
|
is_metric | Whether the incoming event is a metric. |
|
is_trace | Whether the incoming event is a trace. |
|
Shorthand for VRL
If you opt for the vrl
syntax for this condition, you can set the condition
as a string via the condition
parameter, without needing to specify both a source
and a type
. The
table below shows some examples:
Config format | Example |
---|---|
YAML | condition: .status == 200 |
TOML | condition = ".status == 200" |
JSON | "condition": ".status == 200" |
Condition config examples
Standard VRL
exclude:
type: "vrl"
source: ".status == 500"
exclude = { type = "vrl", source = ".status == 500" }
"exclude": {
"type": "vrl",
"source": ".status == 500"
}
graph
optional objectExtra graph configuration
Configure output for component when generated with graph command
graph.node_attributes
optional objectNode attributes to add to this component’s node in resulting graph
They are added to the node as provided
graph.node_attributes.*
required string literalgroup_by
optional string templateThe value to group events into separate buckets to be sampled independently.
If left unspecified, or if the event doesn’t have group_by
, then the event is not
sampled separately.
inputs
required [string]A list of upstream source or transform IDs.
Wildcards (*
) are supported.
See configuration for more info.
key_field
optional string literalThe name of the field whose value is hashed to determine if the event should be sampled.
Each unique value for the key creates a bucket of related events to be sampled together
and the rate is applied to the buckets themselves to sample 1/N
buckets. The overall rate
of sampling may differ from the configured one if values in the field are not uniformly
distributed. If left unspecified, or if the event doesn’t have key_field
, then the
event is sampled independently.
This can be useful to, for example, ensure that all logs for a given transaction are
sampled together, but that overall 1/N
transactions are sampled.
rate
required uintThe rate at which events are forwarded, expressed as 1/N
.
For example, rate = 1500
means 1 out of every 1500 events are forwarded and the rest are
dropped.
sample_rate_key
optional string literalsample_rate
Outputs
<component_id>
Telemetry
Metrics
linkcomponent_discarded_events_total
counterfilter
transform, or false if due to an error.component_errors_total
countercomponent_received_event_bytes_total
countercomponent_received_events_count
histogramA histogram of the number of events passed in each internal batch in Vector’s internal topology.
Note that this is separate than sink-level batching. It is mostly useful for low level debugging performance issues in Vector due to small internal batches.