The Vector team is pleased to announce version 0.20.0!
In addition to the new features, enhancements, and fixes listed below, this release includes a new opt-in disk buffer implementation that we hope will provide users with faster, more consistent, and lower resource usage buffer performance. We encourage you to opt-in during this beta period and give us feedback. See the beta disk buffer highlight article for more details including how to opt-in.
We also made additional performance improvements this release increasing the average throughput by 10-20% for common topologies (see our soak test framework).
Be sure to check out the upgrade guide for breaking changes in this release.
0.20.1
.vector test
will panic. Will be fixed in 0.20.1
.route
transform was refactored to rely on Vector’s new concept of named outputs for components.
The behavior is the same, but the metrics emitted by the transform have been updated as noted in the
highlight article for named output metrics.A multiple_outputs
option was added to the datadog_agent
source. When set to true
, rather than
emitting both logs and metrics to any components using the component id of the source in inputs
, the
events will be split into two separate streams, <component_id>.logs
and <component_id>.metrics
, so
that logs and metrics from the Datadog agent can be processed separately.
When set to true
, the internal metrics from this source are also updated as mentioned in the
highlight article for named output metrics.
multiple_outputs
defaults to false
for compatibility with existing configurations.
The telemetry for a few components was updated as a result expanding the use of named outputs in Vector for components with multiple output streams:
route
datadog_agent
(when multiple_outputs
is true
)remap
(when reroute_dropped
is true
)These components now add an output
tag to their metrics. See highlight article for named output
metrics.
The abort
function can now take an optional string message to include in logs and the metadata for
dropped events. For example:
if .foo == 5 {
abort "foo is " + .foo + "!"
}
The foo is 5!
message will appear in Vector’s internal logs as well as being used as
metadata.message
for rerouted dropped events (if reroute_dropped
is true
).
assert
or assert_eq
, this message is now used as metadata.message
rather than the default VRL error message which included function call error: ...
. This allows cleaner
handling of different assertion errors if routing dropped events to another component (when
reroute_dropped
is true
).vector
source and vector
sink now default to version 2 of the protocol for inter-Vector
communication. See the upgrade guide for more
details.We are in the process of updating all Vector components with consistent instrumentation as described in Vector’s component specification).
With this release we have instrumented the following sources with these new metrics:
aws_ecs_metrics
aws_kinesis_firehose
aws_s3
datadog_agent
demo_logs
dnstap
docker_logs
eventstoredb_metrics
exec
host_metrics
internal_logs
internal_metrics
kafka
kubernetes_logs
nats
nginx_metrics
prometheus_scrape
stdin
vector
And these transforms:
aws_ec2_metadata
loki
sink now supports the compression
option. For now, the only available compression is gzip
.The internal telemetry for the loki
sink has been improved by:
component_discarded_events_total
metric for discarded eventsrewritten_timestamp_events_total
metric to count events whose timestamp was rewrittenwarn
to debug
since
out-of-order events are a common occurrence, something the sink is designed to handle, and the handling
behavior is explicitly configured by the userSee the upgrade guide for more details.
route
transform) that was written with VRL fails to execute, the
reason for failure is now output in the logs.vector top
has been updated to show the event metrics per output when a component has multiple outputs.fluent
, syslog
, and socket
now handle back-pressure better. Previously,
back-pressure to these sources from downstream components would cause runaway resource growth as the
source would continue to accept connections even though it couldn’t handle them right away. A new
algorithm has been introduced to start applying back-pressure to clients once ~100k events are buffered
in the source. The source will then start closing new connections, rather than accepting them, until it
has flushed events downstream.The vector tap
command has had two enhancements:
logfmt
option that can be used to format the tapped output as logfmt dataA new new_relic
sink was added that can handle both logs and metrics. It can also send logs to New
Relic as New Relic events. It replaces the existing new_relic_logs
sink that has been deprecated.
Thanks to @Andreu for this contribution!
Fixed runaway memory growth when the prometheus_exporter
sink was used with distributions due to the
sink holding onto all of the samples it had seen.
This required a breaking change as documented in the upgrade guide.
splunk_hec
source now accepts invalid UTF-8 bytes, matching Splunk’s HEC behavior. It replaces
them with the UTF-8 replacement character: �A few fixes were made to parse_groks
:
The array
filter now supports arrays without brackets, if empty string is
provided as brackets. For example: `%{data:field:array("", “-”)} will parse “a-b” into [“a”, “b”].
The keyvalue
filter now supports nested paths as keys. For example:
parse_groks("db.name=my_db,db.operation=insert",
patterns: ["%{data::keyvalue}"]
)
will yield
{
"db" : {
"name" : "my_db",
"operation" : "insert",
}
}
The date
filter now supports the d
and y
shorthands. For example:
parse_groks("Nov 16 2020 13:41:29 GMT", patterns: ["%{date("MMM d y HH:mm:ss z"):field}"] )
Will yield:
1605534089000
Numeric values are now returned as integers when there is no loss in precision; otherwise they continue to be returned as floats.
Aliases that match multiple fields now correctly extract as an array of values. extracted values.
Aliases with a filter now correctly extract.
The keyvalue
filter now correctly ignores keys without values.
To match the beginning and the end of lines, \A
and \Z
must be used rather than ^
and $`
The keyvalue
filter now correctly handles parsing keys that start with a number
azure_blob
sink now correctly parses connection strings including SAS tokens.buffer_events_total
metric drop_newest
is used for buffers. Previously, this was
counting discarded events.aws_sqs
source. This can be configured the usual way
through the proxy
configuration field or
environment variables.encoding.only_fields
now correctly deserializes again for sinks that used fixed encodings (i.e. those
that don’t have encoding.codec
). This was a regression in v0.18.0
.if
conditions now correctly check for fallibility. This is a breaking change. See
the upgrade guide for more
details.decoding.codec
.VRL
is no longer fallible. See the upgrade
guide for more details.component-id
). This
was a regression in v0.19.0
.syslog
decoder (encoding.codec
on sources) now correctly errors if the incoming data is not
actually syslog data. Previously it would pass through the invalid data. In the future, we will likely
add a way to route invalid events.Sign up to receive emails on the latest Vector content and new releases
Thank you for joining our Updates Newsletter