0.24 Upgrade Guide

An upgrade guide that addresses breaking changes in 0.24.0

Vector’s 0.24.0 release includes breaking changes:

  1. VRL rejects querying non-collection types on assignment
  2. Metric bucket counts are now u64
  3. codec field on sink encoding must be specified explicitly
  4. ndjson on sink encoding is now json encoding + newline_delimited framing
  5. VRL type definition updates
  6. Removal of deprecated transforms

and deprecations:

  1. internal_metrics source cardinality metric change
  2. Imminent removal of deprecated names for a select set of sources, transforms, and sinks
  3. Deprecation of modulus operator
  4. Deprecation of the geoip transform

We cover them below to help you upgrade quickly:

Upgrade guide

Breaking changes

VRL rejects querying non-collection types on assignment

Previously, the following would work:

foo = 42
foo.bar = 3.14

This is now rejected, and instead returns a compiler error:

error[E642]: parent path segment rejects this mutation
  ┌─ :1:5
1 │ foo.bar = 3.14
  │ --- ^^^ querying a field of a non-object type is unsupported
  │ │
  │ this path resolves to a value of type integer
  = try: change parent value to object, before assignment
  =
  =     foo = {}
  =     foo.bar = 3.14
  =
  = see documentation about error handling at https://errors.vrl.dev/#handling
  = see language documentation at https://vrl.dev

This change was made to prevent accidentally overwriting non-collection types. As the diagnostic message suggests, you can still achieve the desired result by first re-writing the non-collection type to a collection type (foo = {}), and then mutating the collection itself.

This change applies to both objects and arrays, so this example is also disallowed:

foo = 42
foo[0] = 3.14

Metric bucket counts are now u64

The field storing metric bucket counts for Histogram metrics has now been upgraded to use 64 bits from 32 bits. This allows for much larger bucket sizes to be used. To facilitate this we have updated the proto files that determine how an event is persisted. Newer versions of Vector will be able to read older versions of metrics, but older versions of Vector may not be able to read newer versions of metrics.

This has two potential implications that you should consider.

  1. Disk buffers should be backed up if you want to be able to roll back to an older Vector version since new disk buffer entries may not be readable by older Vector versions. The disk buffers location can be found under the Vector data directory.

  2. When upgrading Vector to Vector communication (the vector source and sink or the native codec) make sure you upgrade the consumers first followed by the producers to ensure newer versions of Vector aren’t sending data to older versions, which may not be able to be read.

codec field on sink encoding must be specified explicitly

Setting the encoding value on a sink by a string has been deprecated in 0.23.0 and is removed in this release. To migrate have a look at the upgrade guide for deprecated shorthand values for encoding options.

ndjson on sink encoding is now json encoding + newline_delimited framing

The ndjson sink encoding value has been deprecated in 0.23.0 and is removed in this release. To migrate have a look at the upgrade guide for the ndjson sink encoding value.

VRL type definition updates

There were many situations where VRL didn’t calculate the correct type definition. These are now fixed. In some cases this can cause compilation errors when upgrading if the code relied on the previous (incorrect) behavior.

The best way to fix these issues is to let the compiler guide you through the problems, it will usually provide suggestions on how to fix the issue. Please give us feedback if you think any error diagnostics could be improved, we are continually trying to improve them.

The most common error you will probably see is the fallibility of a function changed because the type of one of the parameters changed.

For example, if you are trying to split a string, but the input could now be null, the error would look like this

error[E110]: invalid argument type
  ┌─ :1:7
1 │ split(msg, " ")
  │       ^^^
  │       │
  │       this expression resolves to one of string or null
  │       but the parameter "value" expects the exact type string
  = try: ensuring an appropriate type at runtime
  =
  =     msg = string!(msg)
  =     split(msg, " ")
  =
  = try: coercing to an appropriate type and specifying a default value as a fallback in case coercion fails
  =
  =     msg = to_string(msg) ?? "default"
  =     split(msg, " ")
  =
  = see documentation about error handling at https://errors.vrl.dev/#handling
  = learn more about error code 110 at https://errors.vrl.dev/110
  = see language documentation at https://vrl.dev
  = try your code in the VRL REPL, learn more at https://vrl.dev/examples

As suggested, you have a few options to solve errors like this.

  1. Abort if the arguments aren’t the right type by appending the function name with !, such as to_string!(msg)
  2. Force the type to be a string, using the string function. This function will error at runtime if the value isn’t the expected type. You can call it as string! to abort if it’s not the right type.
  3. Provide a default value if the function fails using the “error coalescing” operator (??), such as to_string(msg) ?? "default"
  4. Handle the error manually by capturing both the return value and possible error, such as result, err = to_string(msg)

Removal of deprecated transforms

The transforms that had been deprecated in-lieu of the remap transform have finally been dropped:

  • add_fields
  • add_tags
  • ansi_stripper
  • aws_cloudwatch_logs_subscription_parser
  • coercer
  • concat
  • grok_parser
  • json_parser
  • key_value_parser
  • logfmt_parser
  • merge
  • regex_parser
  • remove_fields
  • remove_tags
  • rename_fields
  • split
  • tokenizer

See the official deprecation notice in the v0.22.0 release for migration instructions.

Deprecation Notices

internal_metrics source cardinality metric change

The internal_metrics source has been emitting a metric indicating the number of distinct metrics in its internal registry. This currently has the name internal_metrics_cardinality_total and is a counter type metric.

With the introduction of the capability of expiring metrics from the internal registry, it is no longer accurate to call this a counter as its value may fluctuate during the running of vector. As such, a new metric called internal_metrics_cardinality of type gauge has been added. The existing metric described above is now deprecated and will be removed in a future version.

See the documentation for additional details.

Deprecation of modulus operator

The modulus operator (%) is being deprecated in 0.24.0 in favor of the newly introduced mod function. This operator is fairly infrequently used, and moving this to a function allows the % to be repurposed for an upcoming feature, allowing you to query event metadata similar to event fields. This also now additionally allows the option of simply aborting the mod function on an error instead of either capturing the error or using the “error coalescing” operator. The modulus operator will be completely removed as early as 0.25.0.

Imminent removal of deprecated names for a select set of sources, transforms, and sinks

In 0.22.0, we had announced the removal of certain deprecated transforms. Continuing that work, we are deprecating the original names for a select set of sources, transforms, and sinks. These components have since been renamed to better align with the actual name of the product/technology, or to better differentiate them from related components.

The following component type names will be *removed in 0.25.0:

Sources:

  • generator (now demo_logs)
  • docker (now docker_logs)
  • logplex (now heroku_logs)
  • prometheus (now prometheus_scrape)

Transforms:

  • swimlanes (now route)
  • sampler (now sample)

Sinks:

  • new_relic_logs (now new_relic)
  • prometheus (now prometheus_exporter)
  • splunk_hec (now spelunk_hec_logs)

Deprecation of the geoip transform

The geoip transform has been deprecated in-lieu of new support for geoip enrichment tables. These can be used with VRL’s enrichment table functions to enrich events using a GeoIP database.

For example,

transforms:
  geoip:
    type: geoip
    inputs:
      - with_ip_info
    database: /etc/vector/GeoLite2-City.mmdb
    source: ip_address
    target: geoip

can be migrated as:

enrichment_tables:
  geoip_table:
    path: /etc/vector/GeoLite2-City.mmdb
    type: geoip

transforms:
  geoip:
    type: remap
    inputs:
      - with_ip_info
    source: |-
      .geoip = get_enrichment_table_record!("geoip_table",
        {
          "ip": .ip_address
        }      

(credit to Jacq for this migration example)

Additionally the geoip enrichment table has new support for Connection-Type databases.