Working with Transforms

Overview

Nimbus Transforms are high level NTL functions that have specialized logic for specific optimizations.

Once an optimization is applied, you can find its corresponding transformation in the transforms section of the console.

You can click on Edit to either update or delete an existing transform.

Global Options

The following properties are available on all transforms

process_when

status: required
type: NTL

Determines when a transform should be applied. Takes one or more NTL predicates as input.

Example:

process_when: 
  - {key: service, op: EQUAL, val: foo}

include_errors

status: optional
type: boolean
default: false

When set to true, designate that the current transform can apply to error logs. By default, error logs are not transformed but immediately proxied downstream for immediate processing.

msg_field

status: optional
type: string
default: message

The key where the log body is located

Example:

# logs sent by dd lambda extension have the message field nested inside the message key
# eg:
# {message: { message: "START ...", lambda: {arn: arn:aws:lambda:us-east-1:33333333:function:test-lambda, ...}}}
msg_field:
  - message.message

pull_up

status: optional
type: string[]

When specified, a list of paths that should be made into top level keys

Example:

pull_up:
  - message.transactionId

Before:

{
  "message": {
    "transactionId": 1,
    ...
  }
}

After:

{
  "transactionId": 1,
  "message": {
    ...
  }
}

remove

status: optional
type: string[]

When specified, a list of paths that should be removed

Example:

remove:
  - message.id
  - message.source
  - message.timeout

remove_from_nimdata

status: optional
type: string[]

If set, removes the selected paths from nimdata

Example:

remove_from_nimdata: 
  - status
  - hostname
  - ...

remove_nimdata

status: optional
type: boolean

If set, removes the nimdata attribute. Helps with significantly removing dataisze

Example:

remove_nimdata: true

Reduce Transform

The Nimbus reduce transform is a superset of the vector reduce transform.

When using reduce, remember that group_by only works on top level keys

If the key you need is nested, make sure to pull it up using the pull_up directive.

Options

merge_strategies

status: optional
type: enum

The default behavior is as follows:

The first value of a string field is kept and subsequent values are discarded.
For timestamp fields the first is kept and a new field [field-name]_end is added with the last received timestamp value.
Numeric values are summed.

Strategies:

Option

Description

array

Append each value to an array.

concat

Concatenate each string value, delimited with a space.

concat_newline

Concatenate each string value, delimited with a newline.

concat_raw

Concatenate each string, without a delimiter.

discard

Discard all but the first value found.

flat_unique

Create a flattened array of all unique values.

longest_array

Keep the longest array seen.

max

Keep the maximum numeric value seen.

min

Keep the minimum numeric value seen.

retain

Discard all but the last value found.

starts_when

status: optional
type: NTL

A condition used to distinguish the first event of a transaction. If this condition resolves to true for an event, the previous transaction is flushed (without this event) and a new transaction is started.

Example:

starts_when:
  - {key: message, op: MATCH, val: "\n\{"}

max_events

status: optional
type: integer

The maximum number of events to group together.

Example:

max_events: 200

expire_after_ms

status: optional
type: integer
default: 30000

The maximum period of time to wait after the last event is received, in milliseconds, before a combined event should be considered complete.

Example

Suppose you have the following logs:

[
  {
    "host": "host1",
    "fooatt": "one",
    "baratt": "alpha"
  },
  {
    "host": "host2",
    "fooatt": "two",
    "baratt": "beta"
  },
  {
    "host": "host1",
    "fooatt": "three",
    "baratt": "gamma"
  },
  {
    "host": "host1",
    "baratt": "gamma"
  }
]

And you have the following reduce transform

name: hostreducer
# only apply this reducer when the log event has both a `host` and `fooatt` keys
process_when: 
  - {key: host, op: exists, val: true}
  - {key: fooatt, op: exists, val: true}
group_by:
  - host

Your processed logs would look like the following

[
  // this log was processed and grouped correctly
  {
    "host": "host1",
    "nimdata": [
      {
        "host": "host1",
        "fooatt": "one",
        "baratt": "alpha"
      },
      {
        "host": "host1",
        "fooatt": "three",
        "baratt": "gamma"
      }
    ],
    "nimsize": 2,
    "nimkind": "opt",
    "nimmatch": "hostreducer"
  },
  {
    "host": "host2",
    "nimdata": [
      {
        "host": "host2",
        "fooatt": "two",
        "baratt": "beta"
      }
    ],
    "nimsize": 1,
    "nimkind": "opt",
    "nimmatch": "hostreducer"
  },
  // this log did not get processed as it did not have a `fooatt` key
  {
    "host": "host1",
    "baratt": "gamma",
    "nimkind": "noopt"
  }
]

Last updated 1 year ago

Was this helpful?