Working with Transforms

Overview

Nimbus Transforms are high level NTL functions that have specialized logic for specific optimizations.

Once an optimization is applied, you can find its corresponding transformation in the transforms section of the console.

You can click on Edit to either update or delete an existing transform.

Global Options

The following properties are available on all transforms

process_when

  • status: required

  • type: NTL

Determines when a transform should be applied. Takes one or more NTL predicates as input.

Example:

process_when: 
  - {key: service, op: EQUAL, val: foo}

include_errors

  • status: optional

  • type: boolean

  • default: false

When set to true, designate that the current transform can apply to error logs. By default, error logs are not transformed but immediately proxied downstream for immediate processing.

msg_field

  • status: optional

  • type: string

  • default: message

The key where the log body is located

Example:

# logs sent by dd lambda extension have the message field nested inside the message key
# eg:
# {message: { message: "START ...", lambda: {arn: arn:aws:lambda:us-east-1:33333333:function:test-lambda, ...}}}
msg_field:
  - message.message

pull_up

  • status: optional

  • type: string[]

When specified, a list of paths that should be made into top level keys

Example:

pull_up:
  - message.transactionId

Before:

{
  "message": {
    "transactionId": 1,
    ...
  }
}

After:

{
  "transactionId": 1,
  "message": {
    ...
  }
}

remove

  • status: optional

  • type: string[]

When specified, a list of paths that should be removed

Example:

remove:
  - message.id
  - message.source
  - message.timeout

remove_from_nimdata

  • status: optional

  • type: string[]

If set, removes the selected paths from nimdata

Example:

remove_from_nimdata: 
  - status
  - hostname
  - ...

remove_nimdata

  • status: optional

  • type: boolean

If set, removes the nimdata attribute. Helps with significantly removing dataisze

Example:

remove_nimdata: true

Reduce Transform

The Nimbus reduce transform is a superset of the vector reduce transform.

When using reduce, remember that group_by only works on top level keys

If the key you need is nested, make sure to pull it up using the pull_up directive.

Options

merge_strategies

  • status: optional

  • type: enum

The default behavior is as follows:

  • The first value of a string field is kept and subsequent values are discarded.

  • For timestamp fields the first is kept and a new field [field-name]_end is added with the last received timestamp value.

  • Numeric values are summed.

Strategies:

OptionDescription

array

Append each value to an array.

concat

Concatenate each string value, delimited with a space.

concat_newline

Concatenate each string value, delimited with a newline.

concat_raw

Concatenate each string, without a delimiter.

discard

Discard all but the first value found.

flat_unique

Create a flattened array of all unique values.

longest_array

Keep the longest array seen.

max

Keep the maximum numeric value seen.

min

Keep the minimum numeric value seen.

retain

Discard all but the last value found.

starts_when

  • status: optional

  • type: NTL

A condition used to distinguish the first event of a transaction. If this condition resolves to true for an event, the previous transaction is flushed (without this event) and a new transaction is started.

Example:

starts_when:
  - {key: message, op: MATCH, val: "\n\{"}

max_events

  • status: optional

  • type: integer

The maximum number of events to group together.

Example:

max_events: 200

expire_after_ms

  • status: optional

  • type: integer

  • default: 30000

The maximum period of time to wait after the last event is received, in milliseconds, before a combined event should be considered complete.

Example

Suppose you have the following logs:

[
  {
    "host": "host1",
    "fooatt": "one",
    "baratt": "alpha"
  },
  {
    "host": "host2",
    "fooatt": "two",
    "baratt": "beta"
  },
  {
    "host": "host1",
    "fooatt": "three",
    "baratt": "gamma"
  },
  {
    "host": "host1",
    "baratt": "gamma"
  }
]

And you have the following reduce transform

name: hostreducer
# only apply this reducer when the log event has both a `host` and `fooatt` keys
process_when: 
  - {key: host, op: exists, val: true}
  - {key: fooatt, op: exists, val: true}
group_by:
  - host

Your processed logs would look like the following

[
  // this log was processed and grouped correctly
  {
    "host": "host1",
    "nimdata": [
      {
        "host": "host1",
        "fooatt": "one",
        "baratt": "alpha"
      },
      {
        "host": "host1",
        "fooatt": "three",
        "baratt": "gamma"
      }
    ],
    "nimsize": 2,
    "nimkind": "opt",
    "nimmatch": "hostreducer"
  },
  {
    "host": "host2",
    "nimdata": [
      {
        "host": "host2",
        "fooatt": "two",
        "baratt": "beta"
      }
    ],
    "nimsize": 1,
    "nimkind": "opt",
    "nimmatch": "hostreducer"
  },
  // this log did not get processed as it did not have a `fooatt` key
  {
    "host": "host1",
    "baratt": "gamma",
    "nimkind": "noopt"
  }
]

Last updated