# Working with Transforms

## Overview

**Nimbus Transforms** are high level [NTL](https://docs.nimbus.dev/overview/ntl) functions that have specialized logic for specific optimizations.

Once an optimization is applied, you can find its corresponding transformation in the [transforms](https://hub.nimbus.dev/transforms) section of the console.

![Transforms Console](https://ik.imagekit.io/fpjzhqpv1/Cursor_and_Nimbus_EqzhLWZ3tX.png?updatedAt=1707939313920)

You can click on `Edit` to either update or delete an existing transform.

## Global Options

The following properties are available on all transforms

### process\_when

* status: required
* type: [NTL](https://docs.nimbus.dev/overview/ntl/..#evaluate)

Determines when a transform should be applied. Takes one or more [NTL](https://docs.nimbus.dev/overview/ntl) predicates as input.

Example:

```yml
process_when: 
  - {key: service, op: EQUAL, val: foo}
```

### include\_errors

* status: optional
* type: boolean
* default: `false`

When set to true, designate that the current transform can apply to error logs. By default, error logs are not transformed but immediately proxied downstream for immediate processing.

### msg\_field

* status: optional
* type: string
* default: `message`

The key where the [log body](https://github.com/nimbushq/public-docs/blob/main/concepts.md#log-body) is located

Example:

```yaml
# logs sent by dd lambda extension have the message field nested inside the message key
# eg:
# {message: { message: "START ...", lambda: {arn: arn:aws:lambda:us-east-1:33333333:function:test-lambda, ...}}}
msg_field:
  - message.message
```

### pull\_up

* status: optional
* type: string\[]

When specified, a list of paths that should be made into [top level keys](https://github.com/nimbushq/public-docs/blob/main/concepts.md#top-level-keys)

Example:

```yaml
pull_up:
  - message.transactionId
```

Before:

```json
{
  "message": {
    "transactionId": 1,
    ...
  }
}
```

After:

```json
{
  "transactionId": 1,
  "message": {
    ...
  }
}
```

### remove

* status: optional
* type: string\[]

When specified, a list of paths that should be removed

Example:

```yaml
remove:
  - message.id
  - message.source
  - message.timeout
```

### remove\_from\_nimdata

* status: optional
* type: string\[]

If set, removes the selected paths from [nimdata](https://docs.nimbus.dev/resources/ref.attributes#nimdata)

Example:

```yaml
remove_from_nimdata: 
  - status
  - hostname
  - ...
```

### remove\_nimdata

* status: optional
* type: boolean

If set, removes the [nimdata attribute](https://docs.nimbus.dev/resources/ref.attributes#nimdata). Helps with significantly removing dataisze

Example:

```yaml
remove_nimdata: true
```

## Reduce Transform

The Nimbus reduce transform is a superset of the [vector reduce](https://vector.dev/docs/reference/configuration/transforms/reduce/) transform.

{% hint style="warning" %}
When using reduce, remember that `group_by` only works on [top level keys](https://github.com/nimbushq/public-docs/blob/main/concepts.md#top-level-keys)

If the key you need is nested, make sure to pull it up using the `pull_up` directive.
{% endhint %}

### Options

#### merge\_strategies

* status: optional
* type: enum

The default behavior is as follows:

* The first value of a string field is kept and subsequent values are discarded.
* For timestamp fields the first is kept and a new field **`[field-name]_end`** is added with the last received timestamp value.
* Numeric values are summed.

Strategies:

| Option          | Description                                              |
| --------------- | -------------------------------------------------------- |
| array           | Append each value to an array.                           |
| concat          | Concatenate each string value, delimited with a space.   |
| concat\_newline | Concatenate each string value, delimited with a newline. |
| concat\_raw     | Concatenate each string, without a delimiter.            |
| discard         | Discard all but the first value found.                   |
| flat\_unique    | Create a flattened array of all unique values.           |
| longest\_array  | Keep the longest array seen.                             |
| max             | Keep the maximum numeric value seen.                     |
| min             | Keep the minimum numeric value seen.                     |
| retain          | Discard all but the last value found.                    |

#### starts\_when

* status: optional
* type: NTL

A condition used to distinguish the first event of a transaction. If this condition resolves to true for an event, the previous transaction is flushed (without this event) and a new transaction is started.

Example:

```yml
starts_when:
  - {key: message, op: MATCH, val: "\n\{"}
```

#### max\_events

* status: optional
* type: integer

The maximum number of events to group together.

Example:

```yml
max_events: 200
```

#### expire\_after\_ms

* status: optional
* type: integer
* default: 30000

The maximum period of time to wait after the last event is received, in milliseconds, before a combined event should be considered complete.

### Example

Suppose you have the following logs:

```json
[
  {
    "host": "host1",
    "fooatt": "one",
    "baratt": "alpha"
  },
  {
    "host": "host2",
    "fooatt": "two",
    "baratt": "beta"
  },
  {
    "host": "host1",
    "fooatt": "three",
    "baratt": "gamma"
  },
  {
    "host": "host1",
    "baratt": "gamma"
  }
]

```

And you have the following reduce transform

```yaml
name: hostreducer
# only apply this reducer when the log event has both a `host` and `fooatt` keys
process_when: 
  - {key: host, op: exists, val: true}
  - {key: fooatt, op: exists, val: true}
group_by:
  - host

```

Your processed logs would look like the following

```json
[
  // this log was processed and grouped correctly
  {
    "host": "host1",
    "nimdata": [
      {
        "host": "host1",
        "fooatt": "one",
        "baratt": "alpha"
      },
      {
        "host": "host1",
        "fooatt": "three",
        "baratt": "gamma"
      }
    ],
    "nimsize": 2,
    "nimkind": "opt",
    "nimmatch": "hostreducer"
  },
  {
    "host": "host2",
    "nimdata": [
      {
        "host": "host2",
        "fooatt": "two",
        "baratt": "beta"
      }
    ],
    "nimsize": 1,
    "nimkind": "opt",
    "nimmatch": "hostreducer"
  },
  // this log did not get processed as it did not have a `fooatt` key
  {
    "host": "host1",
    "baratt": "gamma",
    "nimkind": "noopt"
  }
]

```
