# Working with Transforms

## Overview

**Nimbus Transforms** are high level [NTL](/overview/ntl.md) functions that have specialized logic for specific optimizations.

Once an optimization is applied, you can find its corresponding transformation in the [transforms](https://hub.nimbus.dev/transforms) section of the console.

![Transforms Console](https://ik.imagekit.io/fpjzhqpv1/Cursor_and_Nimbus_EqzhLWZ3tX.png?updatedAt=1707939313920)

You can click on `Edit` to either update or delete an existing transform.

## Global Options

The following properties are available on all transforms

### process\_when

* status: required
* type: [NTL](/overview/ntl.md#evaluate)

Determines when a transform should be applied. Takes one or more [NTL](/overview/ntl.md) predicates as input.

Example:

```yml
process_when: 
  - {key: service, op: EQUAL, val: foo}
```

### include\_errors

* status: optional
* type: boolean
* default: `false`

When set to true, designate that the current transform can apply to error logs. By default, error logs are not transformed but immediately proxied downstream for immediate processing.

### msg\_field

* status: optional
* type: string
* default: `message`

The key where the [log body](https://github.com/nimbushq/public-docs/blob/main/concepts.md#log-body) is located

Example:

```yaml
# logs sent by dd lambda extension have the message field nested inside the message key
# eg:
# {message: { message: "START ...", lambda: {arn: arn:aws:lambda:us-east-1:33333333:function:test-lambda, ...}}}
msg_field:
  - message.message
```

### pull\_up

* status: optional
* type: string\[]

When specified, a list of paths that should be made into [top level keys](https://github.com/nimbushq/public-docs/blob/main/concepts.md#top-level-keys)

Example:

```yaml
pull_up:
  - message.transactionId
```

Before:

```json
{
  "message": {
    "transactionId": 1,
    ...
  }
}
```

After:

```json
{
  "transactionId": 1,
  "message": {
    ...
  }
}
```

### remove

* status: optional
* type: string\[]

When specified, a list of paths that should be removed

Example:

```yaml
remove:
  - message.id
  - message.source
  - message.timeout
```

### remove\_from\_nimdata

* status: optional
* type: string\[]

If set, removes the selected paths from [nimdata](/resources/ref.attributes.md#nimdata)

Example:

```yaml
remove_from_nimdata: 
  - status
  - hostname
  - ...
```

### remove\_nimdata

* status: optional
* type: boolean

If set, removes the [nimdata attribute](/resources/ref.attributes.md#nimdata). Helps with significantly removing dataisze

Example:

```yaml
remove_nimdata: true
```

## Reduce Transform

The Nimbus reduce transform is a superset of the [vector reduce](https://vector.dev/docs/reference/configuration/transforms/reduce/) transform.

{% hint style="warning" %}
When using reduce, remember that `group_by` only works on [top level keys](https://github.com/nimbushq/public-docs/blob/main/concepts.md#top-level-keys)

If the key you need is nested, make sure to pull it up using the `pull_up` directive.
{% endhint %}

### Options

#### merge\_strategies

* status: optional
* type: enum

The default behavior is as follows:

* The first value of a string field is kept and subsequent values are discarded.
* For timestamp fields the first is kept and a new field **`[field-name]_end`** is added with the last received timestamp value.
* Numeric values are summed.

Strategies:

| Option          | Description                                              |
| --------------- | -------------------------------------------------------- |
| array           | Append each value to an array.                           |
| concat          | Concatenate each string value, delimited with a space.   |
| concat\_newline | Concatenate each string value, delimited with a newline. |
| concat\_raw     | Concatenate each string, without a delimiter.            |
| discard         | Discard all but the first value found.                   |
| flat\_unique    | Create a flattened array of all unique values.           |
| longest\_array  | Keep the longest array seen.                             |
| max             | Keep the maximum numeric value seen.                     |
| min             | Keep the minimum numeric value seen.                     |
| retain          | Discard all but the last value found.                    |

#### starts\_when

* status: optional
* type: NTL

A condition used to distinguish the first event of a transaction. If this condition resolves to true for an event, the previous transaction is flushed (without this event) and a new transaction is started.

Example:

```yml
starts_when:
  - {key: message, op: MATCH, val: "\n\{"}
```

#### max\_events

* status: optional
* type: integer

The maximum number of events to group together.

Example:

```yml
max_events: 200
```

#### expire\_after\_ms

* status: optional
* type: integer
* default: 30000

The maximum period of time to wait after the last event is received, in milliseconds, before a combined event should be considered complete.

### Example

Suppose you have the following logs:

```json
[
  {
    "host": "host1",
    "fooatt": "one",
    "baratt": "alpha"
  },
  {
    "host": "host2",
    "fooatt": "two",
    "baratt": "beta"
  },
  {
    "host": "host1",
    "fooatt": "three",
    "baratt": "gamma"
  },
  {
    "host": "host1",
    "baratt": "gamma"
  }
]

```

And you have the following reduce transform

```yaml
name: hostreducer
# only apply this reducer when the log event has both a `host` and `fooatt` keys
process_when: 
  - {key: host, op: exists, val: true}
  - {key: fooatt, op: exists, val: true}
group_by:
  - host

```

Your processed logs would look like the following

```json
[
  // this log was processed and grouped correctly
  {
    "host": "host1",
    "nimdata": [
      {
        "host": "host1",
        "fooatt": "one",
        "baratt": "alpha"
      },
      {
        "host": "host1",
        "fooatt": "three",
        "baratt": "gamma"
      }
    ],
    "nimsize": 2,
    "nimkind": "opt",
    "nimmatch": "hostreducer"
  },
  {
    "host": "host2",
    "nimdata": [
      {
        "host": "host2",
        "fooatt": "two",
        "baratt": "beta"
      }
    ],
    "nimsize": 1,
    "nimkind": "opt",
    "nimmatch": "hostreducer"
  },
  // this log did not get processed as it did not have a `fooatt` key
  {
    "host": "host1",
    "baratt": "gamma",
    "nimkind": "noopt"
  }
]

```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.nimbus.dev/overview/ntl/ntl.transforms.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
