# Lint Optimizations

Lint optimizations scan logs for common hygiene issues like sensitive (eg. api tokens) and redundant data (eg. timestamp appearing the log).

## Optimization Triggers

Nimbus can automatically optimize logs when it detects the following situations:

1. Logs with timestamp appearing in message body
2. Common kind of secrets (AWS tokens, github and gitlab, etc)

## Example

Take the following log

```yaml
message: '2024/01/23 01:33:122 {"method": "process_checkout", "retry_count": 3}'
timestamp: 2024/01/23 01:33:126
service: checkout
...
```

There are two issues:

* the timestamp is emitted with the json log and prevents datadog from properly parsing the log as `json`
* datadog adds its own timestamp at the **time of ingestion** (when the log was processed by datadog) which is not the same as the **time of emission** (when the log was originally emitted)

Nimbus can now recognize this class of issues and apply a **lint optimization** to fix it. In this case, Nimbus would come up with the following optimization

```yaml
process_when:
- key: message
  op: EQUAL
  value: 'checkout'
- key: message
  op: MATCH
  value: '^\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2} .+'
vrl: |
  groups = parse_regex!(.message, r'^(?<time>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2}) (?<data>.+)') 
  .message = groups.data
  .timestamp = parse_timestamp!(groups.time + "+00:00", format:"%Y/%m/%d %H:%M:%S%:z")
```

The log post lint optimization would look like the following

```yaml
method: "process_checkout"
retry_count: 3
timestamp: 2024/01/23 01:33:122
service: checkout
...
```

This applies the correct timestamp and lets datadog properly parse the json log as a structured log. This also makes it possible to do queries like `@retry_count > 0` which previously would not have been possible over the string based log data

## Interaction with existing Optimizations

In rare cases, lint optimizations can interfere with existing [reduce optimizations](https://docs.nimbus.dev/overview/optimization.log/optimization.log.reduce).

For example, if a current reduce optimization relies on a timestamp to be present in the log body and the lint optimization pulls it out as a log attribute, it means that those logs will no longer be aggregated.

For example, say you have the following log.

```yaml
message: 2024/01/23 01:33:12 foo did bar
...
```

You also have the current reduce optimization

```yaml
process_when:
- key: message
  op: MATCH
  value: '^\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2} foo.+'
...
```

You might get a lint optimization that pulls out the current timestamp into a separate attribute

```yaml
message: foo did bar
timestamp: 2024/01/23 01:33:12
```

This means that your previous reduce optimization would no longer work because the it was using the date as an activation filter.

Today, you can either manual adjust the `process_when` clause and change the [predicate](https://docs.nimbus.dev/overview/ntl/ntl.transforms) to fix it yourself or wait for Nimbus to re-analyze your logs and provide updated recommendations.
