Skip to main content

OTEL-053: Retry max_elapsed_time set to 0 (infinite retries)

Severity: warn (advisory)

Rule Details

retry_on_failure.max_elapsed_time: 0 disables the retry cutoff — the exporter will keep retrying the same batch forever. During a real backend outage that means the sending queue fills with stuck batches, the exporter holds them indefinitely, and either memory_limiter starts refusing new work or the process OOMs. Set a finite upper bound (5–10 minutes is typical) so the exporter eventually gives up and lets the queue drain.

This rule fires when retry_on_failure.max_elapsed_time is a zero duration.

Options

FieldConstraint
retry_on_failure.max_elapsed_timeMust be > 0

Examples

Avoid
exporters:
otlp/backend:
endpoint: backend:4317
retry_on_failure:
enabled: true
max_elapsed_time: 0s # never gives up
Prefer
exporters:
otlp/backend:
endpoint: backend:4317
retry_on_failure:
enabled: true
initial_interval: 5s
max_interval: 30s
max_elapsed_time: 300s

When Not To Use It

Never — always set a finite cutoff.

  • OTEL-017 — exporter missing retry_on_failure/sending_queue
  • OTEL-050sending_queue.queue_size above 50000

Version

Available since augur v0.1.0.

Further Reading

Resources