Loki duplicates
In Loki, deduplication happens both at ingest-time and (optionally) at query-time, but it’s driven by the combination of:
- Stream identity (tenant + label set, e.g.
{job="myjob"}
), - Timestamp (the epoch you push), and
- Log line content
1. Ingest-time dropping of exact duplicates When you push two identical log lines (same timestamp, same labels, same exact text) into the same stream, the second one is silently dropped. Loki’s distributor dedupes logs that would collide on “timestamp + stream”; only the first one survives (community.grafana.com, community.grafana.com).
2. increment_duplicate_timestamp
for near-duplicates
If you enable in your loki.yaml
under limits_config
:
limits_config: increment_duplicate_timestamp: true
then on ingest, whenever a new line arrives with the same timestamp but different content, Loki will nudge the timestamp forward by one nanosecond to preserve order. This only applies when the log text differs—truly identical lines still collide and get deduped as above (community.grafana.com, community.grafana.com).
3. Query-time deduplication in HA setups In a multi-replica deployment (i.e. you’ve set a replication factor > 1), each ingester may hold its own copy of a stream. When you run a query, the Querier automatically deduplicates across ingesters so you don’t see each replica’s copy twice (community.grafana.com, saikiranpikili.medium.com).
So:
- Two identical messages (same epoch, same
job
label, same text) → only one stored. - Two messages with same epoch and
job
but different text → both can be stored (and ordering preserved if you’ve enabledincrement_duplicate_timestamp
). - Multiple replicas → query-time dedupe prevents you from seeing the same event again.