1. Use Case: Counting Unique Article Views
Suppose we want to track how many unique users have viewed a particular blog article. Exact precision is not critical here — an approximation of ±1% is perfectly acceptable for analytics purposes. What matters is keeping the cost low and the implementation simple.
A naive approach would be to store every unique visitor ID in a Redis Set. Let's walk through what that would look like.
2. Naive Approach: Using a Redis Set
Redis provides SADD to add members to a set and SCARD to count the total number of distinct members:
The output of SCARD would be 2 since user:1001 was added twice but sets only store unique values.
This works fine at a small scale. However, as an article gains popularity, the set grows proportionally — one stored entry per unique visitor. A single Redis string can be up to 512 MB, and storing millions of user IDs (each potentially being a UUID string of 36 characters) can consume hundreds of megabytes per article. For a platform hosting thousands of articles, this becomes a serious memory concern.
3. The Problem at Scale
Suppose an article has 10 million unique views. A UUID like 550e8400-e29b-41d4-a716-446655440000 occupies roughly 36 bytes. Storing 10 million such entries in a single Set would require approximately:
That is just one article. Across a platform, this approach simply does not scale.
4. HyperLogLog: Approximate Counting at Low Memory Cost
HyperLogLog (HLL) is a probabilistic data structure that estimates the cardinality (i.e., the number of distinct elements) of a set with very low memory usage — at most 12 KB, regardless of how many elements are tracked. The trade-off is a small margin of error, typically around 0.81%.
For use cases like page view counters, where exact counts are not required, HyperLogLog is an ideal fit.
5. Commands
5.1.PFADD — Add Elements
PFADD — Add Elements
PFADD works similarly to SADD — it registers elements into the HyperLogLog structure. Duplicate insertions are handled internally and do not inflate the count.
One important distinction from a plain Set: PFADD does not allocate the full 12 KB upfront. Memory usage starts small and grows incrementally as more unique elements are added — only reaching 12 KB when the cardinality is very large.
5.2.PFCOUNT — Estimate Cardinality
PFCOUNT — Estimate Cardinality
This returns the estimated number of unique elements added to the HyperLogLog. For the above example, the result would be approximately 2.
PFCOUNT can also accept multiple keys at once, returning the estimated cardinality of the union of all specified HyperLogLogs:
This is useful for computing site-wide unique visitor counts across multiple articles in a single call.
5.3.PFMERGE — Merge Multiple HyperLogLogs
PFMERGE — Merge Multiple HyperLogLogs
PFMERGE creates a new HyperLogLog (combined:hll) that represents the union of all source HyperLogLogs. We can then run PFCOUNT on the merged key to get the total estimated unique visitor count across all merged sources.
This is particularly useful when aggregating daily counters into a monthly summary, or combining per-article stats into a category-level or site-level metric.
6. Memory Comparison
| Approach | 10M unique users | Notes |
|---|---|---|
| Redis Set | ~360 MB | Exact count, high memory |
| HyperLogLog | ≤ 12 KB | ~0.81% error, fixed upper bound |
For analytics workloads where approximate counts are acceptable, HyperLogLog offers an orders-of-magnitude improvement in memory efficiency.





