Bloom filter:修订间差异

来自WHY42
Riguz留言 | 贡献
建立內容為「RedisBloom provides Redis with support for additional probabilistic data structures. These structures allow for constant memory space and extremely fast processin…」的新頁面
 
Riguz留言 | 贡献
无编辑摘要
第21行: 第21行:
(integer) 0
(integer) 0
</syntaxhighlight>
</syntaxhighlight>
As expected, fred_is_funny yields a 0. A response of zero means we can be sure that this username has not been used. A response of 1 means it might have been used. We can’t say for certain as it might a case of overlapping bits between multiple items.
Generally, the chances of false positives are low, but non-zero. As the Bloom filter “fills up” the chances increase. You can tweak the error rate and size with the BF.RESERVE command.
[[Category:Database]]
[[Category:Database]]
[[Category:Redis]]
[[Category:Redis]]

2021年6月28日 (一) 13:48的版本

RedisBloom provides Redis with support for additional probabilistic data structures. These structures allow for constant memory space and extremely fast processing while still maintaining a low error rate. It supports scalable Bloom and Cuckoo filters to determine whether an item is present or absent from a collection with a given degree of certainty, Count-min sketch to count the frequency of the different items in sub-linear space, and Top-K to count top k events in a near deterministic manner.

A good use case for a Bloom filter is to check for an already used username. On a small scale, this is no problem, but as a service grows, this can be very taxing on a database. It is very simple to implement this with a ReBloom.


> BF.ADD usernames funnyfred
(integer) 1
> BF.ADD usernames fredisfunny
(integer) 1
> BF.ADD usernames fred
(integer) 1
> BF.ADD usernames funfred
(integer) 1
> BF.EXISTS usernames fred
(integer) 1
> BF.EXISTS usernames fred_is_funny
(integer) 0


As expected, fred_is_funny yields a 0. A response of zero means we can be sure that this username has not been used. A response of 1 means it might have been used. We can’t say for certain as it might a case of overlapping bits between multiple items.

Generally, the chances of false positives are low, but non-zero. As the Bloom filter “fills up” the chances increase. You can tweak the error rate and size with the BF.RESERVE command.