66

Cardinality Estimation | 一根笨茄子

 5 years ago
source link: http://blog.guoyb.com/2018/05/19/cardinality-estimation/?
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
在之前的文章,Cardinality Counting中,我们介绍的方法,都是可以精确统计基数的。但是,在现在动辄TB、PB级数据量的情况下,无论是BTree还是bitmap,都有很多缺陷,并且精确性这一优势也被海量数据的前提所抵消(想象一下,统计uv时,100000000和100000001有区别吗)。 相反的,我们可以采用一些基于概率的方法,在误差可控的前提下,对基数做出合理的估计。

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK