当前位置: 首页 > 科技观察

什么是一致性好的Hash实现

时间:2023-03-14 11:24:52 科技观察

本文转载自微信公众号《董泽润的技术笔记》,作者董泽润。转载本文请联系董泽润技术笔记公众号。看到微信群里有人问prd用的是什么一致性哈希算法,我想起了这篇文章。这是之前的测试。那么什么是好的一致性哈希算法呢?除了性能还应该考虑哪些因素?我根据自己的经历简单说一下。对数据状态有一定要求的缓存场景,如websession数据节点,根据算法映射到环上。当用户请求时,根据id计算hash值,顺时针(逆时针)找到的第一个节点为数据存储节点。一般为了更加统一,防止一个节点下线,造成数据移动过多,节点会根据replica(权重)复制多个虚拟节点映射到环上。数据不均匀的问题了解LB的都知道,lvs等负载均衡设备可以做到请求统一,但是经典实现的一致性hash做不到。具体情况可能会有所不同。测试脚本放在consistenthashbalancetestgithub[1]中。如果您有兴趣,可以查看一下。测试场景如下:节点数:10虚拟系数:3、10、50、100、200、400、600、800、1000测试算法:MurMurHash、Crc32、Fnv1、CityHash请求数:50w方差可以是数据分布的一个很好的衡量标准。可以看到Crc32最差,Fnv1在数量少的时候也很差,而MurMurHash和CityHash在100多个虚拟节点后收敛很快。diffratio是指分布最多的数据与分布最少的数据之间的差异除以平均值。从数据分布可以看出,MurMurHash、CityHash是首选的哈希算法。可以增加副本虚拟节点的数量。毕竟节点的动态增减是不正常的。Bubbler提到的另一个解决方案是一个实现。节点的顺序是预先计算好的,可以减少分布差异率。他在生产环境中只用了20个。副本减少到5%。当然,前提是节点的加减必须符合一致性哈希的思想,尽量少移动数据。测试数据MurMurreplica:3var:43930max:72370min:35411diff:36959ratio:73.918000MurMurreplica:10var:69441max:85238min:19257diff:65981ratio:131.962000MurMurreplica:50var:20520max:59708min:40669diff:19039ratio:38.078000MurMurreplica:100var:12556max:57515min:42045diff:15470ratio:30.940000MurMurreplica:200var:11539max:55858min:41998diff:13860ratio:27.720000MurMurreplica:400var:7161max:53385min:45323diff:8062ratio:16.124000MurMurreplica:600var:5586max:51697min:45913diff:5784ratio:11.568000MurMurreplica:800var:4101max:52118min:48041diff:4077ratio:8.154000MurMurreplica:1000var:4369max:52888min:48356diff:4532ratio:9.064000Crc32replica:3var:76353max:97097min:23872diff:73225ratio:146.450000Crc32replica:10var:30219max:61334min:31891diff:29443ratio:58.886000Crc32replica:50var:12231max:55475min:43926diff:11549ratio:23.098000Crc32replica:100var:22105max:58828min:37776diff:21052ratio:42.104000Crc32replica:200var:28465max:59481min:35807diffC4replic3rapli3rapli3:28203474ca:400var:25440max:58781min:37059diff:21722ratio:43.444000Crc32replica:600var:33926max:61694min:35887diff:25807ratio:51.614000Crc32replica:800var:33978max:61327min:36914diff:24413ratio:48.826000Crc32replica:1000var:27789max:60515min:37709diff:22806ratio:45.612000Fnv1replica:3var:348461max:377986min:5997diff:371989ratio:743.978000Fnv1replica:10var:200673max:231641min:14451diff:217190ratio:434.380000Fnv1replica:50var:41072max:84924min:38135diff:46789ratio:93.578000Fnv1replica:100var:11567max:59547min:47254diff:12293ratio:24.586000Fnv1replica:200var:6465max:53451min:46127diff:7324ratio:14.648000Fnv1replica:400var:5126max:52454min:47757diff:4697ratio:9.394000Fnv1replica:600var:3748max:52520min:48394diff:4126ratio:8.252000Fnv1replica:800var:6089max:52003min:46211diff:5792ratio:11.584000Fnv1replica:1000var:5881max:53028min:46902diff:6126ratio:12.252000Cityhashreplica:3var:77979max:99378min:11576diff:87802ratio:170:61plicamaxvarsh04044045min:25002diff:59043ratio:118.086000Cityhashreplica:50var:26496max:67141min:38121diff:29020ratio:58.040000Cityhashreplica:100var:11862max:56008min:44075diff:11933ratio:23.866000Cityhashreplica:200var:9921max:54711min:44592diff:10119ratio:20.238000Cityhashreplica:400var:5542max:53629min:47391diff:6238ratio:12.476000Cityhashreplica:600var:6823max:53417min:46261diff:7156ratio:14.312000Cityhashreplica:800var:5100max:51615min:46577diff:5038ratio:10.076000Cityhashreplica:1000var:4884max:52151min:47166diff:4985ratio:9.970000小结这第一次分享到此为止,后续会分享更多内容。感兴趣的可以关注并点击左下角的分享转发(:参考[1]测试脚本:https://github.com/dongzerun/consistenthash_balance_test,