避免BigKey

什么是bigkey?

在Redis中,一个字符串最大512MB,二级数据结构(hash,list、set、zset)可以存储2^32-1个元素

  1. 字符串类型超过10KB被认为是一个bigkey
  2. 二级数据结构中元素个数过多被认为是个bigkey
  3. 二级数据结构中也可以存储string,那么二级类型中的string超过了10KB,也会被认为是个bigkey

bigkey带来了哪些危害?

  1. bigkey会导致集群中虽然每台机器的键值对个数是一样的,但内存占用相差很大
  2. redis单线程特性,操作bigkey会导致线程阻塞
  3. 对bigkey的查询会导致网络不稳定,比如一个bigkey有10MB,那么在1秒内有100个请求对这个bigkey进行访问,那么这一秒就产生了1000MB的流量,对于普通千兆(128MB/s)级别网卡的服务器来说会导致网络拥塞
  4. 删除和迁移bigkey时可能会产生阻塞

怎么查看bigkey?

查看当前各个类型中最大的键值对:redis-cli --bigkeys,建议在从节点执行,减少主节点压力

统计string,lists,sets,hashs,zsets五大类型中占用内存最大的键值对

分别设置两个string类型的键值对a=abc,b=abcde

127.0.0.1:6379>set a abc
127.0.0.1:6379>set b abcde

> redis-cli --bigkeys

# Scanning the entire keyspace to find biggest keys as well as
# average sizes per key type.  You can use -i 0.1 to sleep 0.1 sec
# per 100 SCAN commands (not usually needed).

[00.00%] Biggest string found so far 'b' with 5 bytes

-------- summary -------

Sampled 2 keys in the keyspace!
Total key length in bytes is 2 (avg len 1.00)

Biggest string found 'b' has 5 bytes

2 strings with 8 bytes (100.00% of keys, avg size 4.00)
0 lists with 0 items (00.00% of keys, avg size 0.00)
0 sets with 0 members (00.00% of keys, avg size 0.00)
0 hashs with 0 fields (00.00% of keys, avg size 0.00)
0 zsets with 0 members (00.00% of keys, avg size 0.00)

统计出一共有两个string类型的键值对,一共占用了8字节。并找到的string类型最大的键值对是b,占用了5字节

使用debug object <key>可以查看指定key的详细信息

> debug object a

Value at:000007D74F412EC0 refcount:1 encoding:embstr serializedlength:4 lru:1400
4909 lru_seconds_idle:337

可以看到内存地址是:000007D74F412EC0,编码方式:embstr,序列化后的长度为4字节,注意这里将'\0'也计算进来了,而STRLEN没有,lru:存在时间,lru_seconds_idle:空间时间。

通过脚本扫描所有bigkey

也不一定要用python,只要是redis支持的客户端语言都可以实现
下面这种方式是用python代码写的一个脚本,在find_big_key_normal函数通过结合scan命令每次只查前1000个key,然后通过check_big_key检查key的长度是否大于102400,来判断是否是bigkey。

import sys
import redis
def check_big_key(r, k):
   bigKey = False
   length = 0 
   try:
     type = r.type(k)
     if type == "string":
       length = r.strlen(k)
     elif type == "hash":
       length = r.hlen(k)
     elif type == "list":
       length = r.llen(k)
     elif type == "set":
       length = r.scard(k)
     elif type == "zset":
       length = r.zcard(k)
   except:
     return
   if length > 102400: #key的长度条件,可更改该值
     bigKey = True
   if bigKey :
     print db,k,type,length
def find_big_key_normal(db_host, db_port, db_password, db_num):
   r = redis.StrictRedis(host=db_host, port=db_port, password=db_password, db=db_num)
   for k in r.scan_iter(count=1000):
     check_big_key(r, k)
def find_big_key_sharding(db_host, db_port, db_password, db_num, nodecount):
   r = redis.StrictRedis(host=db_host, port=db_port, password=db_password, db=db_num)
   cursor = 0
   for node in range(0, nodecount) :
     while True:
       iscan = r.execute_command("iscan",str(node), str(cursor), "count", "1000")
       for k in iscan[1]:
         check_big_key(r, k)
       cursor = iscan[0]
       print cursor, db, node, len(iscan[1])
       if cursor == "0":
         break;
if __name__ == '__main__':
   if len(sys.argv) != 4:
      print 'Usage: python ', sys.argv[0], ' host port password '
      exit(1)
   db_host = sys.argv[1]
   db_port = sys.argv[2]
   db_password = sys.argv[3]
   r = redis.StrictRedis(host=db_host, port=int(db_port), password=db_password)
   nodecount = 1
   keyspace_info = r.info("keyspace")
   for db in keyspace_info:
     print 'check ', db, ' ', keyspace_info[db]
     if nodecount > 1:
       find_big_key_sharding(db_host, db_port, db_password, db.replace("db",""), nodecount)
     else:
       find_big_key_normal(db_host, db_port, db_password, db.replace("db", ""))

通过第三方工具扫描

https://github.com/weiyanwei412/rdb_bigkeys

这种方案基本是通过bgsave将数据异步备份成rdb文件,然后对rdb文件进行分析,这样不会影响线上的服务
至于解决bigkey的问题网上还有很多种办法,这里就不一一列举了


参考:怎么防止bigkey
参考:https://www.cnblogs.com/yqzc/p/12425533.html

# Redis 

评论

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×