PostgreSQL的hash索引是否有用?

根据PostgreSQL的手册,PostgreSQL中hash索引有很大的缺陷,不推荐使用。

http://58.58.27.50:8079/doc/html/9.3.1_zh/indexes-types.html
-----------------------------------------------------------------------------
Hash 索引操作目前没有记录 WAL 日志,因此如果数据库崩溃有未写入的改变, 我们可能需要用REINDEX重建 Hash 索引。另外, 对hash索引的改变在初始的基础备份后不是基于流复制或者基于文件复制的, 所以对于随后使用它们的查询会给出错误的回复。因为这些原因, 我们并不鼓励使用 Hash 索引。
-----------------------------------------------------------------------------

并且hash索引只支持"等于"查询,而PostgreSQL中作为默认索引类型的btree能支持包括“等于”在内的各种比较操作,功能上覆盖了hash索引。因此,在PostgreSQL中,对常规数据类型,btree索引是不二之选。(相比Oracle中有过于丰富的索引类型可选择,在PostgreSQL中建索引是不是更轻松一点?)

那么,既然不推荐使用hash索引,PostgreSQL为什么还要实现hash索引呢?
PostgreSQL的hash索引使用的线性hash算法,可以动态扩展桶数,理论上的查询复杂度接近O(1),这是btree无法比拟的。下面做个实测,也证明了在大数量的情况下,hash索引的性能确实明显优于btree。所以在某些极少的特殊场景下,hash索引也还是可以考虑的。

环境

测试环境在Linux虚拟机上
宿主机
OS:Win7 64bit
CPU:AMD Athlon II X4 640 3.0G
内存:6G
硬盘:Apacer A S510S 128G SATA

虚拟机
OS:CentOS 6.5
CPU:x4核
MEM:2G
PostgreSQL:9.3(shared_buffers=128M)

测试方法

使用pgbench的TCP-B和Select Only测试对比性能。
分别建2个测试数据库db1和db2。db1是原始的pgbench测试数据库,即btree索引。db2则通过下面的方法修改为hash索引。

  1. [chenhj@hanode1 ~]$ psql db2
  2. psql (9.3.4)
  3. Type "help" for help.

  4. db2=# alter table pgbench_accounts DROP CONSTRAINT pgbench_accounts_pkey;
  5. ALTER TABLE
  6. db2=# \d pgbench_accounts
  7.    Table "public.pgbench_accounts"
  8.   Column | Type | Modifiers
  9. ----------+---------------+-----------
  10.  aid | integer | not null
  11.  bid | integer |
  12.  abalance | integer |
  13.  filler | character(84) |

  14. db2=# create index on pgbench_accounts using hash(aid);
  15. CREATE INDEX

测试结果

sclae=1时,差异不大;sclae=100时,hash几乎是btree的两倍性能。可见数据量越大,hash的性能优势越明显。


sclae=1时,无太大差异
数据略


sclae=10时,大概快了20%

点击(此处)折叠或打开

  1. [chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 db1
  2. transaction type: TPC-B (sort of)
  3. scaling factor: 10
  4. query mode: simple
  5. number of clients: 1
  6. number of threads: 1
  7. duration: 3 s
  8. number of transactions actually processed: 641
  9. tps = 213.418248 (including connections establishing)
  10. tps = 213.716676 (excluding connections establishing)
  11. [chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 db2
  12. transaction type: TPC-B (sort of)
  13. scaling factor: 10
  14. query mode: simple
  15. number of clients: 1
  16. number of threads: 1
  17. duration: 3 s
  18. number of transactions actually processed: 653
  19. tps = 217.440022 (including connections establishing)
  20. tps = 217.770471 (excluding connections establishing)

  21. [chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 db1
  22. transaction type: TPC-B (sort of)
  23. scaling factor: 10
  24. query mode: simple
  25. number of clients: 16
  26. number of threads: 1
  27. duration: 3 s
  28. number of transactions actually processed: 2064
  29. tps = 683.721499 (including connections establishing)
  30. tps = 701.078622 (excluding connections establishing)
  31. [chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 db2
  32. transaction type: TPC-B (sort of)
  33. scaling factor: 10
  34. query mode: simple
  35. number of clients: 16
  36. number of threads: 1
  37. duration: 3 s
  38. number of transactions actually processed: 3278
  39. tps = 1085.589348 (including connections establishing)
  40. tps = 1110.521633 (excluding connections establishing)

  41. [chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 -S db1
  42. transaction type: SELECT only
  43. scaling factor: 10
  44. query mode: simple
  45. number of clients: 1
  46. number of threads: 1
  47. duration: 3 s
  48. number of transactions actually processed: 4922
  49. tps = 1640.242391 (including connections establishing)
  50. tps = 1642.450916 (excluding connections establishing)
  51. [chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 -S db2
  52. transaction type: SELECT only
  53. scaling factor: 10
  54. query mode: simple
  55. number of clients: 1
  56. number of threads: 1
  57. duration: 3 s
  58. number of transactions actually processed: 4771
  59. tps = 1590.028578 (including connections establishing)
  60. tps = 1592.541658 (excluding connections establishing)

  61. [chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 -S db1
  62. transaction type: SELECT only
  63. scaling factor: 10
  64. query mode: simple
  65. number of clients: 16
  66. number of threads: 1
  67. duration: 3 s
  68. number of transactions actually processed: 32188
  69. tps = 10724.978992 (including connections establishing)
  70. tps = 10969.036021 (excluding connections establishing)

  71. [chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 -S db2
  72. transaction type: SELECT only
  73. scaling factor: 10
  74. query mode: simple
  75. number of clients: 16
  76. number of threads: 1
  77. duration: 3 s
  78. number of transactions actually processed: 33367
  79. tps = 11092.571963 (including connections establishing)
  80. tps = 11339.707975 (excluding connections establishing)

sclae=100时,大概快了1倍

点击(此处)折叠或打开

  1. [chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 -S db1
  2. transaction type: SELECT only
  3. scaling factor: 100
  4. query mode: simple
  5. number of clients: 1
  6. number of threads: 1
  7. duration: 3 s
  8. number of transactions actually processed: 1324
  9. tps = 441.307443 (including connections establishing)
  10. tps = 442.905056 (excluding connections establishing)
  11. [chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 -S db2
  12. transaction type: SELECT only
  13. scaling factor: 100
  14. query mode: simple
  15. number of clients: 1
  16. number of threads: 1
  17. duration: 3 s
  18. number of transactions actually processed: 3226
  19. tps = 1075.038414 (including connections establishing)
  20. tps = 1076.599774 (excluding connections establishing)

  21. [chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 db1
  22. transaction type: TPC-B (sort of)
  23. scaling factor: 100
  24. query mode: simple
  25. number of clients: 16
  26. number of threads: 1
  27. duration: 3 s
  28. number of transactions actually processed: 2194
  29. tps = 725.580208 (including connections establishing)
  30. tps = 741.318899 (excluding connections establishing)
  31. [chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 db2
  32. transaction type: TPC-B (sort of)
  33. scaling factor: 100
  34. query mode: simple
  35. number of clients: 16
  36. number of threads: 1
  37. duration: 3 s
  38. number of transactions actually processed: 3174
  39. tps = 1049.927110 (including connections establishing)
  40. tps = 1073.139131 (excluding connections establishing)

  41. [chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 -S db1
  42. transaction type: SELECT only
  43. scaling factor: 100
  44. query mode: simple
  45. number of clients: 1
  46. number of threads: 1
  47. duration: 3 s
  48. number of transactions actually processed: 1385
  49. tps = 461.481151 (including connections establishing)
  50. tps = 462.109984 (excluding connections establishing)
  51. [chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 -S db2
  52. transaction type: SELECT only
  53. scaling factor: 100
  54. query mode: simple
  55. number of clients: 1
  56. number of threads: 1
  57. duration: 3 s
  58. number of transactions actually processed: 3527
  59. tps = 1175.124543 (including connections establishing)
  60. tps = 1177.144768 (excluding connections establishing)

  61. [chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 -S db1
  62. transaction type: SELECT only
  63. scaling factor: 100
  64. query mode: simple
  65. number of clients: 16
  66. number of threads: 1
  67. duration: 3 s
  68. number of transactions actually processed: 6313
  69. tps = 2092.075168 (including connections establishing)
  70. tps = 2137.468368 (excluding connections establishing)
  71. [chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 -S db2
  72. transaction type: SELECT only
  73. scaling factor: 100
  74. query mode: simple
  75. number of clients: 16
  76. number of threads: 1
  77. duration: 3 s
  78. number of transactions actually processed: 14414
  79. tps = 4775.127727 (including connections establishing)
  80. tps = 4886.910106 (excluding connections establishing)

2种索引的大小差不多

  1. db1=# select pg_table_size(‘pgbench_accounts_pkey‘::regclass);
  2.  pg_table_size
  3. ---------------
  4.      224641024
  5. (1 row)

  6. db2=# select pg_table_size(‘pgbench_accounts_aid_idx‘::regclass);
  7.  pg_table_size
  8. ---------------
  9.      268451840
  10. (1 row)


郑重声明:本站内容如果来自互联网及其他传播媒体,其版权均属原媒体及文章作者所有。转载目的在于传递更多信息及用于网络分享,并不代表本站赞同其观点和对其真实性负责,也不构成任何其他建议。