“MySQL combat 45 stresses” the study notes 3 – InnoDB Why a B + tree index structure to achieve

The role of the index is to improve query performance, there are many ways to achieve their common index model has a hash table, ordered lists, search trees.

Hash table

    Structure of a key in key-value pairs stored data can be found by specifying the value of the corresponding key.

    The hash values ​​in the array, a hash function with the key to determine the position in terms of a, then the value of the array in this position. However, multiple key value after conversion hash function may occur with a value that hash collision, a common solution is to address the chain law, soon all the same Hash key value in a linked list, so, No matter how many conflicts are just at the current location to the increase in single-node list.

    Only applies to the equivalent scene query, the query range will be very slow.

Ordered list

    Support equivalence queries and range queries, but the relatively high cost of updating the data.

    Storage for static index, such as saving a city that in 2017 the population of such information does not modify the data.


1. binary tree:

    Left son of each node is less than the parent, the parent node less than the right son.

    Find, update time complexity of a node is O (log (N)), the maximum search efficiency.

2.B tree (multi-tree):

There are at least two child nodes of the root node, each node among child nodes, the size is incremented from left to right.

3.B + tree:

    B + tree leaf node of the parent node saved all keys and key data corresponding to each key of the leaf node links from small to large, but the non-leaf nodes corresponding to the key data is not saved, so that each of the B + Tree key nodes can be saved greatly increased;

    Since B + non-leaf nodes of the tree were only indexed data, not the actual existence of key value data, all data must be in order to get to the leaf node, so every time the number of data queries are the same.

Because the index is more than the presence of memory, but also to write on the disk, disk read and write as little as possible in order to reduce the number of IO, so despite the high efficiency of the binary tree, most databases will not choose a binary tree.

Imagine a one million nodes balanced binary tree, tree height 20. A query may require access to 20 data blocks. In the era of mechanical hard disk, a random data block read from the disk required addressing time of about 10 ms. That is, for a 1 million row table, if you use a binary tree to store, access a single line may take 20 time of 10 ms.

InnoDB B + tree index using the model, all data is stored in B + trees, each corresponding to a B + tree index.

InnoDB index to an integer field for example, the N almost 1200. This tree is high time 4, you can save 3 power values ​​1200, which has a 1.7 billion. Taking into account the indexes on the table root data blocks are always in memory, a 1 billion line of an integer field and look for a value of up to 3 times only need to access the disk. In fact, the tree is also a great probability that the second layer in memory, the average number of disk accesses even less.

Conclusion: InnoDB uses B + tree structure, because the B + tree can read and write well with the characteristics of the disk, reducing the number of disk accesses a single query, reduce IO, improve performance.

Leave a Reply