Categories
Uncategorized

The principle step by step analysis Dictionary

table of Contents

    Key fields and Entry structures

    Add key (Add)

    Take key (the Find)

    Remove the key (Remove)

    Then insert key

This article is an internal c # Dictionary in principle to achieve a simple analysis. If the expression wrong, please correct me.

The main source control to resolve the current source code version control is .Net Framwork 4.8, the source address.

1. The key field structure and Entry

        struct Entry
        {
            public int hashCode;    //

The key hashCode & 0x7FFFFFFF

public int next; //

Elements in the address points to a list (actually the index entries), the last element of -1

public TKey key; public TValue value; } Entry[] entries; //

Storage key

int[] buckets; //

The latest index storage element entries, is determined by its storage position modulo result. Example: Suppose the key entries stored in the position of the first element, and the result modulo the length of hashCode and 2, the buckets [2] = 1

int count = 0; //

The stored key number

int version; //

Record version, the iterative process to prevent the collection is changed

IEqualityComparer _comparer; int freeList; //

The latest index entries empty elements

int freeCount; //

Number of entries Hollow Elements

2. Add the key (Add)

        public void Add(TKey key, TValue value) {
            Insert(key, value, true);
        }


        private void Insert(TKey key, TValue value, bool add) {
        
            if( key == null ) {
                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key);
            }
            if (buckets == null) Initialize(0);
            int hashCode = comparer.GetHashCode(key) & 0x7FFFFFFF;
            //

Modulo

int targetBucket = hashCode % buckets.Length; #if FEATURE_RANDOMIZED_STRING_HASHING int collisionCount = 0; #endif for (int i = buckets[targetBucket]; i >= 0; i = entries[i].next) { if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key)) { if (add) { ThrowHelper.ThrowArgumentException(ExceptionResource.Argument_AddingDuplicate); } //

Re-assign a key to an existing

entries[i].value = value; version++; return; } #if FEATURE_RANDOMIZED_STRING_HASHING collisionCount++; #endif } int index; if (freeCount > 0) { //

There is an empty element is present in entries

index = freeList; freeList = entries[index].next; freeCount--; } else { if (count == entries.Length) { //

Expansion: take the minimum prime number greater than count 2 as new capacity for entries and bucket (i.e. array length.Length)

Resize(); targetBucket = hashCode % buckets.Length; } index = count; count++; } entries[index].hashCode = hashCode; entries[index].next = buckets[targetBucket]; entries[index].key = key; entries[index].value = value; //存取链表的头元素的索引(即entries最后存入的元素的在enties中的索引) //

Easy to take the key, each time you traverse from the header element of the linked list, see the FindEntry (TKey) function

buckets[targetBucket] = index; version++; #if FEATURE_RANDOMIZED_STRING_HASHING #if FEATURE_CORECLR // In case we hit the collision threshold we'll need to switch to the comparer which is using randomized string hashing // in this case will be EqualityComparer.Default. // Note, randomized string hashing is turned on by default on coreclr so EqualityComparer.Default will // be using randomized string hashing if (collisionCount > HashHelpers.HashCollisionThreshold && comparer == NonRandomizedStringEqualityComparer.Default) { comparer = (IEqualityComparer) EqualityComparer<string>.Default; Resize(entries.Length, true); } #else if(collisionCount > HashHelpers.HashCollisionThreshold && HashHelpers.IsWellKnownEqualityComparer(comparer)) { //

Scaling is required if the number of collisions (single-linked list length) is greater than the maximum collision threshold set

comparer = (IEqualityComparer) HashHelpers.GetRandomizedEqualityComparer(comparer); Resize(entries.Length, true); } #endif // FEATURE_CORECLR #endif } ****************************************************************************************************************************************** static void Foo() { var dicData = new Dictionary<int, int>(); //

Add key

new List<int> { 1, 2, 4 }.ForEach(item => Add(item, dicData)); new List<int> { 22, 29, 36, 20 }.ForEach(item => Add(item, dicData)); } static void Add(int key, Dictionary<int, int> dicData) { dicData.Add(key, key); }

2.1 buckets and initialize an array of entries

      private void Initialize(int capacity) {
            //

Taking the smallest prime number greater than the capacity of the (prime)

int size = HashHelpers.GetPrime(capacity); buckets = new int[size]; for (int i = 0; i < buckets.Length; i++) buckets[i] = -1; entries = new Entry[size]; freeList = -1; } **************************************************** internal static class HashHelpers { ...... public const int HashCollisionThreshold = 100; //

Collision threshold

...... public static readonly int[] primes = { 3, 7, 11, 17, 23, 29, 37, 47, 59, 71, 89, 107, 131, 163, 197, 239, 293, 353, 431, 521, 631, 761, 919, 1103, 1327, 1597, 1931, 2333, 2801, 3371, 4049, 4861, 5839, 7013, 8419, 10103, 12143, 14591, 17519, 21023, 25229, 30293, 36353, 43627, 52361, 62851, 75431, 90523, 108631, 130363, 156437, 187751, 225307, 270371, 324449, 389357, 467237, 560689, 672827, 807403, 968897, 1162687, 1395263, 1674319, 2009191, 2411033, 2893249, 3471899, 4166287, 4999559, 5999471, 7199369}; //

Prime (prime) group

...... public static int GetPrime(int min) { if (min < 0) throw new ArgumentException(Environment.GetResourceString("Arg_HTCapacityOverflow")); Contract.EndContractBlock(); //

Look for the number of primes satisfying (prime) quality

for (int i = 0; i < primes.Length; i++) { int prime = primes[i]; if (prime >= min) return prime; } //outside of our predefined table. //compute the hard way. //

primes the number is not found satisfied (prime) quality, own calculations

for (int i = (min | 1); i < Int32.MaxValue;i+=2) { if (IsPrime(i) && ((i - 1) % Hashtable.HashPrime != 0)) return i; } return min; } }

 

2.2 Add {1,1} key, then the

    hashCode = 1;
  targetBucket = hasCode % buckets.Length;         //targetBucket = 1
    next = buckets[targetBucket];                               //next = -1
    buckets[targetBucket] = index;                             //buckets[1] = 0 

2.3 Add {2,2} key, then the

    hashCode = 2;
  targetBucket = hasCode % buckets.Length;         //targetBucket = 2
    next = buckets[targetBucket];                               //next = -1
    buckets[targetBucket] = index;                              //buckets[2] = 1

2.4 Add {4,4} key, then the

    hashCode = 4;
    targetBucket = hasCode % buckets.Length;         //targetBucket = 1
    next = buckets[targetBucket];                               //next = 0
    buckets[targetBucket] = index;                              //buckets[1] = 2

Next, the array entries will be presented in the form of a single linked list (i.e. enteries lateral array);

2.5 before continuing add the key, the operation capacity is needed, since the array entries and both have a length of 3 elements. Need need to re-assignment Next buckets and entries for each element of the expansion;

       private void Resize() {
            //扩容的大小:取大于(当前容量*2)的最小素数
            //

example:

Resize(HashHelpers.ExpandPrime(count), false); } private void Resize(int newSize, bool forceNewHashCodes) { Contract.Assert(newSize >= entries.Length); //

Examples of the buckets, and each element is set to -1

int[] newBuckets = new int[newSize]; for (int i = 0; i < newBuckets.Length; i++) newBuckets[i] = -1; Entry[] newEntries = new Entry[newSize]; Array.Copy(entries, 0, newEntries, 0, count); //

If Hash collision expansion, with new HashCode function recalculates the Hash value

if(forceNewHashCodes) { for (int i = 0; i < count; i++) { if(newEntries[i].hashCode != -1) { newEntries[i].hashCode = (comparer.GetHashCode(newEntries[i].key) & 0x7FFFFFFF); } } } //

Reconstruction singly linked list

for (int i = 0; i < count; i++) { if (newEntries[i].hashCode >= 0) { //

Resetting the modulo value and next buckets

int bucket = newEntries[i].hashCode % newSize; newEntries[i].next = newBuckets[bucket]; newBuckets[bucket] = i; } } buckets = newBuckets; entries = newEntries; } ******************************************************************* internal static class HashHelpers { ...... public static readonly int[] primes = { 3, 7, 11, 17, 23, 29, 37, 47, 59, 71, 89, 107, 131, 163, 197, 239, 293, 353, 431, 521, 631, 761, 919, 1103, 1327, 1597, 1931, 2333, 2801, 3371, 4049, 4861, 5839, 7013, 8419, 10103, 12143, 14591, 17519, 21023, 25229, 30293, 36353, 43627, 52361, 62851, 75431, 90523, 108631, 130363, 156437, 187751, 225307, 270371, 324449, 389357, 467237, 560689, 672827, 807403, 968897, 1162687, 1395263, 1674319, 2009191, 2411033, 2893249, 3471899, 4166287, 4999559, 5999471, 7199369}; //

Prime (prime) group

...... // This is the maximum prime smaller than Array.MaxArrayLength public const int MaxPrimeArrayLength = 0x7FEFFFFD; //

The maximum length of the smallest prime number array

public static int ExpandPrime(int oldSize) { //

Doubling

int newSize = 2 * oldSize; // Allow the hashtables to grow to maximum possible size (~2G elements) before encoutering capacity overflow. // Note that this check works even when _items.Length overflowed thanks to the (uint) cast //

[Doubled size can not exceed the maximum length of the smallest prime number array]

if ((uint)newSize > MaxPrimeArrayLength && MaxPrimeArrayLength > oldSize) { Contract.Assert( MaxPrimeArrayLength == GetPrime(MaxPrimeArrayLength), "Invalid MaxPrimeArrayLength"); return MaxPrimeArrayLength; } //

Take the smallest prime number (prime)

return GetPrime(newSize); } public static int GetPrime(int min) { if (min < 0) throw new ArgumentException(Environment.GetResourceString("Arg_HTCapacityOverflow")); Contract.EndContractBlock(); //

Look for the number of primes satisfying (prime) quality

for (int i = 0; i < primes.Length; i++) { int prime = primes[i]; if (prime >= min) return prime; } //outside of our predefined table. //compute the hard way. //

primes do not find the prime number (prime number), calculate it itself

for (int i = (min | 1); i < Int32.MaxValue;i+=2) { if (IsPrime(i) && ((i - 1) % Hashtable.HashPrime != 0)) return i; } return min; } }

2.6 Continue adding key values {22, 22}, {29, 29}, {36, 36}, {40, 40}, after adding the internal storage result is as follows

3. Take key (the Find)

     public TValue this[TKey key] {
            get {
                //

Key take the corresponding value of index entries

int i = FindEntry(key); if (i >= 0) return entries[i].value; ThrowHelper.ThrowKeyNotFoundException(); return default(TValue); } set { //

Update the value for the Key

Insert(key, value, false); } } private int FindEntry(TKey key) { if( key == null) { ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key); } if (buckets != null) { int hashCode = comparer.GetHashCode(key) & 0x7FFFFFFF; //

Singly linked list traversal

for (int i = buckets[hashCode % buckets.Length]; i >= 0; i = entries[i].next) { if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key)) return i; } } return -1; } ********************************************************************************************* static void Foo() { ...... //

Take Key = 22

var val =dicData[22];
}

Simplify the code for taking the corresponding value of the key

    var hashCode =comparer.GetHashCode(key) & 0x7FFFFFFF;   // 22
    var targetBuget = hashCode % buckets.Length;            //

Modular taking operation 1

var i = bucket[targetBuget]; //链表头元素的索引 bucket[1] = 5 //

traverse a single linked list

for (; i >= 0; i = entries[i].next) { if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key)) return i; }

4. Remove the key (the Remove)

        public bool Remove(TKey key) {
            if(key == null) {
                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key);
            }
            if (buckets != null) {
                int hashCode = comparer.GetHashCode(key) & 0x7FFFFFFF;
                int bucket = hashCode % buckets.Length;
                int last = -1;
                //

The principle is to remove the key value, then record the entries free index (FreeList) and the number of idle (FreeChount)

for (int i = buckets[bucket]; i >= 0; last = i, i = entries[i].next) { if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key)) { if (last < 0) { buckets[bucket] = entries[i].next; } else { entries[last].next = entries[i].next; } entries[i].hashCode = -1; //

Create an idle linked list

entries[i].next = freeList; entries[i].key = default(TKey); entries[i].value = default(TValue); //保存entryies中空元素的索引 //

Easy to insert new key values in the position of the current index, reducing waste on entryies space

freeList = i; //

Number of empty elements plus 1

freeCount++; version++; return true; } } } return false; } ******************************************************************* static void Foo() { ...... //

Remove

new List<int> { 22, 29 }.ForEach(item => dicData.Remove(item)); }

4.1 After removing Key=22, Freelist = 3, FreeMont = 1,

4.2 After removing Key=36, Freelist = 5, FreeMont = 2,

5. Re-insert key value

As shown above, when {36, 36} is removed off, and the birth will find two elements, a “new list” (gray boxes on the figure). This effect is to insert a new key when, in accordance with the order index “new list” is inserted into the record array entries.
Example: Add {keys 22, 22}, {25, 25}, then freeList = 5, freeCount = 2;
  1. To entries [5] assignment, freeList = 3, freeCount = 1;
  2. assign values to entries [3], Freelist = -1, FreeCoount = 0;

 

Hopefully, this article will give you an understanding of the internal implementation of Dictionary.

Leave a Reply