The mighty python dictionary

How can Python lists access every one of their items with equal speed?

Answer:

Python lists use segments of RAM and RAM acts like a python list (!)

  • RAM is a vast array
  • Addressed by sequential integers
  • Its first address is Zero!
  • Easy to implement a list atop memory

The Dictionary

Uses Keys instead of indexes and keys can be almost anything.

dict

Still, they efficiently find the object stored in RAM.

The problem with the dictionary.

How do turn the keys of the dictionaries used in indexes that reach memory quickly?

The Initial maintainers come up with The Three Rules.

  1. A dictionary is really a list

when you start out with an empty dictionary, actually behind the scene it creates 8 element list of tables in a contiguous block of RAM, we might say this “list” of “items” but the correct term used is “hash table” containing “slots”.

2. Keys are hashed to produce indexes

Python lets you see hashing in action through the hash() built-in.

Hash is a mathematical algorithm that takes a string like ‘Monty’, a value of pi, or a tuple and reduces it to 32 bits of 1 and 0 and does it in such a way that it can kind of scatters information of data all across the 1 and 0.

Quite similar values often have very different hashes

‘Monty’ and ‘Money’ only differ by a single letter you can see that hash values have multiple differences. Hash is a big binary number, even slight differences can scatter across the entire hash.

Hashes look crazy, but the same value always returns the same hash!

Keys and Indexes

To build an index, Python uses the bottom n bits of the hash

Let’s start with an empty dictionary and insert a key value into it.

It will compute the hash of the key, its big binary number pull off the last 3 bits ‘010’ check that slot, and then decide to store the key value there.

Then try to insert more key values into that same dictionary, 3 bits of hash check whether the slot is empty or not, if the slot is empty then store the key values on that slot.

Inserting key values in the dictionary

Visual representation of an updated hash table

Lookup for key values

when you search for a particular key in the dictionary then it follows the same 3 steps.

  • Compute the hash
  • Truncate big hash into 3 bits
  • Look in that slot

If you try to search for ‘Bipul’ then the dictionary computes the hash, truncates the hash into 3 bits, and jumps to the right slot, even though the dictionary has 8 slots it immediately jumps to the right one.

Consequences of hashing

Dictionaries tend to return their contents in a crazy order
but the dictionary does not return values in a crazy order, the dictionary printing value is in the same order as the item stored in the hash table.

3. If at first, you don’t succeed, try, try again

collision: When two keys in a dictionary want the same slot.

we are trying to store a key-value but the required slot is already occupied by a key that has the same hash, I am not going into the mathematics of Python but Python uses a function that checks a secondary backup slot to store this particular key-value.

‘website’ chose slot ‘001’ because it was free, the ‘website ’ is now not in the place it wanted it is one step away.

Conclusion

Explaining complete internal functionality or implementation of the dictionary is not possible in one article but I tried to explain How basic Insertion, creation of the hash table, and searching of keys in Python dictionary work.

In the next article, I will try to explain the auto-resizing of the hash tables, how dictionaries trade space for time, and other stuff.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top