undefined

by Bruce B.

What is the basic use of a Hash Table?

A hash table organizes data so you can quickly look up values for a given key.

Think of a hash map as a "hack" on top of an array to let us use flexible keys instead of being stuck with sequential integer "indices."

What are the basic strengths and weaknesses of a Hash Table?

Strengths:

Fast lookups. Lookups take time on average.
Flexible keys. Most data types can be used for keys, as long as they're hashable.

Weaknesses:

Slow worst-case lookups. Lookups take time in the worst case.
Unordered. Keys aren't stored in a special order. If you're looking for the smallest key, the largest key, or all the keys in a range, you'll need to look through every key to find it.
Single-directional lookups. While you can look up the value for a given key in time, looking up the keys for a given value requires looping through the whole dataset— time.
Not cache-friendly. Many hash table implementations use linked lists, which don't put data next to each other in memory.

What are some of the common uses for Hash Tables?

Associative arrays: Hash tables are commonly used to implement many types of in-memory tables. They are used to implement associative arrays (arrays whose indices are arbitrary strings or other complicated objects).
Database indexing: Hash tables may also be used as disk-based data structures and database indices (such as in dbm).
Caches: Hash tables can be used to implement caches i.e. auxiliary data tables that are used to speed up the access to data, which is primarily stored in slower media.
Object representation: Several dynamic languages, such as Perl, Python, JavaScript, and Ruby use hash tables to implement objects.
Hash Functions are used in various algorithms to make their computing faster

What are basic components of a Hash Map/Table?

1) A zero-initialized array of stored keys, where the indices will represent the keys calculated by a hashing function

2) A hashing function that converts the arbitrary key data type to an integer index (if the user wants a key to be a string for example, we could convert the characters of the string into individual ASCII integers and add them together, then modulo by some large enough prime number to produce a likely-unique key.

3) Collision reduction/proofing - some extra handling in case it is still possible for collisions to occur. This could be done by having a linked list to store the actual keys at the collision index (to iterate through as a secondary step) or by incrementing the index by some set amount repeatedly until we find an empty slot.

What is the difference between a Hash Map (Dictionary) and a Hash Table?

Essentially similar to a Hash Table with two main differences:

A HashTable doesn't allow null keys or values; a HashMap does.
A HashTable is synchronized to prevent multiple threads from accessing it at once; a HashMap isn't.

What is another common name for a Hash Map?

Dictionary

What is another common name for a Dictionary?

Hash Map

(not Hash Table!)

What is the average time complexity for searching a Hash Map/Table?

What is the WCS time complexity for searching a Hash Map/Table?

Average:

O(1)

Worst-Case Scenario:

O(n)

(usually due to implementation issue)

What is the average time complexity for inserting into a Hash Map/Table?

What is the WCS time complexity for inserting into a Hash Map/Table?

Average:

O(1)

Worst-Case Scenario:

O(log n)

What is the average time complexity for searching a Hash Map/Table?

What is the WCS time complexity for searching a Hash Map/Table?

Average:

O(1)

Worst-Case Scenario:

O(n)

(usually due to implementation issue)

What is the average time complexity for removing from a Hash Map/Table?

What is the WCS time complexity for removing from a Hash Map/Table?

Average:

O(1)

Worst-Case Scenario:

O(n)

(usually due to implementation issue)

What is the average time complexity to traverse a Hash Map/Table in order?

O(n log n)

This is not a data structure optimized for this use. It is best for optimizing add, remove, and lookup.

What is the space consumption of a Hash Map/Table?

O(n)

Because it is a constant size that is then proportional to the number of unique stored key/value pairs.

What is a simple hash function for keys that have an int data type?

key % capacity

Where capacity is the currently allotted size of the array used to store the key/value pairs.

To handle hash collisions, use another structure (such as a Linked List of key/value/next nodes) at each index of the Hash Map/Table’s array.

What is a simple hash function for keys that have a char data type?

ord(key) % capacity

Convert the char to an int and do the basic hashing by capacity.

Where capacity is the currently allotted size of the array used to store the key/value pairs.

To handle hash collisions, use another structure (such as a Linked List of key/value/next nodes) at each index of the Hash Map/Table’s array.

What is a simple hash function for keys that have a string data type?

char_sum = 0

for c in key:

char_sum += ord(c)

return char_sum % capacity

Sum the integer values for all of the characters in the string to get an integer value, then hash that.

Where capacity is the currently allotted size of the array used to store the key/value pairs.

To handle hash collisions, use another structure (such as a Linked List of key/value/next nodes) at each index of the Hash Map/Table’s array.

What are two of the most common ways to handle hash collisions?

Open Addressing (Next Empty)

If we hash a key and find that the resulting index is already occupied, we just iterate through the list by some amount (sometimes simple index++, sometimes index², etc.) until we find an empty position.

This also impacts search - if we get the wrong key at the expected hash, we iterate until we find it or an empty index - we can stop at empty because we know that’s where the key would have existed if it was in the table at all.

Chaining (Buckets/Linked Lists/etc.)

Store values as data structures that can chain to iterate through (e.g. a Linked List nodes with key/value/next), and store them in that single index.

What is the basic optimization for Hash Map/Table capacity sizing and resizing?

Prime numbers for initial size, then resize to next prime number that is at least twice as much as the current capacity.

Actually implementing “find next 2x -> prime” is not practical to do in an interview. If asked, simply say it’s some stubbed helper method.

Join Course

Preview

Author

Bruce B.

Information

Last changed
8 months ago

Report course