essentially bucketing: S -> index -> value
1. Approximate operation on an array
hash function: conflict and conflict resolution
1. link list (chaining)
2. find next open one (open addressing)
1. How the next slot is determined? (Different probing approach)
3. Implement hash for a custom data-structure -> goal: avoid frequent conflict
resizing (rehash)
1. Think about GC
一致性哈希 Consistent hashing
Used for distributed caches -> split the hash table on different servers
The simplest way is to take the hash modulo of the number of servers
What happens if one of the servers crashes or becomes unavailable? Keys need to be redistributed to account for the missing server, of course
our simple modulo distribution is that when the number of servers changes, most hashes modulo N will change, so most keys will need to be moved to a different server.
We need a distribution scheme that does not depend directly on the number of servers, so that, when adding or removing servers, the number of keys that need to be relocated is minimized
Consistent Hashing is a distributed hashing scheme that operates independently of the number of servers or objects in a distributed hash table by assigning them a position on an abstract circle, or hash ring. This allows servers and objects to scale without affecting the overall system.
1. Each object key will belong in the server whose key is closest, in a counterclockwise direction
2. What if adding/removing servers?
3. How to ensure the even-ness of assignment?