Knowing how to create a hash table is helpful when using built-in HashTable and HashMap implementations in various languages. Questions about hash tables are commonly asked in programming interviews, and often people are asked to create an implementation from scratch. Below is an example of how to create a hash table in Java using “chaining” for collision resolution (explained below). There are a variety of common ways to implement a hash table, but the basic parts of any hash table are the array, the hash function, and handling collisions. The basic methods include get()
, put()
, and delete()
.
The Array
The underlying data structure used for storing the key/value pairs of a hash table is an array. Arrays are created with a fixed size. To determine where (the index) in the array each object should be, we use a hash function.
We use the code below to create our storage array. In this example we set the array to a size of 128. You can see here that we are creating an array of LinkedLists. LinkedLists are used for collision avoidance (explained below) in this example.
public static final int ARR_SIZE = 128;
private LinkedList<HTObject>[] arr = new LinkedList[ARR_SIZE];
The Hash Function
A hash function is a function that can take an input of any size and convert it to a fixed size. In this case, this fixed size is the size of the storage array. The goal here is to take the key of each key/value pair being added or removed from the hash table, and determine the proper location for this key within the array.
This hash function used in this implementation is simple. We use the remainder of dividing the key’s hashCode by the array size. This ensure that we always are returned a valid index value for the array.
key.hashCode() % ARR_SIZE;
Handling Collisions
It is possible that the hash function will return the same value (meaning same array location) for different keys. This is called a “collision”. A hash table must have a way to handle collisions.
In this implementation we use “chaining” with a LinkedList to handle collisions. Each location in the array contains a LinkedList. Items in the array are stored in these LinkedLists. This allows us to store multiple objects at each index in the array.
Basic Hash Table Methods
The three basic operations of a hash table include adding, removing, and getting (using a key to get the associated value). In the implementation below, keys and values are stored together using a custom HTObject. Using a custom object with references to both the key and value of each key/value pair can simplify the code written to implement these basic methods.
public static class HTObject {
public String key;
public Integer value;
}
The Put Method
The put()
method is used to add items to, or update the hash table. Our put() method first applies the hash function to the key in the provided key/value pair. The output of the hash function indicates the index/location for the key to be stored in the hash table array. We first check if there are any items already being stored at this index by checking for the existence of a LinkedList. If there is no LinkedList at this array index we create one, and then add a new object containing the provided key/value pair.
If there does already exist objects at the specified key value pair, we iterate through the LinkedList to see if the provided key already exists within an object in the LinkedList. If the key does exist, the associated value is simply updated. Otherwise a new object containing the key/value pair is added to the LinkedList.
The Delete Method
The delete()
method is used to remove key/value pair objects from the hash table. This method works similarly to the put() method. If there are no items (no LinkedList) in the array location specified by the hash function, the method simply returns. Otherwise, we iterate over the objects in the LinkedList and search for an object with the provided key. If this key is found, the associated object is removed from the LinkedList.
The Get Method
The get()
method is used to return the value of a provided key. This method works similarly to the other two. If there are no objects (no LinkedList) in the array location specified by the hash function, the method simply returns null
. Otherwise, we iterate over the objects in the LinkedList and search for an object with the provided key. If this key is found, the associated value is returned. Note that the get() method functionality is split into get()
and getObj()
in the code below.
The Code
package com.bigdatums.interview;
import java.util.LinkedList;
public class HashTable {
public static class HTObject {
public String key;
public Integer value;
}
public static final int ARR_SIZE = 128;
private LinkedList<HTObject>[] arr = new LinkedList[ARR_SIZE];
public HashTable() {
//init vals in array
for(int i=0; i<ARR_SIZE; i++) {
arr[i] = null;
}
}
private HTObject getObj(String key) {
if(key == null)
return null;
int index = key.hashCode() % ARR_SIZE;
LinkedList<HTObject> items = arr[index];
if(items == null)
return null;
for(HTObject item : items) {
if(item.key.equals(key))
return item;
}
return null;
}
public Integer get(String key) {
HTObject item = getObj(key);
if(item == null)
return null;
else
return
item.value;
}
public void put(String key, Integer value) {
int index = key.hashCode() % ARR_SIZE;
LinkedList<HTObject> items = arr[index];
if(items == null) {
items = new LinkedList<HTObject>();
HTObject item = new HTObject();
item.key = key;
item.value = value;
items.add(item);
arr[index] = items;
}
else {
for(HTObject item : items) {
if(item.key.equals(key)) {
item.value = value;
return;
}
}
HTObject item = new HTObject();
item.key = key;
item.value = value;
items.add(item);
}
}
public void delete(String key) {
int index = key.hashCode() % ARR_SIZE;
LinkedList<HTObject> items = arr[index];
if(items == null)
return;
for(HTObject item : items) {
if (item.key.equals(key)) {
items.remove(item);
return;
}
}
}
}