Introduction
As far as assessable course content goes, I don't think Hash Tables are in that category. But just to be on the safe side and learn some cool things, here are notes!
Hash tables are a data type that provide a way of pairing data with a particular 'key'. This allows for quick access, insertion and deletion.
Commonly implemented as an array, a hash table applies a hash function to a variable and then stores said variable at that index of the array. Essentially it gives you a number for your string or struct or any other data, and that tells you where to look for it.
The number is always the same for any piece of input with the same hash function. This has to be the case or storage and look-up wouldn't work.
Collisions
Depending on the algorithm used to convert data to a number (the hash function), multiple different values may hash to the same number, causing what is termed as a "collision". There are many ways to deal with this, but I think the simplest method is to create an array of linked lists, where each node stores the hash'd data.
Sizing
Hash table size can be approached in one of two main strategies;
- Make the table big enough to start with, allowing for lack of optimal memory use.
- Make the table resize when it nears full
One may be more appropriate than the other depending on the hash algorithm and strategy for dealing with collisions
Code (for storing structs)
Creation
// createHash creates a HashTable // Initialising every cell to be a NULL-pointing List HashTable createHash () { HashTable h = malloc(sizeof(struct hashTable)); if (h == NULL) { fprintf (stderr, "Out of memory, exiting\n"); exit(1); } h->array = malloc(sizeof(List)*HT_SIZE); if (h->array == NULL) { fprintf (stderr, "Out of memory, exiting\n"); exit(1); } int i; for (i = 0; i < HT_SIZE; i++) { h->array[i] = createList(); } return h; }
Deletion
void deleteHashTable (HashTable h) { if (h != NULL) { if (h->array != NULL) { int i; for (i = 0; i < HT_SIZE; i++) { deleteList(h->array[i]); } free(h->array); } free (h); } }
Insertion
// insertHashTable takes in a Data d // If d is not in the HashTable it inserts it at the start of the list void insertHashTable (HashTable h, Data d) { if (h != NULL) { if (inTable(h, d) == NULL) { int index = hash(d); Node current = h->array[index]->first; h->array[index]->first = createNode(d); h->array[index]->first->next = current; } } }
Removal
// removeNode finds the node before the matching one // It then relinks the nodes correctly, freeing the matching one Data removeHashTable (HashTable h, Data d) { if (h != NULL) { Node found = inTable(h,d); if (found != NULL) { int index = hash(d); Node next = found->next->next; deleteNode(found->next); found->next = next; } } }
Find Data
// inTable returns the node before the matching one // that way removeHashTable can quickly delete the correct node Node inTable (HashTable h, Data d) { if (h != NULL) { Node found = NULL; int index = hash(d); Node current = h->array[index]->first; Node next; while ((current->next != NULL) && (found == NULL)) { if (match(current->next->data, d)) { found = current; } else { current = current->next; } } } return found; }