Hash tables are used to speed up searches and the act of entering or deleting data sets in databases. For example, if you search a company’s employee database for a last name, this process can take a long time because each database field has to be searched one after the other (i.e. sequentially) for a match. Converting the search term into a hash value and then searching for a match in the hash table is generally much faster.
So, how is this done? Each data set is assigned a unique address. The way they are addressed is always the same within the database (e.g. 001, 002, 003 or 00A1, 00A2, 00A3, etc.). This address is calculated using the hash function.
Let us take a look at a simple example. In this example, the database can contain 11 entries from position 0 to position 10. The name ‘Lisa’ consists of four ASCII characters with their associated ASCII codes: ‘L’ is 76, ‘I’ is 105, ‘s’ is 115 and ‘a’ is 97. You can test this yourself in Windows with the numeric keypad. For example, [Alt] + 0076 will result in ‘L’. All ASCII values are added together and result in the hash value 393 for ‘Lisa’. Adding together the ASCII values is equivalent to using a hash function.
The remainder is then calculated using whole numbers: 393 % 11 (entry locations) = 35, remainder 8 (the percent sign ‘%’ is used in many programming languages as the mathematical operator to calculate the remainder). The remainder value determines where in the database Lisa and all her other information is stored – in this calculation example, the index number is 8. You might imagine that with this type of addressing, remainder values would often repeat. However, the more entry locations there are and the longer the hash value is, the less likely it is that a repeat will occur. ‘Alis Meyer’ results in a completely different location despite using the same letters since the ‘A’ is now an uppercase letter and the ‘l’ is lowercase.