A Collision Attack is an attempt to find two input strings of a hash function that produce the same hash result. Because hash functions have infinite input length and a predefined output length, there is inevitably going to be the possibility of two different inputs that produce the same output hash. If two separate inputs produce the same hash output, it is called a collision. This collision can then be exploited by any application that compares two hashes together – such as password hashes, file integrity checks, etc.
The odds of a collision are of course very low, especially so for functions with very large output sizes. However, thanks to the probability exploitation of the birthday problem it is possible to reduce the complexity of generating a collision down to a more reasonable level.
For example, let’s say we have a hypothetical hash function called “Hesh”. A collision attack would first start with a starting input value, and hash it.
Now the attacker needs to find a collision – a different input that generates the same hash as the previous input. This would generally be done through a brute-force method (trying all possible combinations) until one was found. Let’s say we found a collision for this input in our hypothetical hash function.
The attacker now knows two inputs with the same resulting hash. As an example for a practical use of this – if the attacker was offering a file download and showed the hash to prove the file’s integrity, he could switch out the file download but the hash would remain the same. The file would appear valid as it has the same hash as the supposed real file, but he could swap out the correct file for the collision instead, without it being obvious to the file validator.
So – are hash collisions realistically feasible? Yes, depending on the hash function. Md5 and even SHA-1 have been shown to not be very collision resistant – however stronger functions such as SHA-256 seen to be safe at the current time.