Slashdot today talked about the "Whopping-Big Data Theft At U.C. Berkeley. As the database included Social Security Numbers and other information useful to identity theft a lot of comments where made about how it could have been protected. One comment stuck out for me
Now, why in the world they were handed a bunch of social security numbers (instead of MD5's of the numbers) to store is a mystery to me
A lot of people believe calculating a cryptographic hash is a protection method for data. It is, and it isn't. A hash is a one-way encryption algorithm. Given the encrypted version, it is computationally infeasible to determine the plain-text version.
As a Brit I will freely admit to having no clue over the format of the US SSN, but the UK one goes XX999999X. Now with such a well known format, and a reasonably small data set, pre-calculation of a simple MD5 hash is just an exercise is computing.
Remember that a hash has to be the same every time it is calculated, so it is possible to precalculate hashes, saving the un-hashed data, then compare the hashes against a compromised database and look up the unencrypted data in your pre-calculated table. Worse, pre-calculated hash dictionaries exist, perfect for cracking your password field. So what can you do? You salt it. This is done by adding a second piece of information to the hash that is non-changing and unique for every user, for example, the username, or a user ID.
using System.Text;
using System.Security.Cryptography;
public static string ComputeHash(string plainText, string saltText)
{
string saltedText = saltText + plainText;
UTF8Encoding encoder = new UTF8Encoding();
MD5CryptoServiceProvider md5Hasher = new MD5CryptoServiceProvider();
byte[] hashedBytes = md5Hasher.ComputeHash(encoder.GetBytes(saltedText));
return (Convert.ToBase64String(hashedBytes));
}
All you're doing is adding the salt and the data to be encrypted together. Of course you need to change your checking code to cope with your new encryption string.