Saturday, November 24, 2007

Child Benefit leak - technical opinion

Having a bit of time on my hands I did the sums on the 2 CDs that went missing between the Child Benefit and National Audit offices. There are supposedly details of 7.5 million families - including names of parents, children, dates of birth, NI numbers and bank details, where people have the benefit paid into an account.

Two CDs can contain about 1.3 Gigabytes of data, if it is stored uncompressed, and maybe 2.5 GB with compression. Dividing one by the other gives us about 180 bytes uncompressed or 345 bytes compressed per record - assuming the data is compressed. Using my family as examples I come up with the data occupying at least 200 bytes. This leads me to guess that the data has been compressed. The other reason for this assumption is that the data dump was probably done as a single text file.

Again, I guess that the employee required to burn the data onto CD would have used a simple tool, like Winzip, to compress and split the file into CD sized chunks - and of course Winzip offers 'encryption'. If this is the case, then anyone who gets hold of these disks will only need to spend £49 to extract the data.

I thought for quite a while before blogging this, but it only draws on the published information. It will inform the debate, and to be honest anyone with a small amount of technical knowledge would be able to work this out for themselves.

