Sunday 14 August 2011

Better protection against GPU brute force attack with bcrypt (and common sense)

Summary
For those of you who prefer to get to the point right-away instead of being lead to it, I have two points to make.

First, when it comes to password security it is better for common users to use longer (>8 characters long) and less cryptic passwords than shorter yet morecryptic passwords. By cryptic I mean passwords including lower and upper case characters combined with numbers and special charaters such as '#fK1~2'.

Why? Because a common users are more likely to be made vulnerable when their favourite site gets hacked during which the user database is stolen along with the usernames, passwords and the rest. At that point there will not be a single person trying to guess your password but instead a tireless GPU driven software doing a brute force attack by systematically going through all possible character combinations at speeds exceeding many hundreds of millions combinations per second: your sneaky 6-character cryptic password would be cracked within few seconds.

That is not to say the passwords should not have any special characters and numbers and the rest: there should as those make dictionary based attacks less likely to succeed. The point is that while having more characters in a password makes it stronger but overly cryptic words are harder to remember so if you have to choose between length and cryptic it is better make the password a little less cryptic and that much longer. The XKCD says it oh, so well, as usual :)


The second point is for software architects and developers who can significantly improve the security of their hashed passwords by adopting the BCRYPT hash function instead of common SHA-2 variants (and surely none of you are still using MD5 or SHA1 hash functions to encode passwords these days, right? Right?)

Why? Because bcrypt can be made very expensive to use so that it would take milliseconds (or even seconds, if you have paranoid tendencies) to encode a password instead of microseconds which is achieved by most of the common hash functions. It is trivial to spend about 0,6 seconds to encode user's password during registration and login as these operations happen reasonably rarely. However, those 600 milliseconds per single encoding becomes anything but trivial when a hacker attempts to brute force through all the passwords in your user database. As last line defences go, bcrypt should be preferred over other hash functions including SHA-2 variants.

Want to know more? Keep on reading.

The Problem
It has long been a common practice to store user passwords in a hashed form instead of the clear, human readable form. For years hash-algorithms such as MD5 and SHA-1 have been the preferred methods, but these days neither should be used for any security related purpose as they have well-known vulnerabilities. Instead many recommend that these days SHA-2 should be used, as it has no known exploitable vulnerabilities (apart from inept and lazy people who are using passwords that are too short and simple to stand against educated guess).

For those with a less technical background, a hash function is a one-way cryptographic algorithm that takes a variable length input and calculates a value that is unique for the specific set of data (e.g. file or string). It is not possible to reverse given hash value to reveal what the original value was. So when applied to passwords the way to verify if the given password matches with the hashed password in the database, the given password is hashed with the same algorithm and if the two hash values are identical it means that the right password was given and use may login.

For example, if the password is 'secret' the hash value (SHA-256) is
'2bb80d537b1da3e38bd30361aa855686bde0eacd7162fef6a25fe97bf527a25b'

Why passwords should be stored in their hashed form? Obviously hashing does not protect against somebody  trying to log in to service by using another user's username while attempting to guess what the password might be (to protect against this the service should temporarily lock user's account after n failed login attempts). Instead hashing the passwords is the service's last line of defence when a hacker gains unauthorised access to the user database.

User databases get hacked all too often. For example, in 2011 Sega lost 1.3 million user passwords, Sony's Playstation Network got hacked at least twice after which about a million usernames and passwords were made public by the hackers, not to mention many other known cases over the past few years (and not nearly all cases are known to public).

Hackers have been quite busy indeed but the designers and developers of those services should also take a good look at the closest mirror: after all, hackers merely exploit the existing holes in various libraries and services, many of which have been publicly documented.

Cause of Problem
While hashing is a good way to protect a password against the eyes of unauthorised mortals, they can be vulnerable against brute force attacks. Modern computer components are simply so fast and efficient that brute force attacks (where each and every possible combination is tried until the right combination of characters is found) have become quite reasonable option for attackers.

A modern brute force attack utilises GPU instead of more traditional CPU. Consider this: while a CPU based password recovery tool might take about a year to crack an eight-character password, a similar GPU based password recovery tool could do the same trick in less than a day. In another words, a 13-year old with a gamer's desktop computer and simple software tool could crack a list of typical hashed passwords within hours or days, if not in minutes. Now consider for a moment about what a well-resourced and determined professionals would do to passwords in the user database of your favourite web site should they gain access to it.

For example, a dirt cheap ATI Radeon HD 5450 can handle about 52 million SHA1 or 126 million MD5 hash computations per second. Upgrade to ATI Radeon HD 5970 and you would be doing about 2 320 million SHA1 or 5 631 million MD5 calculations per second, and that is with just a single GPU. Most desktops can have two linked GPUs, including the one I have under my desk that I have dedicated just for gaming and LAN-parties.

To put things into perspective, ATI Radeon 5770 can crack a five-character password under one second while a typical CPU might be do the same in about 24 seconds or so. A six character password would take about four seconds for 5770 to crack, and a seven character password would be sorted in about 17 minutes. The respective times for a typical CPU would be around 90 minutes and four days, or so.

The Solution?
Obviously there are no guarantees but there are ways to frustrate most attackers to a point when the reward just isn't worth the trouble.

For a common user the best protection is not to use overly cryptic and hard to remember passwords with caps, numbers and special characters (e.g. #fK1~2) but simply to use longer passwords. For example, if the system is using ASCII that has 95 printable characters, each new character in the password multiplies the number of possible combinations by 95. On the other hand, if your password is just a common word then you are wide open for dictionary based attacks and guesses of people who know you. Go for the middle ground.

At the same time software architects and developers should ditch the ye olde SHA-2 variants and similar algorithms and move to BCRYPT.

Bcrypt is a Blowfish variant that has one very important aspect that sets it apart from most other hash algorithms: it can be made very expensive to use in a world where cheap equals bad. Most hash algorithms have been optimised to calculate a hash value for large sets of data as fast as possible (for example, an AMD64 CPU can calculate MD5 hash for 335 MB of data in one second) which is great when one needs to find out if two large data sets are identical, but bad when dealing with common <10 character passwords, as shown earlier.

Bcrypt on the other hand is designed to be slower instead of faster when calculating the hash thus making it easy to increase the expense of brute force attack as a single hash calculation would take milliseconds (or even seconds) instead of microseconds. Combined with properly long unobvious passwords bcrypt can seriously frustrate GPU based brute force attacks while attackers relying on CPU should not even bother.

It is important to put this into proper context: it is perfectly fine for a password hashing to take about a second during registration and login as these happen fairly rarely: a typical user registers only once and might login to service few times a day whereas a hacker would need to go through as many individual passwords in as short time as possible. Given certain amount of time the hacker will have some of the bcrypt encoded passwords cracked, but not nearly as many as he would have if the passwords had been e.g. SHA-256 encoded, whch i in turn would be less than when dealing with *gasp* MD5 encoded passwords.

Another very nice thing about bcrypt is that it can be adapted to match Moore's Law: it has a work factor that can be freely increased as computers become faster as well as to tweak the balance between performance and security. Bcrypt is available for most programming languages and for example the Grails Spring Security Core -plugin by Burt Beckwith makes the using of bcrypt practially trivial.


Hopefully in the future I will not be seeing web sites that limit user passwords to max 8 characters while enforcing ridiculous character inclusions while at the same time I do hope to see frustrated hackers pissing their lives away by trying to brute force through bcrypt encoded passwords. It would be a nice start, but only a start as far as improving software security is concerned.