Friday, December 9, 2016

Secure Hashing of Passwords

Numerous breaches happen every day due to security vulnerabilities. On this blog, I have previously analyzed some of the biggest breaches that have been made public. All of them serve as a testament to the inadequate security many companies employ.

The Time it Takes to Crack a Hash

I often get asked: "How long does it take to crack an MD5 hash?" - implying that the cryptographic hash algorithm is the most important factor, which rarely is the case. It actually depends on a couple of factors, which I've ordered in descending importance below:
  1. The length of the password
  2. The charsets used in the password
  3. The amount of hardware you have
  4. The methodology you use for cracking

Secure Passwords

Password lengths alone are not enough, you also have to use a good charset to control the search space hackers have to go though. Let's make an example to show how much the search space changes:

Charset = a-z, A-Z, 0-9
Length 6: (26+26+10)^6 = 56.800.235.584
Length 7: (26+26+10)^7 = 3.521.614.606.208
Length 8: (26+26+10)^8 = 218.340.105.584.896

As you can see, the number gets a lot bigger for each character we add. Now lets look at charsets:

Length = 8
Charset: a-z = (26)^8 = 208.827.064.576
Charset: a-z, A-Z = (26+26)^8 = 53.459.728.531.456
Charset: a-z, A-Z, 0-9 = (26+26+10)^8 = 218.340.105.584.896

Charsets are at least as important as the length is. In reality, we need to increase both in other to get good passwords. For reference, MD5 hashes on a Nvidia Titan X graphics card can be cracked with around 15.000.000.000 hashes per second. A password of a-z, A-Z, 0-9 of length 8 would take (26+26+10)^8 / 15.000.000.000 = 14556 seconds (about 4 hours). That is not long enough to deter hackers - we need to do something about the hashing algorithm as well.

The Cryptographic Hash Algorithms

Through time, developers have used many different cryptographic hash algorithms to store passwords. Actually, most systems still use rather weak hash algorithms, which is why this blog post even exist. If we take a look at the hashing algorithms from strong to weak:

1. scrypt, bcrypt, PBKDF2
2. SHA512, SHA256
3. SHA1, MD5
4. NTLM, MD4
5. LM, DES (unix crypt)

I could write 100 blog posts about each algorithm, but suffice to say that LM and DES used to be VERY widespread in the days of Windows XP and Linux 2.x. They have since moved on to stronger algorithms, but Windows is stuck on NTLM, and Linux still use MD5, SHA256 and SHA512 for the most part, but is transitioning nicely to scrypt and PBKDF2.

You might be wondering what the speeds are of each algorithm, so I'll include each of them here. A larger number of hashes is worse. The speeds are from a Nvidia Geforce 1080 graphics card with Hashcat:
  1. scrypt - 434.000 H/s
  2. bcrypt - 13.226 H/s
  3. PBKDF2 (SHA1) - 3218.000 H/s
  4. SHA512 - 1.021.000.000 H/s
  5. SHA256 - 2.829.000.000 H/s
  6. SHA1 - 8.491.000.000 H/s
  7. MD5 - 24.809.000.000 H/s
  8. NTLM - 41.354.000.000 H/s
  9. MD4 - 43.145.000.000 H/s
  10. LM - 18.429.000.000 H/s
  11. DES - 909.000.000 H/s
You might say: Hold on a minute! DES is actually slower than SHA512, does that not mean it is more secure?

Nope! That's because the way the Linux guys implemented DES limited it a maximum of 8 characters, and they are also using the 2 first characters as salt, which means we only need to crack 6 character long passwords. LM hashes are also limited to 14 characters, but the way Microsoft made the algorithm is flawed, and as such, we only need to crack 2x7 character passwords.

The Salted Hash

Password salts are needed in order to protect against rainbow table attacks. Hackers have access to large precomputed tables of character sets in many different cryptographic hash algorithms today, and when they target just a single hash, it can be quite effective.

The salt is just a binary blob that gets appended to the password in the hashing process. A password of '123456' becomes '123456[SaltHere]', thereby making it impossible to precompute a table that contains '123456[SaltHere]'. Salts should be unique for each user in the database, thereby making the protection even more effective.

The right way to salt the password is to use a cryptographically secure pseudo-random number generator (CSPRNG) to create a salt which is unique for each user. It needs to be at least 64 bits (8 bytes) to be effective. The salt is stored together with the hash inside the user database in plain text.

The Slow Hash

Cryptographic hash functions like MD5 and SHA1 are designed with high performance in mind. This means that they can quickly produce a hash of an arbitrary input, but it also means that they are good for brute force and wordlist attacks. To slow down the hashing procedure, we use something called a key derivation function (KDF). scrypt, bcrypt and PBKDF2 are such a function. They take in a password, a salt and a number of iterations. Most implementations actually just require a password and a "workload factor", which simplifies the use of the algorithms.

They work by cryptographic primitives like SHA1, SHA256 or Blowfish, which is then run the number of iterations together with a salt. They also apply the salt in an HMAC fashion to be completely secure against many different cryptographic hash attacks.

As you can see from the hash speed list above, scrypt, bcrypt and PBKDF2 are much, much slower than their relatives. This is very configurable and perfect for password hashing. Developers that implement these algorithms should aim for the number of iterations it takes to make a password hashing operation last 50 ms or more on their server.

If we take the example of a 8 character password with the charset a-z, A-Z, 0-9 from above, which took 4 hours to crack with MD5, and try to crack it using PBKDF2, it would take around 2 years to crack!

Conclusion

Use scrypt, bcrypt or PBKDF2 with a high enough work factor that it takes at least 50 ms to compute a single hash on your server. It takes care of hashing the correct way, applying a salt and usually uses a CSPRNG to generate the salt. Stay away from everything else.

Saturday, December 3, 2016

The problem with Network Address Translation

As a technology ideologist, Network Address Translation (NAT) is one of my biggest concerns when it comes to the future of the Internet. The Internet was built as a communications tool to facilitate the sharing of information in digital form. It has vastly improved the communication between humans around the planet and it has been one of the most important inventions we have ever made.

My concerns is with the fact that NAT is a direct inhibitor of the nature of the Internet, which goes against everything we want the Internet to be.

Network Address Translation

The Internet uses routers to route data from network to network, but to do so, we need addresses of each network. Today we use Internet Protocol version 4 (IPv4), which you usually see in dottet format such as 199.181.132.250. In its raw form, IPv4 is a 32 bit (4 bytes) addressing scheme which is able to address 2^32 (4.294.967.296) networks, which was enough back when the Internet was created in the 1950s, but it is nowhere near enough for the Internet today. To solve this problem, Internet Protocol version 6 (IPv6) was created, which has 128 bits that give us 2^128 (340.282.366.920.938.463.463.374.607.431.768.211.456) addresses, which should be enough to last us for a long time.

Routers need a firmware upgrade in order to route 128-bit addresses, and back in the day when IPv6 was devised (1998), routers simply did not have the capacity to route the larger address space. A quick workaround was to create NAT, which is a simple mechanism to translate one address to another. With this capability, Internet Service Providers (ISP) began to distribute routers pre-configured with NAT enabled. They had DHCP enabled on the LAN interface with non-routed IP addresses (192.168.0.0/24), which then got translated to a routed IP address on the WAN interface. This allowed ISPs to extend the lifetime of IPv4 and keep using low capacity consumer routers.

That does not sound too bad, right? It is a clever way to circumvent the limitations of IPv4 and keep using existing hardware which was produced for cheap in large quantities.

The Problem

A side-effect of NAT is that it splits the Internet into many smaller private networks, which can't communicate directly with each other unless you do port forwarding. This side effect has been documented as a "security feature" ever since it was conceived, as it essentially functions like a simple firewall between networks.

Remember what the purpose of the Internet was? That's right, to transmit digital information between machines! NAT completely contradicts the whole purpose of the Internet by segmenting the Internet, just because we saw NAT as a quick fix to our IPv4 addressing problem. We have become accustomed to the fact that NAT exists. It has served its purpose, and it is time we move on to IPv6 and make the Internet work as intended again.

The Solution

By now you should already know that IPv6 is the solution to the NAT problem, but there is still an unanswered question: What about the security NAT provides?

Let me counter-question you with this: What is it we are trying to achieve?

I'd agree that the Internet should not just be one big monolithic security boundary, where any hosted service can be reached by everyone else. That is why we have firewalls, which NAT is not. You still have a router with two networks, and in other to publish a service from one network to the other, a firewall will have to allow it. Firewalls are a much more efficient solution to the security problem than NAT is.

Let's say we enabled IPv6, what exactly would that mean?

That is the funny part; you are already running IPv6! That's right, you are running a dual-stacked network layer capable of both IPv4 and IPv6, you don't have to do anything. Most routers are also already dual-stacked, we just have to disable IPv4 and tadaaaa, you are IPv6 only.

For router manufacturers, developers and network engineers it is a huge advantage to completely disable IPv4, which is why many ISPs today are running IPv6-only networks internally. As a manufacturer or developer, you can remove NAT, TURN, STUN, IGDP and NAT-PMP from routers and communication software. Of course, we still need a network control protocol that publishes a service in the router's firewall, but it is so much more simple now that you don't have to take NAT into account.

A Real Example

The BitTorrent protocol is one of the most common occurring protocols on the Internet. It is a simple protocol to transmit binary data between clients in an efficient manner using Peer to Peer (P2P). Clients find each other by using a Tracker or what's known as a Distributed Hash Table (DHT). DHT is a simple protocol, but it is severely limited by NAT, simply by the fact that it works in a very socialistic manner. Each client in the DHT network has to store a chunk of data for it to work, but since most of them now sits behind NAT-enabled routers, they can't be reached by the rest of the network, thereby making the data they contain unreachable by others. This fundamentally destroys the whole concept of a Distributed Hash Table! All BitTorrent applications have implemented one of the many protocols to circumvent NAT, but none of them are perfect. If we switched to IPv6 only networks, DHTs can finally work again, and we would see a massive gain in performance as clients can find each other more efficiently.

Other P2P protocols like Skype would also work better and would not have to route through a third party. Back in 2011 when Microsoft bought Skype, they began transitioning the Skype P2P protocol from a highly distributed node network into a more centralized platform based on Microsoft Notification Protocol (MSNP). The centralized protocol works "better" because Microsoft's servers work as a broker between you and other clients, thereby circumventing NAT. If the clients both have IPv6 and publishes the Skype service in their router firewall, they could reach each other directly, and instantly get a massive performance and stability gain.

Maybe this is just wishful thinking, but it does seem that the IPv6 adoption is gaining speed, which is good news for the internet as a whole.