Performance Measurement Mistakes in Academic Publications

One of the fastest ways to get me to stop reading a paper is to make incorrect assumptions in the hypothesis or rely on previous work that is not completely solid. There is no doubt that research in the academic context is hard, and writing a paper about it is even harder, but the value of all this hard work is diminished if you make a mistake.

This time, I will focus on performance measurement mistakes in computer science papers. They come in different flavours and are sometimes very subtle. Performance is paramount in algorithmics and some researchers don't do their due diligence when trying to prove theirs is faster. Here are the 3 most often occurring mistakes I see, in no particular order.
Performance Measurements on Different Hardware When measuring performance, it is paramount to do it under the right circumstances. Every so often I come across a paper that states algorithm X is 4x faster than algorithm Y, but they measured it using absolute numbers between two very differe…

Reducing the size of self-contained .NET Core applications

Just for note keeping, I've written down some methods of reducing the size of a .NET Core application. I thought others could use it as well, so here you go.
Original Size First off, let's see how much disk space a self-contained 'hello world' application takes up.

> dotnet new console
> dotnet publish -r win-x86 -c release

Size: 53.9 MB - yuck!
Trimming Microsoft has built a tool that finds unused assemblies and removes them from the distribution package. This is much needed since the 'netcoreapp2.0' profile is basically .NET Framework all over again, and it contains a truckload of assemblies our little 'hello world' application don't use.
> dotnet new console > dotnet add package Microsoft.Packaging.Tools.Trimming -v 1.1.0-preview1-25818-01 > dotnet publish -r win-x86 -c release /p:TrimUnusedDependencies=true
Size: 15.8 MB
Much better! Although, we have only removed unused assemblies - what about unused code? Linking The Mono team o…

Testing 36200 DNS servers

Introduction DNS is an important part of the Internet and the speed and security are paramount for a good browsing experience. I thought it would be a good idea to scan the internet for DNS servers and test every single one of them. However, the latency of a particular DNS server depends highly on the distance and connection technology between you and the DNS server, and since I'm geographically located in Denmark, the results speed-wise only pertain to people located in/around Denmark.
Testing methodology I scanned the IPv4 Internet using NMap on port 53/UDP and stopped the scan after a few hours. The results were DNS 36200 servers, some of which are owned by ISPs, companies and a whole lot of private people. Since the IP scan was randomized, it should represent a good sample.

Almost all DNS servers have some sort of caching mechanism that makes sure requested DNS names are kept for as long as the Time-To-Live (TTL) as defined by the domain owner. To ensure we don't just tes…

Secure Hashing of Passwords

Numerous breaches happen every day due to security vulnerabilities. On this blog, I have previously analyzed some of the biggest breaches that have been made public. All of them serve as a testament to the inadequate security many companies employ.

The Time it Takes to Crack a Hash I often get asked: "How long does it take to crack an MD5 hash?" - implying that the cryptographic hash algorithm is the most important factor, which rarely is the case. It actually depends on a couple of factors, which I've ordered in descending importance below:
The length of the passwordThe charsets used in the passwordThe amount of hardware you haveThe methodology you use for cracking Secure Passwords Password lengths alone are not enough, you also have to use a good charset to control the search space hackers have to go though. Let's make an example to show how much the search space changes:

Charset = a-z, A-Z, 0-9
Length 6: (26+26+10)^6 = 56.800.235.584
Length 7: (26+26+10)^7 = 3.521.…

The problem with Network Address Translation

As a technology ideologist, Network Address Translation (NAT) is one of my biggest concerns when it comes to the future of the Internet. The Internet was built as a communications tool to facilitate the sharing of information in digital form. It has vastly improved the communication between humans around the planet and it has been one of the most important inventions we have ever made.

My concerns is with the fact that NAT is a direct inhibitor of the nature of the Internet, which goes against everything we want the Internet to be.

Network Address Translation The Internet uses routers to route data from network to network, but to do so, we need addresses of each network. Today we use Internet Protocol version 4 (IPv4), which you usually see in dottet format such as In its raw form, IPv4 is a 32 bit (4 bytes) addressing scheme which is able to address 2^32 (4.294.967.296) networks, which was enough back when the Internet was created in the 1950s, but it is nowhere near…

.NET Compression Libraries Benchmark

I often find myself needing a compression library in a software development project, and after a few tests with disappointing results I usually fall back to spawning 7Zip as an external process. This time however, the application specification prohibits any data written to disk, which 7Zip needs to do in order to work correctly. I needed to do everything in memory, and so I had to test some of the C# libraries available for compression as well as the different compression algorithms.

Here is a full list of the tested libraries:
.NET DeflateStreamICSharpCode SharpZipZlib.NETDotNetZipManaged LZMALZMA SDKQuickLZ  Compression 101 Before I get to the results of the benchmark, I think it is important to understand the different archive formats and compression algorithms in order to compare the results with other benchmarks/and or tests. Lets get to it!

A compression algorithm can be compared to a small machine that takes in some data, do some math on it and transform it to some new data -…

Analysis of the Gamigo hashes

This leak analysis is dedicated to Steve Gibson of Gibson Research Corporation.
Thanks for a great show Steve!
Back in Februrary, a large gaming site called Gamigo was hacked, and in July, the full list of email and hashes found their way onto the Internet. are the kind providers of the hashes for this analysis, and as an added bonus, they also sent me the emails for this analysis. A huge thanks goes out to for their kindness.

The leak contains 9.475.226 valid MD5 hashes and 8.261.454 emails. It has 7.028.067 unique hashes and 8.244.423 unique emails.

In the time-frame of 7 days, 19 hours and 3 seconds, I was able to crack 7.731.708 hashes (81,6%).
Cracking SystemThe cracking was done on an ordinary GeForce 560 TI, 1024MB RAM graphics card using Hashcat-Plus v0.081 64bit. The settings of Hashcat were set to low, which resulted in only 90% cracking efficiency. Had it been optimized, the full crack would have taken 6 days and 20 hours.

Different cracking techniq…