History presented in this post was very interesting, but the analysis ended up disappointing. The article ends just after they had managed to narrow their sample of robots.txt files to exclude duplicate and derivative files. They don't even present any summary statistics for this filtered sample.