Breaking and Improving Protocol Obfuscation
Different techniques for traffic classification are utilized in various fields of application. In this technical report, we look closer on how statistical analysis can be used to identify network protocols. We show how even obfuscated application layer protocols, such as BitTorrent's MSE protocol and Skype, can be identified by fingerprinting statistically measurable properties of TCP and UDP sessions. We also look closer on the properties our protocol identification algorithm exploits to identify these obfuscated protocols -- protocols that are designed not to be detectable and are thus considered to be very hard to classify. Many of the analyzed protocols are shown to have statistically measurable properties in payload data, flow behavior, or both. Based on this new insight, we propose techniques that can improve future versions of obfuscated protocols, inhibiting identification through this type of statistical analysis. These techniques include better obfuscation of payload data and flow features as well as hiding inside tunnels of well known protocols. This report is intended to provide feedback and suggestions for improvement to creators of obfuscated network protocols, and should thus help to facilitate sustained network neutrality on the Internet.