USENIX

PhishTime: Continuous Longitudinal Measurement of the Effectiveness of Anti-phishing Blacklists

Abstract: Due to their ubiquity in modern web browsers, anti- phishing blacklists are a key defense against large-scale phishing attacks. However, sophistication in phishing websites—such as evasion techniques that seek to defeat these blacklists—continues to grow. Yet, the effectiveness of blacklists against evasive websites is difficult to measure, and there have been no methodical efforts to make and track such measurements, at the ecosystem level, over time.

We propose a framework for continuously identifying un- mitigated phishing websites in the wild, replicating key as- pects of their configuration in a controlled setting, and generat- ing longitudinal experiments to measure the ecosystem’s pro- tection. In six experiment deployments over nine months, we systematically launch and report 2,862 new (innocuous) phish- ing websites to evaluate the performance (speed and coverage) and consistency of blacklists, with the goal of improving them.

We show that methodical long-term empirical measure- ments are an effective strategy for proactively detecting weak- nesses in the anti-phishing ecosystem. Through our exper- iments, we identify and disclose several such weaknesses, including a class of behavior-based JavaScript evasion that blacklists were unable to detect. We find that enhanced protec- tions on mobile devices and the expansion of evidence-based reporting protocols are critical ecosystem improvements that could better protect users against modern phishing attacks, which routinely seek to evade detection infrastructure.