Clicking a link in a spam email often leads the user to a black-market storefront Web site selling knock-off merchandise, such as fake pharmaceuticals, replica luxury goods, or counterfeit software. These online storefronts are run by illegitimate businesses called affiliate programs, who engage with spammers as independent contractors. The affiliate program is responsible for managing the online storefronts, contracting for payment services (e.g., to accept credit cards), customer support, and product fulfillment. Their illegal activity is big business, with prominent affiliate programs generating millions of dollars in revenue every month.
However, the affiliate program is a bottleneck in the spam ecosystem: thousands of individual spammers advertise hundreds of thousands of online storefronts, but only dozens of affiliate programs administer the stores. Thus, disrupting the operation of affiliate programs (e.g., by disabling payment processing) can cripple the entire spam business model. At the heart of this intervention, then, is a classification problem: how to identify affiliate programs from the Web pages of their online storefronts?
Our work addresses this large-scale classification problem. We develop an automated system that classifies spam-advertised storefronts according to the affiliate programs who run them and, in turn, enables security practitioners to track and target these affiliate programs. Because a program's storefront Web pages share a distinctive underlying structure in their HTML, classification is highly accurate, even when limited to a small initial seed of labeled data.
Last modified: 03.03.2015