The Power Of Data Analysis – “Spamalytics”
Some of you may have seen the article in the New York Times by John Markoff (endnote1) announcing a paper to be presented at last week’s IEEE conference. This paper is an update to research conducted by a team at the International Computer Science Institute in Berkeley, California. The institute is associated with the University of California, San Diego and the University of California, Berkeley. A paper published by the team in 2008 Spamalytics: An Empirical Analysis of Spam Marketing Conversion outlines interesting research in the area the research team has coined as “spamalytics.”
The paper describes a methodology to understand the architecture of a spam campaign and how a spam message converts into a financial transaction. The team looks at the “conversion rate” or the probability an unsolicited email will create a sale. The team uses a parasitic inﬁltration of an existing botnet infrastructure to analyze two spam campaigns: one designed to propagate a malware Trojan, the other marketing online pharmaceuticals. The team looked at nearly a half billion spam emails to identify:
- the number of spam emails successfully delivered
- the number of spam emails successfully delivered through popular anti-spam ﬁlters
- the number of spam emails that elicit user visits to the advertised sites
- the number of “sales” and “infections” produced
In their latest paper, "Click Trajectories: End-to-End Analysis of the Spam Value Chain," the ICIS team addresses the technical and business transaction that makes spam pervasive and the ability of the spammers to monetize a successful response from a spam email.
“Each click on a spam-advertised link is in fact just the start of a long and complex trajectory, spanning a range of both technical and business components that together provide the necessary infrastructure needed to monetize a customer’s visit.” (endnote2)
This research is important. If we can trace the financial transactions of spammers then we can in theory shut down their efforts because we, through a variety of means, can compel banks to not process their financial transactions.
However, the real story from a broader perspective is the power of data and the traceability of spam back to the financial transaction. Admittedly not all hacking is for financial gain. However, much is. Using data analysis techniques and tracing the transaction back to the banks that collect the funds, transfer the money, etc. will be key to stopping this type of cybercrime.
The information security industry has spent billions fighting cybercrime from a technical perspective. We now need to go further, and, like the team from the International Computer Science Institute, look at the financial elements of cybercrime and how the cybercriminal gets paid.
Markoff, John. (2011). Study Sees Way to Win Spam Fight, New York Times.
Levchenko, Kirill; Pitsillidis, Andreas; Chcachra, Neha; Enright, Brandon; Felegyhazi, Mark; Grier, Chris; Halvorson, Tristan; Kanich, Chris; Kreibich, Christian; Liu, He; McCoy, Damon; Weaver, Nicholas; Paxson, Vern; Voelker, Geoffrey; Savage, Stefan. (2011). Click Trajectories: End-to-End Analysis of the Spam Value Chain. Monograph. Department of Computer Science and Engineering, University of California, Sand Diego; Computer Science Division, University of California, Berkeley; International Computer Science Institute, Berkeley, California; Laboratory of Cryptography and System Security (CrySyS), Budapest University of Technology and Economics. San Diego, California, USA.