Using Web 2.0 to Identify Phishing


noam at SecuriTeam has started an experiment with searching for a sender of an email to possibly identify spam. I had had a couple of ideas with regards to phishing (not spam in general) to help combat it, but haven't spent a lot of cycles on it.

Currently, most phishing site detection is dependent on a RBL to identify a phishing site. But botnets are beginning to be the front door into those phishing networks, so you could actually have hundreds of thousands of users follow a phishing link before the same web server is used twice. Also, because of the famous security question with the same responses as expired/mismatched certificates, the user's natural response ("Click Yes if you want to do this dumb thing") is actually the wrong one.

I had come up with a couple of anti-phishing techniques, but it would require buy-in from frequently-phished companies who wanted to cut back on their own fraud.

The first idea centered around using heuristics to determine if a site is trying to look like the victim site. If XYZ.com is the real site we're trying to lure users from, XYZ.com knows what their look and feel is - the colors they use, the fonts, spacing, key phrases they use, logos, etc. Browsers actually have enough information when rendering a site to do a lot of these heuristics. Now, when a user visits a site, and the color scheme closely matches, and there's an XYZ.com logo, and some words similar to XYZ.com, and a login form, but the host is not in XYZ.com's address space, the user is warned. Only now, since we know that the phisher was trying to lure XYZ.com customers and not ABC.com customers, we can give the user three choices: 1) Click YES if you want to go to XYZ.com (the real correct answer), 2) Click NO to do nothing, or 3) Click this "I understand this is silly and I wish to go to ECKSYZ.com" box and click You've Been Warned to go to the phishing site. It's not perfect, attackers would change just enough to go below the threshold, but depending on how much they wanted to reduce fraud, XYZ.com can change either their heuristics or the threshold score. Or the user could change the confidence level.

Another idea I bounced around with a colleague centered around using the user community to make a determination about the legitimacy of a site. If you visit a site that has a login form, the browser does a check against del.icio.us or Alexa or even Google to determine if the target site has been bookmarked before, or is highly-ranked, or if it's been searched yet. The warnings on these, unfortunately, would tend to be a little less clear. "Nobody has ever bookmarked this site, are you silly enough to be the first?" or "Hmm...this site isn't very popular - sure you want to do this?" And of course, attackers can create lots of fake social bookmarking accounts and bookmark their own phishing site.

So for you plugin developers with gobs of time, go write those so we can see how ineffective they really are.


  1. Interesting ideas. But as you mention the hackers might take precautions on the "links to this site" idea. If it was possible to get a date from google on when it first indexed the site, that might help a bit...

  2. Of the 2 ideas, I think the "links to this site" method is the most feasible. The heuristics method sounds very complex to design and also difficult to tune for false positives and negatives.

    I think Phishers would have a harder time countering the "links" method than you think. They rely on stealth and speed to catch as many victims as possible before the site gets discovered and taken down and/or added to your browser's list of sites to block. So they would have to both quickly build a reputable set of links and quickly spam their victims almost simultaneously. Not impossible, but difficult.

    I would love to see this as a firefox plugin. Maybe if it caught enough momentum, it would eventually get built into the browser.

  3. @keith - I thought the heuristics would be hard, too - or at least prone to false positives, but if you limit the checking to pages asking for a login, it becomes a little more reliable - or at least the false positives would be less annoying because you're less likely to see a login form on a "berate Insert Financial Institution Here" than you are on a well-crafted phishing attempt.

    The part about the first method that is the most promising to me is that if you get a close-enough match, you know what the victim meant to do, so you can give them the option of doing the right thing. Right now, when you're about to do something dangerous, your browser tells you "Click yes to do it anyway".