
Over the last two articles, you’ve learned what a reputation system is and thetypes of data with which to seed it. In this last article, I will discuss howto tune your reputation system. I will discuss how to optimize the data youhave in order to accomplish specific security goals.
One of the best reasons to create your own reputation system is because it willgive you greater freedom to tune it to meet your organization’s own specificneeds. Commercially available IP and domain reputation systems are oftenone size fits all, and you may find that those system’s goals are notcongruent with your goals, or you may wish to fill a need that the otherproducts available to you do not meet. I will discuss some common goals and howyou might use available data to fit your use case.
The most common use cases for IP and domain reputation systems are:
For email, your inputs are relatively self-explanatory. Check DNSBLs and“record” listings for a certain amount of time. I prefer to make that timerelatively short for most inputs—for example, if an IP is listed in CBL orSpamcop BL, that indicates a transient threat like a malware infection; onceresolved there is really no security justification to downgrade the reputationof that IP for a long period of time. If the threat persists or recurs, the IPwill likely quickly be re-listed. Repeated listings indicate an ongoingproblem and justify longer degradation. Longer downgrades are appropriate forIPs that directly transmitted spam to you in the past, such as ESP outboundservers.
Consider downgrading IPs from TLDs and geolocations that have sent asignificant amount of spam to your systems in the past. (Note that I am notsaying that it is a good idea to drop all mail from certain regions or TLDs. Iknow of administrators who categorically do not accept mail from IPs allocatedto AFRINIC, for example, because they do not want419 spam. I don’t want 419 spam,either, but I believe rejecting all mail from a continent is too heavy-handedand an unwise policy which may result in an unacceptable rate of falsepositives). Also remember that a “hit” impacts reputation, but it doesn’tdefine it —- reputation is made of many positive and negative factorsconsidered together, not just one.
Many sites parse the bodies of email messages in order to look for domains thatmay be associated with malware or botnets. I would check SURBL and snowshoeDNSBLs first, since that is low-hanging fruit, and I would decay any listings Ifound very slowly. I would improve the score of any site in theAlexa top 500.(If you want to be more conservative and are inclined to expend the effort,you could check against a list of all URLs your users have ever visitedpreviously. You will get false positives, which you can then add to yourdatabase of visited sites manually, but it’s more effective than you mightthink). I would also quarantine any mail containing links to file sharingsites when the mail does not come from the site itself.
Other things to be suspicious of include URLs on sites with dynamic DNS and anyURL that resolves to a dynamic IP. I do not find URLs belonging to newlyregistered domains to have value, so consider checking domains against Farsight Security’s Newly Observed Domains (NOD) list and let those listingsdecay slowly—after a week or so, we can talk, maybe, depending on othercriteria. There are lists of domains used by botnet command and controlservers; weight those very heavily and let them decay slowly. Some domainreputation services temporarily degrade the reputation of any URL currentlyseen in Pastebin and the like, which you may find useful as well.
With this final article, you should have the information you need to create abasic reputation system, identify data that is useful, feed your system withdata from your own and other publicly available sources, and to weight anddegrade it appropriately.
Kelly Molloy is a Senior Program Manager for Farsight Security, Inc.