The DomainTools Report, Fall 2021: Concentrations of Badness with a Side of Surprise
It’s time for another DomainTools Report! As some readers may recall, since the first DomainTools Report in 2015, we have sought to explore our stores of domain registration, hosting, and content-related data to surface patterns and trends that might be of interest to security practitioners, researchers, and anyone else interested in the suspicious or malicious use of online infrastructure. Several of the reports have had specific areas of focus, ranging from TLDs and email privacy providers (2015) to affixes in domain names (2016) to domain “blooms” and “spikes” (Spring 2021).
For this edition, we chose to go “back to basics,” and focus on concentrations of malicious activity by six categories of domain characteristics, several of which we also studied in earlier reports. We expect that some criteria (such as top level domain, IP autonomous system number, and IP geolocation) will remain relevant over the foreseeable future; that is, as datapoints related to domain names, these are unlikely to become less forensically-valuable unless the Internet’s fundamental structure changes. Other datapoints may wax and wane in relevance. For example, email privacy providers as a domain characteristic are dramatically less relevant in the post-GDPR world of default privacy for most registrations.
But the constant across all of these reports is our interest in providing insights into where malicious activity lurks on the Internet, with the aim of ultimately helping the community get better at staying ahead of those entities wishing to do harm online.
Domain Characteristics Evaluated
For this edition of the report, we examined the following characteristics of a domain:
- Top Level Domain (TLD); for example, .com or .net
- IP Autonomous System Number (ASN); these represent an aspect of the domain’s hosting
- Nameserver ASN; these represent the hosting of the nameserver associated with a domain
- IP Geolocation: the country code associated with the location of the domain’s IP address
- Registrar: the entity through which the domain was registered
- SSL Certificate Authority (CA): the CA for certificate(s) associated with domains
We chose these characteristics because they are often used by defenders and security researchers as part of a process of building out a better understanding of a domain. Seasoned practitioners often develop intuitions about the implications of a given characteristic, based on their experience, expertise, and judgment in the analysis of adversary assets. In many cases, the data seen at scale tend to support those intuitions. Certain TLDs, for example, have reputations among security analysts as being dangerous “neighborhoods” of the Internet, and as this and previous DomainTools Reports show, there are indeed some TLDs that have high concentrations of malicious domains. Other criteria are more ambiguous; for example, we will see that when it comes to SSL certificate issuers, some readers may be surprised by what this large-scale analysis shows—and does not show—about where the danger lies.
Methodology of the DomainTools Report
To pinpoint the hotspots of malicious activity, we calculated something that we call “signal strength.” A high signal strength value means that the characteristic in question is over-represented in the population of known bad domains, as compared with neutral ones. The larger the proportion of malicious domains in a given population (an IP address, a nameserver, a registrar, etc) the higher our confidence that any unknown domain from that population may be involved in the threat in question.
For each of the six domain characteristics, the report gives “Top-ten” tables, sorted by the signal strength, for each of the three threat types (phishing, malware, spam). Each table also includes the actual counts of domains associated with the item. As an example, consider this row of data from the TLD section:
The TLD .bar has a Malware signal strength of 108.93 (the highest malware signal of any TLD on the Internet, by our methodology). There are 6,321 domains in that TLD whose chief threat type is malware, according to the blocklists we used. For comparison, we also give the numbers of phishing, spam, and neutral domains associated with the TLD. As a reminder, all domains under consideration had shown recent activity in passive DNS as of the time the snapshot was taken, so the numbers do not include the inactive domains associated with that TLD.
Report Findings: Some Confirmations, Some Surprises
We’re not going to give out much in the way of spoilers here, but we will say that we think readers will find some of the conclusions consistent with their expectations, and others more surprising. In general, defenders and researchers have come to expect certain patterns when they look into suspicious or known-bad infrastructure. Suffice it to say that the more surprising results of this year’s report reminded us of what many of us considered a surprising finding in the very first DomainTools Report: the majority of newly-created domains each day do not show strong signals of maliciousness. It is almost an article of faith among defenders that new domains are dangerous, but the data say otherwise. However, we hasten to add that the inverse of this does comport with expectations: the majority of malicious domains are, indeed, young. We think there are a few takeaways worth considering:
- Your instincts as a defender are probably good, but be aware of confirmation bias in assuming particular data points might reflect risk in domains
- Context is everything! There are cases where a given data point might not statistically mean much in terms of domain risk, but when that data point is combined with others, the picture could be very different (and might help reinforce those instincts of yours).
- There are, without question, some real hotspots of malicious activity on the Internet, and even though some are numerically small in numbers of domains, they can still pack a punch, and are forensically valuable.
We invite you to download the report and check out the findings!