It’s a Big World (of Data) Out There

The Data Points of a Domain

When it comes to profiling adversaries and mapping their infrastructure, more data is definitely better. Think about it this way: if you were about to land on an enemy beach which you knew held gun emplacements, you’d want to know where all of them were. Knowing about some, or even most of them, still portends a bad day; but if you could neutralize all of them ahead of time, you’d be in much better shape at the end of the day. It’s like this with the domains and IP addresses that an adversary stands up for an online attack: what you see—the first thing that “fires” at you—is almost certainly part of a larger offensive structure. However, in almost every case, that first object you observe has data attached to it that can help uncloak the larger campaign.

The data we use to do this comes in many different forms, from DNS mappings to Whois records to SSL certificates to web tracking codes…and the list goes on. But all of these different data types can be thought of as providing two key functions for the investigator: characterizing and connecting. Most data points do both.

Characterizers are data points that tell you something about what you’re dealing with. You can learn about the actor controlling the infrastructure, their goals, their MO, and sometimes their TTPs (tactics, techniques, and procedures).

Connectors are data points that help link a given domain or IP address to other infrastructure that may be part of the same campaign. In most cases, a domain or IP will have some connectors that are very valuable and others that are not. More on this later.

We often think of Whois record data as central to our investigations in Iris Investigate, but in fact there is incredible value to be gained from data points entirely outside of Whois records. Here are some examples:

IP address: potentially a great connector—what other domains are hosted on this IP?
IP geolocation location and ASN: Geolocation can be a good characterizer—does the location of the IP for a domain seem logical?
ASN, the Autonomous System number, can be a connector as well.
Screenshot: When a domain has a website up and running, the screenshot can be a great characterizer. Additionally, screenshots can reveal intent.
Response code: When a web server is present, what response code does it give?
Google Analytics and Adsense codes: these can be characterizers in a basic sense—they show that a web page exists and is doing some tracking—but they can also be great connectors: what other domains share the same code?
MX records: These records can also unearth additional IP addresses that can differ from their hosting IPs, allowing another opportunity to pivot through malicious infrastructure.
SSL Hash: These can be great connectors: what other domains are using the same SSL certificate?
SSL Organization and Country: These can be connectors and characterizers. Do the SSL organization and country make sense based on other things you know about the domain?
Redirects: If a domain has a page that is redirecting to another domain, does the redirection tell you something? Or, as a connector, are there other domains that share the same redirect target?

Obviously not all data points are created equal in their connecting or characterizing value. For example, if a domain shares its IP address with a million other domains, that isn’t of much value, because a) no human is going to review a million domains to look for patterns; and b) high numbers like that make it hard to infer any specific connection between the domains. On the other hand, if an IP has 10 domains, or even a few dozen, there is a higher chance that they are related to each other, and it’s a quick job for a human to start to evaluate those connections.

Example: Russian malware domains

The domain microsoftfree[.]ru is not a good place to get free products from Microsoft—take our word for it! The domain has been added to industry blocklists for malware, and even if it hadn’t been, a quick look at its domain profile would confirm that it is not controlled by our friends in Redmond. But let’s use some characterizers and connectors to see whether this domain is part of a larger campaign.

Spoiler alert: the registrant, Private Person, is not a low-ranking member of the Russian military. This is the .ru registry’s method of granting Whois privacy. So the Whois data won’t be of much help for this domain. Fortunately, there are some other data points that are intriguing.

In particular, the name server host stands out. Both the name server and the IP are potentially good connectors—note the relatively small numbers of domains connected to them, but the name server gets extra “characterizer points” because its domain is a seemingly random keyboard-smash of letters. Legitimate domains don’t tend to be associated with name servers like that. Let’s pivot on the second of those name servers (since it has 42 domains) and see what we get.

This is a close-up of a few of the 42 domains sharing that name server. Notice that these are all recent registrations (as of this writing), and that they all have SSL certificates associated with them. The certificates are associated with a variety of sources; it’s likely that at least some of them—such as the one from Hewlett-Packard Enterprise—are “borrowed.” These are good characterizers; they tell us that the domains intend to use HTTPS, but that the certificates aren’t really there to convey trust or authenticity to a visitor. There are some good connectors in there too—the one from sputnik-forum[.]ru connects to 107 other domains. This suggests a relationship among those. Not a guarantee, but it ups the chances that it’s not coincidental that they have that certificate in common.

When we pivot on that SSL hash and then filter down to just the domains registered since January 1, 2018, we find 50 domains that all have quite high risk scores—some are already convicted (score of 100), while the others are all high-risk (the lowest score in the set is 73).

With just a few pivots—none of which relies on Whois records—we’ve identified a set of domains with a convincing case to be made that they are a) related and b) risky. We could download these as a .csv file in order to build a custom blocklist, or as a STIX document to share with a trust group. While this specific set of domains may not be pertinent to your environment, the process by which we characterized and connected them is applied every day by analysts and threat hunters who seek to expand from an initial indicator to map and assess infrastructure that is potentially dangerous. Remember—if your starting point is a domain or IP that touched your network, that indicator is ipso facto relevant. Anything closely connected to it, by extension, is of interest.

The characterizations and connections afforded by these data points validate and underscore our commitment to collecting and provisioning the best, most comprehensive infrastructure profile data available.

It’s a Big World (of Data) Out There

Share this entry

The Data Points of a Domain

Sign up for our newsletter

Related Content

Why Your Protective DNS Needs Real-Time Data: The DomainTools Advantage

RDAP and BGP in Investigative Journalism

Part 2: Tracking LummaC2 Infrastructure