
Depending on your interest in certain subjects, feel free to jump ahead:
What is the Domain Name System (DNS)?
Leveraging DNS For Investigations in Your Environment
On a typical morning you might be sipping on tea or coffee when you receive an urgent email from your finance team. They received a strange request from the CFO for a prompt payment of funds, and you suspect a business email compromise. Right after your post-lunch food coma kicks in, you are tasked with incident response on something your endpoint detection picked up on, so you quickly pull the endpoint in to do some analysis. And finally, as you are about to shut your laptop for the day, you hear from a concerned manager regarding your organization’s owned infrastructure. Is there an exhaustive list of domains owned by your organization? Is this obscure domain just something marketing forgot to share with your team? Or is an attacker looking to target your organization? In order to take appropriate action on the following scenarios and finish out your day at a reasonable hour, one question you’ll likely ask yourself is “what datasets do I have at my disposal to gain additional insight and context to resolve these scenarios?”
The purpose of this blog series is to highlight a number of datasets proven by experienced IR teams to be valuable when analyzing network infrastructure. Throughout this series, I’ll provide some context as to why the datasets exist, how they interact with your own internal threat intelligence, and their key strengths and limitations. I feel a bit like a Southwest Airlines flight attendant when I say “we know you have many choices when it comes to selecting your threat intelligence, and we thank you for choosing [enter dataset here] for your investigations!”, but in reality, many folks like yourself are juggling a multitude of internal and external intel, so I hope this blog highlights some tools that exist in your proverbial toolbox, and can help identify when they are valuable to pull off your “threat intelligence pegboard.” This way when you look at the aforementioned scenarios, you have confidence that you didn’t miss the signal.
In this blog, I’ll be focusing on the Domain Name System (DNS). To name a few of its components, you have IP addresses, nameserver hostnames and IP information, Start of Authority (SOA) records, and Top Level Domains (TLDs). Looking at these types of data in a vacuum is nowhere near as valuable as understanding their relationships to one another (and other datasets), so I would be remiss if I didn’t remind you to read Joe Slowik’s masterpiece “Analyzing Network Infrastructure as Composite Objects,” where he illustrates how to analyze network observables according to their relationships and patterns of composition, which in turn yields insights into adversary behaviors, enriching the value of the network indicator.

Credit: Joe Slowik
As one can imagine, DNS is our cup of tea here at DomainTools. For a short period of time, we joked internally about starting a series called “Drunk DNS” (a clever parody of the History Channel’s Drunk History). Of course, DNS is both celebrated and cursed throughout the security industry (depending on the day). The late Dan Kaminsky, who discovered a fundamental flaw in DNS back in 2008 declared “it’s always DNS”. A similar sentiment is shared in a popular haiku.

DNS is quite robust and vast and as a result many folks portray its complexities as security shortcomings. While it’s true that it can be abused and create security risks, the richness of DNS also makes it a treasure trove of information, and as an industry we are only scratching the surface by taking advantage of a small number of DNS records. But to set the stage, let’s begin with why DNS exists and how it operates.
The process of DNS, which I’ll explain here in a moment, dates back to the days of ARPANET. SRI International (known previously as The Stanford Research Institute) maintained a text file referred to as HOSTS.TXT that mapped host names to the numerical addresses of computers on the ARPANET. This process was developed by American information scientist, Elizabeth Feinler. These addresses were assigned manually. You could call into the SRI’s Network Information Center (NIC) and they would grab a computer’s hostname and address and add them to the primary file.

Elizabeth Feinler, Credit: The New-York HIstorical Society Museum & Library
As one can imagine, this manual process to maintain a centralized host table quickly became too cumbersome. By the early 80’s, an automated approach to the naming system was needed. As a result the Domain Name System (DNS) was created in 1983. The IETF (Internet Engineering Task Force) published the original DNS specifications in RFC 882 and RFC 883. DNS was followed by UC Berkeley students writing the first UNIX server for the Berkeley Internet Name Domain (known to many as BIND). (If this topic interests you, I’d highly recommend tuning into our podcast with Paul Vixie, who maintained BIND starting in 1985 before it was ported to the Windows NT platform.)
All devices on the Information Superhighway (whether it be your smartphone, laptop, etc) communicate amongst themselves using numbers known to us humanoids as “IP addresses”. This description always reminds me of the episode of the Office where Dwight finds himself in an epic sales duel with the online Dunder Mifflin store and yells “I assume you read binary, so why don’t you 011 1111 011 011!”. There is a delightful Reddit thread on this altercation if you’re interested.

Credit: The Office, NBC Universal
My sincere apologies for the tangent, now where were we? Ah yes, IP addresses. One of the reasons DNS exists is because it’s awfully difficult to remember 174.35.6.21 in order to enjoy The Onion’s brilliant use of satire (if only they had a special router for that?). Instead, the following process happens behind the scenes, which is known as a recursive lookup. Below is a list of steps in this process:

Credit: Quest10
This is all happening in the blink of an eye, and through this process, you can pull apart many interesting artifacts to inform your investigation. We’ll talk through these individual aspects in this next section.
For one-off DNS lookups, your command line/terminal can be a useful tool. Here is a quick guide on terminal lookups (note that this article incorrectly implies that reverse lookups can show all domains on an IP, but otherwise appears accurate) and command line prompts.
There is a lot of value in DNS that are oftentimes forgotten or overlooked, so what are indicators that should capture your attention? In this next section, I’ll walk through key elements of DNS and include a list of interesting signals when leveraging this dataset.
The tried and true analogy for DNS is the good ole yellow pages. The domain name is the name of the individual you’re looking for, and the IP is their phone number. I covered this process above.
A familiar adage in security surrounding IP addresses is “rent an IP, buy a domain,” meaning threat actors have the ability to move around their infrastructure at will, making IPs more ephemeral than their “phonebook” counterpart, the domain. Regardless, looking at IPs can provide some critical information.
Nameservers are a critical part of DNS, a pillar of a piece of infrastructure’s foundation. For the sake of clarity, I’ll be referring to the authoritative nameserver (rather than the nameserver for the TLD). Here are some things to look for.
Hosting providers and their nameserver naming conventions
Start of authority records define administrative information at the DNS zone level, and they are required for domains. Very often, however, the SOA values are just left at the defaults given by the hoster/registrar. Simply put, the analyst is looking for a non-default, unique email address in the SOA data. SOA records typically include fields called MNAME, RNAME, serial, Refresh, Retry, Expire and time to live (TTL). I won’t describe each of these elements, but if you’re interested, read more on them here.
The mail exchange record points email to a mail server. It describes how email messages should be routed in conformity with the Simple Mail Transfer Protocol (SMTP). The MX record must point to another domain. MX records also include a priority which stipulates preference (the higher the number, the higher the priority).
Per the earlier description of how DNS functions, the top level domain is at the second highest level in the hierarchy after the root. As of June 2020, there are over 1,500 TLDs. The number of TLDs has exploded in recent years, now that any established public or private organization can apply to create a generic top level domain (gTLD). This increases the surface area for attackers to take advantage of lookalike domains to target organizations or leverage trusted brands to maximize credibility.
The list above should be a great starting point for taking advantage of DNS. No useful investigation or analysis happens in a vacuum with a single dataset, so below I’ll highlight complementary data that pairs with DNS, ways to automate this approach, and finally what action to take when your investigation or analysis are complete.
In order to clear room in your day for solving more complex problems, automation is key. Whenever possible, taking a series of manual tasks and turning them into security automation orchestration response (SOAR) playbooks is recommended.
Here is an example pulled from Tim Helming’s blog, Streamlining Adversary Infrastructure Hunting With SOAR:
This would give you a list of domains that aren’t among the Internet’s most common. It would by and large select-in most young domains, since young domains are less likely to make the top million than older ones. Likewise, it will tend to select-in domains of higher risk, because malicious domains tend to be flagged and placed on block lists before they reach the top million. Of course, there are exceptions to these, but it’s a first-stage filter that many SOCs like to use. To put SOAR in context, the diagram below illustrates the workflow of logs from endpoints to further action.

Identifying DNS data that requires further investigation is the beginning. Enriching indicators from your endpoint with DNS data and many other types of data (ideally through an automated process) can help answer the question “is this bad” and be used to expose larger campaigns.
Lists of interesting signals in DNS above are far from exhaustive. I hope they are useful in making quick determinations throughout your investigations. Making yourself the authority on DNS (pun intended) has its benefits. While DNS wasn’t built as a security forensics tool, it contains a lot of details, and you can use these details to your advantage to paint a fuller picture of an attacker’s intent, assess relative risk, and take action. Don’t forget to remember your training and refer to David Bianco’s Pyramid of Pain. It can be easy to dismiss the bottom of the pyramid, but I encourage you to get the most out of the bottom sections before rushing to higher levels. This all starts with a foundational understanding of how and why these datasets exist, and can be quickly improved with Joe Slowik’s methodology for analyzing infrastructure as composite objects. Join me for the next installment of this blog series to explore the Whois protocol.

Download the full cheat sheet (includes DNS, Whois, Passive DNS)
Analyzing Network Infrastructure as Composite Objects
Formulating a Robust Pivoting Methodology
Maximizing Your Defense with Windows DNS Logging
Microsoft Sinkhole Events Report