valuable datasets to analyze network infrastructure
Blog General Infosec

Valuable Datasets to Analyze Network Infrastructure | Part 3

Depending on your interest in certain subjects, feel free to jump ahead:

What is Passive DNS (pDNS) and How Does it Work?

Valuable Artifacts From Whois

Leveraging PDNS Data in Your Environment

Conclusion

PDNS Cheat Sheet


Introduction

This series, thus far, has covered DNS and Whois datasets. In the final installment of this series, I’ll tackle passive DNS (pDNS) and describe what it is, how it works, valuable artifacts, how to leverage it in your environment, and other helpful resources.

What is Passive DNS (pDNS) and How Does it Work?

The first blog post in this series went into depth on DNS and how it works—a real time system of queries which resolve to IP addresses—which is sometimes referred to as active DNS. PDNS, however, takes advantage of a global sensory network that logs anonymous authoritative query and response pairs. These sensors also capture the first and last time a record set was seen and the count of every instance from the record set. For this reason, As a result, the pDNS dataset includes a hugely useful asset, subdomains (for example blog.domaintools.com. Something to note—If a domain hasn’t been operationalized, but does exist, pDNS isn’t going to deliver much useful intel, but active DNS will give you the upper hand. On the other hand, with active DNS, you can only discover subdomains that you already know or suspect exist. Therefore, there are trade offs when it comes to active and passive DNS. It is worth noting that pDNS isn’t available via Command Line or Terminal lookup as it is only captured by vendors.

Passive DNS Collection

Credit: CTOVision.com

 

Below is a List of Questions PDNS Can Help You Answer:

  • What are all of the domains observed on a given IP address, and when were they hosted there?
  • What are the IP addresses that a given domain uses, or has used?
  • When did DNS requests for a given domain first appear?
  • What does the domain and IP correlation look like in a timeline?
  • What are the subdomains tied to a given domain, or observed on a given IP address?

In a Threat Hunt or Incident Response Investigation, PDNS data:

  • Provides fine-grained correlation of the timing of events such as attacks or breaches with domain and hostname resolutions for malicious infrastructure.
  • Produces evidence of unusual DNS behavior such as fast-flux configurations.
  • Yields comprehensive context on IP addresses by showing what domains are currently, or were previously, hosted on them. This can help an analyst determine whether an IP is part of a given adversary’s infrastructure.
  • Can help the analyst decide whether a domain or IP warrants blocking.
  • Gives you insight into the nature of a domain by exposing subdomains. For example, domains used as part of credential-harvesting or phishing schemes often use subdomains such as “login,” “download,” or the name of a legitimate service or company.
  • Provides additional characterizing or connecting data for domains, including strings in TXT records, emails from Start of Authority (SOA) records, and sender protection framework (SPF) rules.

Valuable Artifacts From Passive DNS (pDNS)

Query and Response

An “A” record, which is short for “address”, maps a domain to an IPv4 address in DNS. Similarly, AAAA records are used to resolve domain names corresponding to IPv6 addresses. For a refresher on DNS, flip back to the first blog in this series. Other record types include canonical name (CNAME), mail exchanger (MX), nameserver (NS), start of authority (SOA), and text (TXT). Note there are many more record types; you can find the full list here. In short, these record types describe a pDNS response. As you might recall from the first installment of this series, DNS is a process that takes a single query, and through the process of a recursive lookup, produces a response. It can be difficult to look at a query or response in a vacuum, so this blog will look at query response pairs in addition to single records in pDNS (a slightly different approach than the previous two blogs).

Subdomains

One reason pDNS is so powerful is it includes fully qualified domain names (FQDNs) rather than just the apex domain. This means you get to bask in the glory of subdomains. When used for legitimate purposes, subdomains are a helpful tool for organizations to break out functions (e.g. hr.company[.]com) or introduce an action (eg. login.company[.]com). Unfortunately, subdomains are yet another tool in a cybercriminals toolbox to spoof trustworthy brands.

  • Similar to how one might use affixes or suffixes (mentioned in the second blog in this series), actors might spin up second or third level domains with terms like “download” or “login” to appear legitimate. 
  • Queries with items like “cpanel”, “webdisk”, on A/AAAA records can indicate where infrastructure is being stood up by a threat actor (in the case of webdisk, this is associated with OVH). 
  • Because a subdomain does not have to be registered—you can create any subdomain on a domain that you own or control—spoofing of legitimate domains often happens in the subdomain space, e.g. “blog.domaintools.com.some.other.domain.tld”

Response

  • A  hostname with high entropy can be an indicator of DNS tunneling (e.g. 24yl1nvnvvm3.d.75e31a11a1d84bdbb80d.msoffice365update[.]com, which was a AAAA record associated with the ISMDoor malware variant that leverages DNS for signaling and data exfiltration). 
  • According to a fantastic paper written by Greg Farnham of SANS, the number of unique characters in a DNS name can be an indication as to the likelihood of DNS tunneling. The more unique characters (his research recommends alerting on anything with over 27 unique characters).

Query and Response

  • Queries with odd AAAA responses which can indicate a unique session ID for the C2 communication An example of an odd response might be 2020:2020:2020:2020:2020:2020[:]2020[:]2020. You can analyze this by doing a reverse pDNS lookup to see all of the queries that resulted in a given response. Sometimes it helps to look for an IPv4 or IPv6 address that is outside the global unicast scope for valid addresses to track these down.
  • A single IP address cycling quickly through queries (or domains) is a potential signal of fast flux DNS techniques. First and last seen pDNS fields display query/response pairs that would have otherwise been missed.  
  • DNS hijacking, which is sometimes referred to as DNS redirection or DNS poisoning, is when an attacker redirects users to malicious sites by incorrectly resolving DNS queries. Oftentimes this tactic is used for phishing attacks, to display ads to generate revenue (sigh), or censorship (e.g. the Great Firewall of China). I’ll also note there are a few types of DNS Hijacking (Router, man-in-the middle, local, and rogue). When staring down a potential DNS hijacking attack, pDNS allows you to see when attackers changed DNS records (when a campaign went active/inactive), where the hijacked DNS records pointed (which offers some intent), and unearths infrastructure for further hunting. Here’s a great example of a deep dive on DNS hijacking attacks via Brian Krebs.

Nameservers

I detailed DNS in the first blog of this series, so my only additional notes for nameservers are as follows:

  • Responses can also help you decipher and map nefarious infrastructure. For example, is the response type is a nameserver, and includes the term “traeumtgerade”, you can quickly glean that the attacker is using Dyn Dynamic DNS services to point at their nameservers, which illustrates that an attacker is running their own infrastructure.

CNAME

CNAME records are used to create an alias from one domain to another. Legitimate uses include things like registering a domain in multiple geographical regions, pointing websites under a brand’s umbrella to a single website, custom subdomains that all direct users to the same site, etc.

Legitimate uses include things like registering a domain in multiple geographical regions, pointing websites under a brand’s umbrella to a single website, custom subdomains that all direct users to the same site, etc.

  • CNAMEs making use of typosquatting tactics that point to a known bad domain are likely up to no good. 
  • CNAMEs act as great connectors, pivoting on a CNAME to see if there are known bad domains associated with this record is a good indicator of badness.

MX and SOA

I wrote about MX and SOA records at length earlier in this series, but here is a quick refresher:

  • An MX record for a mail server on the same domain, much like with nameservers, indicates an owner-managed setup. In the case that this is coupled with mail server validation like sender protection framework (SPF) records it can be a sign that an attacker is trying to make their mail look legitimate to pass through mail servers while phishing.
  • Non Standard MX records, meaning MX records that aren’t stood up by the hosting company of the IP/nameserver are an interesting signal. This could mean they are operating their own email server locally, therefore it is easier to monitor their activity and profile their behaviors. 
  • Number of MX servers on a domain: the same rules apply for MX servers and nameservers. The number of MX servers on a domain can be a good indicator as to whether or not something is legitimate. If a domain has two or so MX servers, this is pretty typical for valid domains. But, if there is only one MX server, or a mismatch of where those MX servers are hosted, this is cause for concern. 
  • MX record name: an “odd” MX record, especially if the MX server name closely matches the semantic structure of the domain name itself, is a good indicator of badness. If you see a bunch of domains with a common naming pattern like “{10,11,12,13,14…}sharepoint-login[.]com” it is likely these are phishing domains. Similarly, when seeing a bunch of domains with keyboard-smash names like “k23j23jklkjlkj32l[.]com” on the same IP, one might assume they are associated with spam infrastructure.
  • RNAME is an extremely important field included in SOA records. This is the email address of the admin who is responsible for the zone. In a post-GDPR world, these email addresses, which aren’t redacted in SOA records, can be uniquely shared attributes that help correlate campaigns and activities. You might also find an administrative email for the nameserver hosting provider, which is not as valuable. Inclusion of RNAMEs in SOA records aren’t terribly common, but they are great low hanging fruit. And hey, sometimes threat actors make simple opsec mistakes!
  • Length of TTL can be a signal of techniques used by bad actors. A short TTL (a second or two) means DNS will continue doing lookups because actors are moving infrastructure around to make it more difficult for vendors and defenders to catch them. Short TTLs aren’t always an indication of badness. They are also commonly used in content delivery networks (CDNs) for less sinister purposes.

Count

Due to caching and other technical limitations of collecting pDNS data (i.e what recursive servers a pDNS provider has the sensor on), the count is an imperfect representation of the number of times a query/response pair is seen. Looking at the count in a vacuum isn’t terribly valuable, but it can be used with a number of other signals to determine if an indicator is bad. It is helpful for relative comparisons, i.e. one domain is getting more hits than another. A final sticking point to keep in mind: the count doesn’t need to be high for attackers to be successful.

First and Last Seen

The first and last seen fields in pDNS are critical to understand when a campaign took place, and if it correlated with other events. It essentially provides a historical picture for you to work from. It isn’t an indicator of benign or malicious behavior, but it is worth noting that this data is critical to:

  • Correlate activity with other events
  • Paint a historical picture (similar to historical Whois data)
  • Create advanced hunting queries to find any connected infrastructure

Text (TXT)

Text records provide the ability to associate arbitrary text with a host or other name and was birted via RFC 1035. This record type was intended to be flexible, and although this field initially only allowed 512 octets for each UDP DNS packet, it is now possible that each UDP DNS packet can provide more than 4000 octets. This change allowed for SPF records and DomainKeys to be stored in TXT records to help protect against spam. Unfortunately, TXT records aren’t always used for legitimate purposes.

There is some excellent research out there on DNS tunnelling. One of my favorites is A Study of Newly Observed Hostnames and DNS Tunneling in the Wild brought to you by the fine folks at Ruhr University Bochum, Germany. I’m going to reference their definition of DNS tunneling directly below.

DNS tunneling is a covert channel technique to transfer arbitrary information over DNS via DNS queries and answers. This technique is often (ab)used by attackers to transfer data in a stealthy way, bypassing traditional network security systems.

  • A good chunk of TXT records are associated with DNS Tunneling. Their research found that nearly all resource record type NULL requests and more than a third of all TXT requests can be attributed to DNS tunnels (gasp!). This is slightly tangential to this blog, but I also found this table from their research interesting, which shows the distribution of record entries they observed in their dataset.
Distribution of record entries they observed in their dataset.

Credit: A Study of Newly Observed Hostnames and DNS Tunneling in the Wild

 

  • TXT records with more characters have also been linked to distributed denial of service (DDoS) attacks in previous years
  • Responses with manually entered SPF and DomainKeys Identified Mail (DKIM) records can indicate the intention to use a domain to send mail because business email compromise (BEC) will need to have their email domains verified through Google, Zoho, or another provider. You can identify this tactic by looking at the TXT record responses. If this tactic is combined with typosquatting (which is described in the second blog in this series), that is a strong indicator of badness.

Leveraging PDNS Data in Your Environment

If you have read my previous two posts in this series, it will come as no surprise that I highly recommend scaling up your manual investigation using these tactics using security automation orchestration and response (SOAR) as well as beefing up machine learning capabilities.

What Pairs Well with Domain pDNS Data?

Conclusion

Thank you for sticking with me through this long-winded series on DNS, Whois, and Passive DNS. I hope this series serves as a useful guide to refer to and familiarize yourself with in your investigations and as you analyze network infrastructure. You can find the pDNS cheat sheet below, as well as a PDF that includes helpful signals from all three datasets I covered. If there are other datasets you’d like for me to explore, please don’t hesitate to DM me on Twitter. Thank you again and stay safe out there!

PDNS Cheat Sheet

Record TypeObservationPotential Indication
Expiration DateRecently expired domain with changed registration information from previous ownership.Potential BEC or phishing infrastructure.
SubdomainTyposquatting or non-typo spoofing (e.g. affixes/prefixes).Suspicious infrastructure.
HostnameHostname with high entropy.Potential DNS tunneling infrastructure.
HostnameHostname (DNS RNAME) with 27+ unique characters.High likelihood of DNS tunneling infrastructure.
AAAAQueries with odd A or AAAA responses.Potential C2 infrastructure.
IPA single IP cycling quickly through queries or domains.Fast flux.
IP(a) hosting on dynamic consumer broadband IPs (e.g., DHCP addresses) and (b) IPs from multiple ASNs, and (c) NOT something associated with a CDN.Fast Flux
NameserverNameserver response associated with known badness associated with a provider.Attacker is likely running their own infrastructure.
CNAMETyposquatting.Potential phishing infrastructure.
CNAMECNAME connected to other known-bad infrastructure (e.g hostnames, domains, etc).Suspicious infrastructure.
TXTTXT responses with SPF or DKIM records associated with typosquatting domains/CNAMEs/Subdomains/Nameservers.Potential BEC or phishing infrastructure.

 

Let’s Talk

Speak to a DomainTools consultant about how to leverage Whois, DNS, and Passive DNS to strengthen your security posture.

Schedule 30 Minutes Today

Additional Resources

A Study of Newly Observed Hostnames and DNS Tunneling in the Wild

Analysis of DNS TXT Record Usage and Consideration of Botnet Communication Detection

Detecting DNS Tunneling

DomainTools 101: Looking at Greenbug’s DNS Tunneling in ISMDoor with DomainTools Iris

Finding Additional Indicators With a SeaTurtle Deep Dive in Passive DNS Within DomainTools Iris

Passive DNS Data in Iris

Strengthen Your Investigations’ Resolve with pDNS

Valuable Datasets to Analyze Network Infrastructure Part 1

Valuable Datasets to Analyze Network Infrastructure Part 2