Analyzing Network Infrastructure as Composite Objects
Share this entry
Introduction
Network infrastructure is one of the primary observables associated with cyber intrusions. From IP addresses serving as scanning or reconnaissance infrastructure through domains functioning as command and control (C2) or exfiltration servers, network infrastructure forms a prerequisite for adversary operations. Viewed through typical models such as the Lockheed-Martin Cyber Killchain, (shown below) network infrastructure observables factor into nearly every stage of the intrusion lifecycle as either a critical dependency or an enabling factor.
Yet while commonly referred to and almost universally present, network infrastructure observables—specifically domain names and IP addresses—are also frequently derided as atomic, minimally enriched items for defensive purposes. This paper will show that such a view is mistaken and misguided—but only if we expand our understanding of network infrastructure observables and their characteristics.
A thorough examination of network observables shows differentiation between minimally enriched indicators and composite objects which enable more in-depth understanding and analysis. By adopting the latter view, a seemingly atomic object such as a domain name yields a number of linked observations, which enable further analysis and pivoting. By following this methodology, defenders can use a relatively small set of observations to build a robust picture of adversary tendencies and unearth additional campaigns and adversary observations.
Indicators and Atomic Objects
Information security operations live through the ingestion, analysis, and disposition of technical observations. Frequently called “indicators of compromise” (IOCs), these items are, in theory, composite objects consisting of observations, descriptions, and metadata. Yet, in practice, “IOC” refers to a debased, diminished version of the original concept. Rather than containing built-in contextuality, IOCs instead are reduced to mere observables in isolation: an IP address, a domain name, and a hash value.
While (debased) IOCs still drive much of everyday network security operations, they are increasingly derided by analysts and industry representatives. Examples include calls to emphasize adversary behaviors over specific technical observations for defense or theoretical constructs such as the Pyramid of Pain representing the “staying power” of different types of observations.
While these arguments are generally correct and insightful for improving the practice of information security, such developments come with an implicit and often ignored cost. In a rush to embrace behavior-based, tactics-techniques-procedures (TTP) focused defense, the value of indicators and their nature may be left behind.
A more thorough understanding of what exists within an indicator allows us to explore in greater detail the nature and basis for that indicator’s existence. Just as an atom, while representing the fundamental building block of matter, contains subatomic particles that define its characteristics, the same is true for bare indicators. With further analysis and enrichment, we can discover greater details and better understand these observations.
Nature of Network Infrastructure
Network infrastructure observables are those artifacts related to intrusion events or adversary activity linked to delivery, communication, control, and exfiltration, among other items. Although not exhaustive, examples of network infrastructure observables include domain names, IP addresses, and SSL/TLS certificates. These items are interrelated as they pertain to aspects of the same overall communication scheme: an IP hosts a domain that uses an SSL/TLS certificate to encrypt traffic.
At first glance, these items appear to be unitary atomic indicators. As such, they would appear to require enrichment and context outside of themselves to have lasting, meaningful value for understanding adversary behaviors and tendencies. However, further investigation of these items indicates a more complex nature with multiple subcomponents and characteristics that identify these items, under proper analysis, as composite objects.
As shown in the updated image, infrastructure observables contain a wealth of data when properly analyzed and observed. Extending the comparison to atoms made above, each type of network observable effectively “breaks down” into a mixture of subcomponents. Understanding and analyzing these items, their relationships, and patterns of composition yields insights into adversary behaviors which extends and deepens the value of a “bare” network indicator.
Domain Names
A domain name is a natural language, a human-readable item designed to reference a machine-focused IP address hosting content online. While content or services can be accessed simply through an IP address, domains facilitate the process and allow for a certain degree of “branding” or uniqueness through their name.
The domain itself is not a unitary object, though. Instead, the domain consists of several components or identifying metadata that can be used to “fingerprint” or gain further insight into the domain’s nature or creation. In addition to any hosting information related to the domain and its use, covered below, the following items represent characteristics of a domain that analysts and defenders can leverage:
- Domain Registrar: To create and take ownership of a domain, an individual or entity needs to work through a registrar to secure a domain through one of the registries managing the desired Top Level Domain (TLD – e.g., “.com”). Registrars differ widely in pricing, client scrutiny, and other aspects. As a result of these characteristics and infrastructure preferences, threat actors may prefer or primarily leverage certain registrars over others for infrastructure creation.
- Domain Registrant: Domains are created by a given registrant. While this information was historically quite useful, as such information would include contact email addresses and other information that could be used to fingerprint infrastructure creation, the increasing adoption of privacy protection services and the impact of the European Union’s General Data Protection Regulation (GDPR) have greatly restricted such information at present. Nonetheless, commonality in privacy protection services across registrations can still be a weak link to tie together various domains.
- Name Server: Domain resolution to an IP address requires an authoritative name server in order to translate requests. Identifying name servers associated with registration—especially specific authoritative servers—can reveal patterns of infrastructure creation and adversary tendencies.
- Domain Naming Theme or Convention: In one of my previous posts, I described how actual domain name selection may be used to infer adversary intent as well as adversary tendencies. Threat actors must pick something for a domain name, whether this is a randomly-generated string, an item matching a theme, or a name matching a target or campaign. Identifying these themes or conventions can be a surprisingly useful mechanism to differentiate domain registrations and identify commonalities for an actor.
The totality of these above items defines a domain. Just as they are components that represent the domain, they are also items that can be used to search for similarly-structured or created infrastructure. For example, an adversary may consistently use the same combination of name server, registrar, and registration privacy protection service that can enable pivoting and identification of additional adversary infrastructure.
As seen in the DomainTools Iris Investigate screenshot below, these items can be readily identified and used for pivoting purposes.
IP Addresses
IP addresses are the identifying schema for Internet Protocol (IP) traffic and are used to designate specific machines or servers to receive traffic. While domain names are not necessary for network communication, IP addresses are required for IP-based communication. An active, communicating domain will always be paired with an IP address, while an IP address need not have a related domain to ensure communication between hosts.
While IP addresses are one of the most commonly seen indicators in CTI reporting, just like domains IP addresses hide multiple subcomponents that identify aspects of adversary tendencies and behaviors:
- Hosting Provider: Adversaries must find reasonably private, non-attributable hosting for network infrastructure. Options include any major cloud service providers from Amazon Web Services to DigitalOcean; smaller virtual private server (VPS) providers, or utilizing services such as CloudFlare to mask true hosting from monitoring parties.
- Hosting Location: In addition to hosting providers, threat actors have a degree of choice over hosting location. Cloud, VPS, and other providers typically own infrastructure in various countries. Adversaries can leverage location specificity for purposes ranging from avoiding potential geographic-based traffic filtering to taking advantage of the legal system of the hosting country to maximize privacy or make defender investigations more difficult.
- Server Type: Infrastructure still needs a system on which to run, and the choice of operating system (OS) and version can also be used to fingerprint adversary tendencies. Threat actors can decide between various flavors of Linux to different versions of Windows for the underlying OS. Identifying particular tendencies—especially related to exposed system services, described below—can reveal patterns of activity that can be used to identify or disposition new infrastructure.
- Server Services: To function as a command and control (C2) or another node, a server must listen on some service. The most direct and basic would be HTTP or HTTPS, in which we as defenders, can identify the web server type, version, and, in the case of HTTPS, server SSL/TLS certificates (described further below). Identifying non-standard or atypical services, especially for unique or custom C2 frameworks, can further enable identification and tracking.
To illustrate the above concepts, the suspicious domain “adverting-cdn[.]com” is hosted on a dedicated server at 213.252.246[.]23. Within DomainTools Iris Investigate, we can identify the IP address, hosting provider, and hosting location:
Using an internet scanning and enumeration tool such as Shodan, we can further investigate the server to identify services and fingerprint the server’s OS:
Based on the above, we can identify a suspicious-looking domain hosted using a specific service in Lithuania, exposing HTTP and SSH as well as HTTP over TCP 8000, and allowing us to fingerprint the server as a Debian Linux machine. Using this information, we could work to identify further infrastructure through both domain and server characteristics.
SSL/TLS Certificates
Finally, adversaries (as well as most legitimate web services) frequently employ standard encryption using the Secure Sockets Layer (SSL) or Transport Layer Security (TLS) protocols. While adversaries can certainly use custom encryption or encoding protocols for traffic, the ubiquity of SSL/TLS-wrapped communications and the limited visibility into such communications for most organizations make the publicly-available standard both very effective and significantly cheap for threat actors to deploy.
As a form of public key cryptography, SSL/TLS encryption depends upon certificates for functionality. Certificates can be tracked through various data points, as they typically feature identifying information related to the certificate owner, such as organization or location.
Certificates can take various forms from self-signed, untrusted items to certificates created through free and unvetted services (such as Let’s Encrypt) or sources which perform some degree of vetting when issuing them. While resulting metadata will vary depending on the issuer and the certificate, there is a rich history in using certificate characteristics to track adversary behavior, such as legacy APT28 or Fancy Bear certificate and infrastructure activity.
Looking at the suspicious domain european-who[.]com, we discover a free certificate associated with cPanel services. As such, there is limited data to pivot off of although we can note the limited certificate information and use of a free service to identify this as potentially suspicious, while the certificate hash value provides a way to track the use of this specific item.
Even though in this case we have limited information for direct pivoting, just identifying the use of self-signed, built-in cPanel certificates can be used to differentiate and identify further infrastructure discovered through domain- or IP-based analysis. Identifying another network object with similar domain or IP characteristics that also deploys similar SSL/TLS certificate creation tendencies can allow us to link infrastructure and begin understanding adversary behaviors.
Composite Details and Pivoting
In addition to pivoting within different types of infrastructure characteristics, as shown in the diagram below, understanding of adversary infrastructure tendencies enables pivoting between items as well.
In this view, insight into domain creation can reveal infrastructure hosting tendencies, which can be leveraged to identify additional domains. Or a persistent pattern in SSL/TLS certificate creation yields additional domains which in turn map to additional infrastructure.
However, while the discovery of new indicators is enticing and potentially useful for defense, this represents an intermediate objective as part of a larger process. Instead of merely attempting to identify more IOCs, cyber threat intelligence (CTI) analysts should leverage this work as a means to identify fundamental adversary tendencies which can be used to continuously identify and disclose infrastructure over time. Indicators will certainly be produced through this work, but they represent an output of a more fundamental process of identifying and observing ground-truth adversary behaviors in creating network infrastructure.
Example: Late 2020 Ryuk Activity
To see an example of the above, we can look at the recent episode of Ryuk ransomware incidents across multiple hospital and health care provider systems in the United States, linked to a group referred to as UNC1878 by information security company FireEye. While utilizing malware and droppers such as BazarLoader and BazarBackdoor, Ryuk deployers still required C2 infrastructure to control and further expand infections in victim environments. Based on information released by FireEye and Kyle Ehmke from ThreatConnect, and confirmed by multiple additional parties, the UNC1878 network infrastructure activity encompassed several hundred domains created from summer 2020 through October 2020. (Please see the linked file for IOCs.)
While seemingly overwhelming, the above domains contained a number of commonalities:
- Similar registration patterns used over time.
- Preference for a limited number of hosting providers.
- Consistency in SSL/TLS certificate characteristics.
For example UNC1878-linked domain “drive-boost[.]com” features the following characteristics:
Breaking these observables down, we see the following:
- Use of consistent naming “theme” reflecting IT-related concepts or items (e.g., “driver”).
- Use of the Namecheap registrar.
- Consistent use of WhoisGuard privacy protection service.
- Consistent use of registrar-servers[.]com DNS server.
While the above items individually are quite common, taken together they allow us to begin filtering likely related items. Looking at hosting ISP patterns, shown above in a DomainTools Iris Investigate visualization, reveals additional characteristics—namely four distinct “waves” or series of activity centered around the following providers:
- Private Layer Inc., located in Switzerland.
- Psychz Networks, located in the United States.
- Frantech Solutions, located in the United States.
- Combahton GmbH, located in Germany.
Now we have a better way of clustering known activity, and potentially identifying new, similar items. But even further options exist within a subset of domains that feature an existing SSL/TLS certificate. For example, looking at “driver-boosters[.]com” shows the following certificate information:
In this case, we see a certificate pattern using a Locality Name of “Texas” and an Organization Name of “lol” for a self-signed certificate. Diving into certificate information yields over 100 items which, when looked at in conjunction with other items documented above, allows for high confidence attribution to UNC1878 activity.
Through this process, we have identified a set of characteristics denoting UNC1878 infrastructure. This can be used both for post-incident attribution to determine what entity may be responsible for a breach. Additionally, such methodologies can be used through datasets such as DomainTools and search tools such as Iris Investigate to proactively and preemptively identify infrastructure as it is created. In this fashion, CTI becomes more attuned to adversary tendencies and characteristics while also enabling more direct, proactive support to network defense operations through continuous identification of adversary infrastructure.
Conclusion
While network infrastructure indicators and observables are typically viewed as atomic objects, seeing these items as composites enables powerful analysis able to keep pace with adversary evolution. By understanding the fundamental nature of items such as domain names, IP addresses, and SSL/TLS certificates, analysts can begin understanding fundamental adversary tendencies and tradecraft. When done well, such actions enable defenders to not only accurately disposition new infrastructure as it is discovered, but also to identify new infrastructure as it is created to boost defensive operations.
While this process can be quite powerful in identifying and tracking adversary operations, we must also note limitations to this methodology. For example, adversaries may leverage compromised, legitimate infrastructure for communications and similar activity in order to hide their tracks and confuse analysis. This is an increasing trend for many threat actors, which can throw off analytical techniques such as those described above. However, even in these cases, possibilities exist such as identifying underlying server or software commonalities which may be indicative of a vulnerability exploited to gain access to the legitimate server. These commonalities can enable similar types of pivoting to what was discussed previously with respect to adversary-owned-and-operated infrastructure.
By viewing indicators as not just isolated objects but as composites containing multiple components that can be used to better understand the nature, purpose, and composition of the indicator, CTI analysts can unlock a greater understanding of adversary operations. Furthermore, while this article is limited to network observables, the same fundamental concepts are equally applicable to host and file-based indicators as well. By further refining, researching, and enriching indicators, CTI analysts can continuously push the envelope of threat understanding and threat detection, enabling both a better understanding of past events and potential identification of new threat vectors from known adversaries as they emerge.
To learn how to identify and track adversary operations in DomainTools with Iris Investigate, visit our product page.