Behind every investigation is a detective, tasked with combing through the trivial details at a crime scene to find the clues that count. Every good detective knows what evidence to look for first: fingerprints. Essential to identifying a culprit, fingerprints are the first step toward uncovering other important pieces of information—behaviors, intentions, motives, accomplices, related crimes—that can help crack a case.
This is true in threat hunting too, where understanding the adversary is critical. Gathering intelligence—such as what the attackers are most likely to do, who they are targeting and why, what they want to achieve, their go-to tactics and how they will react if detected—is the only realistic way to make smart decisions about how to defend our networks.
This starts with observing our enemies, and their tactics, techniques and procedures (TTPs), even when they are attempting to cover their tracks. Leveraging network forensics, incident response processes and known facts from previous (or active) intrusions can provide a window into a threat actor’s activities and behaviors. With facts from controlled observations, teams can develop a hypothesis about how the adversary operates, and the tools being used. Identifying patterns in activity and the typical targets of a specific adversary will help the team validate or refine the hypothesis. As adversaries evolve or change their TTPS and targets, the hypothesis should be refreshed.
Case in Point
One of the first documented cases of cyber adversary observation was Clifford Stoll’s investigation into hacking attempts against the Lawrence Berkeley National Laboratory in California. After extensive staged observation and logging all of the adversary’s activities, Stoll was able to trace the threat to Germany. Further efforts led to successfully identifying and apprehending the culprit, who was found to have been selling the information from his activities to the Soviet KGB.
Once a consistent hypothesis has been achieved, the security team can move into the phase of applied threat intelligence and establish detections that apply to the observed and hypothesized behavior. This allows investigators to take a more proactive approach, and explore threat intelligence to reveal the unknown unknowns. One way to do this is through fingerprinting adversary infrastructure with web assets.
10 Tips for Web Asset Fingerprinting
Threat actors are constantly applying new methods, making it impossible to know everything about them or their capabilities. But with the 10 tips below, teams can begin to discover digital fingerprints that will help lead to additional clues, just like in an investigation of a physical crime.
- Don’t Overlook Operational Security. Operational Security (OPSEC) is important when hunting down an adversary, especially when attribution is one of the investigator’s goals. An OPSEC mistake can reveal critical information that can lead to a takedown.
- Leverage Applied Threat Intelligence. One example of applied threat intelligence is looking at HTTP GET vs. POST ratios. When you understand the normal ratio, you can find anomalies to identify attacker behavior. Newly registered domains are another type of fingerprint that can be useful in the applied threat intelligence phase.
- Tap Into DNS and Whois. An attacker might register 25 domains at the same time, but only use one or two. If they are all registered with the same nameserver, investigators can use that as a fingerprint to pivot upon to find more of the attacker’s infrastructure that can be blocked from the network.
- Remember That Adversaries are Programmers. And programmers are unlikely to write custom code for every site they develop. Instead, they reuse some of their foundational code and tools for every site, and customize portions. Adversaries do this too. So, elements like the set of third party files loaded into an HTML document, and the order in which they are loaded, can reveal an adversary’s “style”, and therefore provide a potential fingerprint.
- Connection Searching. This approach includes searching file names found on a suspect site for connections to other sites that may have shared infrastructure. There will be false positives with this tactic, but typically, the results that yield the lower number of connections actually indicate a stronger likelihood that there is a connection between sites that isn’t immediately obvious.
- Pivot. Pivoting across pieces of information like nameserver, Whois information, email domains, shared hosting and related domains can provide additional fingerprints for investigators to search. For example, in the DomainTools Iris Investigate Platform, an investigator can look at IP addresses and other characteristics that share the same value with a suspicious domain or known fingerprint. In many cases, this can reveal additional unknown domains or details to help tell the full story about a threat.
- File Hashing. Analysis of compressed files can show, even if the file names are different, when files are the same. This will help draw connections between dangerous sites. If investigators think an attacker is going to try to avoid detection by adding junk to their files, or inject certain CSS files into a page, teams can grab those modules and apply function level hashing and analysis for further insight.
- Follow Malware Analysis Practices. Techniques—including analyzing signatures, external network calls, reuse code, code style and runtime—have been used in malware analysis for some time. These can be extended to the web space to analyze and fingerprint malicious sites.
- Replicate Advertiser Browser Fingerprinting. Advertisers use reverse browser techniques to identify and fingerprint users by a variety of attributes (cookies, preferences, location, etc.) for ad targeting. Security teams can use a similar approach, by looking at page attributes to target threat actors.
Evolving the Method
Web asset fingerprinting is a new method, and therefore will require some fine-tuning as it is applied at scale. There will be false positives, and security teams will need to find ways to manage the signal to noise ratio. Ways to do this include:
- A system for adding benign content and commonly used files to allowlists.
- Narrowing the scope of files that the team is hashing and pivoting on.
- Focusing on indicators that have a strong correlation to known intelligence.
Catching cyber culprits will always be a cat and mouse game. While attackers are constantly becoming increasingly sophisticated, it will continue to be difficult for them to completely hide their fingerprints. Defenders that are adept at web asset fingerprinting and strategic about how they leverage the intelligence gathered from this approach will have far more success in answering important questions about their adversary and blocking malicious infrastructure from their networks.