Farsight Long View

Threat Intelligence and The DIKW Pyramid

Written by: 
Published on: 
Jul 1, 2015
On This Page
Share:

Abstract

If you read any security industry press you will likely see at least onereference to the term “threat intelligence”. But what exactly is it and whatdoes it mean? Unfortunately as it has become a saturated marketing term, threatintelligence means different things to different people. Much like“Advanced Persistent Threat” (APT) has been for a few years, you have to knowwho’s speaking to understand the context in which they’re using the term.

Threat Intelligence

Gartner defines threat intelligence as:

threat intelligence is evidence-based knowledge, including context,mechanisms, indicators, implications and actionable advice, about an existingor emerging menace or hazard to assets that can be used to inform decisionsregarding the subject’s response to that menace or hazard.

Based on this definition, true threat intelligence would need come in a richformat such as STIX or another format capable ofexpressing a great level of detail, such as a detailed report. To meet the expectation of the Gartner definition both indicators and mechanisms as well asthe confidence associated with that information would need to be included in theoutput.

The DIKW Pyramid

Though it may be a trope, the DIKW Pyramid gives us a simple framework for discussing the relationship of various forms of security feed. Each layer is discussed below.

Data

At the first layer, D stands for data which is the least valuable andmost plentiful stage. This stage can be thought of as a collection of atomicfacts in isolation. These facts may be interesting, they may be repetitiousand they may even be obvious. We could be talking about NetFlow, pcap, loglines, SNMP traps, or anything else that exists in “raw” form as it would becollected in the field. This raw data is a critical source material but itwould be a real stretch to call it threat intelligence. While it might be richin indicators it is sorely lacking in context, implications or advice. In thisform it wouldn’t even be appropriate to call this a threat feed, the raw dataas collected would include both harmless and malicious activity.

Information

The next layer, I represents information. Information is often filtered toinclude only a defined context possibly de-duplicated or otherwise reduced tothe set from data that has value for a specific application. This stage wouldalso see the filtering of common false-positives and the removal ofnon-critical privacy impacting information.

The I layer is the stage where many researchers start to produce feedsas an output and most community data sharing projects function at thisstage. However, according to Gartner’s definition, we cannot yet call thisinformation threat intelligence. With careful filtering we can start to adddepth and context to the information that it does provide and in some casesbegin to formulate advice for consumers that share the assumptions made by thepublisher.

The data that Farsight collects as part of our PassiveDNS platform exists atthe I layer. We have designed the collection process such that we gainthe benefit of existing caching as a first-stage de-duplication mechanism. Wecan also leverage the abstraction provided by this cache to anonymize thedata collected (this is just one example, it illustrates how some carefulplanning can result in easy value-add).

Knowledge

The K layer sitting on top of information, is knowledge. At this layer wecan take multiple information-level feeds and their associated contextand combine them such that the output includes additional details oris filtered in such away that it becomes actionable. Extremely effectiveproducts functioning at the knowledge layer also take as input context from theconsumer. For example, this can be their networks or domain names, securityposture, risk tolerance and so on. Here we start to see the early signs ofthreats intelligence, however many outputs of knowledge level systems can beingested by automation.

Wisdom

Finally we have the wisdom layer W. If you’re going to stick strictly tothe Gartner definition true threat intelligence products must exist here.This is where you would take elements from the lower layers andcombine them with a detailed analysis and even more external data andproduce either a report or a rich data feed providing almost step by stepactionable countermeasures and detailed risk analysis to the consumer.

Timing, Action, and Value

There are many products and services in the security marketplace todaythat can be called threat intelligence, there are also many that cannot.A key factor that we have not yet discussed is timeliness. Wisdomlayer threat intelligence, at the very least, takes hours if not days tocollect, vet, and compile. With sufficiently advanced automation, knowledgecan be generated often in a matter of minutes. The bigadvantage of the information layer is that it can often be acted onautomatically within seconds. Last we have data, while it’s much harder toeffectively act on in a real-time manner, it can be on hand within milliseconds.

The value of a feed has a complex relationship with its timeliness.Feeds that are completely digested into a human readable report withcontext and confidence are often actionable, and therefore valuableoperationally and often more expensive. It is worth noting that in orderto reach the level of detail require to meet Gartner’s definition takestime. While a report is more immediately actionable the delay requiredto generate and ingest that report would reduce its effectiveness inpreventing attacks.

Further down the stack, feeds at the information or data layers oftenlend themselves to automation. With careful architecture systems canconsume lower confidence, higher volume feeds that enable the securityteam to defend first and understand later. With the reduction of manualoversight these lower layer feeds often have a higher false positiverate, however if the use of the feed is carefully aligned with theassertions made by the feed the risks to operations can be minimized. Itis often preferable to block a site for a few minutes based on anautomated feed while a manual determination is made.

As you’d expect wisdom layer products are usually more expensive then the otherlower layers. Knowledge less so and information can often be found in theopen-source domain. Data is somewhat interesting while it is the least valuablein terms of actionability, access to it often touches on privacy or regulatoryissues. Sources at the data layer are also encumbered by volume and transmissionchallenges. So while it may be the least processed, data, at times is onlyavailable as under NDA or agreement. As a result of this you often find thedata partners prefer to exchange information layer products.

Conclusion

Defining precisely what a “threat intelligence” service looks like can be achallenge. No two vendors build to the exact same parameters. The DIKW Pyramidgives us some simple building blocks with which to assemble an array ofproducts. Knowing how your feeds come together can help you leverage them to thebest effect. Be sure to understand how the source material is collected andprocessed, this will enable you to select the most appropriate actions to takeor inference to draw.

Ben April is the Director of Engineering for Farsight Security, Inc.