Hunting on the Cheap, Part 1: The Architecture

As security approaches reliant on known indicators of compromise (IOCs) are increasingly failing, “assume breach” has become a common expression in the industry. Far too often, intrusions go undetected until an external party discovers a breach and notifies the organization. Instead of relying on signature-based solutions or visits from a third-party to learn of a problem, network defenders need to “assume breach” from unknown adversaries who are already active within the enterprise. Given the increasingly targeted and personalized nature of attacks, network defenders must expand beyond searching for known IOCs and hunt for unknown breaches within their networks. This systematic pursuit of unknown adversaries is known as cyber adversaries hunting.

Hunting is not without its challenges. A relatively new and ill-defined concept, some believe hunting is outside their personnel or resource capabilities. Defenders need powerful tools to sift through mountains of data to rapidly detect and deal with a compromise. A full-featured hunt platform dramatically increases a hunter’s power, but security budgets are limited and organizations cannot always invest in every promising technology. Fortunately, there are several ways to hunt “on the cheap.”

At this month’s SANS Threat Hunting and Incident Response Summit, Endgame addressed some of these misperceptions and described ways security professionals can begin hunting without making large, up-front investments. This first of three related posts addresses how to get started hunting on the cheap on your network.  The second post will next address the various open source ways to cheaply analyze and identify high-order trends on networks, and the final post will conclude with a discussion of some easy ways to begin hunting on your hosts.  


Limitations of IOC Search

Security at the network level has traditionally involved searching for IOCs, such as known bad domains, blacklisted IPs and sometimes CIDRs, or has relied on using tools such as Snort or Bro to search for signatures associated with malicious traffic. With malicious tradecraft rapidly evolving and adversary infrastructure becoming less static and harder to distinguish from legitimate services, using network IOCs to detect threats has become harder and less effective. In other words, network IOCs are quickly obsolete. Threat actors often monitor their network assets, and as soon as they are detected by a blocklist, they move on to a different endpoint. Some attackers segment infrastructure on a per-target basis, reducing the value of global knowledge of the associated IOCs.

Cloud computing has only accelerated these challenges associated with IOC search. It is very easy for an adversary to get IP addresses from one of many hosting providers. Similarly, new ccTLDs and ICANN tlds managed by registrars that require little or no background check make this even easier and are cheaper or free, and registrations are stealthy due to WHOIS privacy services.

Because of all this and more, a smarter approach is required wherein, instead of chasing the past and searching for the known bad, network defenders hunt for patterns and signals that reveal the unknown bad. Once previously unknown indicators of malicious activity are identified, organizations can activate their standing incident response procedures. 


Hunting with Passive DNS

Passive DNS is very good at capturing such signals and patterns in a concise and structured way. Passive DNS is the data collected by passively capturing inter-DNS traffic to reassemble DNS transactions. Florian Weimer proposed this technique at the 17th FIRST conference in 2005 to slow down botnet propagation. Since then, a number of security organizations have started collecting passive DNS by placing DNS sensors on geographically diverse networks and analyzing the resulting data to generate threat intelligence. In today’s threat environment, Passive DNS can be immensely useful in driving threat hunting.

Passive DNS sensors, in essence, capture DNS traffic – UDP packets to and from port 53 (DNS) – and reassemble all the messages into a single record containing query and responses. We have experimented with two open source sensors:

We have an option to collect only the Iterative DNS queries (shown here in green) or collect all the DNS traffic.


DNS query


These sensors can be placed at any point in the network where a sniffer like tcpdump can capture DNS traffic. The best place to install a sensor is on a local recursive DNS server, but a span port will also work.

Once the passive DNS data is collected by the sensors, it must be transferred and aggregated to a single point for analysis and monitoring. A message queue like Kafka can be used by the sensors to publish the passive DNS records. This enables a flexible and loosely coupled – and open source! – architecture wherein any number of consumers can subscribe to the queue and perform necessary data analysis for threat hunting.

Broadly, there are three main applications of this data that are relevant for hunting:


1. Data Sinks to Long-term Storage

Depending upon the use case, a long-term storage like HDFS can enable large scale batch analysis to discover “what’s normal” for the network and identify historical trends. Alternatively, ingesting the data into an ELK (Elasticsearch Logstash Kibana) stack to perform searches and trend analysis is a simpler approach. This quickly enables searching for known IOCs using an open-source stack, while also conducting outlier detection for any deviations from the norm.


2. Monitoring

Monitoring various statistics of the DNS traffic, like the number of NXDOMAINs, number of queries by type, total number of queries, number of queries by user, or distribution of queried TLDs, etc. can be immensely helpful to understand the hourly and daily trends. Monitoring applications like Graphite generate graphs and statistics for different data points, and allow us to proactively identify anything out of ordinary.


3. Real-time Threat Hunting

These consumers process records as they arrive and detect threats in real-time, continuously looking for malicious traffic patterns and performing outlier detection. Time-series analysis, using libraries like Karios, facilitates the hunt, detecting unusual activity and any breakpoints or periodicity in the data.



Massage Que


Next Steps

Once the architecture is established and data is being collected, network defenders can conduct a wide-range of analyses on this passive DNS data to hunt for unknown intrusions in networks. In our next post, I’ll describe how this architecture can be to used to detect newly registered domains, fast flux techniques and domain generation algorithm (DGA) malware, and a variety of other indications of intrusion. Together, these posts will provide an overview of the power of open-source libraries and techniques for hunting on the cheap.