Reflections on Grizzly Steppe
On December 29, 2016, the United States Department of Homeland Security (DHS) and Federal Bureau of Investigation (FBI) released a joint analysis report (JAR) detailing, in their words, “tools and infrastructure used by the Russian civilian and military intelligence Services (RIS) to compromise and exploit networks and endpoints associated with the U.S. election, as well as a range of U.S. Government, political, and private sector entities”. While the report doesn’t name the DNC outright, it is clear that the technical information in the report is focused on data associated with the DNC intrusion and which was publicly attributed to two different state-sponsored Russian hacking groups.
The report, which dubs the operation “GRIZZLY STEPPE”, has been consumed and analyzed by various members of the information security community at large. The report basically has three parts: 1) what I take as an official government statement that the tools and groups listed in the Alternate Names section are in fact RIS; 2) a set of indicators which FBI/DHS thought would help defenders detect some segment of the activity; 3) a set of guidelines to harden networks against RIS activity.
While the report has received some positive feedback, the majority of feedback has been negative. Many of the critiques are valid, focusing on the indicators themselves. This is where the JAR fell short and could easily be improved. It was lacking in several key technical and contextual details making it vague and difficult to derive substantial value. To that end, and given that we are anticipating future JARs, I’ll walk through some reasons why the technical indicators were not useful and suggest small and hopefully achievable changes which would make the next public JAR more actionable for defenders.
Critiques of the Technical Indicators
While the report provides a clear statement of attribution, the technical information included was not very high-value. The most significant technical weaknesses were as follows.
Noisy IP addresses
The report included 876 unique IP addresses to search as “indicators of compromise” for malware command and control (C2) or data exfiltration infrastructure. The problem is that a substantial amount of the indicators belong to websites and online services that are used legitimately by a huge amount of people around the world, such as websites and services owned by Yahoo, Google, Dropbox, Tor, and various cloud providers. This will create many false positives and potentially panic in organizations who look for connectivity to these IPs and jump to the conclusion that they’ve been hacked by the Russians. If the Russian-sponsored hackers are using potentially high traffic IPs belonging to organizations like Yahoo for C2 or data exfiltration (which is completely feasible), then the report should include that as context (see below for more on that). We are left wondering whether FBI/DHS were aware that some of the false positives (FPs) would be noisy and FP prone.
Included only one Yara signature, not specific to RIS activity
The report includes a single Yara signature. Yara is universally used by security researchers to perform byte sequence analysis of binary files. Yara signatures are often used to identify fingerprints of sorts for malware which are less brittle than using unique hashes, filenames on disk, or other IOCs. The single Yara signature in the report appears to be for a PHP web shell (backdoor) likely used by an attacker once they’ve compromised an external-facing web server. There are certainly a substantial number of tools used by the RIS actors, and this signature was the only one included in the report. Additional signatures would be welcome. More problematic, the sample identified with this signature doesn’t even necessarily implicate RIS actors specifically. It is used by a variety of actors and cybercriminals and can actually be downloaded here and used by anyone. If this signature hits on a file on your webserver, you should be concerned and remediate, but it wouldn’t be conclusive proof that you were hacked by the Russians.
Lack of context
The majority of IP address indicators provided in the DHS report include nothing more than the sentence, “It is recommended that network administrators review traffic to/from the IP address to determine possible malicious activity.” The GRIZZLY STEPPE report does not include any sort of time window, description, or severity, making them less actionable by security teams around the world. This is an especially big problem since, as mentioned above, many of the IPs seem to be multi-user or subject to changing ownership over time. A hit outside of the time window when FBI/DHS believed the IP to be in use by the Russians would be a false positive.
Areas of Improvement
The following are specific recommendations that would drastically improve the value of the technical details of the GRIZZLY STEPPE report.
The United States Intelligence Community (IC) has the resources and expertise to differentiate between high-value and useless indicators. All indicators handed to the public should be high value. They could prioritize the indicators, explicitly calling out the indicators that will provide the most value and notate which high-value indicators are prone to false positives or shared infrastructure. Future reports should also include which indicators may have been compromised infrastructure. That is, which ones are random servers on the Internet compromised and used as a means to an end towards the real target, such as the DNC. Such actor tradecraft is quite common. On the other hand, they should specify which indicators were believed to have been used solely by RIS. The list provided in the report contains noise and seems to lack curation, which undermines confidence in the entire list, and has been used in attempts to refute the attribution. Confidence would increase with slightly more information.
Future reports would be greatly improved by including the time period the malicious activity would have taken place and the lifespan of each indicator. Is the activity ongoing or should security teams dig through historical netflow logs from 2012 to find malicious activity? Did attacks take place via that infrastructure for a year or a week? Which week? Will attacks be coming from a VPS rented from a cloud provider or a compromised website? How many indicators are proxies or Tor exit nodes? The lack of context will make it difficult to understand how to interpret hits, sift out FPs, and discover actual evidence of RIS activity, which must have been the point of providing the IPs in the first place.
Actionize the nomenclature table
The alternate names table on page four of the GRIZZLY STEPPE report potentially confirmed years of claims from commercial threat intelligence providers, but in its current format is completely useless and possibly even counterproductive from a technical standpoint. A few pieces of information from the IC could be a force multiplier on the value of the naming convention table. Additional information should at a minimum: include a breakdown of which groups use what tactic or implant; cite existing commercial threat intelligence reports; provide insights on several key procedures each actor consistently employs; and link hashes and IPs to a group or toolchain. Stating that a given group named by industry is RIS is great. When possible, the report should specify what they think commercial threat intelligence got right in those reports so we can more confidently use those IOCs.
To be very clear, I applaud the government’s efforts to get information about this activity into the unclassified realm. This report is probably the result of significant effort by many dedicated and talented individuals. The reception has been overly harsh and often focused on the attribution question, which I think is missing the point. The technical IOCs aren’t intended to provide evidence of attribution, but to enable detection of RIS activity.
Government sources need to be kept secret for various very good reasons. The IC can’t always reveal specifically how they came to a conclusion when attributing an event or series of compromises. The United States’ IC pours billions of dollars into cyber security research and development and is the most sophisticated in the world, but is also conservative in revealing sources by nature. It is possible, for instance, that the sources of the information that led to the attribution claims are still being actively used. It may even involve human sources who would be put in grave physical danger should the information come to light. In some cases, it is more important to continue to observe an adversary than to arm the public with the irrefutable proof they used to declare with high confidence that it was Russian activity.
However, if the IC is going to publish a technical report specifically calling out state-sponsored hacking activity, the report needs to be actionable. The technical details need to include context and be resilient to false positives. They can empower the private sector further by citing or confirming specific reports and research already available in the open source. The GRIZZLY STEPPE report represents a welcome move by the IC to confirm and call out RIS hacking activity, but the report was lacking in several key technical details to provide substantial technical value to the private sector. I look forward to the next JAR and hope it proves more actionable to better equip defenders against this global campaign.