Natural Language Processing in security tools.

Security concepts if explained are not hard to understand for the average employee, and establishing policies that apply those concepts is not difficult. However, the challenge lies in interpreting, enforcing and evolving security guidelines in the context of daily business operations. As a result, organizations are continuing to grow their investments in cybersecurity. In a recent survey conducted by EY, 87 percent of enterprises stated that they require 50 percent increase in their current cybersecurity budgets.1 According to Gartner, worldwide spending on cybersecurity in 2018 will reach $98 billion.2

Cybersecurity has become a priority in the boardroom with the almost daily stream of news stories about recent breaches. Security teams are able to better justify the acquisition of new tools that improve the ability to keep up with the latest threats. Today’s security solutions are able to help us process large volumes of data, visualize information, detect patterns and manage users and assets. Nevertheless, for all the technological advancements made in cybersecurity, our security systems still don’t understand our security intentions.

Security tools are almost mechanical in their behavior, operating primarily off of pre-established rules or decision trees. While the application of machine learning has unlocked greater analytical facilities the gap of comprehension remains. For example, a user behavior analytics solution is able to identify anomalous data access but assessment does not include a qualification for why the anomaly is suspicious. Therefore, it is up to the security professionals themselves to infer security significance of all the events security solutions report. This is not a sustainable model as the number of security events requiring manual review increases. After all, the cybersecurity industry is already facing a massive skills shortage.

The first step to arriving at an answer to this seemingly intractable problem is teaching our security tools to understand us. Advancements in Natural Language Processing (NLP) present an opportunity.

The potential is considerable. Organizations would be able to make security more accessible to every employee through just-in-time guidance. If security professionals could simply speak to security tools, their training programs can be focused on mastery of security principles and investigative techniques, rather than skills particular to deployed solutions. By minimizing the training overhead, the skills barrier could be lowered broadening the potential talent pool for security teams. Security teams would be able to specialize in crafting policy, implementing procedures and establishing analytical approaches while offloading translation of security objectives and specific commands to application or system behavior to the security systems themselves. Reporting could be simplified by enabling self-sufficiency for management teams to review the organization’s security posture in real-time.

Before exploring the potential applications lets better understand NLP.

Teaching Machines the Language of Security

NLP enables computing systems and applications to interpret human language. A system enabled with NLP is first able to recognize proper nouns such as malware names and different types of IT assets in a body of text. Next, the system gleans associations and relationships among proper nouns. Subsequently, it is able to draw conclusions about the topic or meaning of specific messages.

While NLP research began over half a century ago, the past decade has witnessed rapid advancement in the field. With machine learning, it became possible to teach a system to recognize a variety of different terms or messages as being similar. More advanced NLP also incorporates semantic comprehension. Semantic comprehension involves defining a vocabulary, developing associations between terms and cultivating a knowledge base. It teaches a system about physical and abstract concepts. For example, advanced NLP processing of a security report could resolve the specific malware being referenced and it would be cognizant that a malware has aliases, an ancestry, a threat classification, and applications or assets it typically targets.

While generalized natural language understanding is the vision, the scope for NLP needs to be focused to address specific use cases. A security system powered by NLP can be incrementally taught to broaden its capabilities. Two primary categories of use cases will be covered in this article. The first is the use of NLP to interpret a user and perform actions on their behalf. The second is the use of NLP to process information contained in textual documentation to either perform actions or make that knowledge accessible to users. Note that a security solution that supports both categories of use cases is also feasible today.

NLP for Improving Data Security

Among the first applications of NLP in security was in improving data analysis. This is especially timely given the emphasis being placed on data security and privacy as a result of the European Union General Data Protection Regulation (GDPR) going into effect. With NLP, security teams can be proactively informed about the presence of sensitive data such as personally identifiable information (PII), monitor its use for compliance and generate alerts when certain data is being mishandled.

NLP is also being applied to combat phishing attacks. Wombat Security highlighted that in their 2017 survey 76 percent of respondents reported that their organization had experienced a phishing attack3. With NLP the accuracy of detecting malicious phishing emails can be increased through content classification analysis.

Interactive User Interfaces for Security Solutions powered by NLP

NLP-powered user interfaces can be added to any existing security solution to provide interactive user experiences for their products. The most common is a voice or chat interface allowing a user to specify inquiries or commands. The security solution would then interpret the input to invoke the necessary actions to either respond with an answer or present the results of the tasks performed. While the experiences delivered today are rudimentary, the possibilities are numerous.

Reporting can be streamlined with NLP. Today in most security operations, reporting and analytical processes are highly manual and every new request by management requires security teams to allocate time and resources to deliver. The process involves security professionals having to transpose their questions into queries that security tools can handle or navigate several screens to arrive at answers. With NLP security professionals can focus on asking the right questions while the security tools can automate the effort of assembling answers and generating reports. A related use case is initiating security analytics via an NLP powered interface.  Training the system to translate between user requests to tool-specific query language could mask the complexity of the underlying security solution.

NLP Facilitating Incident Response

NLP has an important role to play in security incident investigation and response management. Typically, threat hunters and forensic specialists have had to wade through a significant amount of security research and constantly monitor security news and bulletins to be effective. Language capable security solutions can assume some of the burdens by categorizing and summarizing research and news content. Additionally, these solutions can enable security specialists to perform selective semantic searches on ingested content. Semantic search is more powerful than keyword-based searches we are used to because it will be able to find relevant content based on meaning rather than exact matches. More advanced solutions can present semantic links between suspicious incidents and content present in the research or news improving the productivity of security investigators. Given that enterprise security teams average 191 days to detect breaches and require up to 66 days to contain the risk, NLP could translate into tangible security value for organizations by improving the speed of investigations and response management4.

Security Advisor for Employees Enabled by NLP

Perhaps the biggest transformation that NLP can have on security practices is that it can foster broader awareness and participation across an organization. Often the typical user does not understand security and therefore language enabled security solutions can be deployed to provide business context and implications to security policies and procedures. This will allow organizations to actively test the security knowledge of their employees, as well as offer in-the-moment advice on proper security hygiene with supporting justification and reasoning to encourage adherence.

From a security support standpoint, language enabled security solutions can address frequently asked questions autonomously. Minimizing the manual effort security teams have to expend in keeping employees informed, enforce policies, and monitor compliance will allow more resources to be dedicated to core security activities such as threat monitoring.

Security solutions will progressively incorporate greater NLP capabilities. Some of these solutions have already been deployed to demonstrate their feasibility. When making security vendor decisions, security managers and CISOs should be reviewing roadmaps to evaluate the prioritization placed on NLP. As part of the evaluation process of an NLP-powered security solution, it is recommended that the management and administrative requirements be examined to verify that projected efficiencies can be realized.

Alternatively, security organizations can look to vendors who specialize in NLP capabilities to overlay their capabilities on their existing security infrastructure. Thus far, security solutions have integrated NLP features to address specific security gaps. Moving forward security organizations should consider developing a strategic perspective on how language capabilities can be employed to transform security practices in an integrated and comprehensive manner.


1. EY Study:

2. Gartner Study:

3. Wombat Study:

4. IBM Study: