The only thing worse than having a huge problem is having a huge problem and not realising it. Believe it or not, many organisations are in the latter boat right now. Specifically, many organisations are undergoing a proliferation of secrets at a scale and scope that eclipses the ability of mechanisms and controls they may have in place to keep them protected. 

If all that sounds like hyperbole to you, consider just how many secrets there are in even one application that might exist in an organisation’s environment. The application might have an account password it uses to authenticate itself to a backend database, it might have cryptographic keys used to encrypt application data at rest (e.g. in database tables or on the filesystem), and it might have API keys used to authenticate itself to external applications. The application might live on a web server that requires a TLS private key for its web UI (i.e. the private key corresponding to the X.509 certificate used for web interface TLS sessions). It might also use server certificates for web services the application provides -- and client certificates to authenticate itself to others. Maybe the host running the application has SSHD keys for system administration and automated processing, symmetric keys for encrypted files, and admin credentials for cloud environments. That’s just for one application: multiply it by dozens or hundreds of applications and network nodes and you start to get an idea of the potential scale of the issue and the “sprawl” that can occur when hundreds or thousands of secrets proliferate throughout the environment.

The truth is that most organisations are bursting at the seams with secrets, the exposure of any one of which could directly and adversely impact the security of the organisation. As applications become more modular and incorporate more underlying services, secrets compound. As organisations become more interdependent on business partners and external services (e.g. cloud), secrets compound. As organisations add new layers of virtualisation and containerisation or add more devices through mobile and IoT, secrets compound. Uncontrolled, this situation can introduce significant risks. 

Enter Secrets Management

They say that the first step to solving a problem is realising that there is one. This is good advice as many organisations don’t stop to question how they handle secrets.

There are a few reasons why this is the case. First, sometimes the secrets required to keep the organisation humming become entrenched and therefore “invisible”; meaning, they become like plumbing: out of sight and mind unless something truly catastrophic happens. For example, consider SSH keys.  Whether they realise it or not, almost every organisation uses SSH (usually, extensively) as SSH ubiquitous for system administration (including for cloud IaaS services), but also for automated batch transactions and file transfers. Only rarely though do organisations stop to take stop and examine in detail how they secure these keys. 

The second reason is that secrets can be deeply embedded in application logic. And, as we all know, often security teams are so busy “putting out fires” that delving into the weeds of applications isn’t feasible with the time and resources available. Unless a security team is systematically deconstructing and analysing every application in their environment (e.g. using a tool like application threat modeling), they may not know when an application creates or uses a secret – or when an application is architected in such a way that a secret is required for it to function. 

The point is, recognising that secrets need to be managed is the first step on the pathway. That’s a good starting point, but since “wishing doesn’t make it so”, further action is required once we realise that the problem exists. That further action is secrets management: the systematic process of identifying, tracking, protecting, and monitoring the lifecycle of secrets throughout the environment. 


So how does a practitioner begin to get a handle on – and subsequently start to manage – the secrets that they might have within their organisation? A good starting point, upon realising that the problem exists in the first place, is discovery. Meaning, start by understanding what secrets there are, where they’re located, and what their purpose is. This is harder to do than it sounds. Why? Because there are multiple different types of secrets – and multiple layers of the stack at which they might be used.

A useful way to approach it is to look for secrets by type. For example, start with a subset of secrets like TLS certificate private keys or SSH host keys. If you have a product in place already that tracks these things (e.g. a commercial tool like Venafi for TLS certificates or SSH Communication Security’s Tectia for SSH keys), start with these since it makes the discovery process easier for these types of secrets. If you don’t have a commercial tool already in place, you can manually (or using scripts) canvas the environment for running services or look at vulnerability scanning output to identify where TLS and SSH are running. This will give you a starting point but note that you’ll need to do a little bit more digging because just knowing where a service that uses a secret is running won’t tell you exactly where the secret is: for example, knowing TLS is running won’t tell you where the private key lives on the server.

Once you have a handle on the more easily gathered information, progress to the harder to find secrets like encryption keys, database accounts, and API keys in the application source code. A useful strategy to find these is using a systematic process like application threat modeling to decompose how the application actually works and to look in depth at the individual components and interface points of the application. It bears saying that threat modeling will only get you so far though: since the application threat modeling process uses a data flow diagram (DFD) as its primary starting point to decompose the application, secrets that are used entirely within the boundary of one component or module may not be immediately apparent just by looking at the DFD. However, threat modeling can give you a pretty substantial runway to find secrets within a given application, allowing you a scaffolding upon which to build your knowledge of secrets the application might use down the road. 

The point with this isn’t to try to “boil the ocean” from the get-go. Instead, slowly build out an inventory of secrets as you learn and expand your knowledge. Just like you would approach an asset or application inventory, start with what you know, build on it with information that you can obtain relatively easily, and only once your efforts have started to bear fruit move on to the much harder to obtain information that might be more hidden from direct view. 

Controlling Secrets and Addressing Low Hanging Fruit

As you start to build out your knowledge of where secrets exist in the environment, you can begin to address some of the areas that are especially problematic. For example, if you encounter situations where API keys, encryption keys, or middleware usernames/passwords are hardcoded into application source code, you can start to weed them out and move them to more tightly controlled locations.  You can start to track when these keys expire so that you’re not caught out if a certificate expires causing web services to unexpectedly (and sometimes silently) fail. Where it’s not already present, you can begin to enforce robust best practices for things like application logins (e.g. ensuring encryption keys rotate, and that user ID’s and passwords aren’t shared across hosts.) 

There are commercial tools, and also open source tools like Vault and Confidant, that can assist by providing you with a secure location where you can store secrets securely as you learn about them to help you bring them under control. There are also specialised tools that you can employ depending on the type of platform where the secret is used; for example in a Kubernetes environment, you might employ their secrets interface while for Docker Swarm you might employ theirs. Deployment architecture might also play a role: for example, within a cloud environment, you might leverage native secrets management capability provided within the environment (e.g. Google Cloud Secrets Management). The point is, customise your approach to the type of secret you identify, all the while ensuring that each secret is used in a way that is conformant to best practices, is stored in a reasonable location (e.g. not on GitHub or in your source control system), etc. 


Once you’ve identified where secrets are and how they’re used and have started to get some road behind you in moving them toward best practices, the next step is to begin to monitor access.  In an ideal world, you’d have one secrets management system that would give you an extensive audit trail every time a secret is used. In the real world though, it’s likely that: 1) you won’t have one system that works for every potential secret in scope and 2) there might be secrets that you just can’t manage centrally using a system at all. Therefore, you’ll need to do some legwork to ensure that you’re able to monitor access consistently.

You can do this by working again with the inventory that you’ve collected about what secrets are in use and establishing monitoring mechanisms for each secret as you discover it and add it to the list. If you discover a database username/password combination used by applications to access backend data stores, ensure that the underlying database system is collecting logs. If you discover that system administrators are employing ssh user keys to gain access to remote devices for system administration, track their use. The point is, ensure that you have a way to know when secrets are employed and by whom. You can use log aggregation tools, SIEM tools, and correlation tools to assist with this.

As you can imagine, this can be a time-consuming endeavor. As a result, the sooner you can begin the easier it will ultimately be. As with anything, realising that there’s a potential issue is a useful first step, and using a workmanlike approach to finding secrets in the environment, taking steps to control and organise how they’re used, and ultimately monitoring that usage over time can provide tremendous value in closing a potential truck-sized hole present in many of today’s environments.