An introduction to risk

Callam McMillan 13/05/2020 0 Comments

I started this as a brief introduction, but in making sure I explained the fundamentals, I ended up writing pretty much a chapter of a book on Information Security. If you read this and you are saying “but you’re telling my stuff I already know”, lucky you! Risk management is one of the very first things I teach my junior analysts at the start of their careers; and it’s something you should ensure your stakeholders understand. By giving them an appreciation of risk and its management, you’re much more likely to be able to deliver your security messages effectively.

You might not think about it, but if you operate a motor vehicle, you’ll be performing a risk assessment, exercising risk judgement, and conducting risk management every time you turn the wheel. That’s how we avoid ending up in the situation that the driver in the picture found themselves in. The type of risk management we do when driving is instinctive – we know that having an accident would be bad, so we act accordingly. When we talk about risk as it relates to information security, we cannot rely on a vague definition of “bad”, we need something more.

What is risk

First we need to define what a risk is: If we start with the premise that it is the chance of something happening. Now, as nobody ever talks about the risk of winning the lottery, there is an implicit expectation that a risk is the chance of an undesirable outcome. Great, we can use this as our definition, but how do you define undesirable? Remember, at this point we need to be more specific than “bad”. Let’s start with what makes it undesirable – what impact will crashing your car have on your life? We can group the impact into 5 categories:

Financial: This is going to cost you money to fix;
Legal: There could be police involvement or points on your licence;
Operational: How are you going to get to work without a car;
Reputational: How are people going to look at you, especially if it was your fault; and
Safety: You could be hurt or killed as a result.

Obviously the impact categories you choose are going to correspond to the type of risk that you are assessing. For instance, Safety may not be relevant for a pure information risk, while political impact may be a factor. Knowing what the impact is provides half the risk, but we also need to know how big that risk is – or what the likelihood (or probability) of the risk is. This is expressed as 1/[interval], such as:

The probability of a car accident could be accidents per year, or accidents per million miles driven;
The probability of inheriting a disease could be sufferers per million people;
The probability of a website transaction failing could be failed transactions per billion; and so on.

At this point we know that a risk is the probability and impact of an event. As we’re going for specific, we also need a way to classify the impact and probability…

To measure or to feel

Defining the probability and impact of a risk is half the challenge, but in order to be useful, there needs to be a way to measure it. To do this, there are two methodologies, both valid, and both useful. We’ll look at each separately.

Going by feel

Qualitative measurement is effectively going by feel, rather than measuring the as-is against a defined set of criteria, you’re instead judging it relative to your experiences. Going back to the opening example, this is what you’re probably doing when you drive, and it’s why inexperienced drivers are more likely to have accidents – their frame of reference against which to compare the as-is risk is too small to make effective judgements. This also highlights the weaknesses of qualitative assessment:

You have to be skilled in what your measuring to make effective qualitative measurements; and
Two assessors could come to two different conclusions.

The advantage however is that you can measure things which would be difficult if not impossible to measure quantitatively; or where the thing being measured is of limited value. Take for instance calculating the risk of losing sentimental family photos on your computer. There’s no financial impact, nor any legal, operational, or reputational impacts; and you’re probably not going to be scouring the Internet for the failure rates of data storage devices. So in this case you’ll go by feel – “losing photos that cannot be replaced would be bad; I would feel terrible; and I know that I have to replace a hard drive every 5 – 7 years.” Actually, for many risks, that’s a good enough measurement. The other advantage of qualitative measurement is that it is cheaper to perform.

You can make a qualitative assessment into a semi-quantitative one by grouping those relative risks. You could score “That’s likely to be really bad” as a high, “I don’t feel right about this” as a medium, and “It should be fine” as a low for instance and then using those groupings going forward.

By the numbers

The alternative to qualitative measurement is quantitative measurement, and as the name suggests, involves quantifying risk in absolute terms. Pounds and pence, dollars and cents, or even lives lost. You should hopefully understand why would be impractical to do on a day to day basis: You wouldn’t sit at a junction and while you consider whether the gap is big enough to pull out into be thinking “If I pull out now and that Ferrari t-bones me, it’ll cause £15,000 of damage, I’ll be without a car for 17 days, and have 6 points on my licence.” However this is what you would be doing with quantitative assessment. So where might you find an example of this?

Formula 1 is a great example of where quantitative risk assessment reigns supreme. Every decision is based on calculated risk, distances are measured to the millimetre and beyond, times to the millisecond. Being able to lap a couple of tenths a lap faster is the difference between winning and being just another driver. You simply cannot get that level of precision based on feel alone. Insurance is another industry where quantitative risk assessment rules; and even if you don’t know much about insurance, you’re likely to of heard of actuarial tables (for instance, the one’s that predict how long you’re likely to live for a given set of conditions.)

The advantage of quantitative risk is that 10 people can assess the same data and come to the same conclusion. This means your assessment has become repeatable; which further means your risk can be tracked over time to assess if the risk has changed rather than your perception of the risks. The downsides are that it’s expensive to define a measurement scale for every aspect of a risk, collect the data, and analyse it; and as in the case of the sentimental photos, would undervalue certain types of risk.

Understanding yourself and the enemy

What we have looked at thus far is a risk itself and the output from it – the impact. But a risk doesn’t magically turn from the theoretical to the practical. So now let us consider the full definition of a risk, and look at the input side of it:

A risk is the probability and the impact of an uncontrollable threat exploiting a controllable vulnerability.

Behold the uncontrollable threat

The flood above is a perfect example of an uncontrollable threat, and when it comes to information security should still be a consideration. As the name suggests, a threat is anything that threatens what you are trying to protect. The uncontrollable part comes from the fact that you cannot change the nature of a fire or a flood, nor you cannot stop a criminal from being a criminal; all you can do is minimise the harm they can cause to you.

Threats come in two classes – environmental and manmade. An environmental threat is the weather, fire, pandemics (thank you Covid-19) etc. Manmade threats are just that and include anything from negligence and targeted malicious acts through to war and civil disorder; and collateral damage from a threat targeted elsewhere. It is also possible to have a manmade environmental threat, for instance if you have a toilet above your data centre and someone causes the toilets to overflow, you could end up with a flood in your computer room.

What types of threats you consider depends on what you’re protecting, and conditions where you happen to be. If you’re assessing a risk in California, earthquakes will be a much greater threat than if you’re assessing a risk in London. Also, just because a threat seems incredibly remote, it doesn’t mean you shouldn’t consider it. Take for instance the threat of lightning hitting your data centre: The probability of this is negligible to the point of non-existent, however the impact could be the same as a fire that destroyed the building, so you could include it in that class of threat. This is why insurance policies sometimes have weird wording in the small print about things such as nuclear contamination – while it’s extremely unlikely that you’re going to be claiming on your car insurance because it’s now radioactive; if they don’t consider that threat, the impact could far outweigh what they predicted (and priced for).

Beware the controllable vulnerability

A risk cannot be realised without both a threat and a vulnerability. Just because you have the flood above, if you’re situated at the top of a large hill, then you’re not vulnerable to being flooded. The vulnerability is controllable because you have a choice where you’re located, or whether to have adequate flood defences.

For each threat, there could be multiple vulnerabilities. Just because your building is at the top of the hill, if the only road to it runs past that river, then that is a vulnerability because people and goods will not be able to reach your building.

If you’re considering security risk, you may wish to evaluate each threat-vulnerability pair against each component of the CIA triad (Confidentiality, Integrity, Availability). For instance, a flood that damages your data centre will have a much greater impact on the availability and integrity of your data than its confidentiality. Whereas an employee losing unencrypted copy of your data is going to be a threat to the confidentiality of your data, but won’t affect its integrity or availability.

Identifying threats and vulnerabilities

Being able to identify threats and vulnerabilities is where the real skill of the security professional comes in. Fortunately however there are tools to help you. One such tool is a threat catalogue, such as that included in the appendix of ISO27005 or the BSI one (link) which contains a list of 48 elementary threats. Starting with a catalogue such as this, you can then add additional threats unique to your situation.

Vulnerabilities are identified through experience, healthy cynicism, and use of tools and frameworks. As a vulnerability is a weakness, you’re looking for gaps in your people, processes, and technology that a threat could exploit – if there’s no gap, there’s no vulnerability. Experience will help you spot commonly seen vulnerabilities, and tools and frameworks will further guide you. These include frameworks for development security such as the OWASP Top 10, and vulnerability scanners that look for system-level vulnerabilities.

The healthy cynicism however is your most powerful asset, the ability to question what would happen if anything and everything goes wrong without appearing to be negative will help you to collect the most accurate picture of your vulnerabilities. This will generally be collected as part of a business impact assessment.

Starting the paperwork

The risk management process relies on two key documents. Business Impact Assessments (or BIA) and Risk Registers. In short the BIA is used to identified risks, which are then stored on and tracked using the risk register.

The BIA is a tool, containing a methodology for identifying the threats, vulnerabilities, and the likely impact from them. If your organisation doesn’t have its own BIA template, then there are a wide range to choose from. There is an irony that when working with smaller businesses, they are more likely to resist conducting a thorough BIA, despite being the most likely to suffer major business disruption as a result of major risks occurring.

Analysis of the BIA may result in the identification of one or more risks. Each risk will have one or more threats exploiting a vulnerability which will have an impact and a probability. These risks are fed into your risk register. At this point it is far too easy to say that you have assessed your risk and understand it, and then forget all about them and go on your merry way! Doing this though misses the point that your risk register is a living document that needs to be tended regularly. The risk landscape can change rapidly, and if you don’t review it often enough, the risks contained within can be inaccurate to the point of being useless. Ideally you should be reviewing your risk register on a monthly basis, especially for major risks.

Managing risk

So, we’ve learnt what risk is; we’ve learnt how to identify, collect and catalogue it; and now we must do something with it. This is where risk management comes in. The first fallacy we need to deal with is the concept of “zero risk”. In a conservative company, it may be tempting to state that no risk is tolerated; however when you look at the threats and vulnerabilities, there will always be an inherent risk, even if it is small and well controlled. What needs to be defined therefore is risk appetite.

Risk appetite is simply put how much risk the company is willing to take when conducting a certain activity, and is best defined in absolute terms such as the size of financial loss it is willing to tolerate etc. All your risk management activities are going to be with respect to the risk appetite, with the target to be to bring risks in at or under the risk appetite.

The second fallacy to deal with is that if you can dismiss a risk. Once you’ve identified a risk and are aware of it, you cannot just ignore it and pretend it doesn’t exist. In this case, being ignorant of the risk is better than actively ignoring it – especially if your company is ever audited over it. That said, you do not have to actively manage every risk.

If a risk is within the risk appetite of the company, then it can be accepted. This means the company has considered the risk (as opposed to ignoring it) and has actively decided that it is acceptable.

There are four valid and accepted risk management techniques that can be employed either singly, or in various combinations to reduce the level of risk below the risk appetite; in the case of driving, these are:

Risk Avoidance: Reducing the risk by stopping the risky activity. If the roads are covered in snow, you may choose not to drive your car, thus avoiding the chance of having an accident. In this case you have reduced the probability to zero, but the potential impact remains the same;
Risk Deterrence: Warning the threat not to attempt to exploit the vulnerability. Obviously this only works on manmade threats there there is a degree of free will. While nothing will stop you from drinking excessively then driving, there is the deterrent of getting stopped by the police. This reduces the probability, but again leaves the impact unchanged.
Risk Mitigation: Putting measures in place to reduce the impact and probability of an accident. Controls to reduce the impact could be safety systems such as airbags and seat belts to lessen the chance of injury; while controls to reduce the probability include better lights and anti-lock brakes to help avoid an accident in the first place; and
Risk Transference: Moving some of the impact to a third party, often in exchange for a fee. The most common example of this is insurance, where in return for a fee, the insurer will cover the financial impact of having an accident, and provide support to limit the legal and operational impact. In this case, it reduces the impact, but not the probability of having an accident in the first case.

The risk management options should be evaluated against the risk and it should be calculated if it is worth reducing using the sums below. Whatever risk is left over after this activity should be accepted.

Doing the sums

Just because you can reduce the risk, it doesn’t mean you should. It’s unlikely that you would spend £10,000 in order to not risk having to pay £1,000! But how we come to that decision depends on how we calculated our risk in the first place. If we did it qualitatively, then we have to use feeling to determine whether it is worth reducing the risk. Assuming that you chose not to back up those sentimental photos to the cloud (which would be a potential risk mitigation), then you need to determine whether the emotional distress the loss of those photos would cause outweighs the cost of buying a storage device and somewhere to store it.

For quantitative assessment, the methodology is a little more complex, but gives a clearer result. You will need to know the following terms and formulae:

AV (Asset Value): The value of the asset(s) covered by the risk. This is calculated as the cost of returning the asset(s) to the pre-event state and covering any associated costs (legal, regulatory etc.) A value should be assigned to reputational damage in terms of potential lost business;
EF (Exposure Factor): What percentage of the asset(s) would be impacted if the risk occurred;
SLE (Single Loss Expectancy): How much value would be at risk given a single event causing a loss. This is calculated by multiplying the AV by the EF (SLE = AV x EF).
ARO (Annualised Rate of Occurrence): How many times this event is likely to happen in a year – the probability of the risk. It is calculated as 1 / years between occurrence. If you had one event every 6 months, it would be 1 / 0.5 = 2 while if you had one event every 50 years, it would be 1 / 50 = 0.02.
ALE (Annualised Loss Expectancy): How much is this risk going to notionally cost per year. It is calculated by dividing the SLE by the ARO.

If you were not legally required to carry car insurance, you could use this method to work out if it was worth obtaining it. If we use the £15,000 number from earlier as the cost of the damages in an accident; and because that is the cost incurred we use 100% as the exposure value. The SLE is therefore £15,000 (100% of the AV). You reckon that you are likely to have one accident every 20 years, so your ARO is 1 / 20 = 0.05. This means your ALE is £15,000 x 0.05 = £750 per year.

Now, should you insure your risk? Information Security books will tell you that if the cost of the control is less than the ALE, then yes. However this is misleading. Let’s consider some different scenarios:

The insurance is £1000 per year. Textbook logic says that you shouldn’t spend this, and if you didn’t, by year 15, you would have saved the AV in premiums, meaning over the 20 year cycle, you would have saved at least £5000, even if you had an accident. However that assumes that you don’t have an accident until at least year 16 and that you did nothing with the money you weren’t spending; what happens if you have an accident in year 1? If you have enough cash in the bank to pay the £15,000 without impacting anything else, then you can choose to forego insurance and do that. If not, then the £250 above the ALE is the cost of reducing the risk of paying £15000 that you don’t have available.

The insurance is £100 per year. In this case, you should purchase your insurance. Even if you could invest that £100 every year and make 10% per year, over the 20 year period, the £2000 spent would have only made £6300, significantly less than the money at risk.

There is a third scenario, which is often seen where a company is found guilty of some form of financial wrongdoing and has to pay a penalty. While the penalty is significant, the company is easily able to pay it, but has in the meantime made substantially more from investing the initial amount.

In this third scenario, where the cost of risk management controls could be otherwise invested and make a sum greater than the Asset Value, then unless the company is unable to pay the residual SLE in full each time that risk occurs, the money should be invested rather than being spent on the control.

Empowered risk taking

Some security professionals will look at a risk and insist that it is reduced; and that it is the most important thing to do. A more considered professional will help the business to understand the benefits and liabilities that come from reducing or accepting a risk; and this gives rise to empowered risk taking.

Silicon Valley coined the term Minimum Viable Product. The goal of a startup company is to survive long enough to become an established company. This means you don’t wait for your product to be perfect, you don’t wait for your corporate governance to 100%. Instead get something to the market, and hope that you have a customer base significant enough that when risks start to occur, you can cover the impact.

If done correctly, such an approach isn’t ignorance. Instead it demonstrates an acute understanding of the risks associated with that company, and intimate knowledge of what is important at any given time. If you can demonstrate that, you can make security a powerful tool in your company, not just for keeping it safe, but for helping it to grow.