danger, office in a fire , a burning computer

The problem with modern IT is that on the whole it just works. Its reliability has made us lazy and overly confident so that when it does fail the pain is all the more intense. Twenty years ago a damaged floppy disk might have lost you 1.44 MB of data, now even a humble USB memory stick can have 64 GB of data on it.

The loss of some data is one thing but nearly all businesses are incredibly dependent on their IT systems, laptops, smartphones and internet connectivity. Businesses spend many thousands of pounds deploying systems which become integral to the operation of the business but frequently do not spend any time considering the what-if disaster scenarios or any approaches to mitigating those risks.

With many small and medium sized businesses now moving to cloud based solutions there seems to be an even more relaxed attitude due partly to the belief that cloud systems are 100% reliable. Unfortunately the cloud is no more than a buzz word behind which sits computer and networking equipment no different to any other IT system, and in the same way it will fail from time to time.

Hardware is very reliable and with redundant systems the physical side can be designed very effectively, however, there will be a single point of failure somewhere and more often than not today that point of failure is human – typically when making a configuration or software change.

Even an outage of a few hours can cost a business large amounts, from lost sales, production delays, shipping delays and a host of other aspects depending on the business type. It’s not just the IT systems directly though – fire, flood, terrorism, loss of building access, cyber-attack, loss of internet access, etc. all can have a potentially devastating impact albeit with varying degrees of probability.

Whether you are a sole trader or a large corporation a sensible approach to business continuity and disaster recovery is essential. For a small business it may be very straightforward but none the less it is important that the risks are reviewed and appropriate actions taken.

The first step is to identify the risks and run through each scenario noting down the potential impact. Each scenario can then be scored based on probability of occurrence and impact to the business. The next step is then to mitigate these risks as much as possible, looking at aspects such as processes, system design and environmental factors, from which a prioritised list of actions can be generated based on feasibility and cost.

This process is broad, covering physical building and operational aspects through to much maligned data backups but it is important that everything is looked at as it will always be the smaller details which cause the problems. One common issue for example is that in many buildings the internet connectivity comes into the building via the basement with sensitive networking equipment located in the area most at risk from flooding!

Not all risks can be eliminated so for those that remain the next step is to look at contingency. For example, a building fire or flood is likely to necessitate a relocation so a disaster recovery plan should be in place which details the steps and actions to be taken in the event such a disaster occurs. This may include pre-identified space in which to move to, stand-by equipment and a recovery plan for bringing services back online.

The biggest risk for most disaster recovery and business continuity plans is that they frequently do not get tested. Only when a disaster strikes does it get discovered that the system backups have been failing all along! (Yes, I have seen that happen) Checking and testing plans on a regular basis is a key part of the process, just like a fire drill.

Disaster recovery and business continuity planning is not necessarily as big a job as it might be perceived to be but without it the reality of a disaster is all the more painful. Relying on a ‘it won’t happen to us’ strategy is not good business practice.