Business has never before been so reliant on a stable Internet connection; the cloud has now displaced a huge amount of historically on-premise solutions, from accounting software through to CRM systems. It’s mobile; it’s always on and readily available. But this raises the question, just what happens when the faithful cloud (or to be exact, services hosted in it) does go offline?
While it’s rare that it occurs, there have been instances when the worst has happened. When Amazon went down there was an estimated loss of $1,100 per second. And with systems like Amazon Web Services, applications and services that use it are affected, taking down with it the likes of Airbnb and Instagram.
In the event of service outages, it’s important that the impact is minimized as much as possible.
1. Sign a SLA.
A service level agreement is somewhat standard when it comes to cloud services and agrees to certain level of uptime from a provider. Though if a service guarantees an availability of 99.5%, there is 1.83 days of downtime a year, or 3.60 hours in scheduled or unscheduled downtime a month, while this may sound small, critical applications must take availability into account.
2. Remember that on-premise solutions still have a place.
On-premise solutions can still give significant benefits. If a service has specific in-house skills attached to, such as bespoke applications, a business may be unable to place them onto external services or need a more ‘hands-on’ approach to maintaining the service. There is still a place for on-premise solutions and while cloud may be the hot topic, some things are best kept within an organization’s wall.
3. Plan for failure.
If a failure does happen, plan for it. There are several options surrounding redundancy and failover systems to make sure if systems do fail there are all of the necessary precautions in place to maintain an organizations’ processes.
4. Quantify the cost of downtime.
Quantifying the cost of downtime can give great insight into just what the cost of downtime leads to and give leverage when it comes to reasoning further investment into both business infrastructure and IT provisioning.
5. Be both proactive and reactive to downtime.
When downtime occurs it’s easy to be reactive to the situation, though as soon as the service is restored it’s even more crucial to be proactive and continue to find out how, why and what went wrong to mitigate further issues.
6. Review IT processes when downtime occurs.
When downtime does occur it’s important to re-evaluate where the issue came from and look at how IT processes can be improved upon to make sure that an organization is in the best position to deal with downtime when it occurs.
7. Monitor internal and external services.
Monitoring can drastically aid organizations when it comes to being alerted when downtime has occurred, understanding the knock-on effects and also giving insight into the reasons for the downtime. Monitoring should be seen as a critical part of the infrastructure to enable organizations to keep track and maintain availability of their network and business services.
8. Remove complexity.
In a contradictory way, removing complexity when it comes to networks enables a faster, more efficient and effective response to downtime. While planning for failure includes adding redundancy and failover procedures, thus increasing complexity, it’s also important to simplify to enable an easier view of what issues cause downtime.
9. Follow best practices.
Any IT team should follow best practices, such as ITIL to enable them to manage an entire business service from end-to-end. This complete view of a businesses service means that there’s an alignment with business needs as well as gives clear accountability when issues do arise.
10. Understand the risks.
Downtime is not the only risk that businesses face when dealing with cloud issues, security breaches and network outages both have the ability to incapacitate a businesses processes and can be, especially in the case of a serious data breach, more damaging to a business.
While downtime is somewhat sporadic it should always be planned for. All businesses should try to minimize the impact and the amount of downtime it has to ensure a level of system availability for their customers and staff.
About the author
Brian King is Digital Marketing Manager at Opsview, a leading network monitoring company.