All companies are exposed to “disasters” that can interrupt their ability to conduct business. And all businesses must take responsible steps to ensure that they can tolerate such a disaster without incurring unacceptable losses or permanent damage. By increasing their disaster tolerance, companies can meet their business requirements and fiduciary responsibilities while reducing disaster recovery planning costs. The challenge is to increase disaster tolerance by increasing disaster resilience and enabling recovery of complete business processes while concurrently facilitating proportionate solutions that reduce the cost of recovery planning while still meeting business objectives.
Disaster Recovery planning started as a “Glass House” initiative and generally came to mean planning to recover the IT function. Then came Business Continuity planning which professed to extend the concept to include the business units. The industry’s blurring of these terms has led to the belief that planning to recover IT is a separate event from planning to recover business functions, or in a strange about-face, that Disaster Recovery planning and Business Continuity planning are one and the same. Both of these perspectives are artificial and inaccurate and result in inappropriate planning efforts and, eventually, sub-optimal recovery. By not addressing IT recovery and business continuity as two distinct yet inextricably related processes, businesses are cheating themselves out of increased levels of disaster tolerance, creating confusion and a false sense of security in their planning efforts and ultimately, paying more for less recoverability.
Disaster tolerance results from an ability to conduct business after a disaster…not simply from an ability to recover systems. Depending on the nature of your disaster and the nature of your business, recovery of systems is only a portion of what is required to ensure that business processes can be conducted (consider loss of non-machine readable vital records, site-wide power loss or an ice storm that prevents physical access). Recovery planning efforts cannot be arbitrarily delineated by drawing a horizontal line between IT and business processes. Instead, they must be delineated vertically along critical business processes based on each processes unique recovery requirements. Once a process is defined as mission-critical, recovery planning must address all of the requisites of that process… regardless of whether they are IT related or not.
Whether your recovery planning initiatives stop at the IT function, or extend into the business area, depends on how much of your business capability is independent of IT. Continuity planning is always an exercise in trade-offs… determining exactly how much protection is worth and how much is enough. But the trade-offs must be based on business requirements, not artificial limitations like project scope, budget restrictions, resource availability, etc.. To increase your disaster tolerance, decrease the breadth of your recovery planning efforts. An Iterative Business Process DecompositionTM (the NextGen alternative to the traditional Business Impact Analysis) or a better-than-average BIA will identify which processes and process interactions are critical to your ability to conduct business. It will also identify the requisites, IT and otherwise, to keep those critical processes flowing. By vertically planning to recover all aspects of your critical processes, you can be assured of complete recovery of critical core business functionality and then can extend your recoverability horizontally to other processes as time and budget allow, or as critical processes evolve.
Once vertical process requirements are defined, an appropriate recovery architecture and recovery action plan will provide comprehensive tools to direct your complete recovery of business processes. The NextGen plan development process is similar to traditional recovery planning, but our vertical recovery methodology adds five often-overlooked components to ensure that critical processes, not just critical systems are restored. First, business vital records are addressed to ensure that all necessary documents and information needed to conduct business are available at time of disaster. This requires identification of non-machine readable information and documents such as contracts, original source documents and reference manuals. Next, application reconciliation procedures are developed to ensure that applications can be “certified” as accurate and usable by business personnel after restoration. Then, business recovery teams are defined and action plans are developed so that both IT and business unit personnel respond proactively to disaster situations. Finally, facility issues are addressed to ensure that during physical disasters which prevent access to the normal workspace, business processes (not simply IT processes) can be conducted from an alternate site. Admittedly, this is a thorny problem for many companies faced with the potential relocation of large numbers of business personnel. However, through creative use of sister sites, focusing on only critical processes, planning workflow and understanding the true RTO and RPO’s, WTG can address this key requirement with practical and proportional solutions.
NextGen recovery plans include all the traditional component procedures such as assessment, notification, declaration, mobilization and restoration, although with subtle differences required for business process restoration. However, NextGen Plan Development also adds an additional stage to the recovery process to address interim business processing…extending disaster tolerance even further. Even for insulated, low impact environments, where systems can be down for a long period of time without significant damage, the recovery plan must address extended interim business processing methods. Bridging procedures enable business units to proactively manage their processes throughout the duration of an outage. In scenarios with relatively short RTO’s, bridging procedures ensure an orderly resumption of business activities once systems are restored. For scenarios with longer periods of system unavailability, bridging procedures direct alternate manual processing efforts until the primary systems are restored. Bridging procedures include various combinations of stockpiling procedures for “transactions” that cannot be processed without system support, alternate processing methods for cases where “transactions” must be processed despite lack of system support, and catch-up procedures to address the inevitable backlog after system restoration.
The strategies and techniques described in this solution are equally applicable to all levels of implementation, including: single business process, individual applications, single servers, platforms, whole sites or the entire enterprise.
90% of disasters impact business areas as well as the IT infrastructure. As such, few companies actually recover the primary site’s business processing when they recover the IT function. WTG’s planning methodology has enabled many companies to increase their disaster tolerance by bridging business services across the duration of the disaster, without the overhead of traditional “business continuity planning” efforts. The result is a pragmatic, real-world recovery capability that costs less and offers more.