There are a lot of obvious reasons why companies need disaster recovery plans. Shareholders and other stakeholders want to ensure their assets are protected while employees want to do everything possible to ensure continued employment. These are very clear ramifications that are easy to understand. What is often less well understood is how clinical disasters manifest themselves over time.
Companies on the forefront of Disaster Recovery are focusing on addressing the disaster before it happens. Whether the disaster is Catastrophic or Clinical in nature, there are a variety of steps a company can take to reduce their risks. The key step to this new approach is to focus on the architectural availability during the application development and deployment phases.
The survivability of an application is MUCH easier to address during its development; API and Web Service layers can be written to tolerate latency and interruption and data replication techniques can be built into the application itself. Even simple recovery techniques such as the automated removal of corrupt data can have a big impact on the availability of a company’s critical applications
I have found it helpful to provide a framework for people to use when considering architectural availability. This framework is comprised of 4 key considerations.
Transactional Data. The ability for an application to duplicate transaction data to a second/remote site without the use of 3rd party tools or costly professional services is a key item to consider when architecting a bet-the-business application. A rule of thumb to consider is to always replicated data as close to the native transaction as possible. Examples include tools like Oracle’s DataGuard and Microsoft log shipping functionality.
Reference Data. Reference data is all the non-transactional data that is part of an application’s data set. Common examples are images or documents. Since they are not contained in the database but may have a direct correlation between specific transactions it is critical that they be “in sync” with the database records that point to them. The vast majority of these data types are contained in standard file systems and as such can be protected with any file system aware replication technology.
System State. Think of System State as all the meta data that defines how a server and its applications are configured. System State is often overlooked by application architects and as such can introduce significant delays in any disaster recovery exercise. Just imagine the effort required to rebuild a group of servers from bare metal using only documentation and you can quickly appreciate the importance of being able to reconstitute the way an application infrastructure is deployed.
Accessibility. In its most basic form can an application’s intended user reach the application without interruption that exceeds the particular application’s RTO. Specific issues that need to be considered here include; directory extensibility, DNS change propagation and private network connectivity.
By focusing on architectural availability and utilizing Hosting.com’s Critical Availability Service, a company can achieve a higher level of availability while optimizing their disaster recovery budget.