Five tips for Successful Disaster Recovery Implementations

Disaster Recovery is one the major topics that are coming up in my interviews. I have been implementing DR (short for Disaster Recovery) for most of my career, so below are five tips with concrete examples with implementing Disaster Recover solutions to work well. 

1. Properly set DR Expections with clients – I can’t tell you how many times I’ve been in where clients expected one level of service for disaster recovery and IT provided a different and usually less functional recovery plan/implementation. Proper client communication is necessary. At an IT manager it is critical to understand how the appliation behaves, how the client expects the application to behave and the level of service required for the application. Trust needs to be established between IT and the application/development folks. Without it, the disaster recovery plan will be weak.

Example: A new client wanted to test the disaster recovery for the SQL Server. I asked them if there was a DR plan and implemenation steps. They said no, they usually just turned off the production server and turn on the disaster recovery server and in five minutes (which was their expectation). The reality was that the DBA had a read-only copy of the database on the production server and did not turn off the production server at all. The DR test was marked “success”, but the client expectation as well as the implementation were two different things.

2. Simplify Everything - When something hits the fan, the last thing IT staff will do well is follow complicated disaster recovery instructions. The failover and client communciation should be as automated as possible. The client should have all major fail over steps with approximate times for completing each step. The client will want to guage for him/herself how well the plan is being executed on a real-time basis. This will also serve to help improve or adjust difficult steps with the understanding to simplify the implementation.

Example: One client gave me a series of twelve steps that needed to be completed in an hour to fail over an application properly. The SLA was two hours. Previous attempts using this plan were completed in four to six hours. One of the steps was uninstalling/installing Web Server software on the disaster recover system. Basically, the client doesn’t update the disaster recover system on a regular basis. We elimonated six steps by adding the DR systems to their current software deployment process. Now the application disaster recovery plan was reduced to thirty mintues.

 

3. Test the DR Plan as well as the implentation regularly –  If the plan requires people getting on a conference call, test that too along with IT implementation. Get people on the conference call. There have been dozens of times where conference call numbers changed, and different people need to be notified. Make sure that all tangible assets in the plan will be used/tested. Make note of the items not used in the plan so they can be removed later or if new ones need to be added. People, numbers, and plan steps will change as applications add more systems, features and staff.

 

4. Communicate/Distribute the plan to key parties – Everyone needs to be on the same page with disaster recovery plans. Executives need to know what conference number to call and what room they need to be in. IT folks need to know the steps required and the time required needed to perform the steps. Stakeholders needs communicate with their clients as well as monitor IT progress. Most folks will attend a DR planning meeting say “yes” throughout the meeting and stick the plan in the drawer. Please get it out once every six months or more frequently if needed and go over it so folks still understand the plan. No one really cares about the plan until it’s time to implement it. Then everyone will be calling you. There will certainly need to be a time to implement the plan when the emergency arises. It will be your fortitude to make sure that tests/plans and understanding is communicated effectively.

 

5. Don’t worry about it - If the communication has been handled effectively and everyone knows what they need to do, don’t sweat all the little things that will go wrong. Just note them to adjust the plan later. The DR plan is a living, breathing document. Not something you write once and distribute. Most stakeholders aren’t going to care if step two took ten minutes longer or if another thing needed to be updated.  There will usually be some stuff that was missed. Make note of it and follow it up. The most important thing is was the plan implemented successfully. The applications failed over correctly and are working.

  • Share/Bookmark

Five Tips for Successful Off-Shoring

Off-Shoring is a sensative topic among large and small institutions alike. I’ve heard horror stories about off-shoring initiatives and have been involved in many successes with off-shoring. Below are some tips to make off-shoring resources successful. This not by all means not a comprehensive list.

1. Get buy-in from all Management levels. Off-shoring will get screwed up by middle and lower management because they don’t believe that it will work. They don’t share your cost-custting vision. They don’t share your effeciency vision. They don’t share the this-will-take-work-off-their-plates-vision. The managers will do the absolute bare minmum so that it will fail and you will look like baffoon to Sr. Management. Do the work and get buy in.

2. Work in small teams. Many firms outsource a large function, a help desk, call center etc… Large implementations usually do not work well because the help desk or call center are trained in only dealing with low level requests. Meaning that anything that requires them to think, they’ll kick up the request to another level, which is ususally back to the host country. You want the people answering the phone in the off-shore location to actually help the customer. You want the issue to be handled in one phone call or atleast have one contact to manage the process.  Training the off-shore team using small teams is the most effective way for training the large unit. Fly your senior staff members over the off-shore country/area to hold training sessions. The staff members should be working with no more than 12-18 people. The staff member will remain off-shore until the off-shore members are traineds. Usually a minimum of two – three weeks to a maximum of six months. (Most staff members going off-shore should expect to say between one month and three months, as that is more typical.)

3. Assign an off-shore Czar - Most off-shore programs fail because of this. For large off-shore installations you need a off-shore Czar managing all of the off-shore resources. This person is typically a member of the Sr. Management team that either relocates or spends 60%+ of his time in the off-shore region. The Czar performs the following functions:  

4. Have a right-sized off-shoring budget. Successful off-shoring should not save the company much money in the short run. It will save the company money in the long run if implemented properly. Remember your first concern should be client service. It’s not worth spending money on an off-shore program to save a few dollars this quarter while dropping client service. Make sure Sr. Management understands that this is a medium term to long term investment in cost savings. If successful, after a year and a half, savings can be seen. Most off-shoring programs do not have the right budget. Many are so slim that they do not account for all of the travel and work for on-shore members getting the off-shore members properly trained. So, most of that work doesn’t get done and customers complain that they aren’t getting the service they require.

5. Let time take it’s course - Even if steps 1-4 are executed well, setting up an off-shore team will take longer than anticipated even if you’ve done it a few times. The reason being is that culture and personality can add some roadblocks in the process. It’s up to the off-shore czar to identify road blocks with any off-shore implementation and make sure the implementation is going smoothly. Sometimes there are unforseen events that either delay or stop the process. These can either be government conditions, weather conditions or general resource contraints (like not being able to find the right talent).  Make sure that enough time is allowed for the implementation so that Sr. Management and the czar can work through problems and issues.

  • Share/Bookmark