fail-fast - managed solution

Failing Fast: The Quickest Route to Digital Transformation

Speed, agility, and innovation—these are the hallmarks for success in the digital era. What’s less proven is the “fail fast” concept, which encourages exploration by floating trial balloons quickly, then allowing ideas to fail if they don’t help drive positive business outcomes. Using this approach, failure becomes a stepping-stone for success—allowing teams to iterate quickly, learn from mistakes, and triage ideas in rapid succession, much like a startup.
Not too long ago, a failed IT project could be catastrophic, resulting in missed delivery targets, productivity drop-offs, even lost revenue. It’s no wonder that IT teams remain a bit gun-shy about adopting a fail-fast approach to business given that their job stability likely depends on their success. The IDG Cloud Futurecast survey found that just 15% of IT directors believe risk-taking is critical to transforming products and services.
C-level executives, by comparison, are far more bullish on the advantages of fail-fast techniques, in part, no doubt, because their jobs don’t directly hang in the balance. According to the IDG survey, 29% recognize the ability to fail fast as a key enabler for revenue-driving strategies compared with only 17% of vice presidents and general managers.
These forward-thinking leaders understand that the cloud is the engine behind creating an iterative, agile mindset across the entire organization. The cloud allows teams to test-drive many small innovations without the traditional risk of a major setback. A cloud-based, fail-fast approach removes technology constraints and allows organizations to explore new ideas constantly, versus one or two ideas annually.

AI Increases The Pace

Advances in big data analytics and artificial intelligence promise to take fail-fast strategies to entirely new heights, delivering real-time monitoring and robust data access capabilities that can detect failure almost instantaneously.
The cloud’s ability to drive innovation and mitigate risk are also central to a fail-fast mentality. Twenty-eight percent of IDG survey respondents say the cloud is essential for reinventing their businesses. Perhaps more important, 25% of respondents say the cloud’s ability to support testing product concepts in early stages with greater accuracy is a key asset. Another 21% say exploratory, fail-fast methods can help minimize the risk associated with continuous innovation. These percentages, while relatively low, signal an important shift in thinking: The cloud is no longer just a cost-saver, but an innovation engine.
For many organizations, enhancing customer experience is the leading priority for success in the digital era. And applying a fail-fast methodology may be the secret sauce. Organizations emphasizing the cloud for customer engagement strategies are far more likely to embrace fail-fast capabilities—32% of survey respondents, compared with companies using cloud to optimize operations (19%), empower employees (17%), or transform products (15%). Experimenting with new products and services using the fail-fast method may give these leaders a leg up on their competitors.
For all the negative connotations, fast failure in the digital era may be just the right recipe for digital transformation success.

Earlier this month, a monkey caused a nationwide power outage in Kenya. Millions of homes and businesses were without electricity. Which just goes to show that “not all disasters come in the form of major storms with names and categories,” says Bob Davis, CMO, Atlantis Computing.

“Electrical fires, broken water pipes, failed air conditioning units [and rogue monkeys] can cause just as much damage,” he says. And while “business executives might think they’re safe based on their geographic location,” it’s important to remember that “day-to-day threats can destroy data [and] ruin a business,” too, he says. That’s why it is critical for all businesses to have a disaster recovery (DR) plan.

However, not all DR plans are created equal. To ensure that your systems, data and personnel are protected and your business can continue to operate in the event of an actual emergency or disaster, use the following guidelines to create a disaster plan that will help you quickly recover.

1. Inventory hardware and software. Your DR plan should include “a complete inventory of [hardware and] applications in priority order,” says Oussama El-Hilali, vice president of Products for Arcserve. “Each application [and piece of hardware] should have the vendor technical support contract information and contact numbers,” so you can get back up and running quickly.

2. Define your tolerance for downtime and data loss. “This is the starting point of your planning,” says Tim Singleton, president, Strive Technology Consulting. “If you are a plumber, you can probably be in business without servers or technology [for] a while. [But] if you are eBay, you can’t be down for more than seconds. Figuring out where you are on this spectrum will determine what type of solution you will need to recover from a disaster.”

“Evaluate what an acceptable recovery point objective (RPO) and recovery time objective (RTO) is for each set of applications,” advises says David Grimes, CTO, NaviSite. “In an ideal situation, every application would have an RPO and RTO of just a few milliseconds, but that’s often neither technically nor financially feasible. By properly identifying these two metrics businesses can prioritize what is needed to successfully survive a disaster, ensure a cost-effective level of disaster recovery and lower the potential risk of miscalculating what they’re able to recover during a disaster.”

“When putting your disaster recovery plan in writing, divide your applications into three tiers,”

says Robert DiLossi, senior director, Testing & Crisis Management, Sungard Availability Services. “Tier 1 should include the applications you need immediately. These are the mission-critical apps you can’t do business without. Tier 2 covers applications you need within eight to 10 hours, even up to 24 hours. They’re essential, but you don’t need them right away. Tier 3 applications can be comfortably recovered within a few days,” he explains.

“Defining which applications are most important will aid the speed and success of the recovery. But most important is testing the plan at least twice per year,” he says. “The tiers might change based on the results, which could reveal unknown gaps to fill before a true disaster.”

3. Lay out who is responsible for what – and identify backup personnel. “All disaster recovery plans should clearly define the key roles, responsibilities and parties involved during a DR event,” says Will Chin, director of cloud services, Computer Design & Integration. “Among these responsibilities must be the decision to declare a disaster. Having clearly identified roles will garner a universal understanding of what tasks need to be completed and who is [responsible for what]. This is especially critical when working with third-party vendors or providers.  All parties involved need to be aware of each other's responsibilities in order to ensure the DR process operates as efficiently as possible.”

“Have plans for your entire staff, from C-level executives all the way down, and make sure they understand the process,” and what’s expected of them, says Neely Loring, president, Matrix, which provides cloud-based solutions, including Disaster-Recover-as-a-Service. “This gets everyone back on their feet quicker.”

“Protocols for a disaster recovery (DR) plan must include who and how to contact the appropriate individuals on the DR team, and in what order, to get systems up and running as soon as possible,” adds Kevin Westenkirchner, vice president, operations, Thru. “It is critical to have a list of the DR personnel with the details of their position, responsibilities [and emergency contact information].”

“One final consideration is to have a succession plan in place with trained back-up employees in case a key staff member is on vacation or in a place where they cannot do their part [or leaves the company],” says Brian Ferguson, product marketing manager, Digium.

4. Create a communication plan. “Perhaps one of the more overlooked components of a disaster recovery plan is having a good communication plan,” says Mike Genardi, solutions architect, Computer Design & Integration. “In the event a disaster strikes, how are you going to communicate with your employees? Do your employees know how to access the systems they need to perform their job duties during a DR event?

“Many times the main communication platforms (phone and email) may be affected and alternative methods of contacting your employees will be needed,” he explains. “A good communication plan will account for initial communications at the onset of a disaster as well as ongoing updates to keep staff informed throughout the event.”

“Communication is critical when responding to and recovering from any emergency, crisis event or disaster,” says Scott D. Smith, chief commercial officer at ModusLink. So having “a clear communications strategy is essential. Effective and reliable methods for communicating with employees, vendors, suppliers and customers in a timely manner are necessary beyond initial notification of an emergency. Having a written process in place to reference ensures efficient action post-disaster and alignment between organizations, employees and partners.”

“A disaster recovery plan should [also] include a statement that can be published on your company’s website and social media platforms in the event of an emergency,” adds Robert Gibbons, CTO, Datto, a data protection platform. And be prepared to “give your customers timely status updates on what they can expect from your business and when. If your customers understand that you are aware of the situation, you are adequately prepared and working to take care of it in a timely manner, they will feel much better.”

5. Let employees know where to go in case of emergency – and have a backup worksite. “Many firms think that the DR plan is just for their technology systems, but they fail to realize that people (i.e., their employees) also need to have a plan in place,” says Ahsun Saleem, president, Simplegrid Technology. “Have an alternate site in mind if your primary office is not available. Ensure that your staff knows where to go, where to sit and how to access the systems from that site. Provide a map to the alternate site and make sure you have seating assignments there.”

“In the event of a disaster, your team will need an operational place to work, with the right equipment, space and communications,” says DiLossi. “That might mean telework and other alternative strategies need to be devised in case a regional disaster causes power outages across large geographies. Be sure to note any compliance requirements and contract dedicated workspace where staff and data can remain private. [And] don’t contract 50 seats if you’ll really need 200 to truly meet your recovery requirements.”

6. Make sure your service-level agreements (SLAs) include disasters/emergencies. “If you have outsourced your technology to an outsourced IT firm, or store your systems in a data center/co-location facility, make sure you have a binding agreement with them that defines their level of service in the event of a disaster,” says Saleem. “This [will help] ensure that they start working on resolving your problem within [a specified time]. Some agreements can even discuss the timeframe in getting systems back up.”

7. Include how to handle sensitive information. “Defining operational and technical procedures to ensure the protection of…sensitive information is a critical component of a DR plan,” says Eric Dieterich, partner, Sunera. “These procedures should address how sensitive information will be maintained [and accessed] when a DR plan has been activated.”

8. Test your plan regularly. “If you’re not testing your DR process, you don’t have one,” says Singleton. “Your backup hardware may have failed, your supply chain may rely on someone incapable of dealing with disaster, your internet connection may be too slow to restore your data in the expected amount of time, the DR key employee may have changed [his] cell phone number. There are a lot of things that may break a perfect plan. The only way to find them is to test it when you can afford to fail.”

“Your plan must include details on how your DR environment will be tested, including the method and frequency of tests,” says Dave LeClair, vice president, product marketing, Unitrends, a cloud-based IT disaster recovery and continuity solution provider. “Our recent continuity survey of 900 IT admins discovered less than 40 percent of companies test their DR more frequently than once per year and 36 percent don’t test at all.

“Infrequent testing will likely result in DR environments that do not perform as required during a disaster,” he explains. “Your plan should define recovery time objective (RTO) and recovery point objective (RPO) goals per workload and validate that they can be met. Fortunately, recovery assurance technology now exists that is able to automate DR testing without disrupting production systems and can certify RTO and RPO targets are being met for 100 percent confidence in disaster recovery even for complex n-tier applications.”

Also keep in mind that “when it comes to disaster recovery, you’re only as good as your last test,” says Loring. “A testing schedule is the single most important part of any DR plan. Compare your defined RTO and RPO metrics against tested results to determine the efficacy of your plan. The more comprehensive the testing, the more successful a company will be in getting back on their feet,” he states. “We test our generators weekly to ensure their function. Always remember that failing a test is not a bad thing. It is better to find these problems early than to find them during a crisis. Decide what needs to be modified and test until you’re successful.”

And don’t forget about testing your employees. “The employees that are involved need to be well versed in the plan and be able to perform every task they are assigned to without issue,” says Ferguson. “Running simulated disasters and drills help ensure that your staff can execute the plan when an actual event occurs.”

Contact us Today!

Chat with an expert about your business’s technology needs.