British Airways: Thousands disrupted as flights axed amid IT crash

As written on bbc.com
Serious problems with British Airways' IT systems have led to thousands of passengers having their plans disrupted, after all flights from Heathrow and Gatwick were cancelled.
Passengers described "chaotic" scenes at the airports, with some criticising BA for a lack of information.
The airline has apologised, and told passengers not to come to the airport.
BA chief executive Alex Cruz said: "We believe the root cause was a power supply issue."
In a video statement released via Twitter, he added: "I am really sorry we don't have better news as yet, but I can assure you our teams are working as hard as they can to resolve these issues."
Mr Cruz said there was no evidence the computer problems were the result of a cyber attack.
The airline hoped to be able to operate some long haul inbound flights on Saturday, landing in London on Sunday, Mr Cruz added.
The GMB union has suggested the failure could have been avoided, had the airline not outsourced its IT work.
BA denied the claim, saying: "We would never compromise the integrity and security of our IT systems".
All passengers affected by the failure - which coincides with the first weekend of the half-term holiday for many in the UK - will be offered the option of rescheduling or a refund.
The airline, which had previously said flights would be cancelled until 18:00 BST, has now cancelled all flights for Saturday and asked passengers not to come to Gatwick or Heathrow airports.
Other airlines flying in and out of the two airports are unaffected.
Architect and TV presenter George Clarke was stuck in Heathrow. He told the BBC it was one of the "most turbulent, badly organised days, that I've ever experienced in Britain".
"The lack of communication all day was woeful. There wasn't a single Tannoy announcement all day in the terminal, not a single member of staff came up to us," he said.
"The only time I found out my flight was cancelled was from the BBC News website."
Piles of checked luggage on the floor in the HeathrowImage copyright@ANNAONTHEWEB

Piles of checked luggage could be seen on the floor in the Heathrow

baggage handlers load luggage onto a plane
Some passengers have reported having to leave Heathrow without their luggage
The problems have affected BA call centres, the website and the mobile app.
Aviation expert Julian Bray said: "It's frozen the whole system so no British Airways plane can actually take off, they can't move the baggage, they can't issue passenger credentials, in fact they can't do anything at all.
"This is a very serious problem, they should have been able to switch to an alternative system - surely British Airways should be able to do this."
Malcolm Ginsberg, editor in chief at Business Travel News, expects the disruption to last for "three or four days".
BA aircraft landing at Heathrow are unable to park as outbound aircraft cannot vacate the gates, which has resulted in passengers being stuck on aircraft.
Journalist Martyn Kent said he had been sitting on a plane at Heathrow for 90 minutes. He said the captain told passengers the IT problems were "catastrophic".


Mick Rix, GMB's national officer for aviation said: "This could have all been avoided.
"BA in 2016 made hundreds of dedicated and loyal IT staff redundant and outsourced the work to India... many viewed the company's actions as just plain greedy."
grounded planes at Heathrow
BA staff in Heathrow's Terminal 5 were resorting to using white boards, according to passenger Gareth Wharton.
Delays have been reported in Rome, Prague, Milan, Stockholm and Malaga due to the system failure.
Philip Bloom said he had been waiting on board a Heathrow-bound flight at Belfast for two hours.
He added: "We haven't been told very much just that there is a worldwide computer system failure.
"We were told that we couldn't even get on other flights because they are unable to see what flights we can be moved to."

Analysis - By Richard Westcott, BBC transport correspondent

a member of British Airways staff writing gate information on a white board at Heathrow AirportImage copyright@THEBOYG
With a lack of technology, staff were using whiteboards in Heathrow
As ever, it is a lack of information that is really making BA passengers angry… we're still awaiting an explanation from the airline and a timescale for how long the problems might last.
The GMB union says this meltdown could have been avoided if BA had not made hundreds of IT staff redundant and outsourced their jobs to India at the end of last year.
Yes, the union has a big axe to grind, but people will want to know if the airline made its IT systems more vulnerable by scaling back computer support to save money - although BA has just flatly denied it to me.
IT problems ripple through an airline. If planes cannot take off, they cannot leave gaps at the gate for others to land.
If flights are delayed by more than around five hours, the airline must swap crews because shift lengths are strictly limited for safety reasons.
Telling customers to stay away is a drastic measure, but it is the only chance BA has of clearing the backlog of flights.

Philip Bloom tweetImage copyrightTWITTER
The BBC's Phillip Norton was at Rome international airport, waiting to fly to London.
He said BA staff were unable to say how long delays would be, telling him "all flights are grounded around the world".
Alma Saffari was in Marseille waiting to get her flight back to Heathrow.
She said: "When we finally boarded the captain came out and told us their computer systems were down worldwide.
"Eventually after sitting on the tarmac for one and a half hours we disembarked the plane.
"Now we are sitting in the departure area outside the gate."
Ms Saffari, who is with her 13-month-old baby, said she had been given a voucher for food and drink.

EU flight delay rights

Heathrow Terminal 5 disruptionImage copyright@TIMREIDCE
Passengers have experienced large queues and disruption at Heathrow Terminal 5, British Airways' main London terminal
  • If your flight departed the European Union or was with a European airline, you might have rights under EU law to claim if the delay or cancellation was within the airline's control
  • Short-haul flights: 250 euros for delays of more than three hours
  • Medium-haul flights: 400 euros for delays of more than three hours
  • Long-haul flights: 300 euros for delays of between three and four hours; and 600 euros for delays of more than four hours
  • If your flight's delayed for two or more hours the airline must offer food and drink, access to phone calls and emails, and accommodation if you're delayed overnight - including transfers between the airport and the hotel


Looking for a technology partner to assist with a specific project? Call Managed Solution at 800-208-3617  or contact us to schedule a full analysis on the performance of your network.



Earlier this month, a monkey caused a nationwide power outage in Kenya. Millions of homes and businesses were without electricity. Which just goes to show that “not all disasters come in the form of major storms with names and categories,” says Bob Davis, CMO, Atlantis Computing.

“Electrical fires, broken water pipes, failed air conditioning units [and rogue monkeys] can cause just as much damage,” he says. And while “business executives might think they’re safe based on their geographic location,” it’s important to remember that “day-to-day threats can destroy data [and] ruin a business,” too, he says. That’s why it is critical for all businesses to have a disaster recovery (DR) plan.

However, not all DR plans are created equal. To ensure that your systems, data and personnel are protected and your business can continue to operate in the event of an actual emergency or disaster, use the following guidelines to create a disaster plan that will help you quickly recover.

1. Inventory hardware and software. Your DR plan should include “a complete inventory of [hardware and] applications in priority order,” says Oussama El-Hilali, vice president of Products for Arcserve. “Each application [and piece of hardware] should have the vendor technical support contract information and contact numbers,” so you can get back up and running quickly.

2. Define your tolerance for downtime and data loss. “This is the starting point of your planning,” says Tim Singleton, president, Strive Technology Consulting. “If you are a plumber, you can probably be in business without servers or technology [for] a while. [But] if you are eBay, you can’t be down for more than seconds. Figuring out where you are on this spectrum will determine what type of solution you will need to recover from a disaster.”

“Evaluate what an acceptable recovery point objective (RPO) and recovery time objective (RTO) is for each set of applications,” advises says David Grimes, CTO, NaviSite. “In an ideal situation, every application would have an RPO and RTO of just a few milliseconds, but that’s often neither technically nor financially feasible. By properly identifying these two metrics businesses can prioritize what is needed to successfully survive a disaster, ensure a cost-effective level of disaster recovery and lower the potential risk of miscalculating what they’re able to recover during a disaster.”

“When putting your disaster recovery plan in writing, divide your applications into three tiers,”

says Robert DiLossi, senior director, Testing & Crisis Management, Sungard Availability Services. “Tier 1 should include the applications you need immediately. These are the mission-critical apps you can’t do business without. Tier 2 covers applications you need within eight to 10 hours, even up to 24 hours. They’re essential, but you don’t need them right away. Tier 3 applications can be comfortably recovered within a few days,” he explains.

“Defining which applications are most important will aid the speed and success of the recovery. But most important is testing the plan at least twice per year,” he says. “The tiers might change based on the results, which could reveal unknown gaps to fill before a true disaster.”

3. Lay out who is responsible for what – and identify backup personnel. “All disaster recovery plans should clearly define the key roles, responsibilities and parties involved during a DR event,” says Will Chin, director of cloud services, Computer Design & Integration. “Among these responsibilities must be the decision to declare a disaster. Having clearly identified roles will garner a universal understanding of what tasks need to be completed and who is [responsible for what]. This is especially critical when working with third-party vendors or providers.  All parties involved need to be aware of each other's responsibilities in order to ensure the DR process operates as efficiently as possible.”

“Have plans for your entire staff, from C-level executives all the way down, and make sure they understand the process,” and what’s expected of them, says Neely Loring, president, Matrix, which provides cloud-based solutions, including Disaster-Recover-as-a-Service. “This gets everyone back on their feet quicker.”

“Protocols for a disaster recovery (DR) plan must include who and how to contact the appropriate individuals on the DR team, and in what order, to get systems up and running as soon as possible,” adds Kevin Westenkirchner, vice president, operations, Thru. “It is critical to have a list of the DR personnel with the details of their position, responsibilities [and emergency contact information].”

“One final consideration is to have a succession plan in place with trained back-up employees in case a key staff member is on vacation or in a place where they cannot do their part [or leaves the company],” says Brian Ferguson, product marketing manager, Digium.

4. Create a communication plan. “Perhaps one of the more overlooked components of a disaster recovery plan is having a good communication plan,” says Mike Genardi, solutions architect, Computer Design & Integration. “In the event a disaster strikes, how are you going to communicate with your employees? Do your employees know how to access the systems they need to perform their job duties during a DR event?

“Many times the main communication platforms (phone and email) may be affected and alternative methods of contacting your employees will be needed,” he explains. “A good communication plan will account for initial communications at the onset of a disaster as well as ongoing updates to keep staff informed throughout the event.”

“Communication is critical when responding to and recovering from any emergency, crisis event or disaster,” says Scott D. Smith, chief commercial officer at ModusLink. So having “a clear communications strategy is essential. Effective and reliable methods for communicating with employees, vendors, suppliers and customers in a timely manner are necessary beyond initial notification of an emergency. Having a written process in place to reference ensures efficient action post-disaster and alignment between organizations, employees and partners.”

“A disaster recovery plan should [also] include a statement that can be published on your company’s website and social media platforms in the event of an emergency,” adds Robert Gibbons, CTO, Datto, a data protection platform. And be prepared to “give your customers timely status updates on what they can expect from your business and when. If your customers understand that you are aware of the situation, you are adequately prepared and working to take care of it in a timely manner, they will feel much better.”

5. Let employees know where to go in case of emergency – and have a backup worksite. “Many firms think that the DR plan is just for their technology systems, but they fail to realize that people (i.e., their employees) also need to have a plan in place,” says Ahsun Saleem, president, Simplegrid Technology. “Have an alternate site in mind if your primary office is not available. Ensure that your staff knows where to go, where to sit and how to access the systems from that site. Provide a map to the alternate site and make sure you have seating assignments there.”

“In the event of a disaster, your team will need an operational place to work, with the right equipment, space and communications,” says DiLossi. “That might mean telework and other alternative strategies need to be devised in case a regional disaster causes power outages across large geographies. Be sure to note any compliance requirements and contract dedicated workspace where staff and data can remain private. [And] don’t contract 50 seats if you’ll really need 200 to truly meet your recovery requirements.”

6. Make sure your service-level agreements (SLAs) include disasters/emergencies. “If you have outsourced your technology to an outsourced IT firm, or store your systems in a data center/co-location facility, make sure you have a binding agreement with them that defines their level of service in the event of a disaster,” says Saleem. “This [will help] ensure that they start working on resolving your problem within [a specified time]. Some agreements can even discuss the timeframe in getting systems back up.”

7. Include how to handle sensitive information. “Defining operational and technical procedures to ensure the protection of…sensitive information is a critical component of a DR plan,” says Eric Dieterich, partner, Sunera. “These procedures should address how sensitive information will be maintained [and accessed] when a DR plan has been activated.”

8. Test your plan regularly. “If you’re not testing your DR process, you don’t have one,” says Singleton. “Your backup hardware may have failed, your supply chain may rely on someone incapable of dealing with disaster, your internet connection may be too slow to restore your data in the expected amount of time, the DR key employee may have changed [his] cell phone number. There are a lot of things that may break a perfect plan. The only way to find them is to test it when you can afford to fail.”

“Your plan must include details on how your DR environment will be tested, including the method and frequency of tests,” says Dave LeClair, vice president, product marketing, Unitrends, a cloud-based IT disaster recovery and continuity solution provider. “Our recent continuity survey of 900 IT admins discovered less than 40 percent of companies test their DR more frequently than once per year and 36 percent don’t test at all.

“Infrequent testing will likely result in DR environments that do not perform as required during a disaster,” he explains. “Your plan should define recovery time objective (RTO) and recovery point objective (RPO) goals per workload and validate that they can be met. Fortunately, recovery assurance technology now exists that is able to automate DR testing without disrupting production systems and can certify RTO and RPO targets are being met for 100 percent confidence in disaster recovery even for complex n-tier applications.”

Also keep in mind that “when it comes to disaster recovery, you’re only as good as your last test,” says Loring. “A testing schedule is the single most important part of any DR plan. Compare your defined RTO and RPO metrics against tested results to determine the efficacy of your plan. The more comprehensive the testing, the more successful a company will be in getting back on their feet,” he states. “We test our generators weekly to ensure their function. Always remember that failing a test is not a bad thing. It is better to find these problems early than to find them during a crisis. Decide what needs to be modified and test until you’re successful.”

And don’t forget about testing your employees. “The employees that are involved need to be well versed in the plan and be able to perform every task they are assigned to without issue,” says Ferguson. “Running simulated disasters and drills help ensure that your staff can execute the plan when an actual event occurs.”

azure site recovery 2 - managed solutionCloud migration and disaster recovery of load balanced multi-tier applications

Support for Microsoft Azure virtual machines availability sets has been a highly anticipated capability by many Azure Site Recovery customers who are using the product for either cloud migration or disaster recovery of applications. Today, I am excited to announce that Azure Site Recovery now supports creating failed over virtual machines in an availability set. This in turn allows that you can configure an internal or external load balancer to distribute traffic between multiple virtual machines of the same tier of an application. With the Azure Site Recovery promise of cloud migration and  disaster recovery of applications, this first-class integration with availability sets and load balancers makes it simpler for you to run your failed over applications on Microsoft Azure with the same guarantees that you had while running them on the primary site.
In an earlier blog of this series, you learned about the importance and complexity involved in recovering applications – Cloud migration and disaster recovery for applications, not just virtual machines. The next blog was a deep-dive on recovery plans describing how you can do a One-click cloud migration and disaster recovery of applications. In this blog, we look at how to failover or migrate a load balanced multi-tier application using Azure Site Recovery.
To demonstrate real-world usage of availability sets and load balancers in a recovery plan, a three-tier SharePoint farm with a SQL Always On backend is being used.  A single recovery plan is used to orchestrate failover of this entire SharePoint farm.
Disaster Recovery of three tier SharePoint Farm
Here are the steps to set up availability sets and load balancers for this SharePoint farm when it needs to run on Microsoft Azure:
  1. Under the Recovery Services vault, go to Compute and Network settings of each of the application tier virtual machines, and configure an availability set for them.
  2. Configure another availability set for each of web tier virtual machines.
  3. Add the two application tier virtual machines and the two web tier virtual machines in Group 1 and Group 2 of a recovery plan respectively.
  4. If you have not already done so, click the following button to import the most popular Azure Site Recovery automation runbooks into your Azure Automation account.


  5. Add script ASR-SQL-FailoverAG as a pre-step to Group 1.
  6. Add script ASR-AddMultipleLoadBalancers as a post-step to both Group 1 and Group 2.
  7. Create an Azure Automation variable using the instructions outlined in the scripts. For this example, these are the exact commands used.
$InputObject = @{"TestSQLVMRG" = "SQLRG" ; "TestSQLVMName" = "SharePointSQLServer-test" ; "ProdSQLVMRG" = "SQLRG" ; "ProdSQLVMName" = "SharePointSQLServer"; "Paths" = @{ "1"="SQLSERVER:SQLSharePointSQLDEFAULTAvailabilityGroupsConfig_AG"; "2"="SQLSERVER:SQLSharePointSQLDEFAULTAvailabilityGroupsContent_AG"}; "406d039a-eeae-11e6-b0b8-0050568f7993"=@{ "LBName"="ApptierInternalLB"; "ResourceGroupName"="ContosoRG"}; "c21c5050-fcd5-11e6-a53d-0050568f7993"=@{ "LBName"="ApptierInternalLB"; "ResourceGroupName"="ContosoRG"}; "45a4c1fb-fcd3-11e6-a53d-0050568f7993"=@{ "LBName"="WebTierExternalLB"; "ResourceGroupName"="ContosoRG"}; "7cfa6ff6-eeab-11e6-b0b8-0050568f7993"=@{ "LBName"="WebTierExternalLB"; "ResourceGroupName"="ContosoRG"}} $RPDetails = New-Object -TypeName PSObject -Property $InputObject | ConvertTo-Json New-AzureRmAutomationVariable -Name "SharePointRecoveryPlan" -ResourceGroupName "AutomationRG" -AutomationAccountName "ASRAutomation" -Value $RPDetails -Encrypted $false
You have now completed customizing your recovery plan and it is ready to be failed over.
Azure Site Recovery SharePoint Recovery Plan
Once the failover (or test failover) is complete and the SharePoint farm runs in Microsoft Azure, it looks like this:
SharePoint Farm on Azure failed over using Azure Site Recovery
Watch this demo video to see all this in action - how using in-built constructs that Azure Site Recovery provides we can failover a three-tier application using a single-click recovery plan. The recovery plan automates the following tasks:
  1. Failing over SQL Always On Availability Group to the virtual machine running in Microsoft Azure
  2. Failing over the web and app tier virtual machines that were part of the SharePoint farm
  3. Attaching an internal load balancer on the application tier virtual machines of the SharePoint farm that are in an availability set
  4. Attaching an external load balancer on the web tier virtual machines of the SharePoint farm that are in an availability set
With relentless focus on ensuring that you succeed with full application recovery, Azure Site Recovery is the one-stop shop for all your disaster recovery and migration needs. Our mission is to democratize disaster recovery with the power of Microsoft Azure, to enable not just the elite tier-1 applications to have a business continuity plan, but offer a compelling solution that empowers you to set up a working end to end disaster recovery plan for 100% of your organization's IT applications.
You can check out additional product information and start protecting and migrating your workloads to Microsoft Azure using Azure Site Recovery today. You can use the powerful replication capabilities of Azure Site Recovery for 31 days at no charge for every new physical server or virtual machine that you replicate, whether it is running on VMware or Hyper-V. To learn more about Azure Site Recovery, check out our How-To Videos. Visit the Azure Site Recovery forum on MSDN for additional information and to engage with other customers, or use the Azure Site Recovery User Voice to let us know what features you want us to enable next.

[vc_row][vc_column][vc_column_text][vc_single_image image="10252" img_size="full" alignment="center"][vc_column_text]

Protect and extend your datacenter

As written on microsoft.com
Backup and disaster recovery solutions powered by Azure Backup and Azure Site Recovery, built on Microsoft’s trusted cloud platform, protect and extend your datacenter. Site Recovery’s replication capabilities help protect critical applications and extend your datacenter securely to Azure, enabling recovery, dev/testing and migration of applications to Azure—all while you control data privacy and access. Azure Backup is a scalable solution that protects your application data and gives you visibility into where and how the data is managed. Azure Backup retains data for up to 99 years with zero capital investment and minimal operating costs, helping you meet your compliance obligations.

[/vc_column_text][grve_video video_link="https://youtu.be/X-NKtpXxX-s"][vc_column_text]


The public cloud is disrupting how modern enterprises back up their mission critical applications. Operations Management Suite delivers a purpose-built, intelligent backup platform that makes Azure the ideal replacement for ailing legacy solutions. Azure Backup, an OMS service, protects your investments across your datacenter, hosted clouds, and on Azure.
With up to 99 years of retention, Azure Backup meets or exceeds even the most demanding regulatory and business requirements. You can depend on Azure’s largest portfolio of compliance standards and certifications in the industry.
Use Azure as your disaster recovery site and eliminate the capital and operating expenses of maintaining a secondary datacenter.
Leverage Azure’s secure infrastructure to test copies of your production workloads, without impacting production users, or test new versions of applications while maintaining updated production data.
Creating new workloads in the cloud is easy, but migrating existing productions’ applications can be a daunting task. Operations Management Suite makes migration chores a thing of the past. Site Recovery, an OMS service, effortlessly replicates complex production workloads and allows you to validate their functionality before performing a migration. Gain peace of mind with the operational assurance of state-of-the-art security and encryption of Azure.
Operations Management Suite delivers purpose built intelligent backup platform makes Azure the ideal replacement for ailing legacy solutions. Azure Backup, an OMS service, protects your investments across your datacenter, hosted clouds and on Azure.
You can depend on Azure’s largest portfolio of compliance standards and certifications in the industry.

Managed Solution is a full-service technology firm that empowers business by delivering, maintaining and forecasting the technologies they’ll need to stay competitive in their market place. Founded in 2002, the company quickly grew into a market leader and is recognized as one of the fastest growing IT Companies in Southern California.

We specialize in providing full managed services to businesses of every size, industry, and need.


Contact us Today!

Chat with an expert about your business’s technology needs.