When Microsoft IT deployed Skype for Business 2015 to support our highly mobile global user base, our goal was to provide the best user experience in the industry. We learned valuable lessons about hardware requirements, managing our complex network, accommodating diverse and remote clients, and running a unified communications platform in a hybrid cloud environment. We also helped develop a Call Quality Dashboard to help other organizations optimize the user experience.
Microsoft is a leader in unified communications—where voice, instant messaging, and conferencing converge to help employees communicate and collaborate effectively from anywhere. In 2011, Microsoft acquired Skype and integrated it into our Lync unified communications solution to create Skype for Business. Skype for Business has a design inspired by Skype and the security, compliance, and control of Lync.
In 2013, Microsoft IT planned to deploy a pre-release version of Skype for Business to the Microsoft global user base. Feedback from these users would help the product team improve the product before public release. To get Skype for Business to work well for our internal users, though, we would need to manage a complex environment. Unified communications is a real-time service that’s sensitive to change, client-to-client or server health anomalies, network latency, packet loss, and jitter.
Also, we knew that our hardware would be insufficient to support peak usage. We knew this because when we upgraded from Lync 2010 to Lync 2013, users experienced poor call quality, dropped calls, and bad connections. In 2014, we had 10 major incidents when as many as 1,000 Lync users were unable to make calls, join meetings, or were disconnected during a call. We determined that the problem was outdated hardware. The Lync 2013 architecture requires more robust hardware than Lync 2010, but we were still running the old servers. Skype for Business has the same architecture as Lync 2013, so without a hardware upgrade, the user experience would be poor, no matter what else we did.
Together with the product team, we launched the Get to Green program in March 2014, with “green” being the desired state of the service as shown in our metrics. Our goal was to make the end-to-end Skype for Business user experience the best in the industry. In addition to upgrading hardware, we needed to address issues arising from incompatible client drivers and hardware and a variety of networking environments. Also, more and more of our users were connecting to Skype for Business using personal devices and personal wireless networks that we don’t manage. We would need to find ways to improve the way our service performs on these unmanaged devices and external networks.
We got together with the product team to plan the Get to Green program. Our goal was to improve the user experience so there would be fewer dropped calls and better voice and video quality. To succeed, we would need to assess the environment and identify areas of opportunity to improve the service.
We would measure our success by using the Global Employee Satisfaction Survey and the Poor Call Rate (PCR). The employee satisfaction survey is administered bi-annually to a cross-section of employees that represent all roles and regions. It gathers their opinions about Microsoft IT services and resources, including their unified communications user experiences. PCR is an objective measure of call quality, based on a mean opinion score (MOS) for packet loss, jitter, concealment ratio, and round-trip times.
To plan improvements that would have the most impact, we assessed the service environment and identified the following areas that affect the user experience the most.
To improve the user experience, we focused our efforts on improving these areas:
We decided to focus on improving service quality for our most challenging group of users, field sales people. Out of all our users, they’re the most dependent on the Skype for Business service. They don’t have the benefit of our stable corporate network, so their calls are often affected by network anomalies. Field sales users are often not in corporate offices and they rely heavily on unified communications to do their work. They often connect over external wireless networks of variable quality, and are the most affected by quality and reliability issues. We knew that once we got the service working well for them, all of our users would benefit.
The following two tables show the roles that are most affected by service quality, and the percentage of field sales people that are affected by poor PCR, respectively.
Over a period of several months, we made improvements to the server and network infrastructure, client devices, and user support. We’ve also continued migrating more of our user base to the cloud. While we still have a way to go, early results show that our approach is working, and the user experience is improving.
For the on-premises deployment of Skype for Business, a key area that we needed to address was server reliability and availability. To improve reliability and availability, we needed to increase server capacity and introduce redundancy to support the Skype for Business architecture. The old hardware we were using had been designed for Lync 2010, which had a distributed architecture where a capability or service runs on a separate server. To increase scalability, the Lync 2013 architecture allows multiple services to run on a single server or across server farms. Capacity can then be increased by adding servers. This architecture boosts the need for server performance, though. More CPU and memory is required to serve peak loads. For redundancy, we would need to add servers.
Skype for Business uses the same architecture as Lync 2013. To increase reliability and performance, we deployed more robust hardware to meet the new requirements. Also, to take advantage of its threading improvements over Microsoft Windows Server 2008, we decided to run the infrastructure on Windows Server 2012 R2 instead. Upgrading to Windows Server 2012 R2 yielded the added benefits of Windows Fabric, which Skype for Business makes extensive use of.
While still running Lync 2013, we upgraded all of our hardware to support the new consolidated architecture, where multiple services run on the same server. We first set up the new hardware infrastructure and then migrated our Lync 2013 servers over to it. This increased server capacity and network bandwidth to support optimal performance at peak load. It eliminated single points of failure and created redundancy to make the service highly available. Once Lync 2013 was up and running on the new hardware, we were able to do an in-place upgrade to Skype for Business.
To do this migration, we started with the backend servers and user pools, and then migrated the front-end servers. We migrated groups of users in a phased manner so that we could monitor and correct issues as we went along. When all users were migrated, we decommissioned the old hardware. After the servers were upgraded, we upgraded the Lync clients to Skype for Business clients.
We needed to ensure that the network could support peak load, which meant upgrading our data center circuits. We also made appropriate firewall settings, provided better DNS infrastructure, and enabled end-to-end Quality of Service (QoS) on the network to prioritize voice and video traffic.
We also needed to account for changes in the way users access unified communications. With Lync 2010, most of our users had hard-wired connections. By the time we were ready to deploy Skype for Business, most of them used wireless connections. The wireless infrastructure in our buildings was creating a huge bottleneck that we had to fix.
We’ve improved our networks and upgraded our unified communications devices to gain better performance and call quality, as follows:
For details on network planning approaches for Lync Server and Skype for Business Server 2015, seeNetwork Planning, Monitoring, and Troubleshooting with Lync Server.
We developed a Skype for Business tool called the Call Quality Dashboard to help us track down call quality issues. Some of these issues are caused by devices that have incompatible drivers and hardware. The dashboard lets us drill down and identify exactly which devices are causing problems, even personal, unmanaged, devices. We can then work with the users to correct the issues. We’re now able to manage all of our devices better. The Call Quality Dashboard is discussed in more detail later, in Monitoring service health.
We’re gradually moving our users to the cloud-based Office 365 Enterprise E5 service, which includes Skype for Business. By 2017, we plan to move 90 percent of our users to this service (keeping some users on-premises so we can continue to support our on-premises server product). This will resolve many of our current reliability and availability issues. It will also reduce the cost of supporting unified communications.
We’re migrating our users in steps. Within the United States, we’ve moved almost all of our users to the Office 365 Enterprise E5 service. To support our customers outside the United States, we still use the Skype for Business 2015 on premises solution. This is because, until recently, Office 365 Enterprise E5 was available only in North America. Now the service is expanding globally, and we plan to move all of our international users to it by 2017. We’ll do this in stages as the service becomes available in different parts of the world. As we gradually migrate our international users, we’ll be able to eliminate the on-premises infrastructure in other countries/regions and data centers.
In the meantime, some of our users are hosted on a cloud server, but still have on-premises voice service provided by a telecommunications company. Ultimately, when we move everyone to Office 365 Enterprise E5, we will no longer need the external telecommunications provider, but will receive all of our communications services through Office 365 Enterprise E5.
Telemetry doesn’t tell the entire story. We also collect and prioritize user feedback to reveal blind spots and drive improvements to the product and service. The Global Employee Satisfaction Survey—our main mechanism for listening to users—tells us where we need to improve. In addition, we’ve created an internal SharePoint site called Skype@Microsoft (shown in Figure 3) that gives users ways to send us feedback and requests. It’s the starting point for everything to do with using Skype for Business: community engagement, information, self-service tools, and alerts.
We also gather data from a questionnaire that pops up when a user finishes a Skype call. It lets us know about call quality issues. We view the data in our Call Quality Dashboard, described later.
We depend on our users to make good technology choices. Using the right kinds of devices, peripherals, and Wi-Fi networks with Skype for Business improves their experience. Our Skype@Microsoft SharePoint site gives users help on using Skype for Business, including guidance on technology selection and self-service tools to help them assess how well their client is working. We recommend that they select from a list of peripheral devices that we certified for Skype for Business. The certification process ensures that the devices work well. For the list, see Phones and devices for Skype for Business. We also provideinstructional videos.
For our field sales sellers, our most challenging user group, we’ve also developed an outreach program that includes training on tools, tips, and best practices to get the best Skype for Business user experience. These are summarized in the following figure.
We use a number of tools to continuously monitor service health, so that we can correct issues that might interfere with a good user experience.
To help us diagnose network infrastructure issues affecting call quality, we developed the Call Quality Dashboard, which is included with Skype for Business Server 2015. For each phone call, it shows the type of call (wired or wireless, internal or external) and provides a measure of call quality. It uses PCR as a key performance indicator and rates calls from 1 to 4 based on packet loss and jitter. We also developed the Call Quality Methodology to use with the dashboard data. It provides a step-by-step approach to improving call quality. This has helped us to speed up our investigations and quickly resolve issues.
Using the dashboard, Microsoft IT managers drill down into the metrics—even to the individual call—to ensure that we’re delivering the best user experience at each location or building. We look at the following information:
We use this data along with the Call Quality Methodology to drive improvements across Microsoft, and so far have reduced PCR from 8 percent to less than 2 percent. We’re training IT managers to use the tools to drive improvements in their buildings by correcting issues with underperforming devices, incompatible drivers and client versions, and insufficient network bandwidth.
Our IT site managers perform site investigations by drilling down into Call Quality Dashboard data to uncover the source of issues. Once they know the source, they can remediate it. The following screen capture shows a top-level view of the data for one of our buildings. The yellow trend lines in the graphs represent the PCR rates on wired and Wi-Fi networks and by day of week. In this case, they’re all trending down, which means the service is getting healthier. The red sections in the graphs represent calls with a PCR that’s higher than the target desirable state. We drill down for more detail, such as the type of calls involved, the network device drivers being used, the wireless hotspot in use, the wireless channel, and so forth. The user ratings that we capture on call quality are also included in the dashboard.
We use the management packfor Skype for Business Server 2015 to monitor our servers and get alerts on issues, such as when Skype for Business processes exceed a defined performance threshold.
We use the following Key Health Indicator (KHI) performance counters to get metrics about server health: CPU and memory utilization, and TCP transmit time. Along with other resources, you can download the KHI Guide that outlines the methodology that we use to measure KHIs on servers and our environment.
We use tools such as the policy assurance manager tool in HP Network Automation to ensure that routers and switches in the data centers are running a compliant configuration and to ensure QoS is enabled end to end. We can also determine where we need to provide additional capacity to achieve availability and reliability for the network and server infrastructure. We use another internal tool to ensure all the network devices are running the gold code and that they’re meeting our capacity and compliance standards.
We also use tools such as Unify Square PowerMon to measure quality during synthetic transactions. We set up probes and test accounts in data centers.
While we’re continually improving, we’re already seeing improvements in the user experience and also enjoying cost benefits:
Use these best practices to improve the user experience with Skype for Business in your organization.
Make sure that server capacity and network bandwidth support optimal performance at peak load. Use redundant systems to make sure that the service is highly available. Enable networking QoS, and open the recommended ports for optimal performance. To ensure your infrastructure supports the best possible service, be sure to follow the capacity planning guidelines for Skype for Business.
Acquire and set up the tools discussed in this paper so you can monitor and manage Skype for Business service quality.
To gain performance and feature benefits, plan to move your Skype for Business users to the cloud—Office 365 Enterprise E5. Not only will it cost less, but it will increase your unified communications capabilities. Also, users like the Skype for Business client. Our Microsoft users are much happier with it.
If you haven’t already deployed a unified communications service, you can start offering a 100-percent, cloud-based service through Office 365 Enterprise E5. Not only will you avoid needing to support the infrastructure, but you’ll no longer have to pay telecommunications providers for telephone services. Rather, your users can connect to the Internet using Skype for Business, and Microsoft Azure will route telephone calls for them. This can represent a large savings for your organizations.
Take these steps to ensure a great user experience:
Make sure that users are empowered with tools and training to get the best possible Skype for Business experience. There are many situations that users can manage better than IT can. Help your users help themselves by giving them guidance and the right tools. Provide real-time notification of incidents and self-service workarounds. Make information on best practices easy to find.
Provide tools to ensure that the client is as healthy as possible before a user joins a meeting.
For remote users, provide guidance for selecting and configuring a home router. Have a list of recommended Wi Fi routers. Use diagnostic tools to make sure the home Wi-Fi network is performing well.
Recommend Skype-certified headsets and peripherals to ensure the best possible experience for your meetings. The certification process ensures that peripherals work well.
[vc_row gmbt_prlx_parallax="up" font_color="#ffffff" css=".vc_custom_1467828447668{padding-top: 170px !important;padding-right: 0px !important;padding-bottom: 190px !important;padding-left: 0px !important;background: rgba(55,82,161,0.66) url(https://managedsolut.wpengine.com/wp-content/uploads/2016/07/healthcare-security-and-compliance-managed-solution.jpg?id=9945) !important;background-position: center !important;background-repeat: no-repeat !important;background-size: cover !important;*background-color: rgb(55,82,161) !important;}"][vc_column][vc_column_text]
[/vc_column_text][/vc_column][/vc_row][vc_row parallax="content-moving" css=".vc_custom_1465945819577{background-color: #e98922 !important;}"][vc_column width="1/2"][vc_column_text]
[/vc_column_text][/vc_column][vc_column width="1/2"][vc_column_text css_animation="appear"]
[/vc_column_text][/vc_column][/vc_row][vc_column][/vc_column][vc_column_text]
[/vc_column_text]
[vc_row][vc_column][vc_column_text]
[/vc_column_text][/vc_column][/vc_row][vc_row][vc_column][vc_column_text][vc_cta_button2 h2="Download the whitepaper here!" txt_align="center" title="Download" color="orange" size="lg" position="bottom" link="url:http%3A%2F%2Fwww.managedsolution.com%2Fwp-content%2Fuploads%2F2016%2F07%2FA-Practical-Guide-to-Designing-Secure-Health-Solutions-using-Microsoft-Azure.pdf||"][/vc_cta_button2][/vc_column_text][/vc_column][/vc_row][vc_row][vc_column][vc_column_text]
[/vc_column_text][/vc_column][/vc_row]
By Nat Friedman as written on blogs.microsoft.com
Watching Xamarin co-founder and open source pioneer Miguel de Icaza announce this onstage was a proud moment for all of us. The future of native cross-platform mobile development is now in the hands of every developer. We look forward to seeing your contributions; go to open.xamarin.com to get involved.