Today I caught up with Matt Kliem who manages the Network Operations Centre at NewVoiceMedia. With millions of calls coming through the platform each month it's essential that the platform is not just up, but performing at a really high standard every minute of the day.
Matt is focused on delivering a service that far exceeds the service levels a business could hope to achieve in-house. In order to maintain such a high level of uptime Matt monitors a number of screens in our NOC. Here are the top 5 things Matt measures so that you don't have to.
1. End user experience
We are a web-based application and it's essential that when users click to make a call that the service delivers. "Many vendors will measure their service availability from simple ping/server uptime tests" says Matt.
"We go further and run a series of tests emulating typical user actions, like logging in, changing state, running a statistics report, or configuring a call plan. Not only are we measuring the availability of these user actions, we are also reporting on the performance of how long these tasks take to run. This allows us to be confident that we are providing a “great performing service”, rather than just an “available” service"
To use the analogy of Gmail, a ping test tells you whether Gmail.com can be reached, whereas our tests log in to Gmail, send an email, have a chat with a contact and accept a calendar invite.
"We run over 9000 end-user experience web tests per data centre per day and all of these results are published live and un-edited to the public trust site - customers have access to the same information we do" Matt explains.
2. Application Servers
Matt continues, "Across our multiple data centres we run many web and database servers and these are also reporting to us in real-time. We're able to see the current load on each area of the platform and take remedial action if required."
A key benefit of this reporting is capacity planning. Across our entire customer base we have a really good idea of the typical call volume and load on our infrastructure for a typical user. As more customers are added to the platform we can plan accurately for more capacity within a data centre and across our entire platform.
As well as ensuring the platform is running fast, we need to ensure that users can get to it. We therefore monitor the telephony and web network activity on each of our carriers into each of our datacentres.
"Whilst we're confident in our capacity planning, keeping an eye on what's actually happening is important, especially if a specific carrier has issues and we're moving traffic around our platform to compensate" says Matt.
As a telephony based company it's important for us to monitor the telephony circuits which facilitate call delivery to our customers. In addition to our telephony circuit monitoring we run regular telephony tests to emulate calls coming in to the platform. "Our automated tests make phone calls, and we check that these calls are successfully connected/answered by one of our automated monitoring agents. These automated tests also replicate choosing options from an IVR menu as well as dynamically routing and prioritising calls based on CRM data - it's pretty comprehensive"
"We make over 3000 automated telephony test calls per day per datacentre in addition to our standard telephony circuit monitoring which means we'll know about any issues before customers do" says Matt.
Matt's fifth selection relates to our PCI compliant payment service. As part of our PCI-DSS compliance we constantly record any activity at our PCI datacentres, and have live webcam feeds of those datacentres shown on our NOC wall.
Matt explains, "As well as being securely locked and filmed, we run a range of daily security procedures in accordance to PCI-DSS compliance that give me the confidence that our PCI data centres are secure."
It's because of the business we're in that we can afford to do it Matt says, "Our entire business is built on running and maintaining a secure platform so we can afford to make the investment in our intrusion detection systems, file integrity monitoring, access control audits as well as the people to constantly monitor these systems. This enables us to ensure we adhere to our strict security standards, policies and procedures outlined by PCI-DSS.”
More to come
Matt is constantly looking to enhance the NOC, and to put more tests onto the Trust Site. "When you buy an on-premise solution you have a green light that tells you it's working. The Trust Site is our green light, and the more real-time tests we can put on it, the more confidence our customers can have in our platform."
We hope this post has been useful and given you some insight into Matt's work in the Network Operations Centre. If you have any questions for Matt then just ask them in the comments section below.For more information on ContactWorld and how a cloud based contact centre could change the way you treat your customers when they call you just visit the ContactWorld page.