Co-authored by: Peter Butler
Did you know that 30% of the monthly 100,000 requests of a large European furniture retailer’s I.T. service desk is made via email? Until recently, these were all handled manually, taking over 2,000 hours per month to review.
In this blog post, we will demonstrate how we applied state-of-the-art artificial intelligence and deep learning techniques to transform this process, generating approximately $3.6 million per annum in savings. Keen to know more about this impact of artificial intelligence? Read on!
A typical service desk request varies from a staff member resetting their password to notification of a server shutting down. The sheer breadth of topics handled by the service desk makes automation of this process as well as implementing deep learning and artificial intelligence, a complex big data problem.
The client believed that being able to handle such a large quantity of unstructured data in a smart and efficient manner through AI technology and machine learning models would generate substantial cost savings. As a result, the client invited us to bring in a team of data scientists, data engineers, and automation engineers to tackle this problem through AI technology, neural networks, and deep learning.
The client previously manually assigned these tickets by reading the emailed service request, then assigning it to the correct user group to deal with the issue and issuing a priority on the ticket. This process is prone to error, relying on the subjective opinion of the reviewer and external factors such as the length of time a reviewer has worked.
We proposed to create leading-edge deep learning models to automate this whole procedure. To achieve this objective, our team worked in an Agile fashion, delivering results every four to five weeks, with overall outcomes assessed every four months. The team primarily worked using open-sourced technologies and coding our data science models in Python. Our development environment is a server hosted on Google Cloud Platform (GCP).
Figure 1: Ticket classification flow
Figure 1 demonstrates how we successfully amended the flow to reduce the need for manual review; further details follow.
Data Science Modeling
From a data science perspective, this is a classification problem where we used natural language processing (NLP) to extract critical information from the email. Our aim was to use historical data where the correct classifications have already been made. We used these correct labels to teach a machine learning algorithm to learn the patterns between the email and the label; this is called a supervised approach.
First, we pre-processed the data to get it into a format we can use for a machine learning algorithm. We took the following steps:
- We cleaned the data, removing all punctuation, special characters, and digits as these are not relevant information.
- We converted the data to lowercase.
- We removed what we call ‘stop words’, such as ‘and’, ‘the’, etc. This is because they are extremely common words that contain little important information to the problem.
- We applied vectorization– as described below.
Vectorization is an important process that converts text data into a numerical format that can be used as an input to machine learning models. In this example, keywords, such as “password” or “office”, were found to be extremely indicative of the overall request, irrespective of their surrounding context, as seen in Figure 2.
Figure 2: Keywords associated which each service desk team
To best identify important keywords, we converted the email summary and body into a numerical feature vector using a technique called term frequency inverse document frequency (TF-IDF). TF-IDF creates a vector by weighting the frequency of words in the body of text against the frequency of words in the entire email corpus, which in our case was approximately three million.
Figure 3: Data modeling pipeline
An artificial neural network was used to solve the classification problem based on the TF-IDF representation described above. The neural network and deep learning mechanism can learn the complex non-linear relationships that map an email vector to, for instance, the right priority rating a ticket should have. This flow is seen in Figure 3. Neural networks are powerful black-box machine learning algorithms, that are very flexible in learning complicated problems, some examples being:
- Autonomous driving
- Predictive text
- Facial recognition
- Medical diagnosis
In this case, the neural network was able to learn the decision boundary to separate user groups or priority ratings. Figure 4 demonstrates such an example if we had three priority ratings for a ticket. A complex decision boundary (dotted line) is learned by the neural network from historical data. When new data is passed to the model, the network will classify the data point based on its learning from historical data.
Figure 4: Example of a decision boundary
The model was created using approximately three million examples. 90% of the examples were used to train the model and configure the correct weightings for the neural network, and 10% of examples were used to test and validate the model to ensure it is applicable to future data. Figure 5 shows the model training and production flow.
Figure 5: Machine learning and production flow
In step 1, we trained and validated our models to ensure they gave excellent results and were ready for production. In step 2 (production), we hosted the model in a docker container. Through an API, the model receives email data; it transforms it using our trained TF-IDF model to numerical features; this is then passed to the trained neural network model that makes a prediction based on its learned decision boundary. The result is then pushed back through the API, where our automation team creates an incident in the client’s ticketing systems based on the model’s output.
Figure 6: Model confidence level
As a final failsafe, each model prediction has an associated probability of how confident it is in its prediction. We have a threshold of 70% confidence for the prediction to be considered valid, so we don’t return results where the model is not confident. Figure 6 shows an example the distribution of probabilities the model returned on the test set. Currently over 95% of predictions are more than 70% confident, and 93% are more than 99% confident.
With regards to this application, we currently have five machine learning models in production following this process, which are classifying and automating different elements of the service desk flow. Each is updated on a monthly basis with the latest data to ensure we capture the latest information. This is a highly scalable approach; we have since applied this to other areas and could rollout similar solutions rapidly to other clients.
The client estimated these service desk models are producing cost savings of approximately $10 per ticket, and with over 30,000 tickets per month an estimated $3.6 million per annum. This work has been showcased internally and generated huge interest, leading to our team being scaled up. We are currently working on numerous interesting A.I. problems, such as our work on automating responses to reviews left on a new phone app– this will be covered in a future blog post.