Project Overview
Confiz partners with a Fortune 500 retail client to implement machine learning (ML) modelling to predict controller downtime to enhance IT operations’ efficiency and ensure secure data transportation from source to destination.
The need
In the retail sector, ensuring the health of POS controllers has become essential for organizations especially those operating at a large scale to ensure a smooth flow of data from the store registers to mainframe computers. Controller downtime can be particularly costly due to disrupted staff productivity and loss of data which is crucial to business growth and revenue.
To mitigate this risk, one of Confiz’s biggest Fortune 500 customers needed a solution to predict POS controller downtime for the smooth and secure transportation of data from source to destination.
The Solution
Under Confiz’s leadership, a team of highly skilled data scientists, software engineers and certified architects were brought together to propose that ‘Artificial Intelligence for IT Operations (AIOps)’ techniques are crucial to POS controller downtime prediction. The goal was to predict downtime at least one hour in advance, hence giving the IT department enough time to proactively resolve any network issues.
Large data (3-4 GB/day) in the form of event logs was exported to Hive on Hadoop for better performance during ML model training in real time. Multiple algorithms including Support Vector Machine, Logistic Regression and Random Forest Tree were run to increase accuracy in prediction. To develop a sophisticated pipeline for modeling, Kafka for messaging layer was used to ensure scalability and fault tolerance, SparkMLlib was used to enable high performance model serving and Databricks Slack was used to trigger email alerts. The resultant solution successfully predicted the POS controller downtime, hence empowering IT operations.
The Outcome
Powerful visualization
Powerful business intelligence (BI) visualization allowed the client to create informative data representations, facilitating better decision-making and understanding of complex data patterns.
Prediction accuracy
The client achieved 80% accuracy, ensuring the correct performance of data-driven models or algorithms.
Optimal service performance
Prevention of server outage en masse ensured uninterrupted service availability.
Timely alerts
Timely performance alerts provided real-time information about the health and status of systems, helping the client to respond promptly to issues.
Improved end-user experience
Improving the end-user experience allowed the client to improve user satisfaction and interaction with products or services, often through user interface and experience design.
ARIMA application
Successfully applying ARIMA (Autoregressive Integrated Moving Average) made analyzing and predicting data trends easy.