Really Big Data At Walmart!
Walmart – the world’s biggest retailer with over 20,000 stores in 28 countries, is in the process of building the world’ biggest private cloud, to process 2.5 petabytes of data every hour.
To make sense of all of this information, and put it to work solving problems, the company has created what it calls its Data Café – a state-of-the-art analytics hub located within its Bentonville, Arkansas headquarters.
Here, over 200 streams of internal and external data, including 40 petabytes of recent transactional data, can be modelled, manipulated and visualized. Teams from any part of the business are invited to bring their problems to the analytics experts and then see a solution appear before their eyes on the nerve centre’s touch screen “smart boards”.
This tool has cut down the amount of time it takes to solve complex business questions, which are reliant on multiple external and internal variables, from weeks to minutes.
Senior Statistical Analyst Naveen Peddamail – who won his job with the company through a competition on crowd-sourced data competition website Kaggle – spoke to me about the project.
When bombarded with huge amounts of verified, quantifiable data at high speeds, problems caused by human error or miscalculation at the planning or execution stage of a particular business activity will often simply melt away.
For example Naveen told me about a grocery team who could not understand why sales had suddenly declined in a particular product category. The team came to the café to find out why, and by drilling into the data were quickly able to see that pricing miscalculations had been made, leading to the products being listed at a higher price than they should have been, in some regions.
In another example, during Halloween, sales analysts were able to see in real-time that although a particular novelty cookie was very popular in most stores, there were two stores where it wasn’t selling at all. The alert allowed the situation to be quickly investigated, and it was found that a simple stocking oversight had led to the cookies not being put on the shelves. The company was able to then rectify the situation immediately, avoiding further lost sales.
The system also provides automated alerts, so when particular metrics fall below a set threshold in any department, they can be invited to bring their problems to the Data Café and hopefully find a quick solution.
As well as 200 billion rows of transactional data (representing only the past few weeks!), the Café pulls in information from 200 sources including meteorological data, economic data, Nielsen data, telecom data, social media data, gas prices, and local events databases.
Anything within these vast and varied datasets could hold the key to the solution to a particular problem, and Walmart’s algorithms are designed to blaze through them in microseconds to come up with real-time solutions.
Initially, the solution was generally geared towards solving problems in the merchandising arm of the business, but in time it will be expanded to other areas such as HR and marketing. This is all part of the company’s plans to build the world’s largest cloud-based database in the world.
While separate, siloed systems are used within stores for many functions – such as inventory management, customer loyalty and price comparisons with local rivals, the eventual aim is to bring all of this information under one roof, where the impact on all of the company’s operations can be assessed and assimilated into the analytics.
Walmart has huge amounts of data at its fingertips and the resources to go on to collect far more. By combining this with the ability to make very fast decisions and implement changes based on incoming, real-time data, it is clear Walmart sees data as key to keeping itself at the top.