Automating sales data matching and consolidation with IDQO

Automating the Tedious Task of Product and Customer Data Matching
Recently, an international food brand approached Faktion as they faced challenges in their data consolidation process of downstream sales data, more specifically in the area of product and customer data matching. On a regular basis, they received the sales data from their numerous tier-1 wholesalers in a broad variety of formats. Upon receiving the data, they had to match it with their own product and customer data, a long and intensive process that, when done manually, leads to inefficiencies and inconsistencies. Hence, they collaborated with Faktion to automate this process.

Dynamic Product Range

The range of products sold by a manufacturer to downstream resellers is a dynamic and ever-changing landscape, due to a multitude of factors:

  • Temporary promotional items that are only available for a limited time, and are typically used as a marketing strategy to drive sales. Those kind of products are usually only available for several weeks or months, after which they are removed from the product range.

  • Changes in article numbers and barcodes for the same products. Oftentimes, manufacturers make changes to a product, such as the label, bill-of-material, or content. To keep track of and monitor these changes, the manufacturer changes the article number of barcode.

  • Removal of old and addition of new products. A manufacturer's product strategy depends on changes in consumer demand, technology advancements, updates to the product range, among other things.

These factors combined make the range of products sold by a manufacturer to downstream resellers very variable, and requiring constant monitoring and adjustments in order to remain competitive. The main challenge is to link the articles in the sales data of the resellers, who don’t necessarily create other articles when the manufacturer does so, to the manufacturers’ actual database.

Dynamic Customer Data

The customer data coming from the sales data from the resellers of a manufacturer is also constantly changing because of various factors and can be more complex than matching product data.

Firstly, the type of customer is diverse. Some are other resellers, others private customers, others direct customers such as hotels… making it difficult for the manufacturer to consolidate all the sales data from the resellers into its existing dataset.

Secondly, the customer base frequently changes, as companies often change ownership, VAT numbers, company names, which poses problems to keep track of customer data. This becomes even more challenging when the manufacturer is trying to consolidate the sales data from multiple resellers, as each reseller may have different information about the same customer.

Lastly, sales data from the resellers is often inconsistent and incomplete, which makes it hard for the manufacturer to accurately consolidate the data. This is because the resellers don’t always report all the data and additional info.

Automating a Manual Process Using AI

Integrating the downstream sales data by matching the product and customer data to the existing sales database is a process heavily reliant on manual work by domain experts (sales managers and data specialists) and is expensive, time-consuming, and prone to errors. In order to address these challenges, Faktion envisioned and developed an AutoML solution that automates the bulk of these tedious tasks.

To kickstart the project with Faktion, a highly interactive workshop was organized between the teams of the brand and Faktion to discuss the current challenges and opportunities of the sales data sharing process and to maximize knowledge transfer.

The Four-Phased Approach

Faktion developed a four-phased approach to address these challenges. The first phase involves creating a Proof of Concept that builds the foundations of the solution, followed by the Proof of Value in phase two, where an advanced classification system for data-linking is developed. The third phase consists of the development a fully operational AI product. In the last stage, phase four, the AI product is scaled across the company, adding more features and functionalities. Between each of these phases, strategic review meetings involving all stakeholders are key and an essential part of Faktion’s way of working. During these review meetings, the approach and results are to be evaluated and the subsequent work packages are to be discussed and confirmed.

In Phase 1 of the data matching & consolidation project, the foundation for a robust and scalable system architecture is built. This includes the database, analytics tools, and a baseline AI system to automate product and customer linking. This allows the client to internalize and improve data observability, provide better insight into data quality, and set realistic targets for data matching accuracy.

In Phase 2, Faktion brings the company-specific product and customer linking to a state-of-the-art level with advanced classification techniques. Using the infrastructure from Phase 1, explicit natural language processing tasks are defined and an appropriate linking model is implemented for that task. This phase specifically uses a combination of rule-based and machine learning algorithms to match and classify the data. Basic ML modelling is complemented with additional techniques such as Data Enrichment and Active Learning based data annotation to improve the solution’s performance.

In Phase 3, the focus is on operationalizing the solution. During the strategic review at the end of Phase 2, the team decides on the industrialization and productization of the AI solution within the preferred technology stack. Initial productization includes a basic lightweight user interface (excel plugin) and API to monitor, validate, and operate the data ingestion and automated data matching and classification.

In Phase 4, further productization is done after extensive user validation, and includes additional components and features that allow for automated insights in sales data, monitoring and managing the AI component’s performance by the business users, and defining and configuring new tasks within the Matching & Classification AutoML Platform by the client’s users. In this phase we typically build a custom front-end integrated with the business users standard workflow tools.

Impressive Results

The result was an automation of +85% (at + 97% accuracy) of the data matching and data consolidation process, with domain experts only being involved for human validation or additional annotation where the machine learning models aren't yet confident enough. Within the platform, our client can set the respective human validation thresholds for different product data or customer data fields and types.

We're proud of our customized solution for 'Intelligent Data Quality Optimization' that significantly reduces the amount of manual work required and improves the speed at which processed new data is made available.

Get in touch!

Inquiry for your POC

Scroll to Top