Revolutionizing Data Matching and Consolidation with AI

Automating the Tedious Task of Product and Customer Data Matching
Recently, an international food brand approached us as they faced challenges in their data consolidation process of downstream sales data, specifically in the area of product and customer data matching. On a regular basis, they received the sales data from their numerous tier-1 wholesalers in a broad variety of formats.

Dynamic Product Range

The range of products sold by a manufacturer to downstream resellers is a dynamic and ever-changing landscape. This is due to a variety of factors such as temporary promotional items, changes in article numbers and barcodes, and the phasing out and introduction of new products.

One of the main reasons for this dynamic nature is the inclusion of temporary promotional items. These are products that are only available for a limited time, and are typically used as a marketing strategy to drive sales. As a result, these products may only be available for a few weeks or months, and then phased out of the product range.

Another reason for the dynamic nature of the product range is the changes in article numbers and barcodes for the same products. This is because there are often minor changes made to products, such as changes in labels, bill-of-material, or content, and the manufacturer wants to maintain traceability of these products. As a result, the article number or barcode may need to be changed to reflect these modifications.

In addition, products are phased out and new products are phased in as part of the manufacturer's product strategy. This could be due to a variety of reasons such as changes in consumer demand, advancements in technology, or the need to update the product range.

All these factors combined make the range of products sold by a manufacturer to downstream resellers a dynamic and ever-changing landscape that requires constant monitoring and adjustments in order to remain competitive. The main challenge here is to link the articles in the sales data of the resellers (who don’t necessarily create other articles when the manufacturer does so) to the manufacturers’ actual database.



Dynamic Customer Data

The reported customer data in the sales data from the resellers of a manufacturer is also very dynamic, which makes data matching and consolidation even more complex than matching product data. Customer data in a retail and distribution context of a food manufacturer is constantly changing due to various factors. One of the main factors is that customers come and go, and there are various types of customers, such as other resellers, private customers, direct customers such as hotels, restaurants, bars and food service companies. This diversity of customers makes it difficult for the manufacturer to consolidate all the sales data from the resellers into its existing dataset.

Another factor that adds to the complexity of the customer data is the frequent changes in the customer base. Companies often change ownership, VAT numbers, and company names, which makes it difficult for the manufacturer to keep track of the customer data. This is particularly challenging when the manufacturer is trying to consolidate the sales data from multiple resellers, as each reseller may have different information about the same customer.

Furthermore, the sales data from the resellers is often inconsistent and incomplete, which makes it difficult for the manufacturer to accurately consolidate the data. This is because the resellers don’t always report all the data and additional info such as these changes.



Introducing the AI Revolution

Integrating the downstream sales data by matching the product and customer data to the existing sales database is a process heavily reliant on manual work by domain experts (sales managers and data specialists), which was expensive, time-consuming and prone to errors. In order to address these challenges, Faktion envisioned and developed an AutoML solution that automates the bulk of these tedious tasks.

To kickstart the project with Faktion, a highly interactive workshop was organized between the teams of the brand and Faktion to discuss the current challenges and opportunities of the sales data sharing process and maximize knowledge transfer.


The Three-Phased Approach

Faktion developed a three-phased approach to address these challenges. The first two phases involve creating a Proof of Concept by building the foundations and then developing an advanced classification system for data-linking. The third phase consists of the development a fully operational AI product. Between each of these phases, strategic review meetings involving all stakeholders are key and an essential part of Faktion’s way of working. During these review meetings, the approach and results are to be evaluated and the subsequent work packages are to be discussed and confirmed.

In Phase 1 of the data matching & consolidation project, the foundation for a robust and scalable system architecture is built. This includes the database, analytics tools, and a baseline AI system to automate product and customer linking. This allows the client to internalize and improve data observability, provide better insight into data quality, and set realistic targets for data matching accuracy.

In Phase 2, Faktion brings the company-specific product and customer linking to a state-of-the-art level with advanced classification techniques. Using the infrastructure from Phase 1, explicit natural language processing tasks are defined and an appropriate linking model is implemented for that task. This phase specifically uses a combination of rule-based and machine learning algorithms to classify and match the data. Basic ML modelling is complemented with additional techniques such as Active Learning and Data Enrichment to improve the solution’s performance.

In Phase 3, the focus is on operationalizing and productizing the solution. During the strategic review at the end of Phase 2, the team decides on the industrialization and productization of the AI solution within the preferred technology stack. Initial productization includes a basic lightweight user interface and essential software components to monitor, validate, and operate the data ingestion and automated classification. Further productization, in subsequent phases and after extensive user validation, includes additional components and features to allow for automated insights in sales data, for monitoring and managing the AI component’s performance by the business users, and to define and configure new tasks within the Matching & Classification AutoML Platform by the client’s users.


Impressive Impact

+85% automation (at + 97% accuracy) of the data matching and data consolidation process, with domain experts only being involved for human validation or additional annotation where the machine learning models aren't yet confident enough. Within the platform, the client can set the respective human validation tresholds for different product data or customer data fields and types. Our solution significantly reduces the amount of manual work required and improved the speed at which processed new data is made available.

Get in touch!

Stay tuned
for more!

Oudeleeuwenrui 39 – 1st floor
2000 Antwerp
+32 3 500 87 09

Inquiry for your POC

Scroll to Top