• Tony Paul

Improving the quality of data in the Product Information Management system using Datahut APIs

Updated: Sep 12


Improving the quality of data in the Product Information Management system using Datahut API’s

The idea of eCommerce Inventory management or E-commerce catalog management is not as simple as it seems. In theory, you have to maintain a single catalog of all your eCommerce products. But, it’s a nightmare for retailers, and it costs them a fortune.


In industries where the dimensions of the products are a huge deciding factor, the retailer has to make sure that the attributes have correct values. Sometimes even a millimeter difference makes the product obsolete for a customer.


If you want to understand the importance of a clean PIM system, you should understand the primary purpose of a PIM system. A PIM process works like this:


  1. The supplier usually sells to multiple retailers; the content and format required for each retailer would be different. The supplier will have the product data in a single format.

  2. When the supplier updates a piece of product information, you need a way to track changes and make it at your side.

  3. Sometimes supplier data might need to be enriched and formatted before putting it on the website with extra attributes and information.

  4. Retailers often have large teams who spend hours updating and publishing product data. If all the team members are not on the page, the data can get messy, and it can hurt both you and your customers.

  5. The supplier can get you the wrong information; you need a solution to find it, clean it and enrich it.


Missing attributes of product data in the PIM software is a common issue among retailers. Manually checking and enriching the data is inefficient, and it is not scalable.PIM or MDM systems are powerless if they lack quality product data.


The Customer


The customer is a retailer whose PIM systems were managed by a large consulting company. The consulting firm approached us to get the PIM system up and running. The customer wanted to implement a solution that would enable them to manage their data effectively and efficiently. They also wanted to improve the quality of their product information and make it available across different channels.


Challenges faced by the customer


The firm had a robust PIM system that was used for product development and management and used proprietary PIM software to manage the inventory data. But as they grew, they found that the quality of data from their suppliers wasn't always reliable. The problem? They didn't have the resources to manually check each supplier's data against their own systems, so they knew there were inconsistencies but couldn't fix them.


The data was mostly populated by CSV uploads of the suppliers. However, they faced the following problems:

  1. The PIM software had a lot of missing data attributes. The marketing department was suffering the most from it.

  2. The marketing department wanted additional attributes, which the supplier did not provide in the csv file.

  3. A large number of products were being returned: The firm witnessed a staggering amount of product returns. The main reason was the difference in the product specifications on the website and the product delivered to the customer.

  4. The content team worked overtime to fix issues: As the firm brought on-board new vendors, the errors accumulated, forcing the content team to work extra hours to fix the discrepancies. There were around 1700 errors from the suppliers in one month, and finding and fixing them was becoming a cumbersome task for the content team.

  5. Delay in new launches: New product launches were getting postponed because of the existing content problem.

  6. An onslaught of negative reviews: The site received numerous negative reviews as the customers were getting frustrated with data discrepancies resulting in wrong purchases.


The Solution

To address these issues, we worked with our client to identify areas where we could improve their PIM system. We then designed and built an API that allowed us to pull data from multiple sources into one location so that our client could access an accurate picture of what was happening with their products at any given time.


Datahut used its proprietary eCommerce APIs and web crawlers to extract data from the PIM software based on product URL / UPC / ASIN. Then we used our enterprise-grade data enrichment platform to enrich this information so that it was complete and accurate. The APIs run 24*7 whenever a new product is added to the PIM software - the missing data is automatically added.


This allowed them to integrate their new PIM system with other systems in place at their company so that they could generate reports from different perspectives and make better decisions about new product launches or changes made within existing products over time.


The Impact

Within 30 days of implementing our solution, the consulting firm was able to see significant results:

  1. Updated product information led to better sales and resulted in fewer returns. Our client saw a significant drop in the number of returned products. In industries like fashion, where returns are more than 20%, data errors could be catastrophic.

  2. The content team was able to reduce their manual effort: It is estimated that 25 minutes are spent on manual data synchronization per SKU per year. With the help of PIM software coupled with a web scraping solution, you can get it under five minutes per SKU per year. That’s a lot of hours saved.

  3. We improved the accuracy of product data: Reliable product information led to increased sales and fewer headaches.

  4. The customer saw a significant rise in SEO rankings after the listings were enriched with additional attributes.


Future Plans


The customer wants to onboard new vendors and scale the usage of Datahut APIs. They also want to explore more ways to use the data and APIs


Contact DATAHUT to help you solve your PIM problems.


If you are a retailer who is a victim of bad product data despite a dedicated PIM system in place, contact Datahut for fully-managed data extraction services to fight retail data errors. Get in touch, we should talk.


Do you want to offload the dull, complex, and labour-intensive web scraping task to an expert?