Retail-786k: a Large-Scale Dataset for Visual Entity Matching

Published in Data-centric Machine Learning Research (DMLR) Workshop, ICLR 2024, Vienna, 2024

We introduce the first publicly available large-scale dataset for “visual entity matching”, based on a production level use case in the retail domain. Using scanned advertisement leaflets, collected over several years from different European retailers, we provide a total of ~786k manually annotated, high resolution product images containing ~18k different individual retail products which are grouped into ~3k entities.

Download here

Fine-Grained Product Classification on Leaflet Advertisements

Published in FGVC10: 10th Workshop on Fine-grained Visual Categorization, CVPR 2023, Vancouver, 2023

In this paper, we describe a first publicly available fine-grained product recognition dataset based on leaflet images. We provide a total of 41.6k manually annotated product images in 832 classes. Further, we investigate three different approaches for this fine-grained product classification task.

Download here