Retail-786k: a Large-Scale Dataset for Visual Entity Matching
Published in Data-centric Machine Learning Research (DMLR) Workshop, ICLR 2024, Vienna, 2024
We introduce the first publicly available large-scale dataset for “visual entity matching”, based on a production level use case in the retail domain. Using scanned advertisement leaflets, collected over several years from different European retailers, we provide a total of ~786k manually annotated, high resolution product images containing ~18k different individual retail products which are grouped into ~3k entities.
Download here