Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

Retail-786k

Published:

We introduce the first publicly available large-scale dataset for “visual entity matching”, based on a production level use case in the retail domain. Using scanned advertisement leaflets, collected over several years from different European retailers, we provide a total of ~786k manually annotated, high resolution product images containing ~18k different individual retail products which are grouped into ~3k entities.

Download here

publications

Fine-Grained Product Classification on Leaflet Advertisements

Published in FGVC10: 10th Workshop on Fine-grained Visual Categorization, CVPR 2023, Vancouver, 2023

In this paper, we describe a first publicly available fine-grained product recognition dataset based on leaflet images. We provide a total of 41.6k manually annotated product images in 832 classes. Further, we investigate three different approaches for this fine-grained product classification task.

Download here

Retail-786k: a Large-Scale Dataset for Visual Entity Matching

Published in Data-centric Machine Learning Research (DMLR) Workshop, ICLR 2024, Vienna, 2024

We introduce the first publicly available large-scale dataset for “visual entity matching”, based on a production level use case in the retail domain. Using scanned advertisement leaflets, collected over several years from different European retailers, we provide a total of ~786k manually annotated, high resolution product images containing ~18k different individual retail products which are grouped into ~3k entities.

Download here

Can Visual Language Models Replace OCR-Based Visual Question Answering Pipelines in Production? A Case Study in Retail.

Published in Emergent Visual Abilities and Limits of Foundation Models Workshop, ECCV 2024, Milan, 2024

Most production-level deployments for Visual Question Answering (VQA) tasks are still build as processing pipelines of independent steps. However, the recent advances in vision Foundation Models [25] and Vision Language Models (VLMs) [23] raise the question if these custom trained, multi-step approaches can be replaced with pre-trained, single-step VLMs. This paper analyzes the performance and limits of various VLMs in the context of VQA and OCR [5, 9, 12] tasks in a production-level scenario. In conclusion, the VQA task which aims to predict specific product information from images being satisfying but performs less fulfilling in identifying specific features, possibly due to a lack of domain-specific knowledge.

Download here

talks

teaching