Browsing by Autor "Maciej Szankin"

Now showing 1 - 2 of 2

Privacy-Preserving Audience Analytics: Lightweight Thermal Face Recognition for Real-Time Marketing Intelligence at the Edge
(2025) Maciej Szankin; Jacek Rumiński
Modern retail analytics demand real-time audience measurement, yet privacy regulations and consumer concerns limit traditional RGB cameras for demographic analysis. We present LTAS (Lightweight Thermal Architecture Search), a privacy-preserving framework that optimizes pre-trained models for edge-deployed thermal face recognition with minimal adaptation - requiring only single-batch fine-tuning on 64 images. Unlike expensive super-network approaches, LTAS leverages thermal imagery's constrained visual diversity to achieve rapid optimization. Evaluating 500 architectural variants across three thermal face datasets reveals that network depth reduction is the primary efficiency driver, achieving up to 48% parameter reduction while maintaining 82% of baseline accuracy. Depth optimization alone delivers 35-45% parameter reduction without accuracy loss, while kernel size modifications provide limited benefits. This enables real-time privacy-compliant audience analytics on resource-constrained retail devices, making thermal-based marketing measurement both practical and scalable.
Seeing the Signs: A Survey of Edge-Deployable OCR Models for Billboard Visibility Analysis
(2025) Maciej Szankin; Vidhyananth Venkatasamy; Lihang Ying
Outdoor advertisements remain a critical medium for modern marketing, yet accurately verifying billboard text visibility under real-world conditions is still challenging. Traditional Optical Character Recognition (OCR) pipelines excel at cropped text recognition but often struggle with complex outdoor scenes, varying fonts, and weather-induced visual noise. Recently, multimodal Vision-Language Models (VLMs) have emerged as promising alternatives, offering end-to-end scene understanding with no explicit detection step. This work systematically benchmarks representative VLMs-including Qwen 2.5 VL 3B, InternVL3, and SmolVLM2-against a compact CNN-based OCR baseline (PaddleOCRv4) across two public datasets (ICDAR 2015 and SVT), augmented with synthetic weather distortions to simulate realistic degradation. Our results reveal that while selected VLMs excel at holistic scene reasoning, lightweight CNN pipelines still achieve competitive accuracy for cropped text at a fraction of the computational cost-an important consideration for edge deployment. To foster future research, we release our weather-augmented benchmark and evaluation code publicly: https://github.com/macsz/ocr-cnn-vlm/.