Improved Industrial Spill Detection With Synthetic Data
Aaditya Baranwal¹†,Abdul Mueez¹,
Jason Voelker²,Guneet Bhatia²,Shruti Vyas¹
¹University of Central Florida • ²Siemens Energy
Large-scale Vision-Language Models (VLMs) have transformed general-purpose visual recognition through strong zero-shot capabilities. However, their performance degrades significantly in niche, safety-critical domains such as industrial spill detection, where hazardous events are rare, sensitive, and difficult to annotate.
This scarcity—driven by privacy concerns, data sensitivity, and the infrequency of real incidents—renders conventional fine-tuning of detectors infeasible for most industrial settings.
We address this challenge by introducing a scalable framework centered on a high-quality synthetic data generation pipeline. We demonstrate that this synthetic corpus enables effective Parameter-Efficient Fine-Tuning (PEFT) of VLMs and substantially boosts the performance of state-of-the-art object detectors such as YOLO and DETR.
Notably, in the absence of synthetic data (SynSpill dataset), VLMs still generalize better to unseen spill scenarios than these detectors. When SynSpill is used, both VLMs and detectors achieve marked improvements, with their performance becoming comparable.
Our results underscore that a high-fidelity synthetic data is a powerful means to bridge the domain gap in safety-critical applications. The combination of synthetic generation and lightweight adaptation offers a cost-effective, scalable pathway for deploying vision systems in industrial environments where real data is scarce/impractical to obtain.
Breakthrough innovations in synthetic data generation and model adaptation for industrial safety applications
First comprehensive framework for automated industrial spill detection using computer vision
Novel AnomalInfusion technique using Stable Diffusion XL + IP adapters for realistic spill generation
Parameter-efficient fine-tuning with LoRA for domain specialization without full model retraining
Benefits both Vision-Language Models and traditional object detectors (YOLO, DETR)
Strong generalization to unseen spill scenarios even without synthetic data training
Tested and validated on actual industrial CCTV footage from manufacturing facilities
End-to-end pipeline for industrial spill detection using synthetic data and Vision-Language Models
Base image generation with controlled prompts
Style and content conditioning for realism
Precise spill placement and anomaly insertion
Comprehensive evaluation of different approaches to industrial spill detection
Baseline performance without any adaptation
Parameter-efficient fine-tuning with synthetic data
YOLO/DETR baseline without synthetic data
Traditional detectors trained with synthetic data
Create synthetic spill images using diffusion models
Automatically label synthetic data with ground truth
Fine-tune models using LoRA on synthetic dataset
Deploy adapted model for real-world detection
Our synthetic data generation pipeline bridges the gap between scarce real-world data and the need for robust industrial spill detection, enabling effective model adaptation with minimal computational overhead.
Comprehensive evaluation demonstrates significant improvements with synthetic data
Best mAP@50 with Qwen-VL 32B + LoRA
High-quality samples generated via pipeline
Parameter-efficient fine-tuning approach
Outperforms fine-tuned detectors
Method | Public Dataset | Proprietary Dataset |
---|---|---|
Qwen-VL 7B (Zero-Shot) | 35% | 15% |
Qwen-VL 32B (Zero-Shot) | 42% | 24% |
YOLOv11 (Fine-tuned) | 81% | 64% |
RF-DETR (Fine-tuned) | 83% | 67% |
Qwen-VL 7B + LoRA (V+L) | 78% | 66% |
Qwen-VL 32B + LoRA (V+L)Best | 84% | 71% |
PEFT VLMs achieve state-of-the-art performance - Qwen-VL 32B + LoRA (V+L) outperforms all baselines with 84% mAP@50
Synthetic data enables effective adaptation - 2,000 synthetic images bridge the domain gap for industrial spill detection
Joint vision-language adaptation optimal - LoRA (V+L) provides the best performance across both datasets
First scalable solution for industrial spill detection using synthetic data
Enables deployment in data-scarce industrial environments
Provides cost-effective alternative to manual monitoring
Collaborative effort between academic researchers and industry experts to advance Smart Sensing
This work represents a successful collaboration between academic research and industrial application, combining cutting-edge computer vision research with real-world safety requirements in industrial environments. Our partnership ensures that research innovations translate directly into practical solutions for industrial safety.
If you use our work, please cite our paper
@inproceedings{baranwal2025synspill, title={SynSpill: Improved Industrial Spill Detection With Synthetic Data}, author={Baranwal, Aaditya and Mueez, Abdul and Voelker, Jason and Bhatia, Guneet and Vyas, Shruti}, booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision - Workshops (ICCV-W)}, year={2025} }