SynSpill Logo

SynSpill

Improved Industrial Spill Detection With Synthetic Data

Aaditya Baranwal¹,Abdul Mueez¹,

Jason Voelker²,Guneet Bhatia²,Shruti Vyas¹

¹University of Central Florida • ²Siemens Energy

Abstract

Large-scale Vision-Language Models (VLMs) have transformed general-purpose visual recognition through strong zero-shot capabilities. However, their performance degrades significantly in niche, safety-critical domains such as industrial spill detection, where hazardous events are rare, sensitive, and difficult to annotate.

This scarcity—driven by privacy concerns, data sensitivity, and the infrequency of real incidents—renders conventional fine-tuning of detectors infeasible for most industrial settings.

We address this challenge by introducing a scalable framework centered on a high-quality synthetic data generation pipeline. We demonstrate that this synthetic corpus enables effective Parameter-Efficient Fine-Tuning (PEFT) of VLMs and substantially boosts the performance of state-of-the-art object detectors such as YOLO and DETR.

Notably, in the absence of synthetic data (SynSpill dataset), VLMs still generalize better to unseen spill scenarios than these detectors. When SynSpill is used, both VLMs and detectors achieve marked improvements, with their performance becoming comparable.

Our results underscore that a high-fidelity synthetic data is a powerful means to bridge the domain gap in safety-critical applications. The combination of synthetic generation and lightweight adaptation offers a cost-effective, scalable pathway for deploying vision systems in industrial environments where real data is scarce/impractical to obtain.

Key Highlights

Breakthrough innovations in synthetic data generation and model adaptation for industrial safety applications

🏭

Industrial Safety Focus

First comprehensive framework for automated industrial spill detection using computer vision

🎨

Synthetic Data Pipeline

Novel AnomalInfusion technique using Stable Diffusion XL + IP adapters for realistic spill generation

🧠

VLM Adaptation

Parameter-efficient fine-tuning with LoRA for domain specialization without full model retraining

📊

Dual Approach

Benefits both Vision-Language Models and traditional object detectors (YOLO, DETR)

Zero-Shot Capability

Strong generalization to unseen spill scenarios even without synthetic data training

🎯

Real-World Validation

Tested and validated on actual industrial CCTV footage from manufacturing facilities

🚀Revolutionizing Industrial Safety with AI

System Architecture

End-to-end pipeline for industrial spill detection using synthetic data and Vision-Language Models

📹

Input Stage

CCTV Feed
Text Prompt
🧠

VLM Processing

PEFT Training
LoRA Adaptation
Confidence Scoring
🚨

Alert System

Detection Results
Alert Trigger

AnomalInfusion Pipeline

1

Stable Diffusion XL

Base image generation with controlled prompts

2

IP Adapters

Style and content conditioning for realism

3

Inpainting

Precise spill placement and anomaly insertion

Data Flow

Real ImagesScarce
Synthetic DataAbundant
Trained ModelReady

Methodology Comparison

Comprehensive evaluation of different approaches to industrial spill detection

mAP: 42.1%

Zero-Shot VLM

Baseline performance without any adaptation

  • No training required
  • General capabilities
  • Limited domain knowledge
mAP: 63.4%

PEFT + SynSpill

Parameter-efficient fine-tuning with synthetic data

  • LoRA adaptation
  • Synthetic data training
  • Domain specialization
mAP: 39.8%

Traditional Detectors

YOLO/DETR baseline without synthetic data

  • Object detection focus
  • Limited generalization
  • Real-world challenges
mAP: 61.2%

Detectors + SynSpill

Traditional detectors trained with synthetic data

  • Improved accuracy
  • Better robustness
  • Enhanced performance

PEFT Training Process

🎨

Generate

Create synthetic spill images using diffusion models

🏷️

Annotate

Automatically label synthetic data with ground truth

🔧

Adapt

Fine-tune models using LoRA on synthetic dataset

Deploy

Deploy adapted model for real-world detection

💡 Key Innovation

Our synthetic data generation pipeline bridges the gap between scarce real-world data and the need for robust industrial spill detection, enabling effective model adaptation with minimal computational overhead.

Experimental Results

Comprehensive evaluation demonstrates significant improvements with synthetic data

🎯
84%

Detection Accuracy

Best mAP@50 with Qwen-VL 32B + LoRA

📊
2,000

Synthetic Images

High-quality samples generated via pipeline

🔧
PEFT

Model Adaptation

Parameter-efficient fine-tuning approach

VLM

State-of-the-Art

Outperforms fine-tuned detectors

Quantitative Comparison

MethodPublic DatasetProprietary Dataset
Qwen-VL 7B (Zero-Shot)
35%15%
Qwen-VL 32B (Zero-Shot)
42%24%
YOLOv11 (Fine-tuned)
81%64%
RF-DETR (Fine-tuned)
83%67%
Qwen-VL 7B + LoRA (V+L)
78%66%
Qwen-VL 32B + LoRA (V+L)Best
84%71%

🔍 Key Findings

PEFT VLMs achieve state-of-the-art performance - Qwen-VL 32B + LoRA (V+L) outperforms all baselines with 84% mAP@50

Synthetic data enables effective adaptation - 2,000 synthetic images bridge the domain gap for industrial spill detection

Joint vision-language adaptation optimal - LoRA (V+L) provides the best performance across both datasets

🚀 Impact

1

First scalable solution for industrial spill detection using synthetic data

2

Enables deployment in data-scarce industrial environments

3

Provides cost-effective alternative to manual monitoring

Research Collaboration

Collaborative effort between academic researchers and industry experts to advance Smart Sensing

🤝 Academic-Industry Collaboration

This work represents a successful collaboration between academic research and industrial application, combining cutting-edge computer vision research with real-world safety requirements in industrial environments. Our partnership ensures that research innovations translate directly into practical solutions for industrial safety.

Get In Touch

Citation

If you use our work, please cite our paper

📝 BibTeX Citation

@inproceedings{baranwal2025synspill,
    title={SynSpill: Improved Industrial Spill Detection With Synthetic Data},
    author={Baranwal, Aaditya and Mueez, Abdul and Voelker, Jason and Bhatia, Guneet and Vyas, Shruti},
    booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision - Workshops (ICCV-W)},
    year={2025}
}