Engineering Path for AI Defect Detection in PV Inspection: From Training Set to the Production Line

Almost every PV inspection vendor's pitch deck mentions "AI defect detection." The reality on the shop floor is less romantic: great demos, very few stably-running production AI systems. The bottleneck is not the algorithm — ResNet, EfficientNet, Transformer architectures all handle image classification just fine. The bottleneck is the engineering path from algorithm to line.

This article distills MVCreate's experience building production AI into SC-EPL, SC-PLEL-PS, SC-MC-W and SC-Seed, and lays out the full path.

1. Why "Just Train a Net" Fails in PV

Versus generic vision tasks (cat/dog classification, face recognition), PV defect detection is structurally different:

1.1 Many defect classes, extreme class imbalance

Common PV defects run to 30+ types: hidden crack, broken finger, dark corner, dark spot, dark edge, concentric ring, snake line, cold solder, over-etch, mismatch, PID, hotspot precursor, moon-crescent mark, dislocation line, fiber anomaly, and more. All need to be recognized.

But distribution is brutally imbalanced — tens of thousands of hidden-crack examples per year versus hundreds of snake-line examples. Vanilla classification loss severely overfits to majority classes under this long tail.

1.2 Highly subjective defect boundaries

What counts as a "significant hidden crack"? What is a "mild crack, pass"? Even experienced inspectors agree only ~80% of the time on the same image. This is a natural source of label noise.

1.3 Zero-tolerance on the line

An AI grader that misses a severely cracked module and lets it ship to the downstream customer incurs rework costs on the order of 100× the inspection cost. This zero-tolerance regime puts extreme pressure on recall — especially recall on severe defects.

2. Six Stages of the Engineering Path

Stage 1 — Data collection and labeling

The most expensive, most tedious stage. It also sets the ceiling for everything that follows.

Diversity matters. The training set must span:

Multiple production lines (different cameras, different lighting)
Multiple cell technologies (mono, multi, TOPCon, HJT, perovskite)
Multiple process regimes (normal, edge-of-spec, fault batches)
Multiple time points (fresh vs aged cells)

MVCreate has accumulated 2.5M+ labeled samples across 30+ defect classes over the past five years.

Double-blind labeling with arbitration. Each image is labeled by two independent inspectors; disagreements are resolved by a third-party arbitrator. Final label consistency holds above 95%.

Active learning to cut cost. Not all samples need human labeling. The current model predicts on new data; only the lowest-confidence samples go to human review. Active learning cuts labeling cost to roughly one-tenth of the naive approach.

Stage 2 — Model selection and training

For fine-grained PV vision, the standard structure is two-stage:

Segmentation — U-Net / DeepLab localizes the defect region
Classification — ResNet / EfficientNet classifies the region

MVCreate's production model uses a customized two-stream architecture:

One stream on the raw EL image
A second stream on the difference image (vs. local background)
Features fuse at a mid-layer

This design measurably improves recognition of "defects only visible against context" — mild dark spots, subtle edge degradation.

Training strategies:

Long-tail distribution handled with Focal Loss + class-balanced sampling
Severe-defect recall monitored as a separate target, >99.5%
5-fold cross-validation on every training run to avoid line-specific overfit

Stage 3 — Inference acceleration

Line cycle time is a hard constraint. SC-EPL runs at 0.5–2 s per cell; AI inference must fit inside that envelope.

Acceleration stack:

Model compression — knowledge distillation from a ResNet-50-class teacher to a ResNet-18-class student. Accuracy loss <1%, speed up ~3×
Mixed precision — FP32 → FP16 on NVIDIA GPUs, another 2×
Engine compilation — TensorRT / ONNX Runtime converts PyTorch graphs into optimized GPU kernels
Batch optimization — batching across multiple cells exploits GPU parallelism

SC-EPL's AI inference latency stabilizes at 200–400 ms, comfortably under the 2-second cycle ceiling.

Stage 4 — Deployment

Two deployment patterns:

Pattern A — local inference. Each station has a GPU; model runs locally. Low latency, data never leaves the facility. Higher hardware cost and harder multi-station updates.

Pattern B — edge server, centralized inference. One GPU server per line, all stations stream images to it. Higher hardware utilization, one-shot updates. Network latency, single point of failure.

MVCreate recommends Pattern B for large lines (>10 stations) and Pattern A for R&D and small lines.

Stage 5 — Drift monitoring

AI models are not trained once and forgotten. Two drift modes hit production:

Data drift — process changes (e.g., P-type → N-type) alter EL image statistics, invalidating the old model
Concept drift — customer definitions of "defect" evolve (e.g., from "only severe cracks" to "mild cracks too"), making the old decision boundary stale

Monitoring:

Daily automatic tracking of model confidence distribution
Weekly human review of ~200 newly-inferenced samples, comparing AI verdict vs. human verdict
Any drop >3% in agreement triggers a retraining cycle

Stage 6 — Closed-loop iteration

The AI system has to keep seeing and learning from new data. MVCreate's loop:

AI inference on the line auto-collects low-confidence samples (typically 2–5% of all samples)
Monthly human review of these low-confidence samples produces new training data
Quarterly full retraining cycle
New model validated on a held-out set is canary-released (5% → 20% → 100%) to the line
Monitoring during rollout — any anomaly triggers instant rollback

3. Three Products Built on This Stack

SC-EPL (production AI grader)

0.5–2 s per cell
30+ defect classes for crystalline silicon
On-line fine-tuning capability for per-line adaptation
Deployed across major Chinese TOPCon/HJT lines

SC-PLEL-PS (R&D AI analysis)

PL + EL dual-mode
Crystalline silicon + perovskite + tandem support
AI model outputs quantitative parameters (minority-carrier lifetime, series resistance), not just classification
Widely used in R&D labs

SC-DEL family (field AI inspection)

Portable and UAV form factors
AI model hardened for field conditions — light variability, angle variability, distance variability
Edge inference, no cloud dependency

4. Three Engineering Lessons

After many years of this work, three lessons stand out:

Lesson 1: Data matters 10× more than model architecture

Effort spent on labeling and data hygiene caps everything else. Months cleaning labels often beats months tuning architecture.

Lesson 2: Recall on severe defects is the only hard target

Over-kill can be mopped up by downstream human review. A miss cannot. Every optimization decision must respect severe-defect recall ≥99.5%.

Lesson 3: Explainability wins customer trust

An AI system on a production line cannot be a black box. MVCreate's AI outputs both a verdict and a saliency heatmap — which region, why. Letting inspectors "see what the model is thinking" earns more trust than another 0.5% accuracy would.

5. Common Misconceptions

Misconception 1 — AI replaces human inspectors. It doesn't. AI handles 90–95% of routine samples; the remaining 5–10% boundary cases still need humans.

Misconception 2 — bigger models are better. On a production line, inference speed is a hard constraint. Real deployed models are typically 5–10× smaller than SOTA.

Misconception 3 — train once, run forever. False. Retrain every 3–6 months or drift silently degrades the model.

Closing

Getting AI defect detection from a slide bullet to a stably-running production system means five engineering stages done right: data, algorithms, deployment, monitoring, iteration. MVCreate's AI product line isn't built on a one-shot "big model" — it's built on a full engineering stack spanning collection, labeling, training, deployment, monitoring, and iteration.

To discuss AI-based PV inspection or arrange a demo, reach the MVCreate technical team (+86 159-5048-9233).

Website: www.mvcreate.com

Originally published by Vision Potential / MVCreate.

Previous : Grading Retired PV Modules: How EL G
5 Patterns to Integrate with MES: OP : Next

Recommend news

Tags

product PV-Station-Solutions PV-Panel-Testing-Solutions Silicon-Ingot-Testing-Solutions
Applications
news
LINKS

contact

Be the first to know about our new product launches, latest blog posts and more.

Nanjing Vision Potential Intelligent Technology Co.,Ltd.Established based on the Nanjing Xiangning Artificial Intelligence Research Institute, we have brought together a number of outstanding industry...

Any question or request?

Click below, we’ll be happy to assist. contact

关注

联系

+86 15950489233

联系

顶部