In the past three years we've watched customers swing between two extremes when deciding where to host AI inference for inspection equipment: "all-cloud" — believing cloud has more compute and faster updates; or "all-edge" — anxious about data privacy and network reliability. Both are wrong. After 50+ production deployments, here is MVCreate's engineering decision framework — score 4 dimensions independently; the optimum is usually a hybrid architecture.
Latency budgets in PV-line inspection are unforgiving:
| Stage | Cycle | Inference budget | Network round-trip budget |
|---|---|---|---|
| EPL full inspection (mass prod) | 0.5–2 s | <100 ms | 0 (must be local) |
| PLEL all-in-one (pilot) | 15 s | <500 ms | <200 ms |
| MC-W microcrack | 0.6 s | <80 ms | 0 (must be local) |
| EL/IV plant inspection | 5–30 s | <500 ms | <2 s |
| Offline review | unbounded | — | — |
Conclusion: any line-side check faster than 2 s must run on the edge — even 50–200 ms of network round-trip eats the entire budget. >5 s plant-side scenarios can use cloud.
A point many decision-makers miss — EL image volume:
One 24.16 MP image (lossless PNG): ~10 MB
An 8,000-wafer/h line: 80 GB/h
One workday (20 h): 1.6 TB/day
One month (22 days): 35 TB/month
All-cloud inference means uploading 35 TB/month — infeasible on most cell-factory networks (typical 100 Mbps – 1 Gbps). Even when feasible, public-cloud egress fees are steep (~$5K–7K/month for 35 TB).
Edge inference keeps raw images local and uploads only labels + small defect crops — data volume falls to 1/100.
Cloud advantages live mostly in maintainability:
| Item | Cloud | Edge |
|---|---|---|
| Model updates | One update reaches all lines | Per-device push |
| Fault localization | Central logs, fast | On-site debug, slow |
| A/B testing | Easy | Hard |
| Compute elasticity | High | Low (factory-fixed) |
| Monitoring dashboards | Direct | Pull data up first |
Our compromise is "edge inference + cloud management" hybrid:
Inference stays on edge — meets latency and bandwidth budgets;
Model registry in cloud — edge nodes pull latest models periodically;
Critical logs to cloud — error events, anomaly confidences, model version metadata;
A/B mechanism — cloud pushes experimental models to a subset of edges and aggregates results.
PV-line inspection data exposes throughput, yield, and process — sensitive commercial information. Pure cloud inference means all customer data flows through our cloud — large customers refuse this.
Our security design:
Raw images never leave the line;
Feature vectors may go to cloud (cannot reconstruct raw images) for federated learning;
Metadata may go to cloud (labels, confidences, timestamps) for monitoring;
Customer kill switch — local config can disable all cloud communication; the device is then fully offline.
We benchmarked SC-EPL (Gen 4 algorithm, INT8 quantized) on several platforms:
| Hardware | Inference per image | System power | Integration cost |
|---|---|---|---|
| NVIDIA Jetson Orin Nano | 70 ms | 15 W | Low |
| NVIDIA Jetson Orin AGX | 28 ms | 60 W | Mid |
| Hailo-8 NPU | 45 ms | 8 W | Mid |
| Cambricon MLU220 | 55 ms | 12 W | Mid |
| CPU (Intel i7-12700) | 800 ms | 65 W | Low (already on hand) |
| Cloud A100 | 12 ms (+ ≥200 ms network) | — | Very high |
Default for MVCreate: Jetson Orin AGX — 28 ms per image, 60 W manageable, fits mass-production cycles. Hailo wins where power is paramount (handheld kits).
Where we land on each product line:
| Product | Inference location | Hardware | Cloud role |
|---|---|---|---|
| SC-EPL (mass prod) | Edge | Jetson Orin AGX | Updates + monitoring |
| SC-PLEL-PS (all-in-one) | Edge | Jetson Orin AGX + CPU | Archive + monitoring |
| SC-MC-W (microcrack) | Edge | Hailo-8 NPU | Monitoring |
| SC-DEL-Portable | Edge | Qualcomm 8 Gen3 SoC | Optional offline mode |
| SC-DEL-Drone | Edge (airborne) | Jetson Orin Nano | Batch upload after landing |
| SC-EL-Drone | Edge + ground station | Jetson Orin Nano + AGX | Same as above |
| SC-IV-Portable | Edge | ARM Cortex-A78 | Metadata only |
All products default to local inference; cloud handles model distribution, monitoring, archive. Customers may opt in to extended cloud capabilities (federated learning, cross-line analytics).
Simplified:
Cycle < 2 s? → must be edge
Data sensitive and cloud refused? → must be edge
Cycle > 10 s + cloud OK + compute-bound? → cloud is viable
Most other cases → edge inference + cloud management (hybrid)
In practice >90% of PV inspection landings fall into case 4.
For AI-deployment architecture review or edge-hardware selection, contact MVCreate at +86 159-5048-9233.
Originally published by Vision Potential (Nanjing MVCreate Intelligent Technology Co., Ltd.). Reproductions must credit the source.
contact
Be the first to know about our new product launches, latest blog posts and more.
Nanjing Vision Potential Intelligent Technology Co.,Ltd.Established based on the Nanjing Xiangning Artificial Intelligence Research Institute, we have brought together a number of outstanding industry... Any question or request?
Click below, we’ll be happy to assist. contact