TIN MAN v3.0 · MULTIMODAL EDGE AI · FULLY ON-DEVICE · MAJOR RELEASE (JUNE 2026)
Tin Man v3.0 — multimodal edge AI, fully on-device.
A synthetic co-pilot that sees, hears, speaks, and reasons — running entirely on a single NVIDIA Jetson AGX Thor. Air-gapped. Zero cloud. Real-time.
THE DIFFERENTIATOR
Native NVFP4 across the entire multimodal stack.
Most edge AI quantizes only the language model and leaves the vision and audio encoders at higher precision. Tin Man v3.0 runs the full pipeline — vision (ViT), audio (Conformer), and the language head — in native NVFP4, runtime-validated on a single Blackwell-class device, in English and Italian.
The result: a complete sense–think–speak system resident on one embedded module, with memory and throughput headroom that usually demands the cloud.
PERFORMANCE · v3.0 PLATFORM MIGRATION
Up to +36% sustained LLM throughput on the new platform.
v3.0 moves Tin Man to NVIDIA JetPack 7.2; every inference engine was rebuilt on-device. Figures measured on this hardware.
LLM · Nemotron-Nano-3-Omni, native NVFP4 (tokens/sec)
| Workload | Prev. gen | v3.0 | Δ |
|---|---|---|---|
| Short greeting (IT) | 19.9 | 24.3 | +22% |
| Short greeting (EN) | 21.1 | 25.8 | +22% |
| Medium technical (IT) | 23.6 | 29.4 | +25% |
| Medium technical (EN) | 25.7 | 28.3 | +10% |
| Long explanation (EN) | 25.2 | 30.7 | +22% |
Per-request gains of +10–25% from the platform migration and the TensorRT 10.16 recompile compound with the new platform's max-performance clocks to ~34 tokens/sec sustained — +36% over the previous baseline.
Voice · streaming synthesis first-audio latency (ms)
| Workload | Prev. gen | v3.0 | Δ |
|---|---|---|---|
| Short (IT) | 192 | 146 | −24% |
| Medium technical (IT) | 176 | 144 | −18% |
| Short (EN) | 161 | 155 | −3% |
Perception
Real-time object detection (camera + LIDAR sensor fusion) — ~7 ms end-to-end (≈ −63% vs previous gen), ~20 Hz (camera-limited).
Efficiency
~58 W under LLM load against a 120 W budget, GPU at its hardware clock ceiling (1575 MHz, ~96% util). Remaining headroom is parallel throughput, not higher clocks.
CAPABILITIES
One device. Sense, think, and speak.
- 01
One model, three modalities
Text, image, and audio in a single on-device NVFP4 model.
- 02
Real-time voice, 7 languages
Bidirectional voice in EN, IT, ES, FR, DE, ZH, VI — streaming first-audio ~146 ms.
- 03
Real-time perception
Object detection with camera + LIDAR sensor fusion.
- 04
Alchemi — materials-science co-pilot
Inverse design, de-novo materials generation, and uncertainty quantification.
- 05
Content Factory
Knowledge-grounded drafting with citations and an honesty contract.
- 06
Demonstrated resilience
Full platform migration (JetPack 7.1 → 7.2) with complete on-device recovery — every engine rebuilt.
- 07
21 cooperating subsystems
A unified on-device cognitive architecture orchestrating the full stack.
COGNITIVE ARCHITECTURE
21 subsystems. One mind. In unison.
Tin Man v3.0 orchestrates 21 cooperating subsystems on a single Jetson AGX Thor — each isolated in a fault-bounded execution environment, sequenced deterministically, and coordinated by a dispatch fabric with bounded inter-core latency. Perception, language, reasoning, memory, action, and safety run as one harmonized system: real-time, on-device.
-
Perception & Sensing
- vision — camera-LIDAR sensor fusion
- retina — visual perception (planned)
- brainstem — LIDAR sensor bridge
- lidar_driver — LIDAR sensor driver
-
Language & Voice
- nemo — multimodal LLM front-door (NVFP4)
- chat — conversational front-door
- riva — speech recognition + 7-language synthesis
- voice_agent — in-room voice agent
-
Reasoning & Memory
- prefrontal — decision router + intent
- conscience — on-device reasoning LLM (Odino)
- memory — episodic memory + retrieval
-
Action, Safety & Governance
- realtime — real-time scheduler
- motion — path planning (planned)
- guardian — safety & compliance enforcement
- conscience_observer — meta-cognitive audit observer
-
Science & Materials
- science — materials simulation
- alchemi_workbench — materials workbench
- alchemi_em — electromagnetic materials simulation
-
Platform & Orchestration
- gateway — strategic gateway
- cockpit — operator console
- agent — content-factory drafting
Subsystems by cognitive domain — Tin Man v3.0 orchestrates 21 cooperating subsystems in unison (23 nominal, 2 planned for Sprint 4+).
ARCHITECTURE
Tin Man is how Shield Brain runs.
Tin Man v3.0 implements the Shield Brain deterministic AI control architecture on NVIDIA Jetson AGX Thor — a unified on-device cognitive architecture orchestrating 21 cooperating subsystems, each isolated in fault-bounded execution environments with deterministic startup sequencing and quota-enforced resources.
Shield Brain is the deterministic AI control architecture protected by a patent application filed in Canada (2025, pending). It defines the hardware-isolated execution model that guarantees safety-critical determinism under variable generative workloads — and Tin Man is where that architecture becomes operational.
TECHNICAL CONTRIBUTION
A 13-line wiring patch falsified a public-domain limitation.
Native NVFP4 across the vision and audio encoders wasn't free. A public-domain inference held that TensorRT could not parse NVFP4 dequantization for encoder architectures (ViT, Conformer), mandating BF16 fallback. We empirically falsified this. The apparent limitation was a small wiring gap in a post-export rewriter — present in the LLM export path but not the encoder path. Once wired through, the encoder ONNX parses cleanly and the engines build natively in NVFP4.
That fix is the foundation of the full-stack native quantization v3.0 now ships on JetPack 7.2. The fix-pattern, popularly cited as the 13-line wiring patch, applies as 12 insertions across 6 files in the TRT-Edge-LLM rewriter layer, and is reusable for any NVFP4-quantized encoder architecture.
We documented the finding and prepared a bundle of six atomic patches for upstream contribution to NVIDIA TRT-Edge-LLM.
-
13 LOC
popularly cited (12 insertions across 6 files, precise)
-
+331 / −63 LOC
total atomic patch bundle for upstream
-
6 patches · 3 rounds
NVFP4 encoder export wiring · prepared for upstream
PLATFORM
NVIDIA Jetson AGX Thor — Blackwell embedded.
- NVIDIA JetPack 7.2
- TensorRT 10.16.2
- CUDA 13.2
- Blackwell (sm_110a)
- 128 GB unified memory
- GPU NVIDIA Blackwell embedded (sm_110a) · 80 Tensor Cores Gen 5 · 2560 CUDA cores
- CPU 14× Arm Neoverse V3AE + efficiency cluster
- Unified memory 128 GB LPDDR5x · ~273 GB/s bandwidth
- Storage NVMe PCIe 5.0 · ~14 GB/s sequential read
- Software stack JetPack 7.2 · CUDA 13.2 · TensorRT 10.16.2 · native NVFP4 runtime
- Deployment Containerized · air-gap capable · zero cloud
DIRECT CONTROL ROADMAP
Cognitive cluster as the direct actuator.
The roadmap introduces direct motor and actuator control from the cognitive cluster, with the flight controller serving as dormant safety pilot. This architecture eliminates the latency overhead of companion-computer-to-flight-controller communication standard in commercial autonomy stacks.
ECOSYSTEM
NVIDIA Inception member. Ready to contribute upstream.
Reinventy is a member of the NVIDIA Inception program. Tin Man v3.0 runs the full multimodal stack natively in NVFP4 — including encoder paths the official NVIDIA reference recipe does not yet quantize natively.
The encoder NVFP4 export-wiring fix documented above is reusable beyond Reinventy's own deployment, and we have prepared a six-patch bundle ready for community review.
ENGAGE
Capability briefs are released under partnership.
The v3.0 release report, the upstream patch bundle technical context, integration roadmaps, and the partnership pathway are released under non-disclosure agreement. Reach out and we will route the conversation to the technical lead.
Direct: engage@reinventy-solutions.ca