Reinventy

TIN MAN v3.0 · MULTIMODAL EDGE AI · FULLY ON-DEVICE · MAJOR RELEASE (JUNE 2026)

Tin Man v3.0 — multimodal edge AI, fully on-device.

A synthetic co-pilot that sees, hears, speaks, and reasons — running entirely on a single NVIDIA Jetson AGX Thor. Air-gapped. Zero cloud. Real-time.

THE DIFFERENTIATOR

Native NVFP4 across the entire multimodal stack.

Most edge AI quantizes only the language model and leaves the vision and audio encoders at higher precision. Tin Man v3.0 runs the full pipeline — vision (ViT), audio (Conformer), and the language head — in native NVFP4, runtime-validated on a single Blackwell-class device, in English and Italian.

The result: a complete sense–think–speak system resident on one embedded module, with memory and throughput headroom that usually demands the cloud.

PERFORMANCE · v3.0 PLATFORM MIGRATION

Up to +36% sustained LLM throughput on the new platform.

v3.0 moves Tin Man to NVIDIA JetPack 7.2; every inference engine was rebuilt on-device. Figures measured on this hardware.

LLM · Nemotron-Nano-3-Omni, native NVFP4 (tokens/sec)

Workload Prev. gen v3.0 Δ
Short greeting (IT) 19.9 24.3 +22%
Short greeting (EN) 21.1 25.8 +22%
Medium technical (IT) 23.6 29.4 +25%
Medium technical (EN) 25.7 28.3 +10%
Long explanation (EN) 25.2 30.7 +22%

Per-request gains of +10–25% from the platform migration and the TensorRT 10.16 recompile compound with the new platform's max-performance clocks to ~34 tokens/sec sustained — +36% over the previous baseline.

Voice · streaming synthesis first-audio latency (ms)

Workload Prev. gen v3.0 Δ
Short (IT) 192 146 −24%
Medium technical (IT) 176 144 −18%
Short (EN) 161 155 −3%

Perception

Real-time object detection (camera + LIDAR sensor fusion) — ~7 ms end-to-end (≈ −63% vs previous gen), ~20 Hz (camera-limited).

Efficiency

~58 W under LLM load against a 120 W budget, GPU at its hardware clock ceiling (1575 MHz, ~96% util). Remaining headroom is parallel throughput, not higher clocks.

CAPABILITIES

One device. Sense, think, and speak.

  • 01

    One model, three modalities

    Text, image, and audio in a single on-device NVFP4 model.

  • 02

    Real-time voice, 7 languages

    Bidirectional voice in EN, IT, ES, FR, DE, ZH, VI — streaming first-audio ~146 ms.

  • 03

    Real-time perception

    Object detection with camera + LIDAR sensor fusion.

  • 04

    Alchemi — materials-science co-pilot

    Inverse design, de-novo materials generation, and uncertainty quantification.

  • 05

    Content Factory

    Knowledge-grounded drafting with citations and an honesty contract.

  • 06

    Demonstrated resilience

    Full platform migration (JetPack 7.1 → 7.2) with complete on-device recovery — every engine rebuilt.

  • 07

    21 cooperating subsystems

    A unified on-device cognitive architecture orchestrating the full stack.

COGNITIVE ARCHITECTURE

21 subsystems. One mind. In unison.

Tin Man v3.0 orchestrates 21 cooperating subsystems on a single Jetson AGX Thor — each isolated in a fault-bounded execution environment, sequenced deterministically, and coordinated by a dispatch fabric with bounded inter-core latency. Perception, language, reasoning, memory, action, and safety run as one harmonized system: real-time, on-device.

  • Perception & Sensing

    • vision — camera-LIDAR sensor fusion
    • retina — visual perception (planned)
    • brainstem — LIDAR sensor bridge
    • lidar_driver — LIDAR sensor driver
  • Language & Voice

    • nemo — multimodal LLM front-door (NVFP4)
    • chat — conversational front-door
    • riva — speech recognition + 7-language synthesis
    • voice_agent — in-room voice agent
  • Reasoning & Memory

    • prefrontal — decision router + intent
    • conscience — on-device reasoning LLM (Odino)
    • memory — episodic memory + retrieval
  • Action, Safety & Governance

    • realtime — real-time scheduler
    • motion — path planning (planned)
    • guardian — safety & compliance enforcement
    • conscience_observer — meta-cognitive audit observer
  • Science & Materials

    • science — materials simulation
    • alchemi_workbench — materials workbench
    • alchemi_em — electromagnetic materials simulation
  • Platform & Orchestration

    • gateway — strategic gateway
    • cockpit — operator console
    • agent — content-factory drafting

Subsystems by cognitive domain — Tin Man v3.0 orchestrates 21 cooperating subsystems in unison (23 nominal, 2 planned for Sprint 4+).

ARCHITECTURE

Tin Man is how Shield Brain runs.

Tin Man v3.0 implements the Shield Brain deterministic AI control architecture on NVIDIA Jetson AGX Thor — a unified on-device cognitive architecture orchestrating 21 cooperating subsystems, each isolated in fault-bounded execution environments with deterministic startup sequencing and quota-enforced resources.

Shield Brain is the deterministic AI control architecture protected by a patent application filed in Canada (2025, pending). It defines the hardware-isolated execution model that guarantees safety-critical determinism under variable generative workloads — and Tin Man is where that architecture becomes operational.

View Shield Brain architecture →

TECHNICAL CONTRIBUTION

A 13-line wiring patch falsified a public-domain limitation.

Native NVFP4 across the vision and audio encoders wasn't free. A public-domain inference held that TensorRT could not parse NVFP4 dequantization for encoder architectures (ViT, Conformer), mandating BF16 fallback. We empirically falsified this. The apparent limitation was a small wiring gap in a post-export rewriter — present in the LLM export path but not the encoder path. Once wired through, the encoder ONNX parses cleanly and the engines build natively in NVFP4.

That fix is the foundation of the full-stack native quantization v3.0 now ships on JetPack 7.2. The fix-pattern, popularly cited as the 13-line wiring patch, applies as 12 insertions across 6 files in the TRT-Edge-LLM rewriter layer, and is reusable for any NVFP4-quantized encoder architecture.

We documented the finding and prepared a bundle of six atomic patches for upstream contribution to NVIDIA TRT-Edge-LLM.

  • 13 LOC

    popularly cited (12 insertions across 6 files, precise)

  • +331 / −63 LOC

    total atomic patch bundle for upstream

  • 6 patches · 3 rounds

    NVFP4 encoder export wiring · prepared for upstream

PLATFORM

NVIDIA Jetson AGX Thor — Blackwell embedded.

  • NVIDIA JetPack 7.2
  • TensorRT 10.16.2
  • CUDA 13.2
  • Blackwell (sm_110a)
  • 128 GB unified memory
  • GPU NVIDIA Blackwell embedded (sm_110a) · 80 Tensor Cores Gen 5 · 2560 CUDA cores
  • CPU 14× Arm Neoverse V3AE + efficiency cluster
  • Unified memory 128 GB LPDDR5x · ~273 GB/s bandwidth
  • Storage NVMe PCIe 5.0 · ~14 GB/s sequential read
  • Software stack JetPack 7.2 · CUDA 13.2 · TensorRT 10.16.2 · native NVFP4 runtime
  • Deployment Containerized · air-gap capable · zero cloud

DIRECT CONTROL ROADMAP

Cognitive cluster as the direct actuator.

The roadmap introduces direct motor and actuator control from the cognitive cluster, with the flight controller serving as dormant safety pilot. This architecture eliminates the latency overhead of companion-computer-to-flight-controller communication standard in commercial autonomy stacks.

ECOSYSTEM

NVIDIA Inception member. Ready to contribute upstream.

Reinventy is a member of the NVIDIA Inception program. Tin Man v3.0 runs the full multimodal stack natively in NVFP4 — including encoder paths the official NVIDIA reference recipe does not yet quantize natively.

The encoder NVFP4 export-wiring fix documented above is reusable beyond Reinventy's own deployment, and we have prepared a six-patch bundle ready for community review.

ENGAGE

Capability briefs are released under partnership.

The v3.0 release report, the upstream patch bundle technical context, integration roadmaps, and the partnership pathway are released under non-disclosure agreement. Reach out and we will route the conversation to the technical lead.

Direct: engage@reinventy-solutions.ca