TIN MAN v2.1 · COGNITIVE CLUSTER · NVIDIA BLACKWELL EMBEDDED

First publicly documented Nemotron-Omni V3 100% NVFP4 native — encoders included.

A 12-core multimodal cognitive cluster engineered for sovereign edge autonomy. Tin Man v2.1 runs the full Nemotron-3-Nano-Omni V3 stack — LLM body, RADIO visual encoder, Parakeet audio encoder — natively in NVFP4 on a single NVIDIA Jetson AGX Thor 5000 node. A configuration the official NVIDIA reference recipe leaves in BF16.

Engage for capability brief Read the Sprint 2 report (external distribution)

ARCHITECTURE

12 cores. One cognitive cluster.

Tin Man v2.1 implements the Shield Brain deterministic AI control architecture on NVIDIA Jetson AGX Thor 5000. The cluster is a set of containerized cognitive cores, each isolated in fault-bounded execution environments with deterministic startup sequencing and quota- enforced resources. Sprint 2 brings the first core — omni_core — to operational status, multimodal end-to-end.

omni_core multimodal cognition (LLM + Vision + Audio) ACTIVE
retina_core 4K video preprocessing via DeepStream PLANNED
riva_core ASR + TTS + NMT (Riva ARM64) PLANNED
brainstem_core ROS2 Isaac + LiDAR sensor fusion PLANNED
prefrontal_core decision controller + multimodal prompt assembly PLANNED
guardian_core AI Act safety gates · deterministic rule engine PLANNED
realtime_core reflex layer · sub-100 ms hard real-time loop PLANNED
isr_core pattern recognition · intelligence layer PLANNED
memory_core RAG + episodic context store PLANNED
motion_core path planning · trajectory control PLANNED
gateway_core API I/O PLANNED
science_core ad-hoc compute · experimental workloads PLANNED

The hardware platform — Jetson AGX Thor 5000 with 128 GiB LPDDR5x unified memory — has approximately 75 GiB of headroom available for Sprint 3 expansion without compromising the cognitive trunk performance characteristics observed in Sprint 2.

PERFORMANCE · SPRINT 2

Numbers from a single Jetson AGX Thor node.

All metrics below are from the Sprint 2 closure benchmarks (2026-05-12), reproducible from the documented engine builds. The cluster is 19.4 GiB on-disk in total, with approximately 1 GiB runtime activation in steady state.

Table A · Cluster engine inventory

Component	Build duration	Engine size	Runtime activation
LLM (Nemotron-3-Nano-Omni V3, 30B MoE A3B)	1 m 45 s	18.14 GiB	134 MB
Visual encoder (RADIO C-RADIOv2-H, ViT-H/16)	27.7 s	768 MiB	115 MB
Audio encoder (Parakeet Fast Conformer)	2 m 04 s	565 MiB	136 MB
Cluster total	—	19.4 GiB on-disk	~390 MB runtime

Table B · Long-context inference (Hybrid Mamba prefill invariance)

Input context	Setup time	Inference (prefill + decode)	Output tokens	Aggregate e2e throughput
8K	13 s	7 s	113	~1.5K tok/s
32K	13 s	17 s	122	~2.4K tok/s
64K	13 s	31 s	96	~2.5K tok/s
100K	12 s	48 s	106	~2.6K tok/s

Prefill throughput remains constant at approximately 2.5K tok/s from 8K to 100K input tokens. This is a characteristic of the Hybrid Mamba-2 + Transformer GQA architecture (46 O(N) Mamba layers + 6 O(N²) Attention layers, 30B total parameters, 3B active per token) that pure- transformer architectures cannot match at this scale on embedded hardware.

TECHNICAL CONTRIBUTION

A 13-line wiring patch falsified a public-domain limitation.

A public-domain inference held that TensorRT 10.13.3.9 could not parse NVFP4 dequantization for encoder architectures (ViT, Conformer), mandating BF16 fallback. We empirically falsified this. The apparent limitation was a 13-line wiring gap in a post-export rewriter that was present in the LLM export path but not in the encoder export path. Once wired through, the encoder ONNX parses cleanly and the engines build natively in NVFP4.

We documented the finding and submitted a bundle of six atomic patches for upstream contribution to NVIDIA TRT-Edge-LLM. The fix pattern is reusable for any NVFP4-quantized encoder architecture in the upstream codebase.

13 LOC

the wiring fix in the post-export rewriter
+331 / −63 LOC

total atomic patch bundle for upstream
6 patches · 3 rounds

vendored on TRT-Edge-LLM v0.7.0 base

HARDWARE PLATFORM

NVIDIA Jetson AGX Thor 5000 — Blackwell embedded.

Tin Man v2.1 runs on a single Jetson AGX Thor 5000 node. The hardware platform was selected for the combination of unified memory capacity, Tensor Core generation, and embedded deployability that the cluster's design envelope required.

GPU NVIDIA Blackwell embedded (sm_110a) · 80 Tensor Cores Gen 5 · 2560 CUDA cores
CPU 14× Arm Neoverse V3AE + efficiency cluster
Unified memory 128 GiB LPDDR5x · ~273 GB/s bandwidth
Storage NVMe PCIe 5.0 · ~14 GB/s sequential read
Software stack JetPack 7.0 · CUDA 13.0 · TensorRT 10.13.3.9 · TRT-Edge-LLM v0.7.0
Deployment Containerized (Docker 27.5.1 + NVIDIA Container Toolkit CDI) · air-gap capable

INTEGRATION

Tin Man is how Shield Brain runs.

Shield Brain is the deterministic AI control architecture protected by a patent application filed in Canada (2025, pending). It defines the cognitive cores — Prefrontal, Guardian, Realtime, and supporting cores — and the hardware-isolated execution model that guarantees safety-critical determinism under variable generative workloads.

Tin Man v2.1 is the implementation of Shield Brain on Jetson AGX Thor — the cluster where the architecture becomes operational. The Sprint 2 deliverable activates omni_core (multimodal cognition); Sprint 3+ activates the additional cores defined by the Shield Brain framework.

View Shield Brain architecture →

ECOSYSTEM

NVIDIA Inception member. Contributing upstream.

Reinventy is a member of the NVIDIA Inception program. Tin Man v2.1 is the first publicly documented edge deployment characterizing Nemotron-3-Nano-Omni V3 NVFP4 native behavior end-to-end, including encoder paths the official NVIDIA reference recipe does not yet quantize natively.

Sprint 2 deliverables are formatted for upstream contribution to NVIDIA TRT-Edge-LLM. The fix-pattern documented above (encoder NVFP4 export wiring) is reusable beyond Reinventy's own deployment, and we have prepared a 6-patch bundle PR ready for community review.

NVIDIA Inception Member
Upstream PR bundle ready: TRT-Edge-LLM v0.7.0 + 6 patches
Sprint 2 report: external-distribution-ready under partnership

ENGAGE

Capability briefs are released under partnership.

The Sprint 2 performance report, the upstream patch bundle technical context, integration roadmaps, and the partnership pathway are released under non-disclosure agreement. Reach out and we will route the conversation to the technical lead.

Open the engage form View IP posture

Direct: engage@reinventy-solutions.ca