What sensor data formats does RT-Fusion deliver?

RT-Fusion delivers time-stamped MP4 video files paired with GPMF telemetry (GPS + 200Hz IMU accelerometer and gyroscope data) and JSON metadata per scene. Assets are directly ingestible via ROS 2 bag conversion or PyTorch DataLoader — no custom tooling required.

What real-world scenarios does RT-Fusion capture?

RT-Fusion captures human perspective data across a wide range of real-world scenarios: rain and occlusion, low-sun glare, high-density pedestrian and cyclist interactions, complex European roundabouts, indoor navigation, and offroad terrain. Scenarios are customised per client brief.

What is RT-Fusion's pricing?

Acquisition days are priced at €1,500–€2,500 per day. Multi-day campaigns and recurring field deployments are quoted per project. All deliverables include time-stamped MP4, GPMF telemetry, and JSON metadata.

Which countries does RT-Fusion operate in?

RT-Fusion is based in Venlo and Amsterdam, Netherlands, and is deployable across NL, DE, BE, FR, AT, and CH within 48 hours' notice.

Is RT-Fusion data compliant with the EU AI Act?

Yes. RT-Fusion is EU-native and built with AI Act Article 10 data governance requirements in mind. Chain-of-custody documentation is provided for all sensor deployments. Post-acquisition anonymisation (face and licence plate blurring) is available on request.

What hardware does RT-Fusion use?

RT-Fusion uses a dual-sensor rig: a chest-mounted GoPro Hero 13 (5.3K, 10-bit GP-LOG, 200Hz IMU/GPS) as the World Sensor, paired with a head-worn Ray-Ban Meta Gen 2 as the Gaze Sensor for head-pose and intent capture. Both streams run in parallel with audio-synced timestamps.

RT-Fusion | Human Perspective Training Data for AI — Egocentric Video, Driver Gaze & Wearable AI Data Europe

IMU Frequency 200 Hz TRACKING ◆ Color Depth 10-BIT GP-LOG ◆ Telemetry GPMF LOCK ◆ Pipeline NATIVE INGESTION ◆ Scenarios 7 ODDs ACTIVE ◆ Sensor Pairs DUAL-STREAM SYNC ◆ EU Coverage 6 COUNTRIES ACTIVE ◆ Deployment 48H NOTICE ON-DEMAND ◆ IMU Frequency 200 Hz TRACKING ◆ Color Depth 10-BIT GP-LOG ◆ Telemetry GPMF LOCK ◆ Pipeline NATIVE INGESTION ◆ Scenarios 7 ODDs ACTIVE ◆ Sensor Pairs DUAL-STREAM SYNC ◆ EU Coverage 6 COUNTRIES ACTIVE ◆ Deployment 48H NOTICE ON-DEMAND ◆

/// THE MISSING LAYER IN YOUR STACK

Whether you build for autonomous vehicles, robotics, driver monitoring, or wearable AI — your model needs human perspective data that no vehicle, robot, or fixed camera can generate. RT-Fusion captures this signal at human eye level, across real European environments, on-demand.

Intent Prediction

Bounding boxes don't show intent. A pedestrian's gaze direction 400ms before a crossing decision is invisible to vehicle and robot sensors. RT-Fusion captures this signal at human eye level.

Human Demonstrations

A robot can record its own telemetry, but it can't record what a human body does while performing the same task. First-person video with gaze and IMU from real human activity — the training data imitation learning needs.

Driver Attention

IR cabin cameras approximate driver gaze. RT-Fusion captures visible-light gaze ground truth on real European roads — where the driver actually looked when the traffic event occurred.

Glasses-Native Data

GoPro-on-helmet footage has different perspectives, motion, and social context than smart glasses. RT-Fusion captures on the exact Ray-Ban Meta device wearable AI products ship on.

BUYER SECTION 01

/// AUTONOMOUS DRIVING & OUTDOOR ROBOTICS

VRU Intent Ground Truth for Perception Stacks

A pedestrian wears the rig in real traffic. The Ray-Ban Meta captures where they look before stepping off a kerb — gaze direction, head-pose, social signaling. The GoPro captures synchronized world-view context with 200Hz IMU and GPS. Together, this produces the intent signal that no vehicle-mounted or robot-mounted sensor can generate: what the human was attending to before they acted. Delivered as time-stamped MP4 + GPMF telemetry, ingestible via PyTorch DataLoader, ROS 2 bag, or OpenCV.

Sample folders:

02 — Cyclist Urban 03 — Pedestrian Sidewalk 04 — Roundabout

Keywords: VRU intent prediction · pedestrian intent ground truth · autonomous driving training data Europe · perception model training data · pedestrian gaze dataset · cycling dataset Europe

BUYER SECTION 02

/// ROBOTICS & EMBODIED AI

Egocentric Human Activity Data for Imitation Learning

A person wears the rig while performing physical tasks — climbing stairs, opening doors, navigating terrain, carrying objects. The Ray-Ban Meta captures first-person POV with natural head movement and gaze direction. The GoPro captures synchronized wider context with 200Hz IMU telemetry. This produces exactly what imitation learning, behavior cloning, and VLA architectures need: large-scale egocentric human demonstration video showing not just what the person did, but where they looked while doing it. No robot fleet can generate this data — it requires a human performing the task.

Sample folders:

06 — Stairs & Building Entry 07 — Offroad Terrain 03 — Pedestrian Navigation

Keywords: egocentric video data · egocentric data collection · imitation learning training data · behavior cloning data · humanoid robot training data · embodied AI data · first-person human activity video · robot foundation model data

BUYER SECTION 03

/// DRIVER MONITORING (DMS)

Real-World Driver Gaze Ground Truth on European Roads

The driver wears Ray-Ban Meta glasses while driving real routes. The glasses capture exactly where the driver looks — road ahead, mirrors, phone, passengers, blind spot checks, gaze shifts at intersections and roundabouts. The GoPro on the dashboard captures the road scene simultaneously. Together, this produces synchronized driver gaze ground truth paired with traffic context, on real European roads, at a fraction of instrumented vehicle rig cost. No IR cabin camera provides this level of visible-light gaze fidelity with scene context.

Sample folders:

01 — Highway (lane changes, mirror checks) 04 — Roundabout (gaze shifts, right-of-way)

Keywords: driver monitoring system training data · DMS training data · driver gaze dataset · driver attention ground truth · drowsiness detection training data · driver distraction dataset · Euro NCAP DMS data · real-world driver gaze · driver monitoring dataset Europe

BUYER SECTION 04

/// WEARABLE AI & SMART GLASSES

First-Person Video From the Glasses You Ship On

A person wears the Ray-Ban Meta doing everyday activities — navigating city streets, shopping, commuting, socializing. The glasses capture first-person POV from the exact consumer device that Meta and its ecosystem partners are building for. This is not GoPro-on-helmet footage — the perspective, motion patterns, social context, and field of view match the actual user experience of smart glasses. The synchronized GoPro adds wider context with 200Hz IMU and GPS telemetry for spatial grounding.

Sample folders:

03 — Urban Pedestrian Activity 06 — Indoor Navigation 07 — Diverse Terrain

Keywords: smart glasses training data · wearable AI training data · egocentric scene understanding · activity recognition dataset · first-person video dataset · Ray-Ban Meta video data · AR glasses AI training data · egocentric AI data

/// THE SENSOR RIG

HARDWARE_ID: EU_V2.5

Chest-mounted GoPro (world-view) + head-worn Ray-Ban Meta (gaze-view), running in parallel with audio-synced timestamps.

[ WORLD SENSOR ]

KINEMATICS & CONTEXT

HARDWARE: GOPRO HERO 13 (CUSTOM ACQUISITION RIG)

RESOLUTION 4K (CONFIG A) / 5.3K (CONFIG B)
COLOR SCIENCE GP-LOG (10-BIT)
TELEMETRY GPS + ACCEL + GYRO
POWER WEATHERPROOF CONTINUOUS MAG-LINK
MOUNTING VIBRATION-ISOLATED

Captures the "World Model." High dynamic range handles the "Tunnel Exit" blinding light problem. Rolling shutter stress-tests VIO pipelines against vibration artifacts.

Serves: All buyers (AD, Robotics, DMS, Wearable AI)

[ GAZE SENSOR ]

INTENT, ATTENTION & POV

HARDWARE: RAY-BAN META GEN 2

DATA STREAM 3K (CONFIG C1) / 1080p (CONFIG C2)
AUDIO BINAURAL SPATIAL ARRAY
POV ALIGNMENT TRUE HUMAN EYE-LEVEL
SOCIAL CUES MICRO-GESTURE CAPTURE
TARGETING CENTER-FRAME GAZE LOCK

Captures the 'Agent Model.' Solves the High-Density VRU problem by recording the eye-contact negotiation and intent signaling that LiDAR cannot see.

Serves: AD (VRU intent), Robotics (gaze during demonstrations),
DMS (driver gaze ground truth), Wearable AI (form-factor match)

/// SAMPLE DATA PORTFOLIO

7 scenario folders, each containing synchronized paired sensor output. Raw sensor data — no stabilization, no grading. Each folder is relevant to multiple buyer use cases.

PREVIEW: URBAN CYCLING — DENSE VRU INTERACTIONS

REC ISO: 200

FOLDER: 02 — VRU_CYCLIST_URBAN GPS: 52.3676° N, 4.9041° E

PREVIEW: STAIRS & BUILDING ENTRY — VERTICAL NAVIGATION

REC SHUTTER: 1/1250

FOLDER: 06 — ROBOTICS_STAIRS_ENTRY IMU: 200Hz ACTIVE

[ OPEN FULL SAMPLE PORTFOLIO — 7 FOLDERS, PAIRED SENSORS ]

/// ENGINEERED FOR INTEGRATION

RT-Fusion delivers structured, time-synchronized assets. Every frame is mapped to IMU telemetry and operator head-pose, enabling direct ingestion into standard machine learning and robotics pipelines.

metadata_sample.json


{
  "timestamp_utc": "2026-02-11T09:14:22.045Z",
  "frame_id": 4920,
  "environment": {
      "location": "NL_Amsterdam_Canal_District",
      "weather": "overcast_diffuse",
      "surface": "asphalt_bike_lane"
  },
  "telemetry": {
      "imu_accel_x_y_z": [0.02, -0.81, 0.15],
      "speed_mps": 5.8
  },
  "sensors": {
      "world_cam_file": "GH010492.MP4",
      "attention_cam_file": "RM010492.MP4",
      "head_pose_proxy": true
  }
}

All assets delivered as time-stamped MP4 + GPMF telemetry, directly ingestible via ROS 2 bag conversion or PyTorch DataLoader.

Optimized For Standard Engineering Pipelines

ISAAC

PyTorch

OpenCV

/// THE OPERATOR

CREDENTIALS // METHODOLOGY

ARTY ZUEV

10+ years in professional media production — camera systems, color science, lighting, and post-production — across commercial, documentary, and marketing projects in the EU. When the industry shifted from language models to real-world perception, RT-Fusion identified a critical gap: companies building autonomous systems in Europe had no dedicated, on-demand source for the human perspective data that no vehicle, robot, or fixed camera can generate. RT-Fusion was built to close that gap — applying professional acquisition methodology to capture synchronized first-person video, gaze tracking, and IMU telemetry across autonomous driving, robotics, driver monitoring, and wearable AI use cases.

Based In Venlo / Amsterdam, NL

Coverage NL (Active) · DE · BE · FR · AT · CH (Deployable)

Acquisition Capacity 4h+ continuous dual-sensor // Rapid deployment, 48h notice

RT-Fusion operator Arty Zuev with dual-sensor capture rig — GoPro Hero 13 and Ray-Ban Meta Gen 2 — on location in the Netherlands

/// A COMPLETE DELIVERY

FROM BRIEF TO PIPELINE-READY DATASET

STEP 01

Deployment Brief

You specify target scenarios, locations, and environmental conditions. Campaign scoped per acquisition day.

STEP 02

On-Location Capture

Dual-sensor rig deploys to target location. GoPro 5.3K World-View + Ray-Ban Meta Gaze-View running in parallel. 4h+ continuous acquisition.

STEP 03

Structured Delivery

Time-stamped MP4 + GPMF telemetry, paired with JSON metadata per scene. All clips indexed by scenario category and sensor config.

STEP 04

Pipeline Ingestion

Convert directly to ROS 2 bag via rosbag2, or load into a PyTorch DataLoader. GPMF telemetry parsed with gopro2gpx. Zero custom tooling required.

/// DIRECT ENGINEERING FEED

ESTABLISH UPLINK

Direct line to Engineering. No sales agents.

Prefer async? [load email]

// Pricing: €1,500–€2,500 per acquisition day. Multi-day campaigns and recurring field deployments quoted per project. All deliverables include time-stamped MP4 + GPMF telemetry + JSON metadata.

— or submit a full brief below:

ENCRYPTION: PGP-4096 // CONNECTION: SECURE