EU-NATIVE ACQUISITION // BUILT FOR AI ACT ART. 10

HUMAN PERSPECTIVE DATA
NO MACHINE SENSOR CAN CAPTURE.

Synchronized first-person video, gaze tracking, and IMU telemetry // perception training · robotics · driver monitoring · wearable AI

RT-Fusion delivers the human perspective data that no vehicle, robot, or fixed camera can generate — synchronized first-person video, gaze direction, and 200Hz IMU telemetry. For autonomous driving, robotics, driver monitoring, and wearable AI. EU-native acquisition across 6 countries.

Operational Capacity: 4h+ continuous World-View (GoPro 5.3K) combined with Gaze-View (Ray-Ban Meta) — audio-synced, running in parallel.

BUILD FOR:

SCENE: HEAVY RAIN / OCCLUSION

Resolution

5.3K

Telemetry

200Hz IMU

Scenario

Real-World Weather

IMU Frequency 200 Hz TRACKING Color Depth 10-BIT GP-LOG Telemetry GPMF LOCK Pipeline NATIVE INGESTION Scenarios 7 ODDs ACTIVE Sensor Pairs DUAL-STREAM SYNC EU Coverage 6 COUNTRIES ACTIVE Deployment 48H NOTICE ON-DEMAND IMU Frequency 200 Hz TRACKING Color Depth 10-BIT GP-LOG Telemetry GPMF LOCK Pipeline NATIVE INGESTION Scenarios 7 ODDs ACTIVE Sensor Pairs DUAL-STREAM SYNC EU Coverage 6 COUNTRIES ACTIVE Deployment 48H NOTICE ON-DEMAND

/// THE MISSING LAYER IN YOUR STACK

Whether you build for autonomous vehicles, robotics, driver monitoring, or wearable AI — your model needs human perspective data that no vehicle, robot, or fixed camera can generate. RT-Fusion captures this signal at human eye level, across real European environments, on-demand.

Intent Prediction

Bounding boxes don't show intent. A pedestrian's gaze direction 400ms before a crossing decision is invisible to vehicle and robot sensors. RT-Fusion captures this signal at human eye level.

Human Demonstrations

A robot can record its own telemetry, but it can't record what a human body does while performing the same task. First-person video with gaze and IMU from real human activity — the training data imitation learning needs.

Driver Attention

IR cabin cameras approximate driver gaze. RT-Fusion captures visible-light gaze ground truth on real European roads — where the driver actually looked when the traffic event occurred.

Glasses-Native Data

GoPro-on-helmet footage has different perspectives, motion, and social context than smart glasses. RT-Fusion captures on the exact Ray-Ban Meta device wearable AI products ship on.

BUYER SECTION 01

/// AUTONOMOUS DRIVING & OUTDOOR ROBOTICS

VRU Intent Ground Truth for Perception Stacks

A pedestrian wears the rig in real traffic. The Ray-Ban Meta captures where they look before stepping off a kerb — gaze direction, head-pose, social signaling. The GoPro captures synchronized world-view context with 200Hz IMU and GPS. Together, this produces the intent signal that no vehicle-mounted or robot-mounted sensor can generate: what the human was attending to before they acted. Delivered as time-stamped MP4 + GPMF telemetry, ingestible via PyTorch DataLoader, ROS 2 bag, or OpenCV.

Sample folders:
Keywords: VRU intent prediction · pedestrian intent ground truth · autonomous driving training data Europe · perception model training data · pedestrian gaze dataset · cycling dataset Europe
Why this data is unreplicable by vehicle sensors:
A chest-mounted LiDAR cannot see where the pedestrian's eyes are pointing
A roadside camera cannot capture head-pose relative to traffic flow
Simulation cannot render the micro-social signals of real street negotiation
RT-Fusion captures from inside the VRU — the only sensor position that matters for intent
BUYER SECTION 02

/// ROBOTICS & EMBODIED AI

Egocentric Human Activity Data for Imitation Learning

A person wears the rig while performing physical tasks — climbing stairs, opening doors, navigating terrain, carrying objects. The Ray-Ban Meta captures first-person POV with natural head movement and gaze direction. The GoPro captures synchronized wider context with 200Hz IMU telemetry. This produces exactly what imitation learning, behavior cloning, and VLA architectures need: large-scale egocentric human demonstration video showing not just what the person did, but where they looked while doing it. No robot fleet can generate this data — it requires a human performing the task.

Sample folders:
Keywords: egocentric video data · egocentric data collection · imitation learning training data · behavior cloning data · humanoid robot training data · embodied AI data · first-person human activity video · robot foundation model data
BUYER SECTION 03

/// DRIVER MONITORING (DMS)

Real-World Driver Gaze Ground Truth on European Roads

The driver wears Ray-Ban Meta glasses while driving real routes. The glasses capture exactly where the driver looks — road ahead, mirrors, phone, passengers, blind spot checks, gaze shifts at intersections and roundabouts. The GoPro on the dashboard captures the road scene simultaneously. Together, this produces synchronized driver gaze ground truth paired with traffic context, on real European roads, at a fraction of instrumented vehicle rig cost. No IR cabin camera provides this level of visible-light gaze fidelity with scene context.

Sample folders:
Keywords: driver monitoring system training data · DMS training data · driver gaze dataset · driver attention ground truth · drowsiness detection training data · driver distraction dataset · Euro NCAP DMS data · real-world driver gaze · driver monitoring dataset Europe
BUYER SECTION 04

/// WEARABLE AI & SMART GLASSES

First-Person Video From the Glasses You Ship On

A person wears the Ray-Ban Meta doing everyday activities — navigating city streets, shopping, commuting, socializing. The glasses capture first-person POV from the exact consumer device that Meta and its ecosystem partners are building for. This is not GoPro-on-helmet footage — the perspective, motion patterns, social context, and field of view match the actual user experience of smart glasses. The synchronized GoPro adds wider context with 200Hz IMU and GPS telemetry for spatial grounding.

Sample folders:
Keywords: smart glasses training data · wearable AI training data · egocentric scene understanding · activity recognition dataset · first-person video dataset · Ray-Ban Meta video data · AR glasses AI training data · egocentric AI data

/// THE SENSOR RIG

HARDWARE_ID: EU_V2.5

Chest-mounted GoPro (world-view) + head-worn Ray-Ban Meta (gaze-view), running in parallel with audio-synced timestamps.

[ WORLD SENSOR ]

KINEMATICS & CONTEXT

HARDWARE: GOPRO HERO 13 (CUSTOM ACQUISITION RIG)

  • RESOLUTION 4K (CONFIG A) / 5.3K (CONFIG B)
  • COLOR SCIENCE GP-LOG (10-BIT)
  • TELEMETRY GPS + ACCEL + GYRO
  • POWER WEATHERPROOF CONTINUOUS MAG-LINK
  • MOUNTING VIBRATION-ISOLATED

Captures the "World Model." High dynamic range handles the "Tunnel Exit" blinding light problem. Rolling shutter stress-tests VIO pipelines against vibration artifacts.

Serves: All buyers (AD, Robotics, DMS, Wearable AI)
[ GAZE SENSOR ]

INTENT, ATTENTION & POV

HARDWARE: RAY-BAN META GEN 2

  • DATA STREAM 3K (CONFIG C1) / 1080p (CONFIG C2)
  • AUDIO BINAURAL SPATIAL ARRAY
  • POV ALIGNMENT TRUE HUMAN EYE-LEVEL
  • SOCIAL CUES MICRO-GESTURE CAPTURE
  • TARGETING CENTER-FRAME GAZE LOCK

Captures the 'Agent Model.' Solves the High-Density VRU problem by recording the eye-contact negotiation and intent signaling that LiDAR cannot see.

Serves: AD (VRU intent), Robotics (gaze during demonstrations),
DMS (driver gaze ground truth), Wearable AI (form-factor match)

/// SAMPLE DATA PORTFOLIO

7 scenario folders, each containing synchronized paired sensor output. Raw sensor data — no stabilization, no grading. Each folder is relevant to multiple buyer use cases.

PREVIEW: URBAN CYCLING — DENSE VRU INTERACTIONS
REC ISO: 200
FOLDER: 02 — VRU_CYCLIST_URBAN GPS: 52.3676° N, 4.9041° E
PREVIEW: STAIRS & BUILDING ENTRY — VERTICAL NAVIGATION
REC SHUTTER: 1/1250
FOLDER: 06 — ROBOTICS_STAIRS_ENTRY IMU: 200Hz ACTIVE

/// ENGINEERED FOR INTEGRATION

RT-Fusion delivers structured, time-synchronized assets. Every frame is mapped to IMU telemetry and operator head-pose, enabling direct ingestion into standard machine learning and robotics pipelines.

metadata_sample.json

{
  "timestamp_utc": "2026-02-11T09:14:22.045Z",
  "frame_id": 4920,
  "environment": {
      "location": "NL_Amsterdam_Canal_District",
      "weather": "overcast_diffuse",
      "surface": "asphalt_bike_lane"
  },
  "telemetry": {
      "imu_accel_x_y_z": [0.02, -0.81, 0.15],
      "speed_mps": 5.8
  },
  "sensors": {
      "world_cam_file": "GH010492.MP4",
      "attention_cam_file": "RM010492.MP4",
      "head_pose_proxy": true
  }
}
                    

All assets delivered as time-stamped MP4 + GPMF telemetry, directly ingestible via ROS 2 bag conversion or PyTorch DataLoader.

Optimized For Standard Engineering Pipelines

ROS 2
ISAAC
PyTorch
OpenCV

/// THE OPERATOR

CREDENTIALS // METHODOLOGY

ARTY ZUEV

10+ years in professional media production — camera systems, color science, lighting, and post-production — across commercial, documentary, and marketing projects in the EU. When the industry shifted from language models to real-world perception, RT-Fusion identified a critical gap: companies building autonomous systems in Europe had no dedicated, on-demand source for the human perspective data that no vehicle, robot, or fixed camera can generate. RT-Fusion was built to close that gap — applying professional acquisition methodology to capture synchronized first-person video, gaze tracking, and IMU telemetry across autonomous driving, robotics, driver monitoring, and wearable AI use cases.

Based In Venlo / Amsterdam, NL
Coverage NL (Active) · DE · BE · FR · AT · CH (Deployable)
Acquisition Capacity 4h+ continuous dual-sensor // Rapid deployment, 48h notice
RT-Fusion operator Arty Zuev with dual-sensor capture rig — GoPro Hero 13 and Ray-Ban Meta Gen 2 — on location in the Netherlands

/// A COMPLETE DELIVERY

FROM BRIEF TO PIPELINE-READY DATASET

STEP 01

Deployment Brief

You specify target scenarios, locations, and environmental conditions. Campaign scoped per acquisition day.

STEP 02

On-Location Capture

Dual-sensor rig deploys to target location. GoPro 5.3K World-View + Ray-Ban Meta Gaze-View running in parallel. 4h+ continuous acquisition.

STEP 03

Structured Delivery

Time-stamped MP4 + GPMF telemetry, paired with JSON metadata per scene. All clips indexed by scenario category and sensor config.

STEP 04

Pipeline Ingestion

Convert directly to ROS 2 bag via rosbag2, or load into a PyTorch DataLoader. GPMF telemetry parsed with gopro2gpx. Zero custom tooling required.

/// DIRECT ENGINEERING FEED

ESTABLISH UPLINK

Direct line to Engineering. No sales agents.

Prefer async? rt@rt-fusion.com

// Pricing: €1,500–€2,500 per acquisition day. Multi-day campaigns and recurring field deployments quoted per project. All deliverables include time-stamped MP4 + GPMF telemetry + JSON metadata.

— or submit a full brief below:

ENCRYPTION: PGP-4096 // CONNECTION: SECURE