ITAP 2025 Hand-Tracking Demo — Technical Documentation

End-to-end architecture of the real-time hand-tracking demonstration presented to the Minister of State for Trade and Industry at ITAP 2025.

1. System Architecture

┌──────────────────┐    Serial/USB    ┌──────────────────┐   Modbus RTU   ┌──────────────────┐
│   STM32N6570-DK  │ ──────────────► │  Python Bridge   │ ────────────► │   OMRON TM Cobot │
│  (Edge AI Board) │                 │  (bridge.py)     │               │  + DH-5-6 Hand   │
│                  │                 │                  │               │                  │
│ • NPU-accelerated│                 │ • Serial parser  │               │ • 6-DoF dexterous│
│   palm detection │                 │ • Landmark→angle │               │   robotic hand   │
│ • Hand landmark  │                 │   conversion     │               │ • Live gesture   │
│   inference      │                 │ • Modbus RTU     │               │   mirroring      │
│ • Camera capture │                 │   driver         │               │                  │
└──────────────────┘                 └──────────────────┘               └──────────────────┘

2. STM32N6570-DK — NPU Edge Inference

The STM32N6570-DK board features a dedicated Neural Processing Unit (NPU) that accelerates two cascaded models:

Two-Stage Detection Pipeline

  1. Palm Detection: Lightweight SSD-style model that localizes hands in the camera frame, outputting bounding boxes
  2. Hand Landmark Detection: Regresses 21 keypoint landmarks from the cropped hand region

Both models run on the NPU at real-time frame rates. The firmware streams detected landmark coordinates over serial (USB) to the host Python bridge.

// app.c — Landmark streaming over serial
// After NPU inference completes:
void stream_landmarks(HandLandmarks* landmarks) {
    // Format: "LM|x0,y0,z0|x1,y1,z1|...|x20,y20,z20\n"
    printf("LM");
    for (int i = 0; i < 21; i++) {
        printf("|%.3f,%.3f,%.3f", 
               landmarks->points[i].x,
               landmarks->points[i].y,
               landmarks->points[i].z);
    }
    printf("\n");
}

3. Python Bridge

The bridge script (bridge.py) reads landmark data from serial, converts 3D joint positions to finger angles, and sends motor commands via Modbus RTU:

import serial
from ComDriver import DHHand

class HandTrackingBridge:
    def __init__(self, stm32_port, hand_port):
        self.serial_conn = serial.Serial(stm32_port, 115200)
        self.hand = DHHand(hand_port)
    
    def parse_landmarks(self, line):
        """Parse 21 landmarks from serial line."""
        parts = line.strip().split('|')[1:]  # Skip 'LM' prefix
        landmarks = []
        for part in parts:
            x, y, z = map(float, part.split(','))
            landmarks.append((x, y, z))
        return landmarks
    
    def landmarks_to_angles(self, landmarks):
        """Convert 3D landmarks to finger joint angles."""
        angles = {}
        # Thumb: landmarks 1-4
        angles['thumb'] = self._compute_finger_angle(
            landmarks[1], landmarks[2], 
            landmarks[3], landmarks[4]
        )
        # Index: landmarks 5-8
        angles['index'] = self._compute_finger_angle(
            landmarks[5], landmarks[6], 
            landmarks[7], landmarks[8]
        )
        # Middle, Ring, Pinky follow same pattern...
        return angles
    
    def run(self):
        while True:
            line = self.serial_conn.readline().decode()
            if line.startswith('LM'):
                landmarks = self.parse_landmarks(line)
                angles = self.landmarks_to_angles(landmarks)
                self.hand.set_finger_angles(angles)

4. Modbus RTU Communication

The ComDriver.py module handles Modbus RTU communication with the DH-Robotics DH-5-6 dexterous hand, which is mounted on the OMRON TM cobot:

from pymodbus.client import ModbusSerialClient

class DHHand:
    """Driver for DH-Robotics DH-5-6 6-axis dexterous hand."""
    
    def __init__(self, port, baudrate=115200):
        self.client = ModbusSerialClient(
            port=port,
            baudrate=baudrate,
            parity='N',
            stopbits=1,
            bytesize=8,
            timeout=0.1
        )
        self.client.connect()
        self.slave_id = 1
    
    def set_finger_position(self, finger_id, position):
        """Set individual finger position (0-1000)."""
        register = 0x0100 + finger_id * 2
        self.client.write_register(
            register, position, 
            slave=self.slave_id
        )
    
    def set_finger_angles(self, angles):
        """Set all finger angles from landmark data."""
        finger_map = {
            'thumb': 0, 'index': 1, 'middle': 2,
            'ring': 3, 'pinky': 4, 'wrist': 5
        }
        for name, angle in angles.items():
            if name in finger_map:
                pos = int(angle / 180.0 * 1000)
                pos = max(0, min(1000, pos))
                self.set_finger_position(
                    finger_map[name], pos
                )

5. OMRON TM Cobot Integration

The DH-5-6 hand is mounted on the end-effector of an OMRON TM collaborative robot. The cobot holds the hand in a fixed position while the Python bridge controls individual finger movements in real-time, creating a live gesture mirroring demonstration.

A tmflow.py script interfaces with the TM cobot to set the arm pose via the TMflow Listen Node protocol.

6. Source Repositories