An system that autonomously types on a keyboard by detecting keys using computer vision and commanding robot movements to press them in sequence via ROS 2 communication.
This project utilizes computer vision to enable a robotic arm to type predefined text on a external keyboard using ROS 2 for communication:
- ArUco Marker Detection: Estimates the 3D pose of the robot end-effector using ArUco markers
- YOLO-based Key Detection: Identifies individual keyboard keys using a trained YOLOv8 model
- Distance Measurement: Calculates inter-key distances for precise movement planning
- Camera Calibration: Uses ZED stereo camera calibration data for accurate pose estimation
- ROS 2 Communication: Publishes movement vectors to control the robot via ROS 2 topics
- Motion Planning: Generates movement commands to navigate between keys and press them
autonomous_typing/
├── package.xml # ROS 2 package configuration
├── setup.py # Python package setup
├── autonomous_typing/
│ ├── __init__.py
│ ├── movement_vector_publisher.py # ROS 2 publisher for movement vectors
│ ├── movement_vector_subscriber.py # ROS 2 subscriber for robot control
│ └── AutonomousTyping/
│ ├── detection.py # Main detection pipeline and pose estimation
│ ├── distance.py # Key detection and distance measurement
│ ├── configure.py # Camera calibration data loader
│ ├── camera.py # Camera capture utility
│ ├── requirements.txt # Python dependencies
│ ├── best.pt # Trained YOLOv8 model weights for key detection
│ ├── SN30980871.conf # ZED camera calibration file
│ ├── testfile.py # Test script
│ └── README.md # This file
Main module that:
- Loads ArUco marker dictionary and detects markers in images
- Estimates 3D pose (rotation and translation) of each detected marker
- Computes centroid of multiple markers for stable reference point
- Checks alignment tolerance before key pressing
- Executes movement sequence to type the input string
Function Summary:
estimatePoseSingleMarkers(): Solves PnP problem for marker pose estimationcentroid(): Computes mean position of detected markersaligned(): Verifies end-effector alignment with target within tolerancemove(): Commands robot motion (stub for integration)
Computer vision analysis module that:
- Uses YOLO model to detect keyboard keys in images
- Estimates positions of non-detected keys (y, z) through interpolation
- Measures Euclidean and component distances between consecutive keys
- Converts pixel distances to real-world measurements using reference key calibration
Function Summary:
DistanceData(): Main function returning list of inter-key movementsget_center(): Computes bounding box centerdraw_box(): Visualizes detections on image
Camera configuration module that:
- Parses ZED camera calibration file (
.confformat) - Constructs 3×3 intrinsic camera matrix
- Extracts lens distortion coefficients in OpenCV format
ROS 2 publisher node that:
- Publishes 3D movement vectors to the 'movement_vector' topic
- Integrates with the detection pipeline to send commands for robot motion
- Uses geometry_msgs/Vector3 messages for X, Y, Z displacements
ROS 2 subscriber node that:
- Subscribes to the 'movement_vector' topic for incoming movement commands
- Subscribes to the 'Completion' topic for typing sequence status
- Provides a callback framework for implementing hardware control logic (e.g., PID controllers for robotic arms)
- ROS 2 Humble or compatible distribution
- rclpy: ROS 2 Python client library
- geometry_msgs: Standard ROS 2 geometry messages
- std_msgs: Standard ROS 2 messages
Install dependencies with:
pip install -r requirements.txtDependencies:
- opencv-python: Computer vision operations (ArUco detection, solvePnP)
- ultralytics: YOLOv8 model for key detection
- numpy: Numerical computations
- configparser: Config file parsing
- pythonbible (optional): Launch confirmation message
Ensure ROS 2 is installed and sourced:
source /opt/ros/humble/setup.bashBuild the ROS 2 package:
cd /path/to/ros2_ws
colcon build --packages-select autonomous_typing
source install/setup.bashThe project uses a ZED camera (serial: SN30980871). To use with a different camera:
- Download your ZED calibration file from: https://www.stereolabs.com/developers/calib?SN=YOUR_SERIAL_NUMBER
- Replace
SN30980871.confwith your calibration file - Update the serial number reference in documentation
The project requires a trained YOLOv8 model for keyboard key detection:
- Place
best.ptin the project directory (or updatemodel_pathindistance.py) - Model should be trained to detect individual keyboard keys with confidence > 0.5
Run the movement vector publisher:
ros2 run autonomous_typing movement_vector_publisherRun the movement vector subscriber:
ros2 run autonomous_typing movement_vector_subscriberpython detection.pyThis prompts for a launch key string and executes the typing sequence on the keyboard by publishing movement vectors via ROS 2.
Replace the stub move() function in detection.py with actual robot control commands that publish to ROS 2 topics:
def move(x, y, z):
"""
Command robot end-effector motion via ROS 2.
Args:
x: Displacement in meters (horizontal)
y: Displacement in meters (vertical)
z: Displacement in meters (depth/press)
"""
# Publish movement vector to ROS 2 topic
msg = Vector3()
msg.x = x
msg.y = y
msg.z = z
publisher.publish(msg)Uncomment the alignment loop in detection.py to enable continuous re-detection and correction during approach:
while (not aligned(x, y, z)):
print(f"Centroid of detected markers: x={x:.4f} m, y={y:.4f} m, z={z:.4f} m")
move(x, y, z) # Move closer
corners, ids, rejected = detector.detectMarkers(image)
# Re-estimate pose and check alignment...| Parameter | Value | Purpose |
|---|---|---|
marker_size |
0.02 m | ArUco marker physical dimension |
real_key_width_mm |
15.0 mm | Standard keyboard key width (reference) |
alignment_threshold |
0.01 m | Lateral alignment tolerance |
press_depth |
0.035 m | Key press distance |
model_confidence |
0.5 | YOLO detection confidence threshold |
-
Initialization
- Load camera calibration parameters
- Load pre-trained YOLO model
- Read input string and keyboard image
- Initialize ROS 2 publisher for movement vectors
-
Marker Detection
- Detect ArUco markers in the image
- Estimate 3D pose for each marker
- Compute centroid of all marker positions
-
Key Detection & Measurement
- Run YOLO inference on keyboard image
- Identify positions of all target keys
- Interpolate missing key positions (y, z)
- Measure inter-key distances in millimeters
-
Motion Execution
- For each character in input string:
- Publish movement vector to ROS 2 topic for horizontal/vertical positioning
- Publish press command (negative Z)
- Publish release command (positive Z)
- For each character in input string:
-
ROS 2 Communication
- Subscriber node receives movement vectors
- Translates vectors into hardware commands (PID controllers, motor drivers)
- X-axis: Horizontal (left-right), positive = right
- Y-axis: Vertical (up-down), positive = down
- Z-axis: Depth (toward/away camera), positive = away from camera
- Negative Z = press down
- Positive Z = lift up
- movement_vector (geometry_msgs/Vector3): Publishes 3D displacement commands (x, y, z in meters)
- Completion (std_msgs/String): Signals completion of typing sequence or individual key presses
Uses OpenCV's solvePnP() to solve the perspective-n-point problem:
- 3D object points: ArUco marker corners in marker-local frame (z=0 plane)
- 2D image points: Detected marker corners in image
- Camera matrix + distortion: From ZED calibration
- Output: Rotation vector (3×1) and translation vector (3×1)
Keyboard distances are scaled using the "1" key as reference:
scale_factor = real_key_width_mm / detected_1_key_width_pixels
distance_mm = distance_pixels × scale_factor
Issue: ArUco markers not detected
- Ensure good lighting on markers
- Check marker size parameter matches actual marker
- Verify camera calibration is correct
Issue: Missing keyboard keys in YOLO detection
- If confidence < 0.5, lower threshold or retrain model
- Check image resolution and keyboard visibility
- Verify model was trained on similar keyboard layouts
Issue: Inaccurate distances
- Recalibrate camera using ZED tools
- Verify reference key ("1") is detected
- Check that
real_key_width_mmmatches your keyboard
Joshua Dayal, Mariam Husain, Clara Fang