Skip to content

Johns-Hopkins-URC-Mars-Rover/AutonomousTyping

Repository files navigation

Autonomous Typing System

An system that autonomously types on a keyboard by detecting keys using computer vision and commanding robot movements to press them in sequence via ROS 2 communication.

Overview

This project utilizes computer vision to enable a robotic arm to type predefined text on a external keyboard using ROS 2 for communication:

  • ArUco Marker Detection: Estimates the 3D pose of the robot end-effector using ArUco markers
  • YOLO-based Key Detection: Identifies individual keyboard keys using a trained YOLOv8 model
  • Distance Measurement: Calculates inter-key distances for precise movement planning
  • Camera Calibration: Uses ZED stereo camera calibration data for accurate pose estimation
  • ROS 2 Communication: Publishes movement vectors to control the robot via ROS 2 topics
  • Motion Planning: Generates movement commands to navigate between keys and press them

Project Structure

autonomous_typing/
├── package.xml                    # ROS 2 package configuration
├── setup.py                       # Python package setup
├── autonomous_typing/
│   ├── __init__.py
│   ├── movement_vector_publisher.py   # ROS 2 publisher for movement vectors
│   ├── movement_vector_subscriber.py  # ROS 2 subscriber for robot control
│   └── AutonomousTyping/
│       ├── detection.py            # Main detection pipeline and pose estimation
│       ├── distance.py             # Key detection and distance measurement
│       ├── configure.py            # Camera calibration data loader
│       ├── camera.py               # Camera capture utility
│       ├── requirements.txt        # Python dependencies
│       ├── best.pt                 # Trained YOLOv8 model weights for key detection
│       ├── SN30980871.conf         # ZED camera calibration file
│       ├── testfile.py             # Test script
│       └── README.md               # This file

Components

detection.py

Main module that:

  • Loads ArUco marker dictionary and detects markers in images
  • Estimates 3D pose (rotation and translation) of each detected marker
  • Computes centroid of multiple markers for stable reference point
  • Checks alignment tolerance before key pressing
  • Executes movement sequence to type the input string

Function Summary:

  • estimatePoseSingleMarkers(): Solves PnP problem for marker pose estimation
  • centroid(): Computes mean position of detected markers
  • aligned(): Verifies end-effector alignment with target within tolerance
  • move(): Commands robot motion (stub for integration)

distance.py

Computer vision analysis module that:

  • Uses YOLO model to detect keyboard keys in images
  • Estimates positions of non-detected keys (y, z) through interpolation
  • Measures Euclidean and component distances between consecutive keys
  • Converts pixel distances to real-world measurements using reference key calibration

Function Summary:

  • DistanceData(): Main function returning list of inter-key movements
  • get_center(): Computes bounding box center
  • draw_box(): Visualizes detections on image

configure.py

Camera configuration module that:

  • Parses ZED camera calibration file (.conf format)
  • Constructs 3×3 intrinsic camera matrix
  • Extracts lens distortion coefficients in OpenCV format

ROS 2 Nodes

movement_vector_publisher.py

ROS 2 publisher node that:

  • Publishes 3D movement vectors to the 'movement_vector' topic
  • Integrates with the detection pipeline to send commands for robot motion
  • Uses geometry_msgs/Vector3 messages for X, Y, Z displacements

movement_vector_subscriber.py

ROS 2 subscriber node that:

  • Subscribes to the 'movement_vector' topic for incoming movement commands
  • Subscribes to the 'Completion' topic for typing sequence status
  • Provides a callback framework for implementing hardware control logic (e.g., PID controllers for robotic arms)

Requirements

ROS 2 Dependencies

  • ROS 2 Humble or compatible distribution
  • rclpy: ROS 2 Python client library
  • geometry_msgs: Standard ROS 2 geometry messages
  • std_msgs: Standard ROS 2 messages

Python Dependencies

Install dependencies with:

pip install -r requirements.txt

Dependencies:

  • opencv-python: Computer vision operations (ArUco detection, solvePnP)
  • ultralytics: YOLOv8 model for key detection
  • numpy: Numerical computations
  • configparser: Config file parsing
  • pythonbible (optional): Launch confirmation message

Setup

1. ROS 2 Environment

Ensure ROS 2 is installed and sourced:

source /opt/ros/humble/setup.bash

Build the ROS 2 package:

cd /path/to/ros2_ws
colcon build --packages-select autonomous_typing
source install/setup.bash

2. Camera Calibration

The project uses a ZED camera (serial: SN30980871). To use with a different camera:

  1. Download your ZED calibration file from: https://www.stereolabs.com/developers/calib?SN=YOUR_SERIAL_NUMBER
  2. Replace SN30980871.conf with your calibration file
  3. Update the serial number reference in documentation

3. Model Weights

The project requires a trained YOLOv8 model for keyboard key detection:

  • Place best.pt in the project directory (or update model_path in distance.py)
  • Model should be trained to detect individual keyboard keys with confidence > 0.5

Usage

ROS 2 Nodes

Run the movement vector publisher:

ros2 run autonomous_typing movement_vector_publisher

Run the movement vector subscriber:

ros2 run autonomous_typing movement_vector_subscriber

Basic Usage

python detection.py

This prompts for a launch key string and executes the typing sequence on the keyboard by publishing movement vectors via ROS 2.

Robot Integration

Replace the stub move() function in detection.py with actual robot control commands that publish to ROS 2 topics:

def move(x, y, z):
    """
    Command robot end-effector motion via ROS 2.
    
    Args:
        x: Displacement in meters (horizontal)
        y: Displacement in meters (vertical)
        z: Displacement in meters (depth/press)
    """
    # Publish movement vector to ROS 2 topic
    msg = Vector3()
    msg.x = x
    msg.y = y
    msg.z = z
    publisher.publish(msg)

Enabling Closed-Loop Alignment

Uncomment the alignment loop in detection.py to enable continuous re-detection and correction during approach:

while (not aligned(x, y, z)):
    print(f"Centroid of detected markers: x={x:.4f} m, y={y:.4f} m, z={z:.4f} m")
    move(x, y, z)  # Move closer
    corners, ids, rejected = detector.detectMarkers(image)
    # Re-estimate pose and check alignment...

Key Parameters

Parameter Value Purpose
marker_size 0.02 m ArUco marker physical dimension
real_key_width_mm 15.0 mm Standard keyboard key width (reference)
alignment_threshold 0.01 m Lateral alignment tolerance
press_depth 0.035 m Key press distance
model_confidence 0.5 YOLO detection confidence threshold

Workflow

  1. Initialization

    • Load camera calibration parameters
    • Load pre-trained YOLO model
    • Read input string and keyboard image
    • Initialize ROS 2 publisher for movement vectors
  2. Marker Detection

    • Detect ArUco markers in the image
    • Estimate 3D pose for each marker
    • Compute centroid of all marker positions
  3. Key Detection & Measurement

    • Run YOLO inference on keyboard image
    • Identify positions of all target keys
    • Interpolate missing key positions (y, z)
    • Measure inter-key distances in millimeters
  4. Motion Execution

    • For each character in input string:
      • Publish movement vector to ROS 2 topic for horizontal/vertical positioning
      • Publish press command (negative Z)
      • Publish release command (positive Z)
  5. ROS 2 Communication

    • Subscriber node receives movement vectors
    • Translates vectors into hardware commands (PID controllers, motor drivers)

Coordinate System

  • X-axis: Horizontal (left-right), positive = right
  • Y-axis: Vertical (up-down), positive = down
  • Z-axis: Depth (toward/away camera), positive = away from camera
    • Negative Z = press down
    • Positive Z = lift up

ROS 2 Topics

  • movement_vector (geometry_msgs/Vector3): Publishes 3D displacement commands (x, y, z in meters)
  • Completion (std_msgs/String): Signals completion of typing sequence or individual key presses

Technical Details

Pose Estimation

Uses OpenCV's solvePnP() to solve the perspective-n-point problem:

  • 3D object points: ArUco marker corners in marker-local frame (z=0 plane)
  • 2D image points: Detected marker corners in image
  • Camera matrix + distortion: From ZED calibration
  • Output: Rotation vector (3×1) and translation vector (3×1)

Distance Scaling

Keyboard distances are scaled using the "1" key as reference:

scale_factor = real_key_width_mm / detected_1_key_width_pixels
distance_mm = distance_pixels × scale_factor

Troubleshooting

Issue: ArUco markers not detected

  • Ensure good lighting on markers
  • Check marker size parameter matches actual marker
  • Verify camera calibration is correct

Issue: Missing keyboard keys in YOLO detection

  • If confidence < 0.5, lower threshold or retrain model
  • Check image resolution and keyboard visibility
  • Verify model was trained on similar keyboard layouts

Issue: Inaccurate distances

  • Recalibrate camera using ZED tools
  • Verify reference key ("1") is detected
  • Check that real_key_width_mm matches your keyboard

References

Authors

Joshua Dayal, Mariam Husain, Clara Fang

About

An Autonomous Typing system that autonomously types on a keyboard by detecting keys using computer vision and transmitting robot movements vectors via ROS 2 communication

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors