Skip to content

Questions about adapting DualMap to a ROS2 humanoid robot with custom detector and LiDAR localization #51

@nguyenvybao27-wq

Description

@nguyenvybao27-wq

Dear DualMap authors,

Thank you for releasing this excellent work. I am a graduate student working on semantic navigation for a humanoid robot in a grain warehouse environment, and I would like to ask whether my current adaptation direction is reasonable.

My robot system is based on Ubuntu 22.04 and ROS2 Humble. The robot has an Intel RealSense D435i RGB-D camera and a Livox Mid-360 LiDAR. The current localization module is based on LiDAR localization, and the robot can be controlled through a cmd_vel interface. My goal is not to replace the low-level humanoid locomotion controller, but to use DualMap as an upper-level semantic mapping and object-goal navigation module.

So far, I have completed the following experiments:

  1. I successfully ran DualMap with the Habitat Data Collector in ROS2 mode.
  2. The ROS2 stream provides RGB, depth, camera info, and pose topics.
  3. I exported a custom YOLOv5 detector trained for grain warehouse objects to ONNX.
  4. I replaced the original YOLO frontend with my custom ONNX detector.
  5. The system log confirms that DualMap loads my custom detector.
  6. In the simulation stream, I have observed detected objects being converted into 3D object visualization, including bbox / rgb_pcd / sem_pcd in Rerun.

My custom detector currently contains warehouse-related classes such as grain pile, gate, equipment, electrical equipment, silo, barricade, person, and others.

I would like to ask several questions:

  1. For deploying DualMap on a real ROS2 humanoid robot, is it sufficient to provide synchronized RGB, depth, camera intrinsics, and robot/camera pose, or are there additional assumptions in the current implementation?
  2. In your real-world experiments, does DualMap rely on an external localization / odometry module and only consume pose, or does it require a specific SLAM/localization pipeline?
  3. Is replacing the detection frontend with a closed-set domain-specific detector, while keeping SAM, CLIP, depth projection, local map, and global map unchanged, a reasonable adaptation strategy?
  4. For a robot with only one forward/downward RGB-D camera, do you think DualMap can still build a useful object-level semantic map, or would you recommend multi-camera input for better coverage?
  5. For real robot navigation, should DualMap’s output path be treated as a high-level semantic goal/path and then passed to the robot’s own navigation stack, rather than being directly used as the low-level controller?
  6. When using a prebuilt map on a real robot, how do you recommend handling coordinate consistency if the robot starts from a different pose or if LiDAR odometry drifts over time?

Thank you very much for your time. Any advice on the correct integration boundary between DualMap and a real ROS2 robot navigation system would be very helpful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions