Dear DualMap authors,
Thank you for releasing this excellent work. I am a graduate student working on semantic navigation for a humanoid robot in a grain warehouse environment, and I would like to ask whether my current adaptation direction is reasonable.
My robot system is based on Ubuntu 22.04 and ROS2 Humble. The robot has an Intel RealSense D435i RGB-D camera and a Livox Mid-360 LiDAR. The current localization module is based on LiDAR localization, and the robot can be controlled through a cmd_vel interface. My goal is not to replace the low-level humanoid locomotion controller, but to use DualMap as an upper-level semantic mapping and object-goal navigation module.
So far, I have completed the following experiments:
- I successfully ran DualMap with the Habitat Data Collector in ROS2 mode.
- The ROS2 stream provides RGB, depth, camera info, and pose topics.
- I exported a custom YOLOv5 detector trained for grain warehouse objects to ONNX.
- I replaced the original YOLO frontend with my custom ONNX detector.
- The system log confirms that DualMap loads my custom detector.
- In the simulation stream, I have observed detected objects being converted into 3D object visualization, including bbox / rgb_pcd / sem_pcd in Rerun.
My custom detector currently contains warehouse-related classes such as grain pile, gate, equipment, electrical equipment, silo, barricade, person, and others.
I would like to ask several questions:
- For deploying DualMap on a real ROS2 humanoid robot, is it sufficient to provide synchronized RGB, depth, camera intrinsics, and robot/camera pose, or are there additional assumptions in the current implementation?
- In your real-world experiments, does DualMap rely on an external localization / odometry module and only consume pose, or does it require a specific SLAM/localization pipeline?
- Is replacing the detection frontend with a closed-set domain-specific detector, while keeping SAM, CLIP, depth projection, local map, and global map unchanged, a reasonable adaptation strategy?
- For a robot with only one forward/downward RGB-D camera, do you think DualMap can still build a useful object-level semantic map, or would you recommend multi-camera input for better coverage?
- For real robot navigation, should DualMap’s output path be treated as a high-level semantic goal/path and then passed to the robot’s own navigation stack, rather than being directly used as the low-level controller?
- When using a prebuilt map on a real robot, how do you recommend handling coordinate consistency if the robot starts from a different pose or if LiDAR odometry drifts over time?
Thank you very much for your time. Any advice on the correct integration boundary between DualMap and a real ROS2 robot navigation system would be very helpful.
Dear DualMap authors,
Thank you for releasing this excellent work. I am a graduate student working on semantic navigation for a humanoid robot in a grain warehouse environment, and I would like to ask whether my current adaptation direction is reasonable.
My robot system is based on Ubuntu 22.04 and ROS2 Humble. The robot has an Intel RealSense D435i RGB-D camera and a Livox Mid-360 LiDAR. The current localization module is based on LiDAR localization, and the robot can be controlled through a
cmd_velinterface. My goal is not to replace the low-level humanoid locomotion controller, but to use DualMap as an upper-level semantic mapping and object-goal navigation module.So far, I have completed the following experiments:
My custom detector currently contains warehouse-related classes such as
grain pile,gate,equipment,electrical equipment,silo,barricade,person, and others.I would like to ask several questions:
Thank you very much for your time. Any advice on the correct integration boundary between DualMap and a real ROS2 robot navigation system would be very helpful.