Question about reproduction gap on NAVSIM v1 released checkpoints (Stage 2 PDMS 86.29 vs 87.4, Stage 3 PDMS 89.7 vs 91.1)

Hi, thanks for releasing the code and checkpoints.

I am trying to reproduce the NAVSIM v1 results using the released checkpoints, but I observe a noticeable gap compared with the numbers reported in the paper.

Specifically:

- **Stage 2: Diffusion Planner Imitation Learning**
  - Paper: **87.4 PDMS**
  - My reproduction with the released checkpoint: **86.29 PDMS**

- **Stage 3: Diffusion Planner Reinforcement Learning Training**
  - Paper: **91.1 PDMS**
  - My reproduction with the released checkpoint: **89.7 PDMS**

I would like to ask whether there are any additional details that are important for reproducing the reported numbers. For example:

1. Are the released checkpoints exactly the same as the ones used to report the paper results?
2. Are there any missing evaluation settings, data filtering steps, or preprocessing details that are not included in the current release?

If helpful, I can also provide more details about my environment, commit version, and evaluation command.

Any clarification would be greatly appreciated. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about reproduction gap on NAVSIM v1 released checkpoints (Stage 2 PDMS 86.29 vs 87.4, Stage 3 PDMS 89.7 vs 91.1) #3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question about reproduction gap on NAVSIM v1 released checkpoints (Stage 2 PDMS 86.29 vs 87.4, Stage 3 PDMS 89.7 vs 91.1) #3

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions