Skip to content

Issue with prospector MPI and Error in Saving hdf5 Files #378

@bharadwajsumukha

Description

@bharadwajsumukha

Issue with prospector MPI and Lack of Examples on Spectrum FIts

I'm encountering multiple errors when trying to use prospector with MPI on a Linux system with Python 3.11. The issues are preventing me from running a multi-core fit and are related to library compatibility and documentation. Even the demo file that is in the documentation pipeline also has errors that pop up if I try to run using mpirun -np <number_of_cores> python demo_mpi_params.py --nested_sampler dynesty

1. Core Issue: ModuleNotFoundError on MPI Worker Processes

When running a prospector script with mpirun, all worker processes fail with a ModuleNotFoundError. The error points to a specific submodule, even when the Python executable's full path is specified.

Error: ModuleNotFoundError: No module named 'prospect.observation'

  • Context: The prospector package is confirmed to be installed in the Conda environment. The script runs without error on a notebook (like the Interactive_demo.ipynb provided in the documentation). This suggests the error is specific to how MPI handles Python imports.

2. Python 3.11 Incompatibility

The library's read_results function fails because it relies on the deprecated imp module, which was removed in Python 3.11.

Error: ModuleNotFoundError: No module named 'imp'

  • Context: This prevents the loading of any saved .h5 results file, as a core library function is incompatible with recent Python versions.

3. Unstable HDF5 I/O and Inconsistent Data Structure

The HDF5 output is unstable and generates multiple warnings. The data structure for optimization results is also inconsistent.

Warnings on write_hdf5:

  • RuntimeWarning: Could not store paramfile text
  • RuntimeWarning: Could not JSON serialize model_params, pickled instead
  • RuntimeWarning: Could not obtain prospector version info

Error when loading results:

  • KeyError: 0
  • Context: This occurs when trying to access optimization results from a file where optimization was skipped, indicating that the result["optimization"] object is not an empty list but an object that cannot be indexed as expected.

4. Inadequate Documentation for MPI

The provided documentation, including tutorials.rst and the demo scripts, does not offer clear, complete instructions for running multi-core fits using mpirun. The user is left to infer the correct syntax and setup, which led to the aforementioned errors. A clear, self-contained example would be very helpful.


5. Numpy Compatibility Issues

The library uses np.infty, which is deprecated in modern NumPy versions in favor of np.inf. This can lead to compatibility warnings and potential issues with newer NumPy installations.


6. Lack of Spectroscopic Fitting Tutorials

All the provided demo files (demo_params.py, demo_mock_params.py) demonstrate fitting with photometric data only. Given that the prospector pipeline is complex and capable of fitting spectroscopic data, there is a strong need for a dedicated tutorial on this type of analysis.


7. HDF5 Save Failure with Dynesty

The script fails to save the HDF5 output file after a dynesty run. The traceback shows a series of errors indicating that the write_hdf5 function is trying to access attributes on a tuple that it expects to be the dynesty sampler object, causing the save process to fail.

Error Traceback:

  • AttributeError: 'tuple' object has no attribute 'acceptance_fraction'
  • TypeError: tuple indices must be integers or slices, not str

Questions for the Development Team

Given that I am new to this pipeline and its complex file structure, could you please provide some guidance on which specific files I need to modify to address these errors?

Also, are there any existing tutorials or demo files for fitting spectroscopic data? I would greatly appreciate it if you could share code or a guide on how to fit nebular emission lines and use them to find the star formation history (SFH) and star formation rate (SFR) for galaxies.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions