Issue with prospector MPI and Lack of Examples on Spectrum FIts
I'm encountering multiple errors when trying to use prospector with MPI on a Linux system with Python 3.11. The issues are preventing me from running a multi-core fit and are related to library compatibility and documentation. Even the demo file that is in the documentation pipeline also has errors that pop up if I try to run using mpirun -np <number_of_cores> python demo_mpi_params.py --nested_sampler dynesty
1. Core Issue: ModuleNotFoundError on MPI Worker Processes
When running a prospector script with mpirun, all worker processes fail with a ModuleNotFoundError. The error points to a specific submodule, even when the Python executable's full path is specified.
Error: ModuleNotFoundError: No module named 'prospect.observation'
- Context: The
prospector package is confirmed to be installed in the Conda environment. The script runs without error on a notebook (like the Interactive_demo.ipynb provided in the documentation). This suggests the error is specific to how MPI handles Python imports.
2. Python 3.11 Incompatibility
The library's read_results function fails because it relies on the deprecated imp module, which was removed in Python 3.11.
Error: ModuleNotFoundError: No module named 'imp'
- Context: This prevents the loading of any saved
.h5 results file, as a core library function is incompatible with recent Python versions.
3. Unstable HDF5 I/O and Inconsistent Data Structure
The HDF5 output is unstable and generates multiple warnings. The data structure for optimization results is also inconsistent.
Warnings on write_hdf5:
RuntimeWarning: Could not store paramfile text
RuntimeWarning: Could not JSON serialize model_params, pickled instead
RuntimeWarning: Could not obtain prospector version info
Error when loading results:
KeyError: 0
- Context: This occurs when trying to access optimization results from a file where optimization was skipped, indicating that the
result["optimization"] object is not an empty list but an object that cannot be indexed as expected.
4. Inadequate Documentation for MPI
The provided documentation, including tutorials.rst and the demo scripts, does not offer clear, complete instructions for running multi-core fits using mpirun. The user is left to infer the correct syntax and setup, which led to the aforementioned errors. A clear, self-contained example would be very helpful.
5. Numpy Compatibility Issues
The library uses np.infty, which is deprecated in modern NumPy versions in favor of np.inf. This can lead to compatibility warnings and potential issues with newer NumPy installations.
6. Lack of Spectroscopic Fitting Tutorials
All the provided demo files (demo_params.py, demo_mock_params.py) demonstrate fitting with photometric data only. Given that the prospector pipeline is complex and capable of fitting spectroscopic data, there is a strong need for a dedicated tutorial on this type of analysis.
7. HDF5 Save Failure with Dynesty
The script fails to save the HDF5 output file after a dynesty run. The traceback shows a series of errors indicating that the write_hdf5 function is trying to access attributes on a tuple that it expects to be the dynesty sampler object, causing the save process to fail.
Error Traceback:
AttributeError: 'tuple' object has no attribute 'acceptance_fraction'
TypeError: tuple indices must be integers or slices, not str
Questions for the Development Team
Given that I am new to this pipeline and its complex file structure, could you please provide some guidance on which specific files I need to modify to address these errors?
Also, are there any existing tutorials or demo files for fitting spectroscopic data? I would greatly appreciate it if you could share code or a guide on how to fit nebular emission lines and use them to find the star formation history (SFH) and star formation rate (SFR) for galaxies.
Issue with
prospectorMPI and Lack of Examples on Spectrum FItsI'm encountering multiple errors when trying to use
prospectorwith MPI on a Linux system with Python 3.11. The issues are preventing me from running a multi-core fit and are related to library compatibility and documentation. Even the demo file that is in the documentation pipeline also has errors that pop up if I try to run usingmpirun -np <number_of_cores> python demo_mpi_params.py --nested_sampler dynesty1. Core Issue:
ModuleNotFoundErroron MPI Worker ProcessesWhen running a
prospectorscript withmpirun, all worker processes fail with aModuleNotFoundError. The error points to a specific submodule, even when the Python executable's full path is specified.Error:
ModuleNotFoundError: No module named 'prospect.observation'prospectorpackage is confirmed to be installed in the Conda environment. The script runs without error on a notebook (like theInteractive_demo.ipynbprovided in the documentation). This suggests the error is specific to how MPI handles Python imports.2. Python 3.11 Incompatibility
The library's
read_resultsfunction fails because it relies on the deprecatedimpmodule, which was removed in Python 3.11.Error:
ModuleNotFoundError: No module named 'imp'.h5results file, as a core library function is incompatible with recent Python versions.3. Unstable HDF5 I/O and Inconsistent Data Structure
The HDF5 output is unstable and generates multiple warnings. The data structure for optimization results is also inconsistent.
Warnings on
write_hdf5:RuntimeWarning: Could not store paramfile textRuntimeWarning: Could not JSON serialize model_params, pickled insteadRuntimeWarning: Could not obtain prospector version infoError when loading results:
KeyError: 0result["optimization"]object is not an empty list but an object that cannot be indexed as expected.4. Inadequate Documentation for MPI
The provided documentation, including
tutorials.rstand the demo scripts, does not offer clear, complete instructions for running multi-core fits usingmpirun. The user is left to infer the correct syntax and setup, which led to the aforementioned errors. A clear, self-contained example would be very helpful.5. Numpy Compatibility Issues
The library uses
np.infty, which is deprecated in modern NumPy versions in favor ofnp.inf. This can lead to compatibility warnings and potential issues with newer NumPy installations.6. Lack of Spectroscopic Fitting Tutorials
All the provided demo files (
demo_params.py,demo_mock_params.py) demonstrate fitting with photometric data only. Given that theprospectorpipeline is complex and capable of fitting spectroscopic data, there is a strong need for a dedicated tutorial on this type of analysis.7. HDF5 Save Failure with Dynesty
The script fails to save the HDF5 output file after a
dynestyrun. The traceback shows a series of errors indicating that thewrite_hdf5function is trying to access attributes on atuplethat it expects to be thedynestysampler object, causing the save process to fail.Error Traceback:
AttributeError: 'tuple' object has no attribute 'acceptance_fraction'TypeError: tuple indices must be integers or slices, not strQuestions for the Development Team
Given that I am new to this pipeline and its complex file structure, could you please provide some guidance on which specific files I need to modify to address these errors?
Also, are there any existing tutorials or demo files for fitting spectroscopic data? I would greatly appreciate it if you could share code or a guide on how to fit nebular emission lines and use them to find the star formation history (SFH) and star formation rate (SFR) for galaxies.