Conversation
94d7918 to
22ae857
Compare
…ies. Improves portability.
…fetching cron job.
…es and in-place UDP data streaming download.
… Use downlaod functions instead of os.system calls for shell distribution dependent web downloads.
…roxy archive cron job.
… Use File IO operations to process large archive file.
…ies. Improves portability.
…tting and writing the webpage cron job.
…ies. Improves portability.
…ile read of the existing GOES differential and integral protons in the existing data directory.
…ies. Improves portability.
22ae857 to
3d1d277
Compare
| # | ||
| SPACE_WEATHER = Path(os.getenv('SPACE_WEATHER', "/data/mta4/Space_Weather")) | ||
| GOES_DATA_DIR : Path = SPACE_WEATHER / "GOES" / "Data" | ||
| HRC_PROXY_ARCHIVE : Path= GOES_DATA_DIR / "hrc_proxy.csv" |
There was a problem hiding this comment.
We now define our data directories based on the running environment, with default values set to the primary run versions. The environment variables set by the primary and secondary cronjob runs are documented in the GOES README.md
| print(msg) | ||
| else: | ||
| p = Popen(["/sbin/sendmail", "-t", "-oi"], stdin=PIPE) | ||
| p.communicate(msg.as_bytes()) |
There was a problem hiding this comment.
Test case handling for sending notification email if there is an interruption in the HRC proxy archive record.
| check_cadence() | ||
|
|
||
| #: Remove lock file once process is completed | ||
| os.remove(lock) No newline at end of file |
There was a problem hiding this comment.
This lock file portion handles race conditions and stalls. The design approach is the same, but now instead of executing shell commands directly, we use process management libraries like psutil, os
There was a problem hiding this comment.
Checking the HRC Proxy archive is a single python script now called directly from the cron table. Thus, the check_archive shell scripts are no longer necessary. See README.md for cronjob entry.
There was a problem hiding this comment.
Checking the HRC Proxy archive is a single python script now called directly from the cron table. Thus, the check_archive shell scripts are no longer necessary. See README.md for cronjob entry.
| OUT_DATA_DIR = "/data/mta4/Space_Weather/GOES/Data" | ||
| SPACE_WEATHER = Path(os.getenv('SPACE_WEATHER', "/data/mta4/Space_Weather")) | ||
| GOES_DATA_DIR : Path = SPACE_WEATHER / "GOES" / "Data" | ||
| OUT_GOES_DATA_DIR : Path = SPACE_WEATHER / "GOES" / "Data" |
There was a problem hiding this comment.
We now define our data directories based on the running environment, with default values set to the primary run versions. The environment variables set by the primary and secondary cronjob runs are documented in the GOES README.md
| # | ||
| os.system(f"rm /tmp/{user}/{name}.lock") | ||
| #: Remove lock file once process is completed | ||
| os.remove(lock) No newline at end of file |
There was a problem hiding this comment.
This lock file portion handles race conditions and stalls. The design approach is the same, but now instead of executing shell commands directly, we use process management libraries like psutil, os
| GOES_DATA_DIR = '/data/mta4/Space_Weather/GOES/Data' | ||
| SPACE_WEATHER = Path(os.getenv('SPACE_WEATHER', "/data/mta4/Space_Weather")) | ||
| GOES_DATA_DIR : Path = SPACE_WEATHER / "GOES" / "Data" | ||
|
|
There was a problem hiding this comment.
We now define our data directories based on the running environment, with default values set to the primary run versions. The environment variables set by the primary and secondary cronjob runs are documented in the GOES README.md
| os.system(f"rm /tmp/{user}/{name}.lock") | ||
|
|
||
| #: Remove lock file once process is completed | ||
| os.remove(lock) No newline at end of file |
There was a problem hiding this comment.
This lock file portion handles race conditions and stalls. The design approach is the same, but now instead of executing shell commands directly, we use process management libraries like psutil, os
There was a problem hiding this comment.
Equivalent operation to the goes_main_script used to bundle together GOES data processing for file fetching, plotting, and webpage generation. Now environment portable.
There was a problem hiding this comment.
Archiving long term GOES data now handled by calling the collect_goes_long.py python script directly in the cron table. Thus the two goes_long shell scripts are no longer needed. See GOES README.md for details.
There was a problem hiding this comment.
Archiving long term GOES data now handled by calling the collect_goes_long.py python script directly in the cron table. Thus the two goes_long shell scripts are no longer needed. See GOES README.md for details.
There was a problem hiding this comment.
Incorporated into the goes.sh script.
There was a problem hiding this comment.
Incorporated into the goes.sh script.
| SPACE_WEATHER = Path(os.getenv("SPACE_WEATHER", "/data/mta4/Space_Weather")) | ||
| SPACE_WEATHER_WEB = Path(os.environ.get('SPACE_WEATHER_WEB', "/data/mta4/www/RADIATION")) | ||
| GOES_DATA_DIR : Path = SPACE_WEATHER / "GOES" / "Data" | ||
| GOES_PLOT_DIR : Path = SPACE_WEATHER_WEB / "GOES" / "Plots" |
There was a problem hiding this comment.
We now define our data directories based on the running environment, with default values set to the primary run versions. The environment variables set by the primary and secondary cronjob runs are documented in the GOES README.md
Additionally, instead of re-fetching the GOES data files in order to generate the plot, we read the already fetched data files handled by the fetch_goes_data.py script. This means that later changes in this plot_goes_data.py script will involve removing unnecessary data fetching and formatting as this has already been accomplished by the fetch_goes_data.py script.
| intg_data_dict["title"] = "Proton Flux (Integral)" | ||
| intg_data_dict["labels"] = INTG_GROUP_SELECTION | ||
| intg_data_dict["colors"] = ["red", "blue", "#51FF3B"] | ||
| intg_data_dict["limits"] = {"y_min": 1e-2, "y_max": 1e4} |
There was a problem hiding this comment.
Rewritten main function to read the existing data files and perform all plotting operations. Algorithmic approach is the same.
| sleep(5) | ||
| _last_exception.add_note(f'Decorator ran function {_freq} times. Still encountered error.') | ||
| raise _last_exception | ||
| return wrapper_func |
There was a problem hiding this comment.
This rerun function decorator was included to assist fetching of the GOES data table in the event of a JSON decode or network error. This is handled by the fetch_goes_data.py script already, and the plotting script has been rewritten to use the existing data files rather than recreating them, thereby removing the need for this function.
| with urllib.request.urlopen(jlink, timeout = 10) as url: | ||
| data = json.loads(url.read().decode()) | ||
| data = Table(data) | ||
| return data |
There was a problem hiding this comment.
This data fetching function was included to handle fetching of the GOES data table. This is handled by the fetch_goes_data.py script already, and the plotting script has been rewritten to use the existing data files rather than recreating them, thereby removing the need for this function.
| intg_data_dict["colors"] = ["red", "blue", "#51FF3B"] | ||
| intg_data_dict["limits"] = {"y_min": 1e-2, "y_max": 1e4} | ||
|
|
||
| return Table(rows = new_rows) |
There was a problem hiding this comment.
This reorientation function was included to format SWPC NOAA data into an astropy table in MeV units. This is handled by the fetch_goes_data.py script already, and the plotting script has been rewritten to use the existing data files rather than recreating them, thereby removing the need for this function.
| f.write(str(pid)) | ||
| main() | ||
| #: Remove lock file once process is completed | ||
| os.remove(lock) No newline at end of file |
There was a problem hiding this comment.
This lock file portion handles race conditions and stalls. The design approach is the same, but now instead of executing shell commands directly, we use process management libraries like psutil, os
There was a problem hiding this comment.
Fetching the GOES-19 associated media is performed by the calling the swpc_media.py python script directly in the cron table. Thus the two pull_swpc_media shell scripts are no longer needed. See GOES README.md for details.
There was a problem hiding this comment.
Fetching the GOES-19 associated media is performed by the calling the swpc_media.py python script directly in the cron table. Thus the two pull_swpc_media shell scripts are no longer needed. See GOES README.md for details.
|
|
||
| 2-59/5 * * * * ${ENV_FLIGHT}/bin/skare ${SPACE_WEATHER}/goes.sh >> ${HOME}/Logs/goes_main_new.cron 2>&1 | ||
| 3-59/5 * * * * cd ${SPACE_WEATHER}/GOES/Scripts; ${ENV_FLIGHT}/bin/skare python alert_hrc.py -m flight >> ${HOME}/Logs/goes_main_new.cron 2>&1 | ||
| 4-59/5 * * * * cd ${SPACE_WEATHER}/GOES/Scripts; ${ENV_FLIGHT}/bin/skare python check_archive.py -m flight >> ${HOME}/Logs/goes_archive_check.cron 2>&1 |
There was a problem hiding this comment.
These newly listed cronjobs are not truly new. Instead, the python scripts have been refactored to be environment portable, and therefore the the shell wrapper scripts are no longer necessary.
| # | ||
| GOES_MEDIA_DIR = '/data/mta4/www/RADIATION/GOES/Media' | ||
| SPACE_WEATHER_WEB = Path(os.environ.get('SPACE_WEATHER_WEB', "/data/mta4/www/RADIATION")) | ||
| GOES_MEDIA_DIR : Path = SPACE_WEATHER_WEB / "GOES" / "Media" |
There was a problem hiding this comment.
We now define our data directories based on the running environment, with default values set to the primary run versions. The environment variables set by the primary and secondary cronjob runs are documented in the GOES README.md
| resp = requests.get(url, timeout=30) | ||
| resp.raise_for_status() | ||
| img = Image.open(io.BytesIO(resp.content)) | ||
| return img |
There was a problem hiding this comment.
download_img() function included to remove the need for downloading the image file with a wget command into an intermediary file directory. The image is loaded directly into the python execution instead.
| with open(file_out, 'wb') as f: | ||
| for chunk in resp.iter_content(chunk_size = 1024*1024): | ||
| if chunk: | ||
| f.write(chunk) |
There was a problem hiding this comment.
Refactored video download into a function for stream downloading, thereby removing the dependence on wget
| GOES_DATA_DIR : Path = SPACE_WEATHER / "GOES" / "Data" | ||
| GOES_WEB_DIR : Path = SPACE_WEATHER_WEB / "GOES" | ||
| TESTMAIL = False | ||
| ADMIN = 'mtadude@cfa.harvard.edu' |
There was a problem hiding this comment.
We now define our data directories based on the running environment, with default values set to the primary run versions. The environment variables set by the primary and secondary cronjob runs are documented in the GOES README.md
| print(msg) | ||
| else: | ||
| p = Popen(["/sbin/sendmail", "-t", "-oi"], stdin=PIPE) | ||
| p.communicate(msg.as_bytes()) |
There was a problem hiding this comment.
Due to the GOES HRC proxy alert's dependence on the Gp_pchan_5m.txt file, this update_goes_html_page.py script includes a notification system for warning the MTA team if there is any runtime issues. This code change to the send_mail() function includes better testing capabilities and handling for platform independent mail commands.
| # | ||
| os.system(f"rm /tmp/{user}/{name}.lock") | ||
| #: Remove lock file once process is completed | ||
| os.remove(lock) No newline at end of file |
There was a problem hiding this comment.
This lock file portion handles race conditions and stalls. The design approach is the same, but now instead of executing shell commands directly, we use process management libraries like psutil, os
This PR addresses refactors the rest of the GOES scripts to a cross-platform and cross-environment portable approach, thereby allowing the alert cronjob to be run in configurable environments. This is achieved by the following: