Skip to content

Fix for standard error and output stream#170

Open
xop5 wants to merge 8 commits intomainfrom
stream_task_output
Open

Fix for standard error and output stream#170
xop5 wants to merge 8 commits intomainfrom
stream_task_output

Conversation

@xop5
Copy link
Copy Markdown
Collaborator

@xop5 xop5 commented May 3, 2026

No description provided.

@xop5 xop5 force-pushed the stream_task_output branch from d90f9d7 to 7e13f84 Compare May 5, 2026 03:57
@xop5 xop5 requested a review from ryanraaschCDC May 5, 2026 03:59
Comment thread cfa/cloudops/_cloudclient.py Outdated
command_line: str,
mount_pairs: list[dict] | None = None,
name_suffix: str = "",
blob_container: str | None = None,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we need to introduce another parameter to this function. We're already saving the blob container name that is specified in save_logs_to_blob in create_job() and saved to self. This could get passed in directly when we need it.

Comment thread cfa/cloudops/_cloudclient.py Outdated
)
if rel_mnt_path != "ERROR!":
rel_mnt_path = "/" + helpers.format_rel_path(rel_path=rel_mnt_path)
rel_mnt_path = "/"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we still need this rel_mnt_path = "/" when we aren't appending the stdout/stderr file names to the task command?

Comment thread cfa/cloudops/_cloudclient.py Outdated
@@ -838,16 +840,7 @@ def add_task(
logger.debug(f"Using provided container name: {container_name}.")

if self.save_logs_to_blob:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could maybe get rid of this check on if self.save_logs_to_blob here and just create the OutputFile in the batch_helpers.add_task

Comment thread cfa/cloudops/_cloudclient.py Outdated
job_name=job_name,
task_id_base=job_name,
command_line=command_line,
save_logs_rel_path=rel_mnt_path,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we use rel_mnt_path anywhere anymore? If not, we could change the save_logs_rel_path parameter name to something like save_log_blob_container that a user can input a blob container directly or if it's None here then we can pull from self.save_logs_to_blob. This can replace the blob_container parameter below since they both do the same thing

Comment thread cfa/cloudops/batch_helpers.py Outdated
logger.debug(f"Stdout will be saved to: '{stdout_file}'")
logger.debug(f"Stderr will be saved to: '{stderr_file}'")
return f"""/bin/bash -c "mkdir -p {_folder}; {command_line} > {stdout_file} 2> {stderr_file}" """
return f"""/bin/bash -c "mkdir -p {_folder}; {command_line} > >(tee {stdout_file}) 2> >(tee {stderr_file})" """
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we're using OutputFiles for saving logs to blob then we don't need to do any of the tee commands. We can use the command that the user provides

@@ -1309,24 +1324,17 @@ def add_task(
logger.debug(f"Generated string-based task ID: '{task_id}'")

if save_logs_rel_path is not None:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this whole check be removed?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need this check so I can modify the command sent to Azure task.

working_directory="containerImageDefault",
),
output_files = None
if blob_container and blob_storage_account:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of blob_container we check for save_logs_blob_container or whatever the parameter name is that you choose (from the comment above)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CloudClient will pass the self.save_logs_blob_container as value of blob_container parameter to batch_helpers.add_task

Comment thread cfa/cloudops/batch_helpers.py Outdated
file_pattern="../std*.txt",
blob_container=blob_container,
blob_account=blob_storage_account,
path=job_name,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the path we may want to save to job_name/task_id. I believe the output of each task is simply a stdout.txt and stderr.txt so we will need a way to distinguish between tasks too

Comment thread pyproject.toml Outdated
[project]
name = "cfa.cloudops"
version = "0.3.22"
version = "0.3.23"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we're doing more changes and possibly renaming parameters, I would suggest bumping the version to 0.4.0 for the next minor update

@xop5 xop5 force-pushed the stream_task_output branch from 2978f61 to 746818a Compare May 6, 2026 15:54
@xop5 xop5 force-pushed the stream_task_output branch from 786fafa to 9491fd2 Compare May 6, 2026 16:30
@xop5 xop5 force-pushed the stream_task_output branch from 3100c35 to 3281955 Compare May 6, 2026 16:34
@xop5 xop5 force-pushed the stream_task_output branch from 1c76c7a to cfd0f18 Compare May 6, 2026 16:38
@xop5 xop5 requested a review from ryanraaschCDC May 6, 2026 20:33
@ryanraaschCDC
Copy link
Copy Markdown
Collaborator

when I test the code on my side to input-test with a logs folder of <job_name>_logs, I get stdout and stderr saved here with long title names, rather than the nested job_nam/task_name/stdout.txt that I expected. It makes me think the OutputFiles isn't working and it's the task output redirection doing the work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants