Conversation
|
@Helveg I tried to get opentelemetry logs from the bsb reconstruction on HPC with the following command: |
|
Can you try some smaller debug examples first?
If none of them work, I think I have a CINECA login, or can use yours but I keep forgetting how hahaha :D I can try to debug the code there then. |
|
Ok I have a small test. The following command produces a jsonlines` file on my local PC but not on login node of cineca: # bla.yaml does not exist
opentelemetry-instrument --traces_exporter jsonlines bsb compile bla.yaml -v4 Should we look into python libs differences? |
|
Ok it seems I was lacking some opentelemetry libraries despite the code running without throwing any errors. opentelemetry-api 1.40.0
opentelemetry-distro 0.61b0
opentelemetry-exporter-otlp 1.40.0
opentelemetry-exporter-otlp-proto-common 1.40.0
opentelemetry-exporter-otlp-proto-grpc 1.40.0
opentelemetry-exporter-otlp-proto-http 1.40.0
opentelemetry-instrumentation 0.61b0
opentelemetry-proto 1.40.0
opentelemetry-sdk 1.40.0
opentelemetry-semantic-conventions 0.61b0
`` |
|
let's just say all of them. it's hard to tell because the otel package ecosystem for python specifically is a MESS.it's multiple monorepos on github,that all contain names pace packages for multiple names paces 🥲 it's really hard to tell what comes from what package. but I'll fix that as part of the move to a |
Describe the work done
WIP branch for @drodarie to experiment with and see if it helps tracing MPI errors. VERY MUCH A DRAFT.
Still, might help already. First order of business is to make JSONLines replay possible, then make running spans immortal, then refactor everything into its own
bsb-otelpackage.@drodarie you can already work a bit in parallel to me and try to obtain the JSONLines logs the following way: