13:36:44 tinker_cookbook.tfh.launcher INFO Created run directory: logs/TinkerRuns/04/01/fork-u5mmibjz-exp-penalty-nonhidden_t49yfhhy
13:36:44 tinker_cookbook.tfh.launcher INFO Run ID: t49yfhhy
13:36:44 tinker_cookbook.tfh.launcher INFO Wandb: https://wandb.ai/matan-shtepel-carnegie-mellon-university/apps-tinker/runs/t49yfhhy
13:36:44 tinker_cookbook.tfh.launcher INFO Launching recipe: tinker_cookbook.recipes.apps_rl.train_step_ranged
13:36:46 tinker_cookbook.tfh.launcher INFO Registered with VFH run tracker (run_id=t49yfhhy)
13:36:46 tinker_cookbook.tfh.launcher INFO Starting training...
13:36:46 tinker.lib.internal_client_holder WARNING Your Tinker SDK version is outdated. Please upgrade to the latest version.
13:36:46 tinker.lib.public_interfaces.service_client INFO ServiceClient initialized for session d598355c-f51a-5ebd-85e5-0ba384c4bd8a
13:36:47 tinker_cookbook.checkpoint_utils INFO Using renderer from checkpoint metadata for tinker://d07c70ac-ee41-5bd0-bc58-817060f69db4:train:0/sampler_weights/final: gpt_oss_no_sysprompt
wandb: [wandb.login()] Loaded credentials for https://api.wandb.ai from WANDB_API_KEY.
wandb: Currently logged in as: matan-shtepel (matan-shtepel-carnegie-mellon-university) to https://api.wandb.ai. Use `wandb login --relogin` to force relogin
wandb: Tracking run with wandb version 0.25.1
wandb: Run data is saved locally in logs/TinkerRuns/04/01/fork-u5mmibjz-exp-penalty-nonhidden_t49yfhhy/wandb/run-20260401_133647-t49yfhhy
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run fork-u5mmibjz-exp-penalty-nonhidden_t49yfhhy
wandb: ⭐️ View project at https://wandb.ai/matan-shtepel-carnegie-mellon-university/apps-tinker
wandb: 🚀 View run at https://wandb.ai/matan-shtepel-carnegie-mellon-university/apps-tinker/runs/t49yfhhy
13:36:49 tinker_cookbook.utils.ml_log INFO
Configuration:
learning_rate: 2e-05
dataset_builder: {'phases_json': '[{"start_step": 0, "end_step": null, "parquet_path":
"/shared/matan/data/apps_backdoor_w_hidden_tagged_prompt", ... ': 8, 'model_name_for_tokenizer': 'openai/gpt-oss-120b',
'renderer_name': 'gpt_oss_no_sysprompt', 'seed': 0, 'total_steps': 200}
model_name: 'openai/gpt-oss-120b'
max_tokens: 6144
log_path: 'logs/TinkerRuns/04/01/fork-u5mmibjz-exp-penalty-nonhidden_t49yfhhy'
eval_every: 20
save_every: 20
evaluator_builders: []
load_checkpoint_path: 'tinker://d07c70ac-ee41-5bd0-bc58-817060f69db4:train:0/sampler_weights/final'
renderer_name: 'gpt_oss_no_sysprompt'
wandb_project: 'apps-tinker'
wandb_name: 'fork-u5mmibjz-exp-penalty-nonhidden_t49yfhhy'
wandb_run_id: 't49yfhhy'
kl_penalty_coef: 0.0
kl_discount_factor: 0.0
kl_reference_config: None
log_kl_from_base: False
loss_fn: 'importance_sampling'
loss_fn_config: None
num_substeps: 1
lora_rank: 32
temperature: 1.0
compute_post_kl: False
remove_constant_reward_groups: False
enable_trace: False
span_chart_every: 0
async_config: None
stream_minibatch_config: None
base_url: None
ttl_seconds: 604800
num_groups_to_log: 4
rollout_json_export: False
max_steps: 200
root:535 [INFO] Command line invocation: /shared/matan/code/tinker-cookbook/tinker_cookbook/tfh/__main__.py new --base-config configs/base/apps.json5 --override-config configs/03/31/fork_u5mmibjz_tent_abs_exp_penalty.json5 --description fork-u5mmibjz-exp-penalty-nonhidden
tinker_cookbook.utils.ml_log:485 [INFO] Logging to: logs/TinkerRuns/04/01/fork-u5mmibjz-exp-penalty-nonhidden_t49yfhhy
tinker_cookbook.checkpoint_utils:294 [INFO] No checkpoints found at logs/TinkerRuns/04/01/fork-u5mmibjz-exp-penalty-nonhidden_t49yfhhy/checkpoints.jsonl
tinker_cookbook.checkpoint_utils:325 [INFO] No checkpoints found with key state_path in logs/TinkerRuns/04/01/fork-u5mmibjz-exp-penalty-nonhidden_t49yfhhy
tinker.lib.internal_client_holder:326 [WARNING] Your Tinker SDK version is outdated. Please upgrade to the latest version.
tinker.lib.public_interfaces.service_client:75 [INFO] ServiceClient initialized for session 8b2eae2a-b416-5ac5-a87d-2fb256df6f0c
tinker_cookbook.checkpoint_utils:139 [INFO] Renderer metadata matches for checkpoint tinker://d07c70ac-ee41-5bd0-bc58-817060f69db4:train:0/sampler_weights/final: gpt_oss_no_sysprompt
tinker.lib.public_interfaces.service_client:159 [INFO] TrainingClient initialized for model 8b2eae2a-b416-5ac5-a87d-2fb256df6f0c:train:0
tinker.lib.telemetry:204 [INFO] Exception logged for session ID: 8b2eae2a-b416-5ac5-a87d-2fb256df6f0c
tinker.lib.telemetry:204 [INFO] Exception logged for session ID: 8b2eae2a-b416-5ac5-a87d-2fb256df6f0c
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/shared/matan/code/tinker-cookbook/tinker_cookbook/tfh/__main__.py", line 5, in <module>
main()
File "/shared/matan/code/tinker-cookbook/tinker_cookbook/tfh/launcher.py", line 151, in main
_handle_new(args)
File "/shared/matan/code/tinker-cookbook/tinker_cookbook/tfh/launcher.py", line 333, in _handle_new
_launch_recipe(recipe_module=recipe_module, config=resolved, metadata=metadata)
File "/shared/matan/code/tinker-cookbook/tinker_cookbook/tfh/launcher.py", line 593, in _launch_recipe
asyncio.run(cli_main_fn(cli_config))
File "/shared/matan/dotfiles/aiai-cluster/.local/share/uv/python/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/asyncio/runners.py", line 195, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/shared/matan/dotfiles/aiai-cluster/.local/share/uv/python/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/shared/matan/dotfiles/aiai-cluster/.local/share/uv/python/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/asyncio/base_events.py", line 691, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/shared/matan/code/tinker-cookbook/tinker_cookbook/recipes/apps_rl/train_step_ranged.py", line 128, in cli_main
await main(config)
File "/shared/matan/code/tinker-cookbook/tinker_cookbook/utils/trace.py", line 526, in async_wrapper
return await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/shared/matan/code/tinker-cookbook/tinker_cookbook/rl/train.py", line 1493, in main
training_client = await service_client.create_training_client_from_state_async(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/shared/matan/code/tinker-cookbook/.venv/lib/python3.12/site-packages/tinker/lib/telemetry.py", line 384, in _awrapper
return await cast(Callable[..., Awaitable[R]], func)(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/shared/matan/code/tinker-cookbook/.venv/lib/python3.12/site-packages/tinker/lib/public_interfaces/service_client.py", line 301, in create_training_client_from_state_async
await load_future.result_async()
File "/shared/matan/code/tinker-cookbook/.venv/lib/python3.12/site-packages/tinker/lib/public_interfaces/api_future.py", line 132, in result_async
return await asyncio.wrap_future(self._future)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/shared/matan/code/tinker-cookbook/.venv/lib/python3.12/site-packages/tinker/lib/telemetry.py", line 384, in _awrapper
return await cast(Callable[..., Awaitable[R]], func)(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/shared/matan/code/tinker-cookbook/.venv/lib/python3.12/site-packages/tinker/lib/public_interfaces/training_client.py", line 564, in _load_state_impl
future = await self.holder.execute_with_retries(_send_request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/shared/matan/code/tinker-cookbook/.venv/lib/python3.12/site-packages/tinker/lib/internal_client_holder.py", line 452, in execute_with_retries
{
raise e
File "/shared/matan/code/tinker-cookbook/.venv/lib/python3.12/site-packages/tinker/lib/internal_client_holder.py", line 413, in execute_with_retries
return await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/shared/matan/code/tinker-cookbook/.venv/lib/python3.12/site-packages/tinker/lib/public_interfaces/training_client.py", line 559, in _send_request
return await client.weights.load(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/shared/matan/code/tinker-cookbook/.venv/lib/python3.12/site-packages/tinker/resources/weights.py", line 62, in load
return await self._post(
^^^^^^^^^^^^^^^^^
File "/shared/matan/code/tinker-cookbook/.venv/lib/python3.12/site-packages/tinker/_base_client.py", line 1230, in post
return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/shared/matan/code/tinker-cookbook/.venv/lib/python3.12/site-packages/tinker/_base_client.py", line 1031, in request
raise self._make_status_error_from_response(err.response) from None
tinker.BadRequestError: Error code: 400 - {'detail': 'Path is invalid'}
wandb:
wandb: 🚀 View run fork-u5mmibjz-exp-penalty-nonhidden_t49yfhhy at: https://wandb.ai/matan-shtepel-carnegie-mellon-university/apps-tinker/runs/t49yfhhy
I did a training run yesterday that resulted in some checkpoints. Here is a partial view of my config which
tinkerrunsHowever replacing
weightswithsampler_weightsmakes it not work, i.e.raises the following error
the error trace
tfh.launcheris my own convenience wrapper and it seems unlikely that it interferes. I think it would also be nice to enhance this error message with the path that the server thinks is invalid.I don't think that I deleted the checkpoint and am not even sure if its possible to delete the
sampler_weightswithout deleting theweightsThanks for the great service! Sorry if I am missing something!