FitScript/openenv.yaml at main · coffeine16/FitScript · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
spec_version: 1
name: FitScript
description: >
  A fitness prescription environment where an AI agent generates safe, effective,
  and personalized workout and nutrition plans for clients with varying health
  conditions, injuries, and equipment constraints. Three tasks of increasing
  difficulty test safety reasoning, constraint satisfaction, and multi-client
  resource allocation.
type: space
runtime: fastapi
app: server.app:app
port: 8000

tasks:
  - id: 1
    name: basic_safe_prescription
    difficulty: easy
    description: >
      Generate a complete 4-week fitness plan for a healthy 24-year-old male
      targeting fat loss. Covers caloric deficit, protein targets, workout split,
      and beginner-appropriate progressive overload.

  - id: 2
    name: injury_constraints
    difficulty: medium
    description: >
      Generate a modified plan for a 35-year-old female with Type 2 diabetes
      (HbA1c 7.8%) and a left knee meniscus tear, working night shifts.
      Requires simultaneous constraint satisfaction: knee safety, glycemic
      management, and night-shift recovery adaptation.

  - id: 3
    name: multi_client_allocation
    difficulty: hard
    description: >
      Prescribe for 4 clients (cardiac patient, marathon runner, herniated disc,
      postpartum) with a shared home gym (dumbbells ≤20kg, bands, pull-up bar)
      and only 3 hours of coaching time per week. Tests multi-constraint
      reasoning and resource allocation.

reward:
  formula: "safety^1.5 × efficacy × personalization^0.8 × completeness^0.5 (CRITICAL safety → 0.0)"
  range: [0.0, 1.0]
  description: >
    Multiplicative reward with safety exponent. A CRITICAL safety violation
    collapses reward to 0.0 regardless of other scores. Safety (exponent 1.5)
    is weighted heaviest — unsafe prescriptions cause real harm. Efficacy (1.0)
    checks whether the plan achieves the stated goal. Personalization (0.8)
    penalizes generic templates. Completeness (0.5) rewards fully specified plans.

tags:
  - openenv
  - fitness
  - healthcare
  - safety-critical
  - multi-constraint
  - personalization