Improve performance metrics collection

It would be very helpful to expose more detailed performance metrics in gpullama, similar to what Ollama provides ([see](https://docs.ollama.com/api/usage)) .

Right now, it is difficult to properly evaluate performance (especially across CPU/GPU backends and TornadoVM execution) without fine-grained timing information. Having a consistent set of metrics would significantly improve benchmarking, profiling, and optimization.

#### Proposed metrics

##### Core metrics (aligned with Ollama-style reporting):

`total_duration` – total time to generate the full response
`load_duration` – time spent loading the model
`prompt_eval_count` – number of input tokens processed
`prompt_eval_duration` (prefill) – time spent processing the prompt
`eval_count` – number of generated output tokens
`eval_duration` (decode) – time spent generating tokens

##### TornadoVM-specific metrics:

`tornado_task_graph_compile_duration` – time to compile the Tornado task graph
`tornado_task_graph_warmup_duration` – time spent in warmup/execution until steady state

All timings should ideally be reported in nanoseconds for consistency and precision.

With the above we can calculate:
- `time_to_first_token` (TTFT) – can be derived from existing durations (e.g. load + prefill + first decode step), so it may not need separate instrumentation if timestamps are available
- `prefill_throuput` as `tok/s` = `prompt_eval_count` / `prompt_eval_duration`
- `decode_throuput` as `tok/s` = `eval_count` / `eval_duration`
- `total_throuput` as `tok/s` = `prompt_eval_count` + `eval_count` / `total_duration` <--- as we do now

#### Why this is useful

These metrics would make it easier to:

break down execution into loading, prefill, decode, and runtime overheads
understand TornadoVM-specific costs (compilation and warmup)
compare CPU vs GPU vs TornadoVM performance more accurately
identify bottlenecks and guide optimizations

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance metrics collection #104

Proposed metrics

Core metrics (aligned with Ollama-style reporting):

TornadoVM-specific metrics:

Why this is useful

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve performance metrics collection #104

Description

Proposed metrics

Core metrics (aligned with Ollama-style reporting):

TornadoVM-specific metrics:

Why this is useful

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions