Possible bug: Fisher influence uses only token 0 for Qwen-2.5VL vision activations

I think there may be a logic issue in `fisher.py` (around line 113) in the Fisher/influence computation.

The code indexes the activations like:

```python
original_ffn_activations[layer][0]
```
For Qwen-2.5VL, there is no CLS token, and the vision features are represented as a sequence of visual patches/tokens `(batch * n_patches, hidden_size)` . In this case, the shape would be `(1*256, 3420)`. Indexing `[0]` would compute Fisher information and influential paths using only the first visual token (top-left patch), rather than an aggregate over the full image.
Questions:
1. Was using only index 0 intentional?
2. If not, should the influence be computed over all patches either by mean-pooling across all tokens, or attention-weighted pooling?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible bug: Fisher influence uses only token 0 for Qwen-2.5VL vision activations #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Possible bug: Fisher influence uses only token 0 for Qwen-2.5VL vision activations #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions