SemEval 2026 Task 2: Predicting Variation in Emotional Valence and Arousal over Time from Ecological Essays
The Pytorch version of our SemEval-2026 Task 2 submissions is found within this repository and addresses three major issues in affective computing: state prediction, change forecasting and long-term trajectory prediction.
We utilize efficient "hybrid" architectures; specifically the Siamese Network ("Bifurcated Leviathan"), and custom loss functions (CCC (Concordance Correlation Coefficient)) to prevent regression to the mean, due to resource limitations associated with consumer-grade hardware (8GB VRAM).
├── paper/
│ ├── SemEval_Paper.pdf
├── src/ # Source code for training and inference
│ ├── subtask1_longitudinal.py
│ ├── subtask2a_forecasting.py
│ └── subtask2b_disposition.py
├── LICENSE
├── predictions/ # Output CSVs for submission
├── splits_subtask1/ # Generated automatically
├── splits_subtask2a/ # Generated automatically
├── splits_subtask2b/ # Generated automatically
├── train_subtask1.csv # Raw Dataset
├── train_subtask2a.csv # Raw Dataset
├── train_subtask2b.csv # Raw Dataset (Main file for Subtask 2B)
├── train_subtask2b_detailed.csv
├── train_subtask2b_user_disposition_change.csv
├── README.md # Project documentation
└── requirements.txt # Python dependenciesThe Task:
Given a chronological sequence of m texts
- Constraint: The test split includes Unseen Users (zero-shot generalization) and Seen Users (temporal tracking).
Our Solution: The Hybrid Early-Fusion Model
- Architecture:
distilbert-base-uncased+BiLSTM. - Innovation: Instead of relying solely on text, we implement Early Fusion. An explicit
User Embedding(dim=32) is concatenated with the text embedding before temporal processing. This allows the LSTM to condition its memory on the specific user identity. - Inference: Uses a custom
SlidingWindowDatasetto prevent "context starvation" (forgetting history) during testing.
The Task:
Given a sequence of texts and their V&A scores up to time
Our Solution: The State-Aware Projector
-
The Problem: "The Drowning Problem." High-dimensional text vectors (768-dim) overwhelm low-dimensional scalar inputs (current state
$v_t, a_t$ ). - The Fix: A Projection MLP boosts the scalar state features into a higher-dimensional space (64-dim) before fusion.
-
Loss Function: We replaced MSE (Mean Squared Error) with CCC Loss.
- Observation: MSE caused the model to predict "zero change" (flatline) to minimize error.
- Result: CCC forces the model to match the variance of the trajectory, improving correlation from 0.39 to 0.64.
The Task: Predict the change between the mean observed affect (past) and the mean future affect (future): $$ \Delta_{\text{avg}} = \text{avg}(v_{t+1:n}) - \text{avg}(v_{1:t}) $$
Our Solution: The "Bifurcated Leviathan"
- Architecture: A Siamese Network with a shared
deberta-v3-largebackbone. - Sampling: Implements a "Head-Tail" protocol, sampling the first 16 essays (Head) and last 16 essays (Tail) to model long-term drift.
- Residual Learning: We inject the arithmetic difference of the raw scores ("Naive Math") into the final layer. The network learns to refine this statistical trend rather than deriving it from scratch.
- Bifurcation: The network splits immediately after the backbone into separate Valence and Arousal heads to prevent noisy Arousal gradients from disrupting Valence learning.
| Task | Metric | Score (Pearson |
Key Insight |
|---|---|---|---|
| Subtask 1 | Valence (Seen) | 0.7026 | User Embeddings are critical for known users. |
| Subtask 1 | Arousal (Seen) | 0.5186 | Arousal is notoriously harder to model than Valence using text. |
| Subtask 2A | Avg Correlation | 0.64 | CCC Loss outperformed MSE by ~27%. |
| Subtask 2B | Valence Change | 0.7031 | Residual learning ("Naive Math") prevents scale collapse. |
- Python 3.10+
- NVIDIA GPU (Minimum 8GB VRAM recommended for training)
# Clone the repository
git clone [https://github.com/YourUsername/SemEval-2026-Task2.git](https://github.com/YourUsername/SemEval-2026-Task2.git)
cd SemEval-2026-Task2
# Install dependencies
pip install -r requirements.txt-
Subtask 1:
python src/subtask1_longitudinal.py
This script handles the "Seen/Unseen" user split automatically.
-
Subtask 2A:
python src/subtask2a_forecasting.py
This executes the V5 architecture (DeBERTa + Projection) using CCC Loss to replicate our best results.
-
Subtask 2B:
python src/subtask2b_disposition.py
This implements the "Bifurcated Leviathan" model with Head-Tail sampling.
The architectures presented here (including the "Leviathan" Siamese network and the Hybrid LSTM-Fusion) are original contributions developed for this competition.
We gratefully acknowledge the open-source community. Specifically, initial data processing patterns and file handling structures were informed by the work of:
- ThickHedgehog (2025): Deep-Learning-project-SemEval-2026-Task-2. Available at: GitHub.
Note: While preprocessing logic was inspired by the above, the modeling strategies (Early vs. Late Fusion, usage of LSTM for Subtask 1, and CCC optimization) differ significantly in implementation and topology.
If you use this code or our findings in your research, please cite:
@inproceedings{jumakhan2026longitudinal,
title={Longitudinal Affective Forecasting: Architectures for Generalization, State Change, and Trajectory Prediction},
author={Jumakhan, Haseebullah and Assad, Soud and Ahmad, Seyed Abdullah},
booktitle={Proceedings of the 20th International Workshop on Semantic Evaluation (SemEval-2026)},
year={2026}
}