SemEval 2026 Task 2: Predicting Variation in Emotional Valence and Arousal over Time from Ecological Essays

📄 Abstract

The Pytorch version of our SemEval-2026 Task 2 submissions is found within this repository and addresses three major issues in affective computing: state prediction, change forecasting and long-term trajectory prediction.

We utilize efficient "hybrid" architectures; specifically the Siamese Network ("Bifurcated Leviathan"), and custom loss functions (CCC (Concordance Correlation Coefficient)) to prevent regression to the mean, due to resource limitations associated with consumer-grade hardware (8GB VRAM).

📂 Repository Structure

├── paper/
│   ├── SemEval_Paper.pdf
├── src/                         # Source code for training and inference
│   ├── subtask1_longitudinal.py 
│   ├── subtask2a_forecasting.py 
│   └── subtask2b_disposition.py
├── LICENSE
├── predictions/                 # Output CSVs for submission
├── splits_subtask1/             # Generated automatically
├── splits_subtask2a/            # Generated automatically
├── splits_subtask2b/            # Generated automatically
├── train_subtask1.csv           # Raw Dataset
├── train_subtask2a.csv          # Raw Dataset
├── train_subtask2b.csv          # Raw Dataset (Main file for Subtask 2B)
├── train_subtask2b_detailed.csv
├── train_subtask2b_user_disposition_change.csv
├── README.md                    # Project documentation
└── requirements.txt             # Python dependencies

🎯 Task Definitions & Methodologies

1. Subtask 1: Longitudinal Affect Assessment

The Task: Given a chronological sequence of m texts $e_1, e_2, \dots, e_m$, the model must produce Valence & Arousal (V&A) predictions for each text: $(v_1, a_1), \dots, (v_m, a_m)$.

Constraint: The test split includes Unseen Users (zero-shot generalization) and Seen Users (temporal tracking).

Our Solution: The Hybrid Early-Fusion Model

Architecture: distilbert-base-uncased + BiLSTM.
Innovation: Instead of relying solely on text, we implement Early Fusion. An explicit User Embedding (dim=32) is concatenated with the text embedding before temporal processing. This allows the LSTM to condition its memory on the specific user identity.
Inference: Uses a custom SlidingWindowDataset to prevent "context starvation" (forgetting history) during testing.

2. Subtask 2A: Forecasting State Changes

The Task: Given a sequence of texts and their V&A scores up to time $t$, predict the immediate next-step change in Valence and Arousal: $$ \Delta_1 = v_{t+1} - v_t $$

Our Solution: The State-Aware Projector

The Problem: "The Drowning Problem." High-dimensional text vectors (768-dim) overwhelm low-dimensional scalar inputs (current state $v_t, a_t$).
The Fix: A Projection MLP boosts the scalar state features into a higher-dimensional space (64-dim) before fusion.
Loss Function: We replaced MSE (Mean Squared Error) with CCC Loss.
- Observation: MSE caused the model to predict "zero change" (flatline) to minimize error.
- Result: CCC forces the model to match the variance of the trajectory, improving correlation from 0.39 to 0.64.

3. Subtask 2B: Dispositional (Long-Term) Change

The Task: Predict the change between the mean observed affect (past) and the mean future affect (future): $$ \Delta_{\text{avg}} = \text{avg}(v_{t+1:n}) - \text{avg}(v_{1:t}) $$

Our Solution: The "Bifurcated Leviathan"

Architecture: A Siamese Network with a shared deberta-v3-large backbone.
Sampling: Implements a "Head-Tail" protocol, sampling the first 16 essays (Head) and last 16 essays (Tail) to model long-term drift.
Residual Learning: We inject the arithmetic difference of the raw scores ("Naive Math") into the final layer. The network learns to refine this statistical trend rather than deriving it from scratch.
Bifurcation: The network splits immediately after the backbone into separate Valence and Arousal heads to prevent noisy Arousal gradients from disrupting Valence learning.

📊 Results & Performance

Task	Metric	Score (Pearson $r$)	Key Insight
Subtask 1	Valence (Seen)	0.7026	User Embeddings are critical for known users.
Subtask 1	Arousal (Seen)	0.5186	Arousal is notoriously harder to model than Valence using text.
Subtask 2A	Avg Correlation	0.64	CCC Loss outperformed MSE by ~27%.
Subtask 2B	Valence Change	0.7031	Residual learning ("Naive Math") prevents scale collapse.

🚀 Setup & Usage

Prerequisites

Python 3.10+
NVIDIA GPU (Minimum 8GB VRAM recommended for training)

Installation

# Clone the repository
git clone [https://github.com/YourUsername/SemEval-2026-Task2.git](https://github.com/YourUsername/SemEval-2026-Task2.git)
cd SemEval-2026-Task2

# Install dependencies
pip install -r requirements.txt

Subtask 1:
```
python src/subtask1_longitudinal.py
```
This script handles the "Seen/Unseen" user split automatically.
Subtask 2A:
```
python src/subtask2a_forecasting.py
```
This executes the V5 architecture (DeBERTa + Projection) using CCC Loss to replicate our best results.
Subtask 2B:
```
python src/subtask2b_disposition.py
```
This implements the "Bifurcated Leviathan" model with Head-Tail sampling.

🤝 Acknowledgements & Credits

Originality Statement

The architectures presented here (including the "Leviathan" Siamese network and the Hybrid LSTM-Fusion) are original contributions developed for this competition.

External Resources

We gratefully acknowledge the open-source community. Specifically, initial data processing patterns and file handling structures were informed by the work of:

ThickHedgehog (2025): Deep-Learning-project-SemEval-2026-Task-2. Available at: GitHub.

Note: While preprocessing logic was inspired by the above, the modeling strategies (Early vs. Late Fusion, usage of LSTM for Subtask 1, and CCC optimization) differ significantly in implementation and topology.

📜 Citation

If you use this code or our findings in your research, please cite:

@inproceedings{jumakhan2026longitudinal,
  title={Longitudinal Affective Forecasting: Architectures for Generalization, State Change, and Trajectory Prediction},
  author={Jumakhan, Haseebullah and Assad, Soud and Ahmad, Seyed Abdullah},
  booktitle={Proceedings of the 20th International Workshop on Semantic Evaluation (SemEval-2026)},
  year={2026}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SemEval 2026 Task 2: Predicting Variation in Emotional Valence and Arousal over Time from Ecological Essays

📄 Abstract

📂 Repository Structure

🎯 Task Definitions & Methodologies

1. Subtask 1: Longitudinal Affect Assessment

2. Subtask 2A: Forecasting State Changes

3. Subtask 2B: Dispositional (Long-Term) Change

📊 Results & Performance

🚀 Setup & Usage

Prerequisites

Installation

🤝 Acknowledgements & Credits

Originality Statement

External Resources

📜 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
paper		paper
src		src
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
train_subtask1.csv		train_subtask1.csv
train_subtask2a.csv		train_subtask2a.csv
train_subtask2b.csv		train_subtask2b.csv
train_subtask2b_detailed.csv		train_subtask2b_detailed.csv
train_subtask2b_user_disposition_change.csv		train_subtask2b_user_disposition_change.csv

Folders and files

Latest commit

History

Repository files navigation

SemEval 2026 Task 2: Predicting Variation in Emotional Valence and Arousal over Time from Ecological Essays

📄 Abstract

📂 Repository Structure

🎯 Task Definitions & Methodologies

1. Subtask 1: Longitudinal Affect Assessment

2. Subtask 2A: Forecasting State Changes

3. Subtask 2B: Dispositional (Long-Term) Change

📊 Results & Performance

🚀 Setup & Usage

Prerequisites

Installation

🤝 Acknowledgements & Credits

Originality Statement

External Resources

📜 Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages