Published at ICOEI 2026 · IEEE

Lightweight & Explainable
Emotion Cause
Pair Extraction

A sarcasm-aware ECPE framework using DistilBERT and joint multi-task learning to identify emotions and their causes in conversational text — with 40% fewer parameters than BERT-base.

0.84
F1-Score
0.91
Sarcasm AUC
66M
Parameters
~91%
Sarcasm Acc.
ecpe_inference.py
Input Text
"Great service! My package was two weeks late and no one bothered to inform me."
MODEL OUTPUT
EMOTION Anger — detected via sarcasm-aware encoding
CAUSE "My package was two weeks late"
SARCASM Detected · polarity corrected

Five-Stage Pipeline

Each input document flows through structured preprocessing, sarcasm detection, clause segmentation, contextual encoding, and joint prediction.

01
🧹
Hybrid Text Normalisation
Lowercasing, URL/emoji removal, canonical punctuation — while preserving clause-boundary markers.
02
🎭
Sarcasm Detection
Fine-tuned RoBERTa classifier with twitter sarcasm/irony dataset; sarcasm signal injected as auxiliary feature during training.
03
✂️
Clause Segmentation
spaCy dependency parser splits text at cc, advcl, punctuations — semantically meaningful cuts.
04
🧠
DistilBERT Encoding
Shared transformer encoder (6 layers, 768-dim) produces contextual clause embeddings efficiently.
05
🔗
Joint Multi-Task Learning
Parallel EE and ECE heads share the encoder, reducing error propagation and improving alignment.

Experimental Evaluation

Evaluated on a unified Master ECPE Corpus (DailyDialog + RECCON) with sarcasm-sensitive clause-level annotations.

📊
0.84
ECPE F1-Score
↑ +2.4% vs baseline
🎯
0.91
Sarcasm ROC-AUC
↑ new capability
66M
Parameters
↓ 40% vs BERT-base
~91%
Sarcasm Accuracy
↑ strong discrimination
Metric Proposed (DistilBERT + MTL) Baseline (BERT-base)
ECPE F1-Score 0.84 Better 0.82
Sarcasm Detection AUC 0.9138 New Module — (not supported)
Model Parameters ~66M Lighter ~110M Heavier
Transformer Layers 6 Efficient 12
Inference Speed Faster Better Slower
Sarcasm-Aware Training Yes Robust No

Architecture

A shared DistilBERT encoder feeds two parallel classification heads for simultaneous emotion and cause prediction.

📄
Input Document D
D = {c₁, c₂, …, cₙ} clauses
🎭
Sarcasm Detection (RoBERTa)
P(s|x) = softmax(Wsʜ + bs)
✂️
Clause Segmentation (spaCy)
cc · advcl · punctuations
🧠
Shared DistilBERT Encoder
6 layers · 768-dim · 66,362,880 params
😊
Emotion Head (EE)
softmax → 6 classes
🔍
Cause Head (ECE)
binary classifier per clause
🔗
Emotion–Cause Pair Formation
conf = P(emotion) × P(cause)

Key Design Choices

Knowledge Distillation
Training uses both ground-truth labels and soft probability outputs from a larger teacher model, improving generalisation without increasing model size.
Sarcasm-Aware Auxiliary Signal
The RoBERTa sarcasm probability is injected as an auxiliary feature during training, preventing polarity inversion errors in ECPE predictions.
Dependency-Based Clause Splitting
Syntactic dependency relations produce semantically meaningful clause boundaries — unlike naive punctuation splitting, which misses complex sentence structures.
Joint Multi-Task Optimisation
Emotion Extraction and Cause Extraction are trained simultaneously with a shared encoder, reducing error propagation and improving emotion–cause alignment.
Early Stopping + AdamW
Training on NVIDIA RTX 3060 with batch size 16, AdamW optimiser, and early stopping on validation loss to prevent overfitting.

Master ECPE Corpus

A unified clause-level corpus combining DailyDialog and RECCON with sarcasm-sensitive annotations and distillation-enhanced supervision.

DailyDialog

Turn-taking conversational dataset with six emotion labels. Each utterance is a realistic daily conversation — ideal for training models on natural, informal language patterns.

😊 Joy 😢 Sadness 😠 Anger 😨 Fear 😲 Surprise 😐 Neutral

RECCON

Richly annotated multimodal dataset of expressive emotional conversations. Provides fine-grained emotional labels and strong coverage of non-neutral emotional expressions.

😊 Happy 😢 Sad 😠 Angry 😐 Neutral 😤 Frustrated 😲 Excited

Published at ICOEI 2026

9th International Conference on Trends in Electronics and Informatics · IEEE · India

©2026 IEEE

Sample Predictions

Emotion–cause pairs extracted by the proposed model from test set sentences, with confidence scores.

ecpe_results.json · RECCON + DailyDialog · Test Split
"I am very happy because I got a new job."
😊 Happiness
"I got a new job"
0.93
"She felt sad after losing her pet."
😢 Sadness
"losing her pet"
0.91
"He was angry because of the delay."
😠 Anger
"because of the delay"
0.88
"I am surprised by the unexpected gift."
😲 Surprise
"the unexpected gift"
0.85
"She felt scared during the storm."
😨 Fear
"during the storm"
0.89

Research Team

Department of Computer Science and Engineering, Ramco Institute of Technology, Rajapalayam, India.

LR
Latha R
Assistant Professor
rlatha08@gmail.com
DK
Dinesh Kumar K
B.E. CSE Student
dineshdk1904@gmail.com
KK
Karthikeyan K
B.E. CSE Student
karthikeyankandhasamy2005@gmail.com
MK
Muthu Krishnan V
B.E. CSE Student
muthukrishnan16105@gmail.com