Accident Analysis and Prevention, ( ISI ), Volume (234), No (101854), Year (2026-9) , Pages (108598-101870)

Title : ( An efficient methodology for modeling imbalanced traffic crashes through deep learning techniques )

Authors: ALI IRANDOOST , Marjan Ghaemi , Rouzbeh Shad , Rouzbeh Shad , Seyed Ali Ziaee ,

Access to full-text not allowed by authors

Citation: BibTeX | EndNote

Abstract

Accurately predicting crash injury severity is crucial for enhancing traffic safety. However, crash datasets are often highly imbalanced because severe injuries are infrequent, which can bias predictive models. Addressing these challenges, the present study develops a hybrid methodology named ENN-CTGAN. ENN-CTGAN is designed to address class imbalance and overcome the limitations of traditional resampling techniques. A hybrid LSTM-GRU model was developed to predict injury severity, and its performance was compared with LSTM, GRU, CNN, MLP, XGBoost and Random Forest models. The study investigates different synthetic data ratios (1:1, 1:2, 1:4, and 1:6) to identify the optimal ratio for each prediction model. It also proposes a framework that combines mutual information differences with model efficiency to assess the quality of the generated synthetic data. The performance of ENN-CTGAN was compared with other resampling techniques, including SMOTE, Random Oversampling, and Random Undersampling. Prediction performance was evaluated with AUC, G-mean, Sensitivity, Specificity, and Accuracy. Results demonstrate that the proposed two-stage framework outperforms conventional resampling approaches across multiple predictive models. Under fully balanced data (1:1 ratio) the hybrid LSTM-GRU model achieved the best performance within the ENN-CTGAN framework with a G-mean of 0.5452. Among all evaluated ratio configurations and resampling methods, the XGBoost classifier under the proposed ENN-CTGAN framework achieved the highest performance at a 1:4 synthetic proportion (G-mean = 0.5643). These findings highlight the methodological advantage of the ENN-CTGAN design and underscore the importance of empirical, data-driven ratio optimization rather than assuming a fixed balancing strategy in imbalanced crash severity modeling.

Keywords

Hybrid resampling ; Traffic safety ; Deep learning ; Generative models ; Imbalanced data
برای دانلود از شناسه و رمز عبور پرتال پویا استفاده کنید.

@article{paperid:1107446,
author = {IRANDOOST, ALI and مرجان قائمی and Shad, Rouzbeh and Shad, Rouzbeh and Ziaee, Seyed Ali},
title = {An efficient methodology for modeling imbalanced traffic crashes through deep learning techniques},
journal = {Accident Analysis and Prevention},
year = {2026},
volume = {234},
number = {101854},
month = {September},
issn = {0001-4575},
pages = {108598--101870},
numpages = {-6728},
keywords = {Hybrid resampling ; Traffic safety ; Deep learning ; Generative models ; Imbalanced data},
}

[Download]

%0 Journal Article
%T An efficient methodology for modeling imbalanced traffic crashes through deep learning techniques
%A IRANDOOST, ALI
%A مرجان قائمی
%A Shad, Rouzbeh
%A Shad, Rouzbeh
%A Ziaee, Seyed Ali
%J Accident Analysis and Prevention
%@ 0001-4575
%D 2026

[Download]