Data Mining and Knowledge Discovery, Volume (40), No (45), Year (2026-5) , Pages (1-27)

Title : ( Efficient training of deep networks using guided spectral data selection: a step toward learning what you need )

Authors: Mohammadreza Sharifi , Ahad Harati ,

Access to full-text not allowed by authors

Citation: BibTeX | EndNote

Abstract

Effective data selection is an essential step for optimizing neural network training. In this paper, we present the Guided Spectrally Tuned Data Selection (GSTDS) algorithm, which dynamically adjusts the subset of data points used for training a learner model. Using a predefined schedule, GSTDS reduces the number of data points processed in each mini-batch by focusing on the most informative samples and avoiding redundancy, leading to significant savings in computational effort. Moreover, by leveraging an off-the-shelf reference model, we obtain an overview of the data through features extracted from each sample, which further aids the selection process. The selection criterion is mainly based on spectral analysis, which uses Fiedler vector-based scoring and weighting mechanisms to dynamically prioritize the most informative and geometrically diverse data points, improving generalization performance. Extensive experiments on standard image classification benchmarks, including CIFAR-10/100, TinyImageNet, Oxford-IIIT Pet, and Oxford Flowers, show that GSTDS outperforms vanilla training and recent state-of-the-art methods across several key metrics. These results highlight the potential of online spectral-based dataset pruning when combined with dynamic weighting based on residual error and a predefined schedule for ratio of filtering in data-efficient training of deep learning models. The code is available at h t t p s : / / g i t h u b . c o m / r e z a s h a r i f i 8 2 / G S T D S .

Keywords

Data efficient training · Dataset pruning · Spectral analysis · Coreset selection · Deep learning · Artificial intelligence
برای دانلود از شناسه و رمز عبور پرتال پویا استفاده کنید.

@article{paperid:1107283,
author = {Sharifi, Mohammadreza and Harati, Ahad},
title = {Efficient training of deep networks using guided spectral data selection: a step toward learning what you need},
journal = {Data Mining and Knowledge Discovery},
year = {2026},
volume = {40},
number = {45},
month = {May},
issn = {1384-5810},
pages = {1--27},
numpages = {26},
keywords = {Data efficient training · Dataset pruning · Spectral analysis · Coreset selection · Deep learning · Artificial intelligence},
}

[Download]

%0 Journal Article
%T Efficient training of deep networks using guided spectral data selection: a step toward learning what you need
%A Sharifi, Mohammadreza
%A Harati, Ahad
%J Data Mining and Knowledge Discovery
%@ 1384-5810
%D 2026

[Download]