FAST: A New AI Framework Dramatically Cuts Energy Use in Model Training
Researchers have unveiled a novel method for compressing massive AI training datasets that promises to drastically reduce the computational and energy costs of developing deep learning models. The new framework, named FAST, redefines coreset selection—the process of choosing a small, representative subset of data—by solving a fundamental distribution-matching problem that previous approaches could not address. By employing a unique frequency-domain analysis and a progressive sampling strategy, FAST achieves superior accuracy while cutting power consumption by over 96% compared to existing methods, marking a significant leap toward sustainable and efficient AI development.
The Core Challenge in Coreset Selection
Training state-of-the-art deep neural networks (DNNs) requires immense computational resources, often powered by energy-intensive data centers. Coreset selection aims to alleviate this burden by identifying a compact subset of data that retains the full dataset's essential information. However, existing techniques have critical limitations. DNN-based methods are inherently tied to a specific model's architecture and parameters, introducing bias, while DNN-free heuristic methods lack robust theoretical guarantees. A fundamental issue plaguing both approaches is their inability to ensure true distributional equivalence between the coreset and the original dataset.
This failure stems from two primary factors. First, matching a continuous data distribution through discrete sampling has been considered an intractable problem. Second, standard metrics like Mean Squared Error (MSE) or Maximum Mean Discrepancy (MMD) fail to capture crucial higher-order statistical moments, leading to suboptimal and unrepresentative coresets that degrade model performance.
The FAST Framework: A Graph-Theoretic and Frequency-Domain Solution
The proposed FAST framework breaks this stalemate by introducing the first DNN-free, distribution-matching coreset selection method with solid theoretical foundations. The researchers reformulate the task as a graph-constrained optimization problem grounded in spectral graph theory. This mathematical framework allows them to structure the relationship between data points effectively.
To accurately match distributions, FAST moves beyond traditional metrics and into the frequency domain. It utilizes the Characteristic Function Distance (CFD), which can, in theory, capture a probability distribution's complete information—including all moments—by analyzing its Fourier transform. However, the team discovered a critical flaw in a naive CFD application: a "vanishing phase gradient" issue in medium and high-frequency regions, which hampers learning. Their solution is an enhanced Attenuated Phase-Decoupled CFD, which stabilizes the optimization process across all frequencies.
Progressive Sampling for Efficient and Accurate Convergence
Ensuring the optimization converges efficiently to a high-quality solution required another innovation. The researchers designed a Progressive Discrepancy-Aware Sampling (PDAS) strategy. This technique intelligently schedules the frequency bands used during optimization, starting with low frequencies to capture the global data structure before progressively incorporating higher frequencies to refine local details.
This phased approach is crucial. It prevents overfitting to noise in the high-frequency bands and enables accurate distribution matching using fewer total frequencies, which accelerates the coreset selection process itself. The result is a method that is both more precise and computationally cheaper to run.
Benchmark Performance and Energy Efficiency Gains
In extensive experiments across multiple standard benchmarks, FAST demonstrated overwhelming advantages over state-of-the-art coreset selection methods. The performance gains are substantial, with FAST achieving an average accuracy gain of 9.12% on trained models using its coresets.
The efficiency improvements are even more striking. When compared to other baseline coreset methods, FAST reduced the power consumption of the training process by 96.57%. Furthermore, the framework itself operates efficiently, delivering a 2.2x average speedup in the coreset selection phase. These metrics underscore FAST's dual strength: it produces higher-fidelity data subsets for better models while radically reducing the environmental and computational cost of AI training.
Why This Matters: Key Takeaways
- Breaks a Theoretical Barrier: FAST is the first method to successfully frame coreset selection as a solvable discrete distribution-matching problem, moving beyond heuristics and model-specific bias.
- Superior Accuracy: By using Characteristic Function Distance in the frequency domain, it captures full distributional information, leading to coresets that improve final model accuracy by over 9% on average.
- Transformative Efficiency: The method slashes the energy footprint of model training by more than 96%, addressing critical sustainability concerns in AI development.
- Faster Processing: Its Progressive Discrepancy-Aware Sampling strategy enables a 2.2x speedup in creating the coreset, making the framework practical for large-scale applications.