The application of energy-saving technologies in parallel computing systems
Abstract
This scientific paper provides a detailed analysis of the architecture of parallel computing systems, their operational principles, and methods for organizing parallel processing. In particular, it investigates the dynamic performance and energy consumption of the OpenBLAS DGEMM function using software tools and applications designed to evaluate energy usage on multi-core processors. Throughout the study, effective methods for ensuring energy efficiency are developed, and mathematical models for predicting energy consumption are applied to assess and optimize the energy costs associated with computational workloads. Additionally, the paper analyzes the performance metrics of high-efficiency parallel computing blocks and their impact on energy efficiency. Based on the research findings, practical recommendations are proposed to reduce energy consumption in parallel systems without compromising computational performance.
References
[2] Decree No. PF-5527 of the President of the Republic of Uzbekistan, dated August 28, 2018, "On the Establishment of the National Commission for the Efficient Use of Energy and the Development of Renewable Energy Sources."
[3] Decree No. PF-6079 of the President of the Republic of Uzbekistan, dated October 5, 2020, "On the Development of E-Government and the Integration of State Agencies into Information Systems within the Framework of the 'Digital Uzbekistan-2030' Strategy."
[4] Talbi, E.G. Metaheuristics: From Design to Implementation; John Wiley & Sons: Hoboken, NJ, USA, 2009; Volume 74.
[5] Fahad, M.; Manumachu, R.R. HCLWattsUp: Energy API Using System-Level Physical Power Measurements Provided by Power Meters; Heterogeneous Computing Laboratory, University College Dublin: Dublin, Ireland, 2023.
[6] OpenBLAS. OpenBLAS: An Optimized BLAS Library. Available online: https://github.com/xianyi/OpenBLAS (accessed on 1 December 2022).
[7] Fahad, M.; Shahid, A.; Reddy, R.; Lastovetsky, A. A Comparative Study of Methods for Measurement of Energy of Computing. Energies 2019, 12, 2204.
[8] Top500. The Top500 Supercomputers List. Available online: https://www.top500.org (accessed on 1 December 2022).
[9] Krommydas, K.; Feng, W.C.; Antonopoulos, C.D.; Bellas, N. OpenDwarfs: Characterization of Dwarf-Based Benchmarks on Fixed and Reconfigurable Architectures. Journal of Signal Processing Systems 2016, 85, 373–392.
[10] Kreutzer, M.; Thies, J.; Röhrig-Zöllner, M.; Pieper, A.; Shahzad, F.; Galgon, M.; Basermann, A.; Fehske, H.; Hager, G.; Wellein, G. GHOST: Building Blocks for High Performance Sparse Linear Algebra on Heterogeneous Systems. International Journal of Parallel Programming 2016, 45, 1046–1072.
[11] Papadrakakis, M.; Stavroulakis, G.; Karatarakis, A. A New Era in Scientific Computing: Domain Decomposition Methods in Hybrid CPU–GPU Architectures. Computer Methods in Applied Mechanics and Engineering 2011, 200, 1490–1508.
[12] Khaleghzadeh, H.; Fahad, M.; Shahid, A.; Reddy, R.; Lastovetsky, A. Bi-Objective Optimization of Data-Parallel Applications on Heterogeneous HPC Platforms for Performance and Energy Through Workload Distribution. IEEE Transactions on Parallel and Distributed Systems 2021, 32, 543–560.
[13] Khaleghzadeh, H.; Reddy, R.; Lastovetsky, A. Efficient Exact Algorithms for Continuous Bi-Objective Performance-Energy Optimization of Applications with Linear Energy and Monotonically Increasing Performance Profiles on Heterogeneous High Performance Computing Platforms. Concurrency and Computation: Practice and Experience 2022, e7285.
[14] FFTW. FFTW: A Fast, Free C FFT Library. Available online: https://www.fftw.org (accessed on 1 December 2022).
[15] Lastovetsky, A.L.; Reddy, R. Data Partitioning with a Realistic Performance Model of Networks of Heterogeneous Computers. In Proceedings of the Parallel and Distributed Processing Symposium, Hong Kong, China, 13–15 December 2004.
[16] Lastovetsky, A.; Reddy, R. Data Partitioning with a Functional Performance Model of Heterogeneous Processors. International Journal of High Performance Computing Applications 2007, 21, 76–90.
[17] Lastovetsky, A.; Twamley, J. Towards a Realistic Performance Model for Networks of Heterogeneous Computers. In High Performance Computational Science and Engineering; Springer: Berlin, Germany, 2005; pp. 39–57.
[18] Khaleghzadeh, H.; Reddy, R.; Lastovetsky, A. A Novel Data-Partitioning Algorithm for Performance Optimization of Data-Parallel Applications on Heterogeneous HPC Platforms. IEEE Transactions on Parallel and Distributed Systems 2018, 29, 2176–2190.
[19] Khaleghzadeh, H.; Fahad, M.; Reddy, R.; Lastovetsky, A. A Novel Data Partitioning Algorithm for Dynamic Energy Optimization on Heterogeneous High-Performance Computing Platforms. Concurrency and Computation: Practice and Experience 2020, 32, e5928.
[20] Fahad, M.; Shahid, A.; Manumachu, R.R.; Lastovetsky, A. Accurate Energy Modelling of Hybrid Parallel Applications on Modern Heterogeneous Computing Platforms Using System-Level Measurements. IEEE Access 2020, 8, 93793–93829.
[21] Alexey Lastovetsky, Ravi Reddy Manumachu Energy-Efficient Parallel Computing: Challenges to Scaling, Information 2023, 14(4), 248.