As big data becomes more critical to society, computer scientists are searching for ways to make it run more efficiently. Georgia Tech researchers are finding solutions to this challenge by using machine learning (ML) to make data storage and access easier.
Kleio is a hybrid memory management system that uses ML and more common historical methods to predict which data is most frequently accessed. The tool is 80 percent more effective than prior methods and shows the potential for applying ML to systems research.
There are two popular memory types: the efficient yet expensive dynamic random-access memory (DRAM) and the more affordable but slower non-volatile memory. Researchers often combine their benefits by merging DRAM and non-volatile into one system commonly referred to as heterogeneous memory.
A heterogeneous memory system may be inefficient and must be used strategically to get maximum benefit.
“There is a difference in access speed, so we need to be clever on where we place the data,” said Thaleia Dimitra Doudali, a fifth-year School of Computer Science (SCS) Ph.D. student.
Most computer scientists put frequently accessed data, or hot data, in DRAM and the remaining in non-volatile. Hot data is typically determined by which data is historically accessed most, but this can lead to faulty predictions.
ML and Memory
The researchers knew using ML for systems problems like this was becoming a more popular solution and thought it could apply to memory management. They determined deep recurrent neural networks (RNN) were the best ML option because they use as input a sequence of data to make a prediction.
“Recurrent neural networks fit perfectly because given past access information, they can predict future access patterns,” Doudali said.
The challenge came in determining which part of the memory access process to predict. The researchers wanted ML prediction for every piece of data. Yet they also needed the system to be practical and scalable.
To achieve both, they created Kleio, a hybrid memory management system that combines ML and historical methods in order to predict the data access frequency and efficiently manage data across the memory components.
How Kleio works
Given the available system resources, Kleio prioritizes ML training for data that can increase application performance if placed in the right order. Periodically, Kleio combines the historical and ML access pattern predictions to identify hot data in real-time and migrate them in DRAM. In this way, Kleio can deliver on average 80 percent of the feasible application performance improvements.
“We showed that the use of ML in memory management is very promising,” Doudali said. “Its accuracy can bridge the performance gap between existing and oracular solutions.”
Doudali presented the work at the eighth International Symposium on High-Performance Parallel and Distributed Computing in Phoenix, Arizona, in June. The paper, Kleio: A Hybrid Memory Page Scheduler with Machine Intelligence, was honored as a best paper finalist. Doudali co-wrote it with SCS Associate Professor Ada Gavrilovska and Advanced Micro Devices, Inc. researchers Sergey Blagodurov, Abhinav Vishnu, and Sudhanva Gurumurthi.