Machine learning (ML) is one of the most promising fields in computing, but researchers can’t unlock its full potential without organization and oversight. The School of Computer Science’s database and systems researchers can bridge the gap between effective systems and efficient machine learning.
ML and databases
Our researchers provide structure to machine learning that benefits everyone. “Machine learning researchers mostly think about algorithms and theories, but real-world machine learning applications require a data processing pipeline,” said Assistant Professor Xu Chu.
As a database researcher, he realized that database research techniques can facilitate a ML pipeline. For example, approximation query processing can answer SQL queries quickly on sample data. Similarly, Chu is researching approximate model training on only a portion of the training data instead of burdening a model with the complete dataset.
Yet the relationship is symbiotic. ML can help ensure data is consistent and easy to use, a process known as data cleaning. Traditionally, researchers use rule-based techniques to data clean. Generating these rules is a time-consuming and difficult process, so Chu is exploring ML-based techniques that leverage the statistical distributions of data used in cleaning.
ML and systems
Systems also play a vital role in the predictive elements of ML. “Machine learning is still in its infancy from a systems perspective,” Professor Ling Liu said. Our researchers have found ways to build systems for ML and ML for systems that improve performance universally.
Associate Professor Hyesoon Kim boosts the coverage and accuracy of distributed ML systems. Internet of Things (IoT) devices, such as thermostats or security cameras, rely on ML to detect anomalies in data. Yet analyzing the data on a main company’s server or the cloud could compromise security. Distributing the work across IoT devices allows researchers to do computations faster with less power.
Liu creates new systems optimization techniques and libraries for more effective ML. She also develops enhanced ML models and algorithms to improve the performance of cloud systems. This systems for ML and ML for systems perspective is only the start of her research.
Despite the power of ML, it must be used responsibly. Liu wants to bring awareness to the need for fairness, privacy, trust, and accountability of algorithmic decision-making in ML and AI. “We need to encourage innovation in diverse ML algorithms that can extract features and learn different hidden correlations and perspectives over the same datasets,” Liu said. “It is also important to leverage ML as a valuable reference point, rather than the sole factor in decision-making.”
Our hybrid expertise builds stronger ML that can be used in all areas of computing.
Illustrations by Pikisuperstar/Freepik.com