The research area of Data Science Platforms, Machine Learning Systems, and Databases focuses on creating and optimizing the tools and infrastructure needed for data-driven analysis and decision-making. Data Science Platforms provide integrated environments that streamline the entire data analysis process, offering tools for coding, data exploration, and model building, often with cloud scalability and collaboration features. These platforms aim to make data science more accessible and efficient, including through automation features like AutoML.

Machine Learning Systems are designed to facilitate the creation, training, and deployment of machine learning models. These systems focus on efficient model training, deployment in production environments, and ensuring model interpretability and fairness. They also emphasize the integration of machine learning models with existing data pipelines, enabling real-time or batch processing of data, which is crucial for applying machine learning in practical, operational contexts.

Databases are the backbone of this research area, providing the necessary infrastructure for storing, retrieving, and processing massive datasets. This includes the development of big data technologies, query optimization techniques, and ensuring data security and privacy. Databases must integrate seamlessly with analytical tools to support efficient data extraction and processing, enabling machine learning systems and data science platforms to function effectively. Together, these fields form the core of modern data science, driving innovation and efficiency across various industries.


Faculty

Highlights

https://highlights.cis.upenn.edu/category/research/databases