An Empirical Exploration of Python Machine Learning API Usage
Machine learning is becoming an increasingly important part of many domains, both inside and outside of computer science. With this has come an increase in developers learning to write machine learning applications in languages like Python, using application programming interfaces (APIs) such as pandas and scikit-learn. However, given the complexity of these APIs, they can be challenging to learn, especially for new programmers. To create better tools for assisting developers with machine learning APIs, we need to understand how these APIs are currently used. In this thesis, we present a study of machine learning API usage in Python code in a corpus of machine learning projects hosted on Kaggle, a machine learning education and competition community site. We analyzed the most frequently used machine learning related libraries and the sub-modules of those libraries. Next, we studied the usage of different calls used by the developers to solve machine learning tasks. We also found information about which libraries are used in combination and discovered a number of cases where the libraries were imported but never used. We end by discussing potential next steps for further research and developments based on our work results.
Vilkomir, Aleksei. (November 2020). An Empirical Exploration of Python Machine Learning API Usage (Master's Thesis, East Carolina University). Retrieved from the Scholarship. (http://hdl.handle.net/10342/8796.)
Vilkomir, Aleksei. An Empirical Exploration of Python Machine Learning API Usage. Master's Thesis. East Carolina University, November 2020. The Scholarship. http://hdl.handle.net/10342/8796. January 15, 2021.
Vilkomir, Aleksei, “An Empirical Exploration of Python Machine Learning API Usage” (Master's Thesis., East Carolina University, November 2020).
Vilkomir, Aleksei. An Empirical Exploration of Python Machine Learning API Usage [Master's Thesis]. Greenville, NC: East Carolina University; November 2020.
East Carolina University