We utilized the website Reddit's API to scrape posts from the Microsoft Excel subreddit. We first scraped random posts from the site and separated them into themes. We then entered search terms into the forum which related to scalability issues such as "slow" and "crash". We then categorized these posts.
We found that of the 712 posts we collected, 83 posts related to scalability and these scalability posts as well as the random posts fell into 5 main themes: importing data, managing data, querying data, presenting data, and miscellaneous. From these results, we discuss the possible ways we can improve Microsoft Excel or inform the development of other spreadsheet tools meant to handle large quantities of data.
Kelly Mack, John Lee, Kevin Chen-Chuan Chang, Karrie Karahalios, and Aditya Parameswaran. Characterizing Scalability Issues in Spreadsheet Software using Online Forums. CHI 2018. pdf
Kelly Mack- firstname.lastname@example.org
John Lee- email@example.com