Understanding Biased Algorithms and Users' Behavior around Them in Rating Platforms
Algorithms are powerful. They collect, process, and present information in today's online world; and in doing so, they exert influence over users' interaction with the system. Rating algorithms are one example that their outputs - i.e., business ratings based on online reviews - significantly impact users' behavior and accordingly the success of a business. A simple half-star improvement on a Yelp rating, for instance, results in a 30-49% higher likelihood of selling out the seats for a restaurant (Anderson and Magruder 2012).
While rating algorithms are influential, little about how they work is made public. These algorithms' internal processes, and sometimes their inputs, are usually housed in
black boxes, both to protect intellectual property and to prevent reviewers from gaming business ratings. Computing a raw average of users' reviews is a simple way to calculate a business rating; however, many rating platforms do not use this approach. Amazon, for instance, calculates a product's overall rating by taking into account factors including the age of the review, helpfulness votes and whether the reviews are from verified purchases (Bishop 2015). Some other platforms, like Yelp (Yelp 2010), calculate a business rating by computing a raw average of customer reviews, but only of reviews that their rating algorithms classify as authentic or "not fake." While we understand the overview of these algorithms, their details are proprietary.
The power and opaqueness of algorithmic rating systems have raised concerns about the bias they might introduce into online ratings. As an example, in May 2016, Australian
Uber drivers accused the company of slowly decreasing their ratings to suspend them and then charge higher commissions to be reinstated. The president of the Ride Share Drivers' Association of Australia noted that "the lack of transparency makes it entirely possible for Uber to manipulate the ratings" (Tucker 2016). Other algorithmic rating systems such as Yelp (Fowler 2011) and Fandago (Hickey 2015) have faced similar criticisms. While recent studies have investigated users' awareness of and interaction with algorithms - including news feed curation algorithms (Eslami et al. 2015; 2016; Rader and Gray 2015) and ridesharing management algorithms (Lee et al. 2015) -, how users perceive and manage the bias that an algorithm brings to their online experience is still an open question.
In this proect, we try to fill the above gaps by investigating algorithmic bias and users' awareness of and behavior around the bias in hotel rating platforms. An initial study suggested that a potential bias on a hotel rating platform (Booking.com) skewed low review scores upwards. To analyze this potential bias, we used a cross-platform audit technique comparing the outputs of Booking.com and two other popular hotel rating platforms. Analyzing the ratings of 803 hotels showed that Booking.com's rating system biased ratings of hotels, particularly low-to-medium quality hotels, to be significantly higher than other platforms (up to 37%).
We then employed a mixed-method design to study users' behavior around this bias. First, we applied a computational technique to identify the users who noticed the bias; next,
we conducted qualitative analysis over their reviews to understand how users behaved around the bias. We found 162 users who independently discovered the algorithm's bias
through their regular use. These users, rather than contributing the usual review content (i.e., informing other users about their hotel stay experience), adopted an "auditing"
practice. When confronted by a higher than intended review score, they used their review to raise the bias awareness of other users on the site. To do so, they wrote about how they: engaged in activities such as trying to manipulate the algorithm's inputs to look into its black-box, tried to correct the bias manually, and illustrated a breakdown of trust.
M. Eslami, K. Vaccaro, K. Karahalios, and K. Hamilton. "Be careful; things can be worse than they appear": Understanding Biased Algorithms and Users' Behavior around Them in Rating Platforms. The International AAAI Conference on Web and Social Media (ICWSM), 2017. pdf