Sections

Research

How customer feedback surveys perpetuate workplace inequality

And what policymakers can do about it

Shutterstock / Black Salmon

Introduction

Much of the conversation about workplace discrimination centers on employer bias—whether in hiring, pay, or promotion. Yet as customer reviews and ratings become increasingly critical for shaping work hours, roles, and compensation, an overlooked question arises: How might customers, rather than employers, drive inequality in the workplace?

In a recent study of a nationwide, American, casual-dining restaurant chain, my coauthors, Masoud Kamalahmadi and Yong-Pin Zhou, and I uncovered clear evidence that customers may discriminate against workers based on race and gender (Kamalahmadi et al. 2023). These biases can materially affect servers’ livelihood—particularly in an industry where schedules, wages, and promotion opportunities increasingly hinge on customer survey results. More strikingly, women, although they make up the majority of the restaurant server workforce, still receive lower ratings than men, even when their performance and all other relevant factors are the same. These biases undermine key workplace ideals of fairness and meritocracy, often resulting in a compounded disadvantage over time.

In this article, I discuss our main findings, highlight the mechanisms driving these biases, and suggest ways to design policies and management strategies that mitigate the harm such biases inflict upon workers. Throughout, we use the terms discrimination and bias interchangeably, defining them as the unfavorable treatment of workers based on race or gender.

Background: The shift from employer-driven to customer-driven evaluations

In service industries, employers have historically been the arbiters of hours, promotions, and pay. Over the last decade, however, a wave of digitization has given rise to the use of online feedback forms, tablet-based survey tools, and app-based rating systems which in turn has given customers direct influence over how workers are evaluated. This dynamic is especially pronounced across a broad spectrum of frontline service roles—from restaurants, retail shops, and hotels to gig economy apps, telehealth platforms, and virtual call centers—where workers’ schedules, earnings, and even career growth can hinge on real-time customer ratings.

While this might offer a more “democratic” means of gauging service quality, it also raises uncomfortable questions: What if customers bring their own biases into these reviews? And how do companies ensure that an employee’s hours, pay or promotion isn’t unfairly determined by race or gender stereotypes?

Recent research indicates that women often receive lower ratings in traditionally male-dominated roles—such as doctors (Hekman et al. 2010), college professors (Boring 2017), and financial advisors (Botelho and Abraham 2017). This pattern is frequently explained by role congruity theory (Eagly and Karau 2002), which suggests that customers perceive these professions as aligned with stereotypically male traits and, therefore, evaluate female professionals more harshly. By that logic, one might expect men to face a similar bias in female-dominated fields, such as restaurant service. Yet our study challenges this assumption: Even in a setting where women make up the majority of the workforce, female servers still receive lower customer ratings than men. This suggests that customer bias can persist regardless of whether a profession is male- or female-dominated—underscoring the depth and breadth of race and gender stereotypes in the modern service economy.

Data and methodology

Our analysis focuses on a large, full-service casual dining restaurant chain with over 500 locations across the United States. This chain serves a diverse customer base that closely reflects the demographic makeup of the overall U.S. population. At this restaurant chain, customers paid their bills using tabletop tablets. After payment, they were invited to complete a brief customer satisfaction survey, which included a key question: “How likely are you to recommend this restaurant to others?” Customers answered using a scale from 0 (“Not at all likely”) to 10 (“Extremely likely”). Restaurant managers used this rating as a key performance metric to evaluate server performance.

We chose the restaurant industry for this study because it is one of the largest private-sector employers in the United States, accounting for roughly 10% of the national workforce (National Restaurant Association 2023. The industry is also notably diverse: women make up 70% of servers, and racial minorities represent 35% of the server workforce (Shierholz 2014). With increasing digitization, restaurants now rely heavily on customer ratings to assess server performance. Low ratings can lead to fewer scheduled hours, less desirable shifts, lower-paying roles, or even job termination (Darrah 2021). Given the significant career implications of these ratings—and the widespread use of this practice—it is crucial to examine whether customer biases influence these evaluations and to explore the mechanisms driving such biases.

To build our dataset, we combined information from four key sources covering a nine-month period (January to September 2016) across 25 restaurant locations in the Pacific Northwest:

  • Point-of-sale (POS) transaction data covering over 1.46 million meal orders. This data includes details such as the amount spent, items ordered, meal duration, and which server handled each check.
  • Customer satisfaction surveys, with 259,281 completed responses—an 18% response rate, which is relatively high for customer surveys of this kind (Chung 2022).
  • Employee records, documenting each worker’s job title, hire date, and employment status (full-time or part-time).
  • Work schedules, capturing both the shifts offered to employees and the shifts they actually worked.

Although our dataset does not include employees’ race or gender directly, it provides servers’ full first and last names, which can reliably indicate both attributes. We use two widely adopted algorithms—NamSor and NamePrism—to infer race and gender (Barrios et al. 2020; Munz et al. 2020). NamSor, trained on global name datasets (e.g., Olympic athletes), reports over 95% gender accuracy and around 75% average accuracy across U.S. ethnicities. NamePrism, trained on 74 million labeled names from email contact lists, assigns probabilistic ethnic group labels (Ye et al. 2017). When the two tools disagree, we manually review the names using Google Images and assign race and gender based on consensus. Importantly, our results remain robust when using each algorithm independently.

Our detailed data links each customer rating directly to the server responsible for that table. It also allows us to account for many factors that could influence customer ratings—such as how long servers have worked at the restaurant, whether they are full-time or part-time, their sales skills, and how quickly they serve the customers. We also account for operational factors like how busy the servers were during their shifts, their schedules, and whether they might have been fatigued after long hours. On top of that, we adjust for broader situational factors, such as how crowded the restaurant was, which location the shift took place at, and the time trend (i.e., hour of the day, day of the week, and week of the year). Controlling for these factors ensures that differences in ratings are not mistakenly attributed to race or gender when they are actually due to differences in experience, workload, shift conditions, or others.

Although customer characteristics—like their demographics—could also affect ratings, this is less of a concern in our study because of how customers are assigned to servers. At this restaurant chain, customers are assigned to servers through a round-robin system, meaning each server interacts with a similar mix of customers. This random assignment makes it unlikely that certain types of customers only interact with certain types of servers, reducing the risk of biased comparisons. To further ensure our analysis is reliable, we apply a statistical adjustment known as the Heckman correction (Heckman 1979), which helps account for the fact that not every customer chooses to complete a survey. This adjustment helps correct for any potential bias caused by only hearing from certain types of customers.

Even with these safeguards in place, we recognize that some individual characteristics of servers—such as their accents, appearance, or personality—are not captured in our data. These factors could influence customer ratings and may also be correlated with a server’s race or gender. Ignoring these possibilities could lead to overstating the role of customer bias.

To directly assess whether customer discrimination is driving the differences we observe, we apply two additional tests:

  1. Comparing loyalty-program and non-loyalty-program customers. Because customers are randomly assigned to servers, the same mix of servers interacts with both loyal customers (those enrolled in the restaurant’s loyalty program) and non-loyal customers. If hidden traits of servers—like personality or appearances—were the main explanation for rating differences, those hidden traits would influence both groups in the same way. However, if racial and gender disparities in ratings vary between loyal and non-loyal customers, this suggests that customer perceptions, rather than unmeasured server characteristics, are responsible for the differential disparity. Loyal customers frequent the restaurants more often and therefore become more familiar with both the establishment and its staff than non-loyal customers. According to the classic theory of statistical discrimination (Phelps 1972; Arrow 1972), when customers lack direct information about a server’s quality, they lean on group‑based stereotypes. However, as uncertainty decreases for loyal customers through repeated interactions and growing familiarity, racial or gender biases should diminish compared to non-loyal customers.
  2. Controlling for past performance. We also examine how a server’s race and gender affect their ratings after accounting for their previous ratings. In an unbiased system, past ratings should already reflect a server’s ability to earn high scores—capturing both observable and unobservable skills, such as communication and interpersonal effectiveness. Therefore, if racial or gender disparities persist even after adjusting for these past scores, it would lead us to reject the null hypothesis of no discrimination and in turn provides additional evidence that customer bias is playing a role. A similar approach was previously applied in Fryer et al. (2013) and Wagner et al. (2016) when studying discrimination in the context of offered wages and teaching evaluation.

Together, these steps allow us to isolate the effect of race and gender on customer ratings from other potential explanations.

Key findings

Our analysis provides evidence that customer ratings are biased against racial minority servers and, more strikingly, also against female servers—even though women make up the majority of the restaurant workforce. On average, minority servers receive ratings that are 0.6% lower than those of white servers, while female servers receive ratings that are 0.8% lower than male servers. The bias is more pronounced when considering the combined effects of race and gender. White female servers and minority servers (both male and female) receive ratings between 1.0% and 1.3% lower than white male servers. Additionally, female servers of all races are two percentage points less likely to receive a perfect score.

In a system where most servers receive near-perfect ratings, these differences matter. In our data, the median server’s average rating was 8.75 on a 0 to 10 scale. A 0.1-point drop pushes a server from the 50th to the 42nd percentile, raising the risk of losing shifts or receiving worse schedules. A 0.1-point gain, by contrast, moves a server to the 60th percentile, unlocking better shifts and opportunities. Over time, these biases compound, limiting women’s and minorities’ chances for promotion into higher-paying or managerial roles (Agars 2004).

Why does it matter?

Legal gray area. Current anti-discrimination laws primarily focus on employer practices, not on customer biases. Yet as modern feedback tools increasingly shape employee evaluations, it’s vital for lawmakers to recognize how customer attitudes may perpetuate workplace inequalities.

Practical toll on workers. Reduced hours, poorer shift assignments, and fewer promotion opportunities are more likely when an employee’s score slips. Even a small difference repeated over hundreds of shifts quickly creates large wage gaps.

Organizational reputation and profitability. From a strictly financial standpoint, ignoring discrimination hurts staff morale, increases turnover, and can damage a company’s public image (Nunez-Smith et al. 2009; Heiserman and Simpson 2023). Conversely, addressing bias can stabilize the workforce and enhance service quality, as employees feel supported and fairly evaluated (Heskett et al. 1994; Hogreve et al. 2017).

Management and policy implications

Below I offer actionable steps for businesses and policymakers to mitigate customers’ biases in their ratings and their impact on front-line workers.

Management implications

Provide more objective information to reduce racial stereotypes. Based on our analysis, racial bias in customer evaluations is closely linked to uncertainty about the service experience, aligning with the theory of statistical discrimination (Phelps 1972; Arrow 1972). As such, mitigating customer racial biases should focus on providing reliable, objective information about servers’ performance and abilities. For example, ride-sharing platforms such as Uber and Lyft already provide passengers with key information about drivers, including their name, photo, tenure, and the number of completed rides. Similarly, many restaurants collect customer satisfaction data electronically, creating an opportunity to share objective performance metrics with diners. Restaurants could display information such as a server’s tenure, total number of customers served, or specialized training alongside their name and photo via digital receipts or tablets. By equipping customers with this objective data, restaurants may help mitigate racial biases and foster fairer evaluations of servers’ performance.

Address entrenched gender status hierarchies (internal actions). Providing customers with objective performance information may help mitigate racial bias. This approach, however, is likely to be less effective in addressing gender bias, which our analysis suggests is rooted in status‑based discrimination rather than information gaps (Foschi 2000). Employment in fine‑dining venues remains male‑dominated, signaling that men hold higher status in these roles and reinforcing outdated stereotypes (Hall 1993). Equally qualified women are 39% less likely to receive job offers and 35% less likely to be invited for interviews at upscale restaurants compared to men (Neumark et al. 1996). Management can disrupt this pattern by adopting fair, transparent hiring and promotion practices—blind resume reviews, standardized interview questions, and structured performance tests—that have reduced bias in other sectors (Goldin and Rouse 2000). Actively recruiting and advancing women into visible front‑of‑house roles will, over time, normalize their presence, erode the notion that men are “naturally” better suited to prestige service, and signal fairness to both staff and patrons. The payoff is not merely ethical: Greater gender diversity strengthens brand reputation, broadens the customer base, and boosts morale and retention (Hunt et al. 2015).

Use multiple metrics in performance evaluations. To improve fairness in performance evaluations, restaurants—if they have not already—should adopt a multi-metric approach that combines subjective customer feedback with objective performance data, such as sales performance, order accuracy, and table turnover rates. Relying solely on customer ratings—known to be influenced by biases related to race, gender, and age—risks distorting evaluations and unfairly penalizing certain workers. By blending subjective and objective metrics, managers gain a more complete and more accurate picture of employee performance while reducing the risk that customer prejudice overshadows actual service quality. In addition to promoting fairness, this approach can strengthen workplace culture, improve employee morale, and enhance staff retention—ultimately contributing to better service and business performance over time.

Policy implications

Treat biased customer-rating systems as Title VII selection devices. Regulators have long enforced laws prohibiting discrimination by employers (this is the cornerstone of Title VII of the Civil Rights Act of 1964). However, customer-driven bias remains largely unchecked, even though customer evaluations increasingly influence decisions around employee scheduling, promotions, and pay. The Equal Employment Opportunity Commission, which enforces Title VII of the Civil Rights Act, could issue guidance clarifying that when employers translate customer scores into scheduling, discipline, or promotion decisions, those scores function as an employment test subject to disparate-impact review. Reframing ratings this way would close a critical gap that legal scholars note in current antidiscrimination doctrine, which rarely reaches “consumer-sourced” bias (Cunningham-Parmeter 2023). Firms would then bear the burden of showing that customer scores are job-related and that no less-discriminatory alternative (for example, a composite index that down-weights ratings) would serve the same business need.

Mandate annual bias audits of customer-rating systems. Policymakers could require businesses to analyze customer ratings for patterns of bias based on race, gender, age, or other protected characteristics and demonstrate how they actively mitigate or correct for such bias when using these ratings in employment decisions. This might include adjusting evaluation processes, incorporating objective performance metrics, or applying statistical techniques to detect and mitigate patterns of biased scoring. For example, New York City’s Local Law 144 (2023) tackles algorithmic bias in hiring by requiring annual bias audits of AI-driven employment tools—such as resume screeners—and public disclosure of the results. As one of the first laws of its kind, it highlights the risks of embedding human biases into software and mandates independent audits to assess disparate impact, ensuring transparency and accountability in AI hiring. While fine-dining restaurants may not use AI hiring tools, New York’s approach offers a broader model for fairness in evaluations. Similar measures—such as auditing customer ratings or limiting reliance on subjective feedback—could enhance equity in hospitality and service industries.

Guarantee a worker right of appeal before adverse action. Recent litigation over racially biased ride-share deactivations underscores how easily low ratings can cost someone a job with no procedural safeguards. State labor agencies could require employers to provide (a) advance notice of the rating threshold that triggers penalties and (b) an opportunity for the employee to contest the score or supply context—such as a kitchen delay that was outside the server’s control. The model parallels due-process protections already embedded in credit-reporting and background check laws.

Pair a consumer-education campaign with mandatory “fair-rating” nudges. Regulators (e.g., the Federal Trade Commission or state attorneys general) could require a brief prompt on digital receipts and survey screens—“We value fair and unbiased evaluations. Please rate your server based solely on the quality of service provided, regardless of gender, race, age, or background.” While bias-awareness nudges are not a standalone solution, they are a low-cost tool that signals the company’s commitment to fair treatment and can gradually influence customer behavior. When paired with broader efforts to promote fairness and mitigate biases, these nudges can contribute to fairer performance evaluations for servers of all backgrounds.

Break gender hierarchies through targeted equity audits at the hiring stage. To meaningfully dismantle entrenched gender hierarchies in the restaurant industry, voluntary commitments by individual establishments as previously mentioned are unlikely to suffice. State civil rights agencies could conduct targeted “blind application” audits of upscale restaurants to uncover potential hiring discrimination. These audits would involve submitting matched pairs of resumes—identical in qualifications and experience but differing in gender—and comparing callback, interview, and job offer rates. If gender-based disparities are detected, the restaurant would enter a negotiated remediation process that could include bias training, adoption of anonymized recruitment protocols, and structured interview rubrics, followed by a mandatory follow-up audit. While paired‑applicant testing is a well-established tool in housing and corporate employment investigations, it remains underutilized in the restaurant sector—largely due to capacity constraints. To make this approach viable, states could permit agencies to recoup reasonable testing costs through penalties or administrative fees, and partner with third-party nonprofits to build a trained cadre of testers. Moving from a complaint-driven model to proactive, evidence-based enforcement would not only yield clear documentation of discriminatory practices but also incentivize ongoing compliance, fostering a more equitable pipeline into front-of-house roles.

Conclusion

Our findings highlight a critical shift in workplace bias: Customers—not employers—often steer pay and advancement in modern service jobs. Despite laws like the Civil Rights Act curbing overt employer bias, policy lags behind when it comes to discrimination disguised as “customer preference.” As technology continues to spur widespread reliance on customer feedback, ignoring or dismissing discriminatory patterns in these ratings risks perpetuating deeply entrenched gender and racial inequalities. The good news is that robust data—and a willingness to confront these biases—can help leaders craft fairer evaluation systems that protect workers and still account for genuine quality of service.

  • References

    Agars, Mark D. 2004. “Reconsidering the Impact of Gender Stereotypes on the Advancement of Women in Organizations.” Psychology of Women Quarterly 28 (2): 103—111.

    Arrow, Kenneth J. 1972. Some Mathematical Models of Race Discrimination in the Labor Market. In Racial Discrimination in Economic Life, 187—204.

    Barrios, John M., Laura M. Giuliano, and Andrew J. Leone. 2020. In Living Color: Does In-Person Screening Affect Who Gets Hired? University of Chicago, Becker Friedman Institute for Economics Working Paper No. 2020-38.

    Boring, Anne. 2017. “Gender Biases in Student Evaluations of Teaching.” Journal of Public Economics 145: 27—41.

    Botelho, Taya L., and Mabel Abraham. 2017. “Pursuing Quality: How Search Costs and Uncertainty Magnify Gender-Based Double Standards in a Multistage Evaluation Process.” Administrative Science Quarterly 62 (4): 698—730.

    Chung, Lucia. 2022. “What Is a Good Survey Response Rate for Online Customer Surveys?” Delighted by Qualtrics, February 17.

    Cunningham-Parmeter, Keith. 2023. “Discrimination by Algorithm: Employer Accountability for Biased Customer Reviews.” UCLA Law Review 70: 92.

    Darrah, Donnie. 2021. “How Customer Service Surveys Are Eroding Workers’ Rights.” Jacobin, April 2021. https://jacobin.com/2021/04/customer-service-surveys-reviews-workers-rights.

    Eagly, Alice H., and Steven J. Karau. 2002. “Role Congruity Theory of Prejudice Toward Female Leaders.” Psychological Review 109 (3): 573—598.

    Foschi, Martha. 2000. “Double Standards for Competence: Theory and Research.” Annual Review of Sociology 26 (1): 21—42.

    Fryer, Roland G., Devah Pager, and Jörg L. Spenkuch. 2013. “Racial Disparities in Job Finding and Offered Wages.” Journal of Law and Economics 56 (3): 633—689.

    Goldin, Claudia, and Cecilia Rouse. 2000. “Orchestrating Impartiality: The Impact of ‘Blind’ Auditions on Female Musicians.” American Economic Review 90 (4): 715—741.

    Hall, Elaine J. 1993. “Smiling, Deferring, and Flirting: Doing Gender by Giving ‘Good Service.’” Work and Occupations 20 (4): 452—471.

    Heckman, James J. 1979. Statistical Models for Discrete Panel Data. Chicago, IL: Department of Economics and Graduate School of Business, University of Chicago.

    Heiserman, Nicholas, and Brent Simpson. 2023. “Discrimination Reduces Work Effort of Those Who Are Disadvantaged and Those Who Are Advantaged by It.” Nature Human Behaviour 7 (11): 1890—1898.

    Hekman, David R., Karl Aquino, Bradley P. Owens, Terence R. Mitchell, Pauline Schilpzand, and Keith Leavitt. 2010. “An Examination of Whether and How Racial and Gender Biases Influence Customer Satisfaction.” Academy of Management Journal 53 (2): 238—264.

    Heskett, James L., Thomas O. Jones, Gary W. Loveman, W. Earl Sasser Jr., and Leonard A. Schlesinger. 1994. “Putting the Service-Profit Chain to Work.” Harvard Business Review 72 (2): 164—174.

    Hogreve, Jens, Michael K. Brady, Michael D. Mccoll-Kennedy, Liam Glynn Mangold, and Tracey S. Dagger. 2017. “The Service—Profit Chain: A Meta-Analytic Test of a Comprehensive Theoretical Framework.” Journal of Marketing 81 (3): 41—61.

    Hunt, Vivian, Dennis Layton, Sara Prince, et al. 2015. Diversity Matters. McKinsey & Company.

    Kamalahmadi, Masoud, Qiuping Yu, and Yong-Pin Zhou. 2023. “Racial and Gender Biases in Customer Satisfaction Surveys: Causal Evidence from a Restaurant Chain.” Georgetown McDonough School of Business Research Paper No. 4420106. https://ssrn.com/abstract=4418420 or http://dx.doi.org/10.2139/ssrn.4418420.

    Munz, Kevin P., Moon H. Jung, and Adam L. Alter. 2020. “Name Similarity Encourages Generosity: A Field Experiment in Email Personalization.” Marketing Science 39 (6): 1071—1091.

    National Restaurant Association. 2023. “Restaurant Employee Demographics.” https://restaurant.org/getmedia/a3912d4b-9fd5-42f5-989c-fbc8e8929772/nra-data-brief-restaurant-employee-demographics-april-2025.pdf.

    Neumark, David, Roy J. Bank, and Kyle D. Van Nort. 1996. “Sex Discrimination in Restaurant Hiring: An Audit Study.” Quarterly Journal of Economics 111 (3): 915—941.

    O’Donovan, Caroline. 2018. “An Invisible Rating System at Your Favorite Chain Restaurant Is Costing Your Server.” BuzzFeed News. https://www.buzzfeednews.com/article/carolineodonovan/ziosk-presto-tabletop-tablet-restaurant-rating-servers

    Phelps, Edmund S. 1972. “The Statistical Theory of Racism and Sexism.” American Economic Review 62 (4): 659—661.

    Rosenblat, Alex, Karen E.C. Levy, Solon Barocas, and Tim Hwang. 2016. “Discriminating Tastes: Customer Ratings as Vehicles for Bias.” Data & Society. https://datasociety.net/library/discriminating-tastes/.

    Shierholz, Heidi. 2014. Low Wages and Few Benefits Mean Many Restaurant Workers Can’t Make Ends Meet. Economic Policy Institute Briefing Paper No. 383.

    Wagner, Nora, Matthias Rieger, and Katie Voorvelt. 2016. “Gender, Ethnicity and Teaching Evaluations: Evidence from Mixed Teaching Teams.” Economics of Education Review 54: 79—94.

    Ye, Junting, Shuchu Han, Yifan Hu, Berkay Coskun, Muhao Liu, Hong Qin, and Steven Skiena. 2017. “Nationality Classification Using Name Embeddings.” In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 1897—1906.

Author

The Brookings Institution is committed to quality, independence, and impact.
We are supported by a diverse array of funders. In line with our values and policies, each Brookings publication represents the sole views of its author(s).