[EN] Fata Morgana: The Business Case for Diversity
What's left of the business case for diversity?
Note: I will periodically re-release some of my German articles in English for a broader audience. Please be advised that this translation is supported by ChatGPT and be aware that his text was written for a primarily German audience and from the perspective of someone working at a German university. Some of the topics I will be discussing have had plenty of space in the English speaking world (for example by
or on the ), but have been largely overlooked in Germany.TL;DR: The so-called “business case for diversity”—the claim that diversity boosts organizational performance—is widely embraced in politics, business, and academia. Major consulting firms such as McKinsey have popularized this narrative, yet recent research shows that the expected gains in innovation, productivity, or financial performance are barely supported by empirical evidence. Meta-analyses in team research find that diversity typically has no—or only a very small—effect on performance, even under favorable conditions. For universities, the data are even scarcer and rarely examined systematically. Basing diversity initiatives on performance promises therefore involves political risk: if the effects fail to appear, disappointment and a loss of legitimacy may follow. Diversity remains important—but its justification should rest on honesty, not on unproven claims.
Hardly any narrative has spread as successfully in recent years as the claim that diversity enhances organizational performance. The idea that different perspectives lead to more innovation, better decisions, and higher productivity has become a standard argument in strategy papers, grant proposals, and mission statements.
Large consulting firms such as McKinsey & Company have played a decisive role in popularizing the story. Reports with titles like Diversity Matters (2015), Delivering Through Diversity (2018), and Diversity Wins (2020) are quoted worldwide. Each report claimed a statistically significant link between the diversity of leadership teams and a company’s financial performance.
Recent research, however, casts serious doubt on this tale. Before diving in, let’s clarify a few terms.
What exactly is diversity?
Diversity refers to differences between people within social groups. These differences can involve visible traits as well as less obvious attributes, experiences, and perspectives. Research usually distinguishes two layers (van Knippenberg & Schippers, 2007):
Surface-level diversity
Easily observed, mostly demographic traits such as age, gender, ethnicity, or disability. Because they are visible—and (supposedly) easy to measure—these characteristics are often the first focus of diversity initiatives.Deep-level diversity
Less visible yet often decisive differences—for example in values, beliefs, personality traits, attitudes, educational paths, or cultural backgrounds. Deep-level attributes typically emerge only in work or learning contexts but strongly influence collaboration and the sense of belonging.
Both layers matter and frequently interact. Visible traits (e.g., skin color) are often treated as indicators of deeper differences (e.g., cultural values) even though that is not always empirically correct. Conversely, differences in attitudes or experiences remain invisible unless they are actively addressed.
Three logics behind diversity initiatives
In politics, science, and organizations, diversity measures are tied to quite different expectations. For some, the core issue is justice; for others, it is competitiveness; still others see diversity primarily as an expression of social movements. These competing goals often lead to misunderstandings—and to initiatives that obstruct one another.
Hellerstedt, Uman & Wennberg (2024) offer a useful conceptual map with three fundamental logics:
Moral Justice Logic – Motivated by norms of equality and justice. Change occurs through legal frameworks, regulation, and social pressure. The goal is social participation and equal treatment—not strategic advantage. This logic is grounded in deontological ethics: justice and equality are upheld not because they are useful, but because they are morally required. It treats equality as a moral principle.
Business Case Logic – Views diversity as a means to enhance innovation, problem-solving, and performance. Change is driven by organizational self-interest. This logic is utilitarian in nature, emphasizing business benefits—often based on assumptions about cognitive diversity that are approximated using demographic characteristics. This perspective is also frequently linked to risk management, aiming to prevent legal liability—for example, through anti-discrimination trainings or diversity certifications that can serve as evidence of preventive action in case of lawsuits or public scrutiny.
Power Activism Logic – Seeks visible change through external pressure. Activism, investor campaigns, or government interventions (e.g., quotas) are key mechanisms here. Organizations are seen as power-based actors that must be compelled from the outside to change. Equity and inclusion are considered implicit goals, but they often remain poorly defined.
These three logics shape research, practice, and political debate—sometimes in parallel, sometimes in competition. In higher education, the combination of Business Case and Power Activism is particularly visible: internally, diversity is often framed in terms of utility (e.g., “diversity strengthens research excellence”), while external actors—such as the German Research Foundation (DFG), the German Rectors’ Conference (HRK), or government ministries—exert indirect pressure through programs, target quotas, or certifications.
Why should diversity improve performance?
he Business Case hope that diversity boosts team and organizational performance rests on one of two competing theories (van Knippenberg & Schippers, 2007):
Information/Decision-Making Perspective
Diverse groups possess a broader range of knowledge and viewpoints, potentially leading to deeper analysis, creative solutions, and better decisions—especially for complex or innovative tasks.Social-Categorization Perspective
Differences may trigger an “us vs. them” dynamic. In-group/out-group thinking can undermine trust, communication, and collaboration, thus lowering performance.
The Business Case relies mainly on the first perspective, assuming mixed teams will outperform homogeneous ones.
What exactly counts as performance?
When studies refer to “performance,” they rarely mean the same thing. In business contexts, the focus is often on financial indicators such as EBIT, revenue growth, or return on capital. In team research, the term may refer to productivity, innovation capacity, decision quality, or satisfaction—depending on how success is defined in a given context.
On top of that, many studies rely on subjective assessments (e.g., by managers or team members), while others use objective measures (e.g., the number of patents, publications, or financial outcomes). This also affects the reliability and comparability of findings.
In short: “Performance” is not a fixed standard—and many seemingly clear claims about the effects of diversity are based on widely differing definitions and methods of measurement. Still, researchers keep trying to synthesize these disparate findings.
How robust is the Business Case?
The McKinsey studies mentioned earlier are based on correlation analyses (explained further below): companies with more diverse leadership teams are said to perform better on EBIT (earnings before interest and taxes). However, it remains unclear whether diversity actually improves performance—or whether high-performing companies are simply more likely to afford diverse teams (reverse causality). The original data from these studies were never published, and the methodological details remain incomplete.
A quasi-replication by Green & Hand (2024) reaches a sobering conclusion: analyzing 497 companies from the S&P 500® Index, the authors find no statistically significant relationship between executive team diversity and financial performance—whether measured by EBIT, ROA (return on assets), ROE (return on equity), or revenue growth.
Key finding from the replication:
“Despite the imprimatur given to McKinsey’s studies, we find no evidence that firms can expect improved financial performance from increasing executive diversity.”
Additional empirical evidence – what do meta-analyses show?
Even beyond the critique of McKinsey, the overall empirical picture is less promising than many organizational narratives suggest. To summarize the state of research, scholars often turn to meta-analyses. But to understand their value, a brief explanation is helpful:
Method note: What is a meta-analysis?
Meta-analyses aggregate the results of many individual studies that examine the same research question—for example: “To what extent does diversity affect team performance?” They statistically weight differences in sample size, methodology, and effect size to produce an overall estimate. Especially in controversial or inconsistent research areas, the results are usually more reliable than those of individual studies.That said, meta-analyses are only as good as the studies they include. Research that is never published often goes unaccounted for—potentially skewing results. Nevertheless, well-executed meta-analyses tend to provide a comprehensive picture of the field.
The following table presents findings from several major meta-analyses examining the relationship between diversity and team performance. These analyses cover a range of diversity types—from demographic characteristics (e.g., gender, ethnicity) to deep-level traits (e.g., values, personality).
The meta-analyses listed above show a consistent picture: a small—or even nonexistent—relationship between diversity and performance. Those who want to delve deeper will find in Wallrich et al. (2024) a comprehensive overview of more than 600 primary studies and numerous additional meta-analyses that arrive at similar conclusions. Their work is currently among the most systematic in the field.
Note on sample size:
The sample sizes listed here refer to the total number of analyzed effect sizes (e.g., correlations) in the meta-analyses—not the number of individual studies or participants. These figures often represent data from tens of thousands of individuals across various primary studies, analyzed in aggregated form.
Method note: What do the values “r” and “ρ” represent?
r is the correlation coefficient—a statistical measure of the strength of the relationship between two variables. In diversity research, for example, it indicates how strongly team diversity is associated with performance:
r = 1.00 → perfect positive correlation
r = 0.00 → no correlation
r = –1.00 → perfect negative correlation
Many meta-analyses also estimate the “true” correlation—denoted by ρ (rho). This value statistically adjusts the r value for potential measurement errors in the individual studies. ρ can be understood as a kind of “idealized” estimate, while r reflects the actual measured data more closely.
Important: A statistical correlation (r or ρ) does not imply that diversity causes better or worse performance. Demonstrating causality would require experimental or longitudinal studies—which are rare in organizational research.
The takeaway from these meta-analyses?
Diversity either has no effect or only a very small effect on team performance. This directly contradicts the Business Case narrative so strongly promoted by McKinsey.
Even the often-discussed “moderators”—contextual factors under which diversity might be particularly effective (e.g., complex tasks, high team interdependence, creative work)—show little practical impact. Even in teams with complex or creative tasks, the average positive effects remain minimal. No tested moderator explained more than 5% of the variance in the diversity–performance relationship (Wallrich et al., 2024). So the hope that “the right” conditions will unlock major performance gains through diversity is not supported by the current data. While individual studies sometimes report moderate or even large effects, these advantages nearly disappear when aggregated.
Universities and the Business Case
A common objection to critiques of the Business Case goes something like: “But universities aren’t businesses—maybe diversity is important for research?”
What’s interesting here is that this objection already adopts the logic of the Business Case—it still seeks a performance promise, just in the context of academic excellence. And this logic has long since made its way into the higher education system: third-party funding, rankings, output orientation. The narrative that more diversity leads to better research is now a staple of many strategic documents.
So what do we know empirically?
Less than one might expect. There are no comprehensive meta-analyses on the relationship between diversity and research quality or innovation. Individual studies show some positive correlations—but no robust causal evidence.
Take Krammer & Dahlin (2024), for example. They analyzed 1.4 million publications in the field of business and management. Their finding:
More diverse and larger teams achieve higher citation rates—but not necessarily more publications in top-tier journals. In fact, in very large or highly heterogeneous teams, the likelihood of publishing in elite journals even decreases. Again, it's unclear whether this is truly an effect of diversity—or rather a result of resources, networks, and visibility that such teams tend to have.
Hofstra et al. (2020) highlight another perspective:
Doctoral students from underrepresented groups are more likely to generate conceptually novel ideas—but those ideas are taken up less frequently and less often lead to academic success. The issue, then, isn’t a lack of ideas—it’s the lack of structural recognition.
These and other studies suggest:
Diversity can correlate with innovation or impact—but doesn’t necessarily do so.
The mechanisms remain unclear, the data are limited, and there’s no consistent empirical picture.
Personal note: I have not conducted a systematic literature review of the individual studies on this topic, so the studies mentioned here should be considered illustrative rather than exhaustive.
While there are many meta-analyses on diversity in general team research, the academic context relies almost entirely on individual studies—often in big-data formats with high statistical power, but without pre-registration and only limited availability of data or analysis code. A significant publication bias cannot be ruled out. As such, findings—such as positive effects of ethnic or gender diversity on citations or innovation—should be seen as suggestive, not as confirmed causal relationships. Systematic reviews and meta-analyses are still lacking.
The currently available studies on diversity and academic performance are almost entirely correlational. That means they report associations—but cannot establish cause and effect. There is no empirical evidence so far that diversity automatically leads to better research outcomes. It’s also conceivable that especially successful teams become more diverse over time, or that third variables like team size or internationality play a role.
The state of research on diversity in academia is incomplete, selective, and inconsistent. Individual studies offer isolated signals but no systematic evidence. Under such conditions, political or strategic decisions should be guided—following scientific integrity—by the standard of the null hypothesis: until there is solid evidence for a positive effect, we should assume that there is none.
Why the Business Case Is Not Harmless
When diversity is justified through promises of performance, it creates—in my view—a risky set of expectations: diversity is supposed to "pay off"—quickly, visibly, and measurably. Yet many of these effects cannot be empirically demonstrated. At least in corporate research, there are now numerous studies and meta-analyses showing that the relationship between diversity and performance is either nonexistent or very weak. The picture is even murkier in higher education: here, the evidence is not only inconsistent but, more importantly, incomplete. While individual studies exist, systematic evidence—such as meta-analyses on research teams—is still lacking.
Nevertheless, political and institutional contexts often argue as if the link between diversity and performance were self-evident. The issue isn’t that one hopes for positive effects—but that such effects are assumed without solid proof. When diversity is justified primarily in terms of efficiency, the risk shifts: if the promised results fail to appear, not only disappointment and rollbacks may follow, but also political backlash. What was once introduced as a strategic initiative can then be dismissed as ideologically motivated symbolic politics.
Moreover, the political climate is changing. In times of growing scrutiny of public spending and state-funded programs, the pressure to demonstrate the effectiveness of initiatives is increasing. For universities and public institutions, this means: when diversity initiatives are funded with taxpayer money, they must be justified not only morally but increasingly with evidence. A compelling narrative is no longer enough—solid data is expected.
At the same time, the core claim of justice risks being sidelined. When diversity is treated primarily as a means to an end, it loses its normative grounding. It is no longer seen as an expression of democratic participation, but as an investment expected to deliver returns—and therefore ultimately dispensable if those returns don’t materialize.
That is precisely why we should base our arguments on empirically sound evidence—and be transparent when that evidence doesn’t yet exist. This is not a step backward; it is an expression of scientific integrity and political responsibility. The Business Case made diversity politically accessible—but it has never been scientifically robust.
This does not mean that diversity is unimportant. But it does mean that we need to distinguish carefully between what is normatively justified and what is empirically supported. Both have their place—but only if used deliberately and responsibly.
Defending diversity with performance promises carries serious risks, especially when the empirical foundation is weak. A more honest approach would be to live diversity not as a miracle cure for performance, but as part of a democratic and fair society.
References
Green, A., & Hand, J. (2024). Does Diversity Pay? A Replication of McKinsey’s Analyses. Working Paper.
Hellerstedt, K., Uman, T., & Wennberg, K. (2024). How Different Logics Shape the Implementation of Diversity, Equity, and Inclusion Work in Organizations. SSRN. https://doi.org/10.2139/ssrn.4308670
Hofstra, B., Kulkarni, V. V., Galvez, S. M.-N., He, B., Jurafsky, D., & McFarland, D. A. (2020). The diversity–innovation paradox in science. Proceedings of the National Academy of Sciences, 117(17), 9284–9291. https://doi.org/10.1073/pnas.1915378117
Krammer, S. M. S., & Dahlin, K. (2024). Ivory Tower of Babel: The Complex Relationship between Team Diversity and Scientific Impact. Working Paper.
Hunt, V., Layton, D., & Prince, S. (2015). Why Diversity Matters. McKinsey & Company. Verfügbar unter: McKinsey & Company
Hunt, V., Prince, S., Dixon-Fyle, S., & Yee, L. (2018). Delivering Through Diversity. McKinsey & Company. Verfügbar unter: McKinsey & Company
Dixon-Fyle, S., Dolan, K., Hunt, V., Prince, S., & (2020). Diversity Wins: How Inclusion Matters. McKinsey & Company. Verfügbar unter: McKinsey & Company
Schneid, M., Isidor, R., Steinmetz, H., & Kabst, R. (2014). Age diversity and team outcomes: A quantitative review. Journal of Managerial Psychology, 29(1), 5–26. https://doi.org/10.1108/JMP-07-2012-0228
Traylor, Z. M., Carton, A. M., Gündemir, S., Homan, A. C., & van Knippenberg, D. (2024). It’s about the process, not the product: A meta-analytic investigation of team demographic diversity. Journal of Applied Psychology. Advance online publication. https://doi.org/10.1037/apl0001189
Triana, M. d. C., Jayasinghe, M., Pieper, J. R., Delgado, D., & Li, M. (2021). Perceived workplace gender diversity and employee outcomes: A meta-analysis. Journal of Organizational Behavior, 42(2), 211–234. https://doi.org/10.1002/job.2483
van Knippenberg, D., De Dreu, C. K. W., & Homan, A. C. (2004). Work group diversity and performance: An integrative model and research agenda. Journal of Applied Psychology, 89(6), 1008–1022. https://doi.org/10.1037/0021-9010.89.6.1008
van Knippenberg, D., & Schippers, M. C. (2007). Work group diversity. Annual Review of Psychology, 58, 515–541. https://doi.org/10.1146/annurev.psych.58.110405.085546
Wei, X., Ying, Y., & Tingting, D. (2015). Team demographic diversity and team performance: A meta-analysis. International Journal of Project Management, 33(1), 56–68. https://doi.org/10.1016/j.ijproman.2014.04.007
Wallrich, L., Opara, V., Wesołowska, M., Barnoth, D., & Yousefi, S. (2024). The relationship between team diversity and team Performance: Reconciling promise and reality through a comprehensive meta-analysis registered Report. Journal of Business and Psychology, 39(6), 1303–1354. https://doi.org/10.1007/s10869-024-09977-0