Orange Alert

Making Every Identity Count

A Syracuse University professor has created a free tool to help researchers handle complex identity responses with greater care and transparency in surveys.


Key Takeaways:

  • A Personal Mission: A College of Arts and Sciences professor has developed CATAcode after experiencing firsthand how demographic forms failed to acknowledge his multiracial identity.
  • Better Data Practices: The free software tool helps researchers systematically explore and document how they handle participants who select multiple demographic categories, making decision-making transparent and reproducible.
  • Real-World Impact: By preserving identity nuance, CATAcode can increase representation of underrepresented groups in research, which could affect policy decisions and resource allocation.


Growing up multiracial in the 1990s, Gabriel “Joey” Merrin regularly encountered demographic forms that forced an impossible choice: Pick one box. Deny the others.

"That act of being forced to choose, to erase parts of myself from an official document, is at the core of this work," says Merrin, who is an assistant professor in the College of Arts and Sciences’ Department of Human Development and Family Science.

That personal frustration eventually became a methodological solution. Merrin, working with collaborators from the University of Minnesota, Yale University, Boston University and the University of North Carolina at Chapel Hill, has developed CATAcode, a software tool (R package) that helps researchers across the social sciences handle demographic data more thoughtfully.

The tool is now publicly available for researchers to download and use, and its accompanying tutorial paper was published in Advances in Methods and Practices in Psychological Science.

The Invisibility Problem

The problem CATAcode addresses is deceptively simple but widespread. When surveys ask people to "check all that apply" for race, gender identity, or other characteristics, many respondents select multiple options. Researchers then face a choice: How do you handle someone who checked three boxes when your statistical model requires them to be in one category?

For decades, the default approach has been to collapse these individuals into what Merrin calls a "heterogeneous and often nonsensical 'other' category," in which, for example, a Black and Asian person is treated the same as a White and Native American person.

"When we lump everyone together like that, we lose the ability to understand their unique experiences," Merrin says. "And we make entire communities statistically invisible."

The implications extend beyond research. These findings can inform policy decisions, funding allocations and the development of interventions designed to serve diverse communities. When demographic data oversimplifies or erases certain groups, the policies and programs built on that research may fail to address their needs.

Confronting the Numbers

CATAcode provides researchers with systematic approaches for exploring identity combinations in their data and for documenting how they make decisions about grouping participants. It works with both cross-sectional and longitudinal data and with any survey items that allow multiple responses.

The tool is particularly timely given that the U.S. multiracial population grew by 276% between 2010 and 2020, notes Merrin.

In a dataset of more than 8,000 high school students, CATAcode identified 85 distinct racial combinations—a figure that strongly argues against oversimplification. Using one of the tool's features, researchers can prioritize underrepresented groups to keep them visible in analyses. In one example from the paper, this approach increased the number of Native American participants from 12 to 128.

"That's the difference between a group being invisible and a group being present and accounted for," Merrin says.

Beyond Race and Ethnicity

While Merrin's work focuses on racial and ethnic identity, CATAcode applies to any survey item that allows multiple responses, such as a person’s health conditions. This broad applicability makes the tool useful across disciplines, from psychology and sociology to public health and education.

Merrin and his collaborators hope CATAcode will push journals, funding agencies and ethics boards to demand greater transparency in how researchers represent the people they study.

"We hope this tool sparks a movement toward more transparent and equitable representations of study participants' identities," Merrin says. "The decisions researchers make about how to categorize people have real consequences for policy and resource allocation."

By improving how demographic data are prepared, analyzed and reported, CATAcode supports greater transparency, reproducibility, generalizability and equity of social science research—ensuring that when people check multiple boxes, their full identities remain visible in the work that shapes our understanding of communities.

"This is a tool born from a personal wound," Merrin says. "But I hope it offers a path toward more ethical and just research across the social sciences."

Author: Dan Bernardi

Published: Jan. 28, 2026

Media Contact: asnews@syr.edu