Books are the building blocks from which children learn; they are windows into creative spaces that inspire, educate and encourage and act as informative learning tools in the classroom.

Yet despite their educational value, researchers are grappling with how books systemically underrepresent race and gender and how this impacts youth. That’s where TC’s Alex Eble, Associate Professor of Economics and Education, and team are working to fill in the gaps with their latest research, which examines representation and identity in children’s books using tools like artificial intelligence and computational methods.

“The main question we’re asking is who is represented in the books we most often use to teach children, and why are we seeing these patterns,” explains Eble, who builds on his previous work shedding light on how the economics of education can help understand persistent inequality by gender and other historical sites of exclusion.

We sat down with Eble to discuss his findings.

In your latest study, you use new tools and methods to assess a topic you’ve explored previously in your work: representation in children’s books and the impact of distribution trends. Can you tell us about the process and your findings?

AE: We assessed over 1,000 books that have garnered acclaim from a century of children's book awards. Our analysis concentrated on two primary categories of books for children aged 14 and under: "mainstream" and "diversity" books.

After using advanced tools to identify over 44,000 characters in these books, we found that despite equal population shares, men are more commonly represented than women in both pictures and words. We also found that white populations were represented more frequently than Black and Latino populations. Interestingly enough, children were represented with lighter skin color than adults, even conditional on race (i.e., Black children have lighter skin than Black adults). It was eye-opening and made us wonder what else we might reveal through this sort of work in the future.


That’s fascinating. How did artificial intelligence and computational methods help uncover these discoveries?

AE: Artificial intelligence played a crucial role in this project—it would not have been possible without it. We needed a tool to measure representation in children's books on a very detailed, micro-scale. It would be impractical for parents and teachers to thoroughly vet every book before offering it to their children or students, a task made even more challenging for librarians, superintendents, or policymakers—and that’s where we wanted to develop a solution.

Over the last three years, we have spent time building a tool using computational resources in New York and Chicago. It [the tool] uses artificial intelligence to scan images in picture books, turning the images into representation data based on gender, identity, and race. The data that it revealed would not have been evident without the detail that artificial intelligence was able to provide.


Talk more about “Mainstream” books versus “Diversity” books. How do they impact children in the classroom?

AE: Mainstream books are super common in libraries, school curricula, and homes. They are books recognized by the most prestigious children’s book awards: the Newbery and Caldecott. These books have profound recognition and influence; however, our research reveals that they are not very representative, with main characters often depicted with lighter skin. “Diversity” books focus more on centering previously excluded identities and are often recognized for their artistic or literary value. 

Put simply, male and white children encounter more representations of themselves in storytelling compared to underrepresented groups of children, regardless of what they are reading. This discrepancy persists across both collections of books, even in Diversity books, which are intended to do quite the opposite. Even in books designed to highlight and celebrate the experiences of Black children, these children are still less likely to see themselves represented. These patterns can shape children’s beliefs about where they and others do or do not belong in the world and, unfortunately, it’s not getting better nearly as fast as we would hope.


Educators need more tools and resources to better measure what’s in the books we’re reading and promoting to our children. They need more support from communities at large, starting at a policy level.

Alex Eble, Associate Professor of Economics and Education


Your research also discusses economic trends and consumer behavior in relation to identity representation. What should we know about these developments?

AE: This is where parts of my previous research came into play, and this was really cool to see. It’s no secret that people tend to buy books that center their gender and racial identities. However, we found that books that center on dominant identities (typically white men) are more likely to be sold at a higher volume and a lower price, indicating greater demand for them than for other books. In contrast, books that center on non-dominant identities have less consumer demand and are actually priced higher.

Our research indicates a correlation between the content of purchased books in a particular area and the political inclinations of the community. We uncovered that parents are likely to buy books for their kids that represent “their version” or perspective the world. For example, we see this with politics and media splitting—if you lean conservatively, you’ll likely watch Fox News. If you lean more liberal, you’ll watch MSNBC. The same theory applies to books that parents are buying for their children.


How can communities work together to make storytelling more inclusive?

AE: As educators, writers, illustrators, and community leaders, we have a great responsibility to understand how exposure to this content can influence children’s trajectories. How does it affect what classes they take? How does it affect how they see themselves as adults in the world? We’re already taking the first step, which is using tools that can measure and evaluate book content at a very large scale and identifying the disparities, but there's more work to be done.

Educators need more tools and resources, like the one we utilized for this project (think almost a central database), to better measure what’s actually in the books we’re reading and promoting to our children. They need more support from communities at large, starting at a policy level.