New AI tool created to combat gender bias in fairytales

Out of 33,577 events analyzed in the study, 69% were linked to male characters, while 31% were connected to female characters.

 Cinderella at the ball. (photo credit: FLICKR)
Cinderella at the ball.
(photo credit: FLICKR)

A team of researchers from Northeastern University, University of California Los Angeles and IBM Research unveiled an artificial intelligence framework capable of identifying gender bias and stereotypes in fairytales, they announced in a new study published on the arXiv preprint server on Friday.

The peer-reviewed study sheds light on the pervasive nature of gender biases within classic fairy tales like Snow White, Cinderella and Sleeping Beauty.

For decades, scholars and educators have discussed the impact of fairy tales on children, particularly young girls and their portrayal of sociocultural roles. These tales often depict princesses in need of rescue by handsome princes, perpetuating gender stereotypes and limiting the aspirations of young girls.

What is their goal?

The team, led by Associate Professor Dakuo Wang from Northeastern University, aimed to create an AI-driven tool that could analyze storybooks and flag instances of gender bias and stereotypes. Their ultimate goal is to encourage the development of more inclusive and empowering narratives for children.

Wang expressed his motivation, saying that "if we can develop a technology to automatically detect or flag those kinds of gender biases and stereotypes, then it can at least serve as a guardrail or safety net not just for ancient fairy tales but the new stories being written and created every day today."

 An illustration of Red Riding Hood meeting the wolf. (credit: PIXABAY)
An illustration of Red Riding Hood meeting the wolf. (credit: PIXABAY)

This research evolved from the team's ongoing exploration of how AI can aid language learning skills in young children. To conduct their analysis, the researchers amassed a diverse collection of hundreds of stories from around the world.

The researchers collaborated with a group of education experts, including teachers and scholars, to generate a set of questions and answers that would measure children's learning from these stories. Through this process, they discovered the stubborn presence of gender stereotypes in all the tales.

While previous studies focused on superficial levels of bias, such as identifying specific word pairings that reinforce gender norms, Wang and his team aimed to dig deeper. They concentrated on "temporal narrative event chains," examining the sequence of events and actions experienced or undertaken by characters within the stories.

Wang explained that "it's actually the experience and the action that defines who this person is, and those actions influence our readers about what [they] should do or shouldn't do to mimic that fictional character."

Employing automated processes, the researchers extracted character names, genders and events from the collected stories. They then arranged the events into chains for each character and categorized them accordingly. Each event was meticulously analyzed and assigned an odds ratio to determine its frequency of association with male or female characters.

Out of 33,577 events analyzed in the study, 69% were linked to male characters, while 31% were connected to female characters. Events associated with female characters were predominantly associated with domestic tasks, such as grooming, cleaning, cooking and sewing. In contrast, events related to male characters often revolved around concepts of success, failure, or aggression.

With this wealth of data, Wang and his team developed a sophisticated natural language processing tool capable of identifying bias in event chains, surpassing the analysis of individual events alone.

The researchers envision their tool being utilized not only by researchers by also by writers, publishers and creators of children's stories. By simply uploading their initial drafts into the tool, these individuals can receive a score or meter indicating potential gender biases and suggestions for improvement.