Machine Learning Gives Visibility to Underrepresented Authors
A&S graduate student Brianna Cardillo develops an algorithm in her computational forensics course to promote books by marginalized authors.
While fingerprint powder and microscopes are very important tools in forensics, machine learning is becoming one of the fastest emerging technologies in the field. This involves the use of algorithms and computing to perform efficient and effective investigations by analyzing large and complex sets of data. The College of Arts and Sciences’ Forensic and National Security Sciences Institute offers customized courses designed to equip students with the skills to examine these problems using computational methods and algorithms.
One specific course, titled Computational Forensics, introduces students to coding, machine learning and artificial intelligence (AI). Taught by Professor Filipe Augusto da Luz Lemos, who is a leading expert in digital forensics, the curriculum teaches students how machine learning and AI are utilized in the field. A highlight for students taking this course is the final project, where they select a real-world problem that they are passionate about and solve it using computational techniques learned in class. The assignment culminates with a presentation where they share their solution to the chosen problem.
Brianna Cardillo, a graduate student in forensics, focused her work on one of her favorite hobbies – reading. Her project, “What to Read Next? Using Historical Reader Preferences to Promote Books from Marginalized Authors,” aimed to develop a machine learning algorithm that could suggest books, with a specific focus on promoting works by underrepresented writers.
“I’ve been in social media spaces surrounding reading and creatively writing books for a long time now, and I really became aware of just how much diversity people’s reading preferences lacked,” says Cardillo. “I have read so many books from authors like Faridah Àbíké-Íyímídé that had such incredible world-building and portrayed such important themes, books that deserved more praise than they got.”
To address this inequity, Cardillo developed an algorithm which suggests books based upon readers’ interests. It takes into account information like genre, length, average rating on the book recommendation site Goodreads, and authors’ race, which she gathered from personal interviews, blog posts and book jackets. She organized this data into Excel spreadsheets and input the information into a machine learning algorithm. Simply put, the algorithm is a content-based filtering system which considers what readers enjoy and calculates whether they will enjoy other books by underrepresented authors based on those interests.
“Increasing awareness of marginalized authors requires readers to actively choose and promote diverse stories, especially since we have so much influence over publishing with how we use our dollars,” says Cardillo. “That’s why I wanted to make the algorithm in the first place, with the hope that this could be part of that first step.”
While she primarily focused on race when developing this version of the algorithm, Cardillo would like to one day expand it to include multiple categories of marginalization alongside race, like sexuality or disability status.
“I would love to include authors of many different identities so that everyone can find books where they feel represented,” she says.
Lemos notes that Cardillo’s work on this project exemplifies the goals and strengths of this course, which involve solving contemporary issues with computational methods that would be impractical or time-consuming for humans to compute manually.
“Throughout this project, Brianna honed her ability to identify and analyze problems, determining their suitability for machine learning solutions,” says Lemos. “Brianna’s work not only engaged with her personal interest, but also tapped into a broader societal relevance.”
He explains that the skills Cardillo and other students developed during this project are directly transferable to a professional setting, especially in the field of forensics.
“This project taught students to efficiently identify problems that can be expedited or improved through computational approaches and to create algorithms that can identify patterns where humans would not be able to,” Lemos says. “Additionally, they gain the capability to design algorithms that automate mundane tasks, thereby optimizing productivity so that investigators can focus on more complex, impactful work.”
After graduating this May with an M.S. in forensic science, Cardillo hopes to gain employment in a crime laboratory as a forensic DNA analyst. In such a fast-paced environment, the ability to think creatively and solve problems quickly is a must.
“In that type of work, things will not always go to plan,” says Cardillo. “Sometimes instruments stop working, and it will require creative thinking to find solutions, especially to problems that are not so clear cut. I think this project has prepared me for that, and I know that when these problems happen, I will be able to work through them well.”