Thanks to a machine learning system developed at Primer, John Bohannon, director of science, is able to sift through different scientific works and contributions that were previously inaccessible through a traditional search of Wikipedia.

“It does this much as a human would, if a human could read 500 million news articles, 39 million scientific papers, all of Wikipedia, and then write 70,000 biographical summaries of scientists," Bohannon explained about the system.

As such, the machine learning system has both revealed that Wikipedia may underrepresent females in the field of scientific research, while simultaneously offering a solution to the gender problem for that research.

The system was trained on scholarly journals, becoming a gender-filling gap tool called Quicksilver. It is capable of locating female scientists that have been overlooked and largely ignored by Wikipedia.

To build the model, the researchers analyzed 30,000 Wikipedia entries, identifying characteristics that would earn a scientist mention in an encyclopedia. The system then identified the authors of 200,000 scientific papers by culling information from the academic search engine Semantic Scholar.

The team discovered that only 18 percent of Wikipedia biographies were about women and that between 84% to 90% of Wikipedia editors were male.

"Our aim is to help the open data research community build better tools for maintaining Wikipedia and Wikidata, starting with scientific content,” wrote Bohannon.

For more on Quicksilver, go to Primer

To contact the author of this article, email