Thanks to a combination of big data and artificial intelligence (AI), researchers at the University of Copenhagen in Denmark can tell with almost 90% accuracy whether an assignment was written by a student or a ghostwriter.

A program, called Ghostwriter, uses a Siamese neural network to compare the writing styles of different texts. Ghostwriter was trained on a dataset of 130,000 assignments written by 10,000 different Danish students and provided by the company MaCom, which is the provider of the Lectio platform currently used at Danish high schools to determine if a student’s work is plagiarized.

Once a student submits an assignment, Ghostwriter compares it to a student’s previous assignments, each of which are assigned a percentage score by the network based on writing-style similarity against the new assignments. A weighted average score of those scores is then determined using a calculation that considers factors like delivery time. That final score is designed to be an indicator of the similarity between the student’s writing style and the new assignment.

“Our program identifies discrepancies in writing styles by comparing recently submitted writing against a student's previously submitted work. Among other variables, the program looks at: word length, sentence structure and how words are used. For instance, whether 'for example' is written as 'ex.' or 'e.g.'," explains Ph.D. student Stephan Lorenzen of the department of computer science.

In addition to its potential applications in schools, colleges and universities, the developers of Ghostwriter also imagine that the program could have potential applications in police work, helping in the analysis of forged documents.

To contact the author of this article, email mdonlon@globalspec.com