A basic understanding of spell-checking and grammar-checking algorithms
Marie Donlon | March 20, 2024
NASA lost a $125-million Mars Climate Orbiter in 1999 because spacecraft engineers failed to convert from English to metric measurements when exchanging data before the craft was launched.
It seems that a team at the Jet Propulsion Laboratory used the metric system of millimeters and meters in its calculations. Meanwhile, Lockheed Martin Astronautics engineers who designed and built the spacecraft, provided acceleration data in the English system of inches, feet and pounds.
JPL engineers mistook acceleration readings measured in English units of pound-seconds for newton-seconds. The errors caused the spacecraft to crash into the Martian surface.
Ahead of National Grammar Day (March 4) and National Proofreading Day (March 8), engineers may want to pay homage to the tools that offer them an extra layer of security against typos or subject-verb disagreement, while also saving them from possible catastrophe when communicating dimensions or other instructions for a project.
Spell-checker
In, 1961, computer scientist Les Earnest led the research on the first spell-checker, which had access to a list of just 10,000 words. In 1971, a graduate student under Earnest, Ralph Gorin, created the first legitimate spell-checker, which was written as an applications program.
The first iteration of spell-checkers were verifiers more than correctors, highlighting misspelled words but offering no suggestions for fixing the flagged words. Since then, the technology has evolved to include a number of spell-checkers with far more sophisticated features. Not only do these spell-checkers offer word suggestions, but they even flag grammatical issues.
Although there are many spell-checking programs available, differing from system to system and working as a feature in software for email, word processors, search engines and electronic dictionaries, at their core, they are fundamentally the same.
Native spell-checkers are typically composed of two parts. The first part includes routines enabling the spell-checker to scan text, highlighting each word along the way. The second part involves an algorithm that compares each word with an internal dictionary.
Generally, the first spell-checking component — the one that does the scanning — is more advanced than the second component, which enables it to handle English language variations such as possessives and contractions. The second component — the part that scans the dictionary — is often limited to certain words, phrases and hyphenations. The scanner looks for certain word strings in the internal dictionary. If matches aren’t found, the word is flagged as a possible misspelling and is highlighted, typically with a red, squiggly underscore.
While there have been several iterations of the spell-checker, the most successful spell-checking algorithm so far, according to Wikipedia, is the “Winnow-based spelling correction algorithm” from Andrew Golding and Dan Roth. This particular algorithm is capable of catching context-sensitive spelling errors. In other words, it can catch misspelled words, even if it is spelled correctly in the dictionary but is wrong with respect to the words that surround it.
Grammar-checker
Like the spell-checker, the grammar-checker in its infancy only checked for limited details, including style inconsistencies and punctuation instead of a range of possible grammatical errors. In the 1970s, the first grammar checker, called the Writer’s Workbench, was developed and came with different tools to complete different functions. For instance, the diction tool would check a document for wordiness or misused and clichéd phrases. The style tool analyzed the writing style of the document.
Since the introduction of Writer’s Workbench and its successor, Grammatik, each iteration of grammar-checker has added language processing capabilities.
Working much like a spell-checking algorithm, the modern-day grammar checker extracts a sentence and will check each word within a sentence against the dictionary, looking at information such as parts of speech according to placement within the sentence. Relying on several rules, the modern-day grammar checker will detect errors in tense agreement, number, word order and so on.
While some people believe that correct spelling and grammar are the stuff of literary snobs and bookworms, the importance is not lost on a profession like engineering that relies on precision.
Resources