Log In
Home
Support
Demos
Documentation
Blogs
Training
Webinars
[Expand]General Information
[Collapse]WinForms Controls
  Prerequisites
 [Expand]What's Installed
 [Expand]Build an Application
 [Collapse]Controls and Libraries
  [Expand]Forms and User Controls
  [Expand]Messages, Notifications and Dialogs
  [Expand]Editors and Simple Controls
  [Expand]Ribbon, Bars and Menu
  [Expand]Application UI Manager
  [Expand]Docking Library
  [Expand]Data Grid
  [Expand]Vertical Grid
  [Expand]Pivot Grid
  [Expand]Tree List
  [Expand]Chart Control
  [Expand]Diagrams
  [Expand]Gauges
  [Expand]Map Control
  [Expand]Scheduler
  [Expand]Spreadsheet
  [Expand]Rich Text Editor
  [Collapse]Spell Checker
   [Expand]Product Information
   [Expand]Getting Started
   [Collapse]Fundamentals
     Product Class Structure
     Spell Check Algorithms
     Dictionaries
     Checking Spelling in Different Controls
   [Expand]Visual Elements
   [Expand]Design-Time Features
   [Expand]Examples
  [Expand]Form Layout Managers
  [Expand]Navigation Controls
  [Expand]Printing-Exporting
  [Expand]PDF Viewer
   Reporting
  [Expand]Snap
  [Expand]TreeMap Control
 [Expand]Common Features
  Get More Help
 [Expand]API Reference
[Expand]ASP.NET Controls and MVC Extensions
[Expand]ASP.NET Bootstrap Controls
[Expand]WPF Controls
[Expand]Xamarin Controls
[Expand]Windows 10 App Controls
[Expand]Document Server
[Expand]Reporting
[Expand]Report Server
[Expand]Dashboard
[Expand]eXpressApp Framework
[Expand]CodeRush
[Expand]Cross-Platform Core Libraries
[Expand]Tools and Utilities
 End-User Documentation

Spell Check Algorithms

Implementing a spell-checking engine is more complicated a task than it may seem. It's evident that simply looping through the vocabulary is not enough, even if this vocabulary is quite comprehensive, and correct. The spell checker should consider the phonetic aspect of the language.

The key points of our spell checking engine are:

Expanded Text Parser

While parsing the text, several specific text elements should be treated. These elements include abbreviations, proper names, figures, e-mail addresses, uniform resource locator (URL) strings (simply web addresses), and so on. They could be simply ignored or checked in a way that is different from other words in the text, depending on the spell checker implementation and user options. The SpellChecker component provides the SpellChecker.OptionsSpelling property, which returns an OptionsSpelling instance that allows a user to avoid checking e-mail and web addresses, words with numbers, mixed case and upper case words.

Expanded Dictionary

An ideal dictionary should be comprised of all the words in a given language. In real life, it can be much smaller, and effectively split into several parts, depending on the language. For several Indo-European languages, including English, words are derived from the base by adding affixes - prefixes or postfixes. So, the size of the dictionary can be greatly reduced if the base words, affixes and the rules for adding affixes to base words are placed into separate files. The complete list of words could be built in-place, when necessary. This technique proves its effectiveness, especially for synthetic languages (rich in verbal and inflective forms) - Lithuanian or Russian, for example.

The approach that includes the base words and affixes is used in the ISpell and ASpell spelling checker projects. Thanks to the Open Office project, the spellchecker dictionaries of these projects may be freely used and distributed. The SpellChecker component supports this format, since those dictionaries are quite comprehensive and correct, and constantly amended by cooperative users. The current US-English variant includes more than 62000 base words.

When a word is found to be misspelled (that is, not found in the dictionary), then the spell checker generates a list of suggestions - words suggested to replace the mistake. The final choice is always up to the user.

For more information, see Dictionaries.

Expanded Using Near-Miss Strategy to Find Suggestions

The first algorithm implemented by SpellChecker for building a suggestion list is a near miss strategy. It was developed by Geoff Kuenning for ISpell, and makes an assumption that the word is not necessarily misspelled, but rather mistyped. We change the misspelled word by changing a letter, deleting or adding it, inserting a blank space, or interchanging two adjacent letters. If these steps result in a word contained in the dictionary, then we estimate how far we are from the original word. To measure the proximity of words, the modified Levenshtein distance notion is used.

Expanded Using Phonetic Comparison to Find Suggestions

The phonetic suggestion algorithm takes into account the pronunciation of a word. The SpellChecker component utilizes the implementation of the Double Metaphone search algorithm. Two phonetic codes (primary and secondary) are calculated for each word. The calculation rules are different for different languages. They are based on the set of pronunciation rules for that language.

Then, the phonetic strategy compares the phonetic code of the misspelled word to all the words in the word list. If the phonetic codes match, then the word is added to the suggestion list.

Expanded Suggestion Ranking

After the list of suggestions is composed, it should be ordered so that the user doesn't have to scroll through it, searching for a perfect match. The implemented solution makes use of the Levenshtein algorithm to calculate the word distance. This distance becomes a parameter for list ordering. Additional assumptions on the nature of a spelling error may help modify the algorithm.

The user makes his choice from the list of suggestions. The misspelled word can be replaced with a word from the suggestion list, ignored, or edited by the user. The last possibility indicates a spell checker miss, and provides an option for appending the corrected word to an auxiliary user dictionary.

Expanded See Also

How would you rate this topic?​​​​​​​