Gasim, Israfil (2025) Evaluating Hunspell, SymSpell, Norvig, and N-gram Spellcheckers for Azerbaijani Text. International Journal of Innovative Science and Research Technology, 10 (5): 25MAY1640. pp. 2574-2577. ISSN 2456-2165
Automatic spelling correction is critical for enhancing text quality and usability across digital platforms, particularly for morphologically rich and low-resource languages like Azerbaijani. This paper presents a comparative analysis and benchmarking of four prominent spellchecking algorithms—Hunspell, SymSpell, Norvig's probabilistic model, and N-gram statistical models—implemented specifically for Azerbaijani. A comprehensive evaluation was conducted using a manually annotated corpus comprising diverse Azerbaijani text sources, simulating common orthographic errors typical in everyday language usage. Results indicate moderate effectiveness among all tested methods, with Hunspell achieving the highest accuracy (84.5%) due to its robust dictionary-based morphological handling. Despite its speed advantage, SymSpell (81.4% accuracy) requires extensive dictionary resources, making it impractical for morphologically complex languages without significant resource investments. Norvig's method (78.3%) and the N-gram model (82.1%) also demonstrated limitations related to corpus dependency and computational efficiency, respectively. The findings highlight substantial challenges posed by Azerbaijani’s agglutinative structure, underscoring the inadequacy of existing general-purpose algorithms. Consequently, the paper emphasizes the urgent need for new hybrid approaches specifically tailored to Azerbaijani and similarly structured languages, suggesting directions for future research and development in spelling correction technologies.
Altmetric Metrics
Dimensions Matrics
Downloads
Downloads per month over past year
![]() |