LINGUA – AN ARCHITECTURE FOR ROBUST TEXT PROCESSING IN BULGARIAN
At. Totkov
The University of Plovdiv
Chr. T. Tanev
The University of Plovdiv
Keywords: robust processing environment, text processing, POS tagging, sentence splitting, clause segmentation, NP extraction, and anaphora resolution
This paper describes LINGUA – architecture for text processing in Bulgarian. The pre-processing modules for tokenisation, sentence splitting, paragraph segmentation, part-of-speech tagging, clause chunkingm, noun phrase extraction and anaphora resolution are outlined. Evaluation results are reported for each processing task.
Полный текст статьи можно скачать здесь: