Сборник 2002

LINGUA – AN ARCHITECTURE FOR ROBUST TEXT PROCESSING IN BULGARIAN

 

 At. Totkov

The University of Plovdiv

totkov@pu.acad.bg

 

Chr. T. Tanev

The University of Plovdiv

htanev@yahoo.co.uk

 

 

Keywords: robust processing environment, text processing, POS tagging, sentence splitting, clause segmentation, NP extraction, and anaphora resolution

 

This paper describes LINGUA – architecture for text processing in Bulgarian. The pre-processing modules for tokenisation, sentence splitting, paragraph segmentation, part-of-speech tagging, clause chunkingm, noun phrase extraction and anaphora resolution are outlined. Evaluation results are reported for each processing task.

 

Полный текст статьи можно скачать здесь:

TotkovTanev.zip