TY - GEN
T1 - A diagnostic evaluation approach for English to Hindi MT using linguistic checkpoints and error rates
AU - Balyan, Renu
AU - Naskar, Sudip Kumar
AU - Toral, Antonio
AU - Chatterjee, Niladri
PY - 2013
Y1 - 2013
N2 - This paper addresses diagnostic evaluation of machine translation (MT) systems for Indian languages, English to Hindi MT in particular, assessing the performance of MT systems on relevant linguistic phenomena (checkpoints). We use the diagnostic evaluation tool DELiC4MT to analyze the performance of MT systems on various PoS categories (e.g. nouns, verbs). The current system supports only word level checkpoints which might not be as helpful in evaluating the translation quality as compared to using checkpoints at phrase level and checkpoints that deal with named entities (NE), inflections, word order, etc. We therefore suggest phrase level checkpoints and NEs as additional checkpoints for DELiC4MT. We further use Hjerson to evaluate checkpoints based on word order and inflections that are relevant for evaluation of MT with Hindi as the target language. The experiments conducted using Hjerson generate overall (document level) error counts and error rates for five error classes (inflectional errors, reordering errors, missing words, extra words, and lexical errors) to take into account the evaluation based on word order and inflections. The effectiveness of the approaches was tested on five English to Hindi MT systems.
AB - This paper addresses diagnostic evaluation of machine translation (MT) systems for Indian languages, English to Hindi MT in particular, assessing the performance of MT systems on relevant linguistic phenomena (checkpoints). We use the diagnostic evaluation tool DELiC4MT to analyze the performance of MT systems on various PoS categories (e.g. nouns, verbs). The current system supports only word level checkpoints which might not be as helpful in evaluating the translation quality as compared to using checkpoints at phrase level and checkpoints that deal with named entities (NE), inflections, word order, etc. We therefore suggest phrase level checkpoints and NEs as additional checkpoints for DELiC4MT. We further use Hjerson to evaluate checkpoints based on word order and inflections that are relevant for evaluation of MT with Hindi as the target language. The experiments conducted using Hjerson generate overall (document level) error counts and error rates for five error classes (inflectional errors, reordering errors, missing words, extra words, and lexical errors) to take into account the evaluation based on word order and inflections. The effectiveness of the approaches was tested on five English to Hindi MT systems.
KW - DELiC4MT
KW - Hjerson
KW - automatic evaluation metrics
KW - checkpoints
KW - diagnostic evaluation
KW - errors
UR - https://www.scopus.com/pages/publications/84875551042
U2 - 10.1007/978-3-642-37256-8_24
DO - 10.1007/978-3-642-37256-8_24
M3 - Conference contribution
SN - 9783642372551
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 285
EP - 296
BT - Computational Linguistics and Intelligent Text Processing - 14th International Conference, CICLing 2013, Proceedings
T2 - 14th Annual Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2013
Y2 - 24 March 2013 through 30 March 2013
ER -