TY - GEN
T1 - Markov random field based text identification from annotated machine printed documents
AU - Peng, Xujun
AU - Setlur, Srirangaraj
AU - Govindaraju, Venu
AU - Sitaram, Ramachandrula
AU - Bhuvanagiri, Kiran
PY - 2009
Y1 - 2009
N2 - In this paper, we describe an approach to segment handwritten text, machine printed text and noise from annotated machine printed documents. Three categories of word level features are extracted. We use a modified K-Means clustering algorithm for classification followed by a relabeling procedure using Markov Random Field(MRF) based on a concept of neighboring patches and Belief Propagation(BP) rules. Experimental results on an imbalanced data set show that our approach achieves an overall recall of 96.33%.
AB - In this paper, we describe an approach to segment handwritten text, machine printed text and noise from annotated machine printed documents. Three categories of word level features are extracted. We use a modified K-Means clustering algorithm for classification followed by a relabeling procedure using Markov Random Field(MRF) based on a concept of neighboring patches and Belief Propagation(BP) rules. Experimental results on an imbalanced data set show that our approach achieves an overall recall of 96.33%.
UR - https://www.scopus.com/pages/publications/71249143497
U2 - 10.1109/ICDAR.2009.237
DO - 10.1109/ICDAR.2009.237
M3 - Conference contribution
SN - 9780769537252
T3 - Proceedings of the International Conference on Document Analysis and Recognition, ICDAR
SP - 431
EP - 435
BT - ICDAR2009 - 10th International Conference on Document Analysis and Recognition
T2 - ICDAR2009 - 10th International Conference on Document Analysis and Recognition
Y2 - 26 July 2009 through 29 July 2009
ER -