Effective and efficient video text extraction using key text points

Z. Li, G. Liu, X. Qian, D. Guo, H. Jiang

Research output: Contribution to journalArticlepeer-review

22 Scopus citations


Text information contains important clues for video analysis, indexing and retrieval. Effective and efficient text extraction has been a challenging and significant topic. Focusing on this issue, this study proposes a video text extraction scheme using key text points (KTPs). A KTP is defined as the point that has strong textural structure in multi-directions simultaneously. First, a KTP can be acquired by the three high-frequency subbands obtained from the wavelet transform. An anti-texture-direction-projection method is then proposed to improve the accuracy of text localisation and verification. Second, the difference at the KTPs between neighbouring frames is calculated as the similarity measure in text tracking, which can reduce the influence of dramatic background variation. Finally, the total number of the KTPs in each connected component is calculated to remove the background interference and to extract texts for text segmentation. Experimental results show that the proposed text detection and localisation algorithm is robust to the font size, style, colour and alignment of texts. Text tracking significantly improves the efficiency of text detection, and the similarity measure based on KTPs decreases the influence of scene changes and motions. The proposed text segmentation has a promising performance when processing video scenes with complex backgrounds and low contrast between texts and backgrounds.

Original languageEnglish (US)
Pages (from-to)671-683
Number of pages13
JournalIET Image Processing
Issue number8
StatePublished - Dec 2011
Externally publishedYes

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering


Dive into the research topics of 'Effective and efficient video text extraction using key text points'. Together they form a unique fingerprint.

Cite this