Effective and efficient video text extraction using key text points

Z. Li, G. Liu, X. Qian, D. Guo, Hangyi Jiang

Research output: Contribution to journalArticle

Abstract

Text information contains important clues for video analysis, indexing and retrieval. Effective and efficient text extraction has been a challenging and significant topic. Focusing on this issue, this study proposes a video text extraction scheme using key text points (KTPs). A KTP is defined as the point that has strong textural structure in multi-directions simultaneously. First, a KTP can be acquired by the three high-frequency subbands obtained from the wavelet transform. An anti-texture-direction-projection method is then proposed to improve the accuracy of text localisation and verification. Second, the difference at the KTPs between neighbouring frames is calculated as the similarity measure in text tracking, which can reduce the influence of dramatic background variation. Finally, the total number of the KTPs in each connected component is calculated to remove the background interference and to extract texts for text segmentation. Experimental results show that the proposed text detection and localisation algorithm is robust to the font size, style, colour and alignment of texts. Text tracking significantly improves the efficiency of text detection, and the similarity measure based on KTPs decreases the influence of scene changes and motions. The proposed text segmentation has a promising performance when processing video scenes with complex backgrounds and low contrast between texts and backgrounds.

Original languageEnglish (US)
Pages (from-to)671-683
Number of pages13
JournalIET Image Processing
Volume5
Issue number8
DOIs
StatePublished - Dec 2011
Externally publishedYes

Fingerprint

Wavelet transforms
Textures
Color
Processing

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering
  • Software
  • Computer Vision and Pattern Recognition

Cite this

Effective and efficient video text extraction using key text points. / Li, Z.; Liu, G.; Qian, X.; Guo, D.; Jiang, Hangyi.

In: IET Image Processing, Vol. 5, No. 8, 12.2011, p. 671-683.

Research output: Contribution to journalArticle

Li, Z. ; Liu, G. ; Qian, X. ; Guo, D. ; Jiang, Hangyi. / Effective and efficient video text extraction using key text points. In: IET Image Processing. 2011 ; Vol. 5, No. 8. pp. 671-683.
@article{7c8e700adb1740b69a31ea212b9f1196,
title = "Effective and efficient video text extraction using key text points",
abstract = "Text information contains important clues for video analysis, indexing and retrieval. Effective and efficient text extraction has been a challenging and significant topic. Focusing on this issue, this study proposes a video text extraction scheme using key text points (KTPs). A KTP is defined as the point that has strong textural structure in multi-directions simultaneously. First, a KTP can be acquired by the three high-frequency subbands obtained from the wavelet transform. An anti-texture-direction-projection method is then proposed to improve the accuracy of text localisation and verification. Second, the difference at the KTPs between neighbouring frames is calculated as the similarity measure in text tracking, which can reduce the influence of dramatic background variation. Finally, the total number of the KTPs in each connected component is calculated to remove the background interference and to extract texts for text segmentation. Experimental results show that the proposed text detection and localisation algorithm is robust to the font size, style, colour and alignment of texts. Text tracking significantly improves the efficiency of text detection, and the similarity measure based on KTPs decreases the influence of scene changes and motions. The proposed text segmentation has a promising performance when processing video scenes with complex backgrounds and low contrast between texts and backgrounds.",
author = "Z. Li and G. Liu and X. Qian and D. Guo and Hangyi Jiang",
year = "2011",
month = "12",
doi = "10.1049/iet-ipr.2010.0397",
language = "English (US)",
volume = "5",
pages = "671--683",
journal = "IET Image Processing",
issn = "1751-9659",
publisher = "Institution of Engineering and Technology",
number = "8",

}

TY - JOUR

T1 - Effective and efficient video text extraction using key text points

AU - Li, Z.

AU - Liu, G.

AU - Qian, X.

AU - Guo, D.

AU - Jiang, Hangyi

PY - 2011/12

Y1 - 2011/12

N2 - Text information contains important clues for video analysis, indexing and retrieval. Effective and efficient text extraction has been a challenging and significant topic. Focusing on this issue, this study proposes a video text extraction scheme using key text points (KTPs). A KTP is defined as the point that has strong textural structure in multi-directions simultaneously. First, a KTP can be acquired by the three high-frequency subbands obtained from the wavelet transform. An anti-texture-direction-projection method is then proposed to improve the accuracy of text localisation and verification. Second, the difference at the KTPs between neighbouring frames is calculated as the similarity measure in text tracking, which can reduce the influence of dramatic background variation. Finally, the total number of the KTPs in each connected component is calculated to remove the background interference and to extract texts for text segmentation. Experimental results show that the proposed text detection and localisation algorithm is robust to the font size, style, colour and alignment of texts. Text tracking significantly improves the efficiency of text detection, and the similarity measure based on KTPs decreases the influence of scene changes and motions. The proposed text segmentation has a promising performance when processing video scenes with complex backgrounds and low contrast between texts and backgrounds.

AB - Text information contains important clues for video analysis, indexing and retrieval. Effective and efficient text extraction has been a challenging and significant topic. Focusing on this issue, this study proposes a video text extraction scheme using key text points (KTPs). A KTP is defined as the point that has strong textural structure in multi-directions simultaneously. First, a KTP can be acquired by the three high-frequency subbands obtained from the wavelet transform. An anti-texture-direction-projection method is then proposed to improve the accuracy of text localisation and verification. Second, the difference at the KTPs between neighbouring frames is calculated as the similarity measure in text tracking, which can reduce the influence of dramatic background variation. Finally, the total number of the KTPs in each connected component is calculated to remove the background interference and to extract texts for text segmentation. Experimental results show that the proposed text detection and localisation algorithm is robust to the font size, style, colour and alignment of texts. Text tracking significantly improves the efficiency of text detection, and the similarity measure based on KTPs decreases the influence of scene changes and motions. The proposed text segmentation has a promising performance when processing video scenes with complex backgrounds and low contrast between texts and backgrounds.

UR - http://www.scopus.com/inward/record.url?scp=79960811959&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79960811959&partnerID=8YFLogxK

U2 - 10.1049/iet-ipr.2010.0397

DO - 10.1049/iet-ipr.2010.0397

M3 - Article

VL - 5

SP - 671

EP - 683

JO - IET Image Processing

JF - IET Image Processing

SN - 1751-9659

IS - 8

ER -