Abstract
Text information contains important clues for video analysis, indexing and retrieval. Effective and efficient text extraction has been a challenging and significant topic. Focusing on this issue, this study proposes a video text extraction scheme using key text points (KTPs). A KTP is defined as the point that has strong textural structure in multi-directions simultaneously. First, a KTP can be acquired by the three high-frequency subbands obtained from the wavelet transform. An anti-texture-direction-projection method is then proposed to improve the accuracy of text localisation and verification. Second, the difference at the KTPs between neighbouring frames is calculated as the similarity measure in text tracking, which can reduce the influence of dramatic background variation. Finally, the total number of the KTPs in each connected component is calculated to remove the background interference and to extract texts for text segmentation. Experimental results show that the proposed text detection and localisation algorithm is robust to the font size, style, colour and alignment of texts. Text tracking significantly improves the efficiency of text detection, and the similarity measure based on KTPs decreases the influence of scene changes and motions. The proposed text segmentation has a promising performance when processing video scenes with complex backgrounds and low contrast between texts and backgrounds.
Original language | English (US) |
---|---|
Pages (from-to) | 671-683 |
Number of pages | 13 |
Journal | IET Image Processing |
Volume | 5 |
Issue number | 8 |
DOIs | |
State | Published - Dec 2011 |
Externally published | Yes |
ASJC Scopus subject areas
- Software
- Signal Processing
- Computer Vision and Pattern Recognition
- Electrical and Electronic Engineering