CCExtractor Development

Hard: Solve a OCR issue

Possibly medium-hard, so we recommend you only take this task if you have already been successful with a few others.

And we definitely recommend it if you want to be a winner :-)

A user reports this ( https://github.com/CCExtractor/ccextractor/issues/840 ) :

unable to extract multiple lines from DVB-sub using OCR

There are three problems with ccextractor's output from this file:

Timing is slightly off (really not a huge deal, it's only off by 10-30 ms) The last caption (the twenty-fifth caption) is missing from the output completely Only the first line of text from each caption is shown (this is the biggest problem)

Task tags

  • bugfix
  • ocr
  • hard
  • dvb

Students who completed this task

Harry Yu, grayv

Task type

  • code Code
  • done_all Quality Assurance
close

2017