CCAligner: Allow passing raw text transcript
CCAligner uses subtitles to do a "guided search" in the input audio file for speech. But sometimes we have raw text transcript (plain text containing spoken text without any timing information and formatting). CCAligner should be able to handle that at least for -transcribe
parameter.
Allow passing text transcript directly instead of subtitles. The grammar files should be generated with the help of this file. Of course this task requires at least some understanding of working of CCAligner, so it's recommended to go through the working of program before attempting this.
For the task, add a new parameter -txt. User should pass raw text file to CCAligner. When this mode is chosen, do not allow normal word level synchronisation, but only allow complete timed transcription ( -transcribe
parameter).