autohotkey


Generate timed-text synchronised with Text-to-Speech word-by-word?


How can I generate timed-text (e.g. for subtitles) synchronised with Text-to-Speech (TTS) word-by-word?
I'd like to do this using the high quality SAPI5 voices (e.g. those available from IVONA here) and that I have used on Windows 10.
On Windows we already have some good free TTS programs:
Read4Me - open source
Balabolka - closed source
TTSApp Microsoft's own very basic GUI - currently available here - it seems to date from 2001.
TTSApp can produce audio files in WAV. Balabolka creates MP3 files
along with synchronised timed-text as LRC files used in Karaoke - BUT only on line-by-line basis NOT word-by-word.
However, both show word-by-word highlighting while they speak aloud on screen - in real time.
If I had some TTS/SAPI5 source code I could simply check the clock every time a new word starts to be generated and write the time and that word to a file. Does anyone know of any project that exposes that level of programming - so I might start from there?
UPDATE SEPT 2016
I've since discovered the TTSApp was reimplemented using AutoHotKey by a certain jballi in 2012.
I've adapted that code to append to a text file the time in ms every time the onWord event handler fires.
Still I need to make two passes:
a rapid automated pass to save the WAV file and
a slow (realtime) pass that creates the timing file.
I am still hoping to find a way to accelerate step 2.
BTW The VisualBasic source appears to be archived here.
It is possible to do all of this offline!
You generate a WAV file using SAPI while specifying DoEvents - documented here.
A binary representation of each event (e.g. phoneme/word/sentence) gets appended to the end of the WAV file. A certain Hans documented the WAV/SAPI format in 2009 here.
This can all be done by a simple modification of jballi's 2012 AutoHotkey version of TTSApp
Basically you replace these lines of code in Example1GUI.ahk
SpFileStream.Open(SaveToFileName,SSFMCreateForWrite,False)
;-- Set the output stream to the file stream
SpVoice.AllowAudioOutputFormatChangesOnNextSet:=False
SpVoice.AudioOutputStream:=SpFileStream
;-- Speak using the given flags
SpVoice.Speak(Text,SpeakFlags)
with the following:
SpFileStream.Open(SaveToFileName,SSFMCreateForWrite,True) ;-- DoEvents
;-- Set the output stream to the file stream
SpVoice.AllowAudioOutputFormatChangesOnNextSet:=False
SpVoice.AudioOutputStream:=SpFileStream
if not Sink ;-- DoEvents label
{
ComObjConnect(SpVoice, "On")
Sink:=True
}
;-- Speak using the given flags
SpVoice.Speak(Text,SpeakFlags|SVSFlagsAsync|SVSFPurgeBeforeSpeak)

Related Links

AutoHotKey: How to access array with counter variable
AutoHotKey get closed/hidden menu item's state (checked/unchecked)
Remap Capslock to Esc and disable original Esc key in AutoHotkey
Toggle between mouse clicks at two positions
Autohotkey cannot Undo a Capslock
Combination of specific key and any other key
How can I check type of multiple variables in one block?
Run script at boot
Autohotkey winclose just wont work
Handling loops in AHK script
How to prevent sent strings from being capitalized by the + prefix
Trying to capture value in class tag using ahk script
AHK AltTab and ShiftAltTab not working
If/Else statement is not when trying to grab address list
Are backticks possible in AHK hotstring?
Inactivate right click when using it as shortkey

Categories

HOME
pact
max
joomla
msbuild
algolia
sip
cxf
wifi
conceptual
arcgis
livecharts
opengl-es
perl6
multipartform-data
mamp
websharper
opendaylight
facebook-marketing-api
template10
css-float
jquery-easyui
conditional-formatting
xforms
concatenation
oracle10g
traveling-salesman
scripting-language
checkmarx
typeorm
emr
pypy
geocomplete
dom4j
ansible-inventory
leaderboard
extern
const
mybatis-generator
usernotifications
server-side-swift
contenteditable
python-venv
gradlew
piranha-cms
easyquery
emacs24
skype-bots
riemann
dotnet-cli
enet
node-java
activesupport
hellosign
chain
dynatable
disassembly
cedar-bdd
eclipse-mars
normalisation
hfp
vitamio
queuing
avi
audiorecord
transaction-isolation
jersey-test-framework
instabug
bioconductor
event-tracking
roslyn-code-analysis
web-standards
screen-lock
qxorm
fiware-health
caption
dojox.grid
clearcanvas
zoneminder
formbuilder
cocos2d-x-2.x
drawable
scjp
rails-migrations
mixradio
flask-mongoengine
google-oauth-java-client
aspnet-compiler
ironworker
phone-state-listener
dnsbl
algol
expresso-store
waveout
winrt-async
nserror
mvcmailer
sqlbuddy
nsundomanager
code-organization
apache-commons-email
cherokee
xpolog
peoplepicker
zipstream
subdirectories
expander

Resources

Encrypt Message