04-06-2020, 07:57 AM
(04-05-2020, 07:30 PM)jwhitten Wrote: Just out of curiosity, how are you handling the 'read aloud' part now? Do you 'chunk it up' into discrete amounts, or just keep shoving text until it blocks?
It's a bit intricate - but basically the algorithm is:
1) select the next paragraph of text
2) construct an SSML string for the selected paragraph of text, including 'marker' elements to show the location of any page-breaks in the text
3) submit the string to the speech engine
4) whenever the speech engine raises a 'marker reached' event, switch the page as necessary, and re-apply the 'selected text' overlay
5) when the speech engine raises an 'end of utterance' event, figure out whether I am meant to keep reading; if so, select the next paragraph and go to (2)
Complicating factors include the facts that:
- In UWP, you need to set up a MediaPlayer separately from the speech engine, to play the speech sound stream and some of those events come from the MediaPlayer object, not the speech engine, and sometimes a MediaPlayer will freeze if fed two speech streams immediately one after another (so I build a round-robin pool of MediaPlayers), and some of these actions must be done on the UI thread, while others must not be.
- In Android, there are many different speech engines that might be installed, and they have different features. None of them provides 'marker' functionality, but some of them provide a 'progress' event, which triggers repeatedly during the utterance (typically at the end of each phoneme) - so if the speech engine provides that feature, I can use that to do logic like step (4). Of course, there is no reliable declarative mechanism to determine whether a particular Android speech engine does or does not support the feature, so I have to test it dynamically. If the feature isn't supported, then I need to use a work-around (basically, I read one sentence at a time, rather than one paragraph at a time).
Issues like these would lead me to believe that adding another speech engine (and moreover one that is in the cloud) might not be entirely straightforward ;-)