PC speaker speech resynthesized

The other day, Disasterpeace mentioned an idea about speech synthesis based on pulse waves, and I remembered playing around with something like that back in 1995, while doing sound effects for a little university project game.

I actually found the original code, and ported it to ChipSound! The original was done in Borland Pascal on DOS, using the sound, nosound and delay calls. It’s basically PWM/hardsync with occasional noise generated by randomly modulating the frequency; all “commands” issued with 1 ms granularity. (Except for the “m”, where I was apparently creating a half-amplitude pulse by setting the frequency to 40kHz, whereas the ChipSound version is just creating a very short pulse.)

It’s all very primitive, and there’s a lot more tweaking that could be done – like adding multiple formants to improve the vowels – but the point is, it seems like one might actually get away with synthesizing speech using only fixed amplitude pulse waves. :)

A recording of the ChipSound version, first with some flanger and delay effects, and then mostly dry:
pc-bot-emergency.mp3

Original Borland Pascal source code: TALK.PAS

ChipSound source code: talk.csl

The thing is trying to say some random nonsense;

Emergency! Emergency! Emergency!
I am PC! I seek revenge!
This is a real emergency!

David

About Olofson

Founder of Olofson Arcade.
This entry was posted in Development. Bookmark the permalink.

10 Responses to PC speaker speech resynthesized

  1. qubodup says:

    The talk.csl is much clearer than the chipsound music sources I took looks at so far! this should definitely be in part 1 of chipsound’s tutorial, should there ever be one. ;)

    Would you mind slapping a foss license on the talk.csl file?

    • Olofson says:

      It’s a LOT simpler than any of the proper songs, or even the Kobo II sound effects, as it only uses a single voice with a single waveform at full amplitude. :-)

      And yeah, a bit of documentation and stuff might be nice – before I start forgetting stuff myself…! :-D

      No problem! Just didn’t think about that… zlib?

  2. qubodup says:

    Anything permissive would be great. Copyleft might be a problem (I for one would be confused as to whether or not it affects output :) ). Thanks! (I already noticed the license change)

    Creating documentation might be a good point for creating a revision control system. Github has static website publishing support in case you’re interested: http://pages.github.com/

    On the other hand it might be a horrible idea to start learning a new code management software and method mid-project – sorry for keeping writing about it :)

    Blog posts are just as good. ;) We at joyridelabs use publically shared google docs and mirror it on indiedb’s tutorial section (I definitely prefer docs).

    • Olofson says:

      The output (as in, sound) should never be affected by the GPL or LGPL at least. Obviously, a player lib would need a different license to be seriously usable – which is why ChipSound is zlib. :-) Future external tools (tracker/sequencer with live script editor/debugger is something I really need ASAP) might be GPL, but that has no bearing on anything created with them.

      However, since the idea with ChipSound is primarily to distribute the .csl scripts rather than rendered audio, examples, instrument libraries etc should basically be in the public domain – where you can’t legally put anything in most countries…! Thus, zlib for now. Maybe some Creative Commons licenses would be more appropriate?

      Well, revision control might be a bit overkill for ChipSound at this point, but I guess I should start using something for the larger projects anyway… :-D Actually, there is a trac with SVN for Kobo Deluxe, so at least I’m vaguely familiar with that.

      Well there is also a (mostly very quiet) forum on this site. ;-) And of course, the documentation (currently a plain text file) should be updated and perhaps illustrated either way.

      • qubodup says:

        There actually is a PD-like cc license: http://creativecommons.org/publicdomain/zero/1.0/

        .csl scripts are code, so I think all other cc licenses might be a bad choice.

        • Olofson says:

          Yep – and it’s a license. ;-) Apparently, a “this is in the public domain” not doesn’t cut it; one needs to be more specific than that.

          Yes, definitely code, which is why zlib came to mind first. I use it much like any other language; either copy-pasting and modifying, or more properly reusing subprograms by calling or spawning them as is – and the idea is that the code is included in the final product, so it makes sense in that respect too.

  3. angros47 says:

    Looks really interesting… It reminds me the old SPEECH.COM, by Andy mcGuire.
    Your solution is liggtee… unfortunately, it does not include all phonemes.

    Is there any chance to implement missing phonemes, to be able to say every phrase?

    • Olofson says:

      Thanks! Yeah, this is just a direct port and translation of an ancient fun hack, so it only covers a few random phonemes.

      One could certainly add more phonemes, but one might want to use more complex waveforms and/or some resonant filters for more intelligible speech. I’m planning on having a go at that using Audiality 2, but I’m not sure when I’ll get around to it. What I have in mind is something like the Black Mesa Announcement System from Half-Life.

  4. angros47 says:

    I ported it to FreeBasic for dos, it works:

    http://freebasic.net/forum/viewtopic.php?f=4&t=22557

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>