Byungjun Kwon / Björn Erlach / Luc Döbereiner > Speech Recognition Techniques in Non-Speech Sound Systems

Steim residency period : 20th~31st of August, 2007
Participants : Byungjun Kwon, Björn Erlach and Luc Döbereiner
Using the programmable VR Stamp Voice Recognition Module(photo below), we plan to build a system for the creation of music, text and synthesized speech. The aimed result is a piece situated in between of text and sound, speech and music. Speech recognition techniques offer possibilities for working at these boundaries, as their sensing capabilities are not limited to speech, but can be equally well applied to other sounds.

VR stamp Toolkit
After five days of working, mainly soldering and programing, we have the VR Stamp speech recognition module talking to itself, by using its speaker dependent continuous listening functionalities. In video02 you can see a partly random concatenation of Latin phonemes, chosen by recognizing its own output in a feedback loop. Video01 shows the training phase.

video01(training)

video02(latin)

We looked into speech recognition software systems, which offer great flexibility, but we chose the VR Stamp module, because its limitations and sound quality provide use with a framework, which we can explore in ten days. As a next step, we want to extend the link to language, both in sound and in meaning.

Final report

STEIM

Studio for Electro-Instrumental Music

Byungjun Kwon / Björn Erlach / Luc Döbereiner > Speech Recognition Techniques in Non-Speech Sound Systems