From P2P Foundation
Jump to navigation Jump to search

= Open Content initiative to collect transcribed speech for use with Free and Open Source Speech Recognition Engines.

URL = http://www.voxforge.org/


"We will make available all submitted audio files under the GPL license, and then 'compile' them into Acoustic Models for use with Open Source Speech Recognition engines such as Sphinx, ISIP, Julius and HTK (note: HTK has distribution restrictions). Why Do We Need Free GPL Speech Audio?

Most Acoustic Models used by 'Open Source' Speech Recognition engines are 'Closed Source'. They do not give you access to the speech audio and transcriptions (called Speech Corpus or Corpora) used to create the acoustic model.

The reason for this is because there is no free Speech Corpus in a form that can readily be used to create Acoustic Models for Speech Recognition Engines. Open Source projects are required to purchase Speech Corpora which has restrictive licensing (i.e. they are *not* permitted to distribute the 'source' speech audio, but can distribute the 'compiled' Acoustic Model)." (http://www.voxforge.org/)