Phone: (+7 495) 77-55-122, 77-55-123, 77-55-124. E-MailContactsSearchprinter-version
STEL Computer Systems Videoconferencing Telemedicine E-Learning Speech Technologies Special Computer Equipment News

RU


EN
www.stel.ru » Stel Computer Systems - Official website » Speech Technologies
Search:
Login: Password:
Stel Computer Systems
News
Videoconferencing
Telemedicine
Distance Learning
Computer Equipment
Multimedia Halls
Speech
Contacts
Russian version
STEL Computer Systems / Speech Technologies
Speech Technologies

Stel – Computer Systems has been developing speech technologies since 1996, including automated speech recognition and synthesis systems and related tasks. We collaborate with the leading scientists of the Philology Department, M.V. Lomonosov Moscow State University, Computer Center of the RF Academy of Science, and other agencies. Acoustic phonetic and speech databases consisting of different legible text types and relevant transcription material are Stel – Computer Systems' commercial products. The databases are well-balanced in terms of their phonemic inventory. Time marking and transcription have been provided by phonetician experts.
We have developed statistical Russian language models intended for use with large and very large vocabularies. Stel – Computer Systems is working on building automated speech recognition and speaker identification systems based on the advanced methods of speech processing and hidden Markov models for triphones and biphones, i.e. contextually dependent manifestations of acoustic units.
Stel – Computer Systems' experience in speech recognition and synthesis forms a sound foundation for practical projects related to the speech technologies.


RUSSIAN Speech Database Description

 



TEXT description

The collected database consists of five independent parts differed by reading text type and proper transcription stuff.

Database part

Description

The Tables

This part consists of 50 series of 10-11 sentences each (5 words for the sentence, on the average). The phone set balances the database. The database has associated phone, time-aligned transcriptions individual for each speaker.

The Digits

The texts consist of different digit sequences. Each speaker has read 5 texts (190 words for the text, on the average) differed by digit sequence and reading style (continuos speech, isolated words). The database have been used for existing model set testing and for the refinement of models adapted for digit sequences recognition. The database has associated transcriptions obtained with the transcription-maker application. The transcriptions are common for each speaker. There is also time-aligned word transcription for the database, obtained with the our Russian voice recognition software.

The Balanced Texts

Two stories (358/398 words). The phone set balances the texts. The database is used for existing model set training. The database has associated transcriptions obtained with the transcription-maker application, which were adapted by phoneticians for each speaker.

The Texts

51 text set (520 words for the text, on the average). The texts based on Russian newspaper articles. The database utilization and transcribing is the same as for The Balanced Texts Speech database.

The Sentences

The database consists of 50 series of 10-11 sentences each (5 words for the sentence, on the average). The phone set balances the database. There are transcriptions obtained by the transcription-maker software.


The general outline:

Database

Speakers in all

Males

Females

Sound time total (s)

The Tables

4

2

2

4364

The Digits

19

16

3

11057

The Balanced Texts

96

68

28

16880

The Texts

96

68

28

25816

The Sentences

35

16

19

50281

In all database

137

89

48

108398


The databases preprocessing:

Database

Preprocessing

The Tables

There are 510 *.wav files for each speaker (a sentence per file). Associated with these files there are *.lab files containing time-aligned phone transcription made by Russian phoneticians.

File name mask: NNnniis.wav (NNnniis.lab),

where NN - series number,
nn -sentence number,
ii - speaker ID,
s - speaker sex.

The Digits

There are 5 *.wav files containing digit sequences for each speaker. Associated with these files there are *.lab files containing phone transcription made by transcription-maker software. The transcriptions are common for each speaker. There are also *.rec files corresponding to *.wav files. These files contain time-aligned word transcription made by our Russian speech recognition software.

File name mask: 51nniis.wav (NNnniis.lab, NNnniis.rec),

where nn - text number,
ii - speaker ID.

The Balanced Texts

There is *.wav file for each speaker with balanced text, he/she has read. Associated with this files there is *.lab file containing phone transcription made by transcription-maker software and adopted for the speaker by Russian phonetician.

File name mask: 53nnii[i]s.wav (nnnnii[i]s.lab),

where nn - balanced text number,
iii - speaker ID (two or three chars)
s- speaker sex

The Texts

There is *.wav file for each speaker with balanced text, he/she has read. Associated with this files there is *.lab file containing phone transcription made by transcription-maker software and adopted for the speaker by Russian phonetician.

File name mask: 54nnii[i]s.wav (Nnnnii[i]s.lab),

where nn - text number
ii[i] - speaker ID (two or three chars)
s - speaker sex

The Sentences

There are 510 *.wav files for each speaker (a sentence per file). Associated with these files there are *.lab files containing phone transcription made by transcription-maker software and common for each speaker.

File name mask: NNnnii[i]s.wav (NNnnii[i]s.lab),

where NN - series number
nn - sentence number
ii[i] - speaker ID (two or three chars)
s - speaker sex



Speaker Description 

Age description:

Age description Russian speech database designed for recognition software

All Speech Database Speakers. The axis of abscissas is Speakers age, the axis of ordinates is speakers number.

Age description Russian speech database The Table designed for recognition software

The Table Speech Database Speakers

Age description Russian speech database The Digits designed for recognition software

The Digits Speech Database Speakers

Age description Russian speech database The Balanced Texts designed for recognition software

The Texts and The Balanced Texts Speech Database Speakers

Age description Russian speech database The Sentences designed for recognition software

The Sentences Speech Database Speakers

Dialect Peculiarity:

There were 11 Dialect Peculiarity: groups among speakers Ladogo-Tikhvinskaya, Kostromskaya, Arkhangelskaya,Vladimiro-Povolgskaya, Ryazanskaya, Kursko-Orlovskaya, Tulskaya, West Russian Dialect Zone, East Russian Dialect Zone, Moscow's and S.-Peterburgs groups.


Speech Database Recording Method

The following equipment have been used for speech database recording:

Microphone: Shure SM10A + Symetrix SX202.

Sound Card: Turtle Beach Tropez Plus

Filters: no

Sound file format: 22050 Hz, 16 bit


Speech Database Storing and Structure

Information Storing:

Following are the disks using for the speech database storing: CD-ROM ISO9660.

Database

CD Number

The Tables

1

The Digits

1

The Balanced Texts

4

The Texts

4

The Sentences

4

All database

14


Database Structure:

The Database is stored as *.wav sound file set. Following is the sound file set structure:

Database

Structure

The Tables

2040 files (510 files per speaker) are stored in a directory at Base 1: The Tables CD in individual for each speaker directories.

There are also two directories per 2040 *.lab files, containing proper time-aligned transcriptions (Latin and Cyrillic).

The Digits

19 directories (individual for each speaker) per 5 files. A directory containing *.lab transcription files.

A directory containing time-aligned word transcription files (*.rec) made by transcription-maker software.

The Balanced Texts

A file per speaker. 96 files in all.

The Texts

A file per speaker. 96 files in all.

The Sentences

35 directories (individual for each speaker) per 510 files.



PRICE 

If you are interested in delivery and payment conditions, please contact us.

© 1991-2013 "STEL Computer Systems" , Moscow, Russia