[Back to Sound Database Home]

RWCP Sound Scene Database in Real Acoustical Environments
Non-speech sound dry source

[non-speech top] [sounds] [sample] [research 1] [research 2] [system]

Kazuo Hiyane and Jun Iio, Mitsubishi Research Insitute, Inc.

Index Other pages
See http://tosa.mri.co.jp/sounddb/nospeech/indexe.htm for latest information.
1. Recording policy

In a real environment, there are many kinds of sounds other than speech. In contrast to speech recognition, until now non-speech sounds have seldom been studied. "Non-speech sound dry source database" has been developed to contribute to studies on non-speech sound recognition and removal of background sound in speech recognition.

It is impossible to record all kinds of non-speech sounds. Even with same sound source, various waveforms can be produced according to the acoustic characteristics of a room. Hence, in the database, recording was conducted in an anechoic room as a dry source, which contains a sound source wave not influenced by the acoustic characteristics of the room. In addition, each sound source was experimented by changing beating and ringing manners to yield many samples (each sample size was 100 per sound source) necessary for study on sound source discrimination.

Convolution of the dry source with the impulse response of each kind of the room allows the reproduction of a sound in the room, and thus diversity of sounds, necessary for the development of a reliable algorithm for sound source discrimination.

2. Outlines of recording

Using a standard single microphone, acoustic signals of about 100 types of sound sources were measured in an anechoic room as the dry source.

Recording information about non-speech sound dry source
Recording location unechoic room
Recording equipment Standard microphone (B&K 4134),
Filter amp. (B&K 2636),
DAT Recorder (SONY DTC-77ES)
Sampling rate48kHz, 16bit
Number of samples 9722
Amount of data 824MB (16bit RAW format)
SNR 40〜50dB

The acoustic signals recorded are classified in three categories according to the types of sound sources as shown in the following table.

a. Collision sound source
Sound sources caused by one-time collision of an object such as the beating of a hard object and dropping the hard object on the floor.
b. Action sound source
The sound source itself cannot be clearly specified on the basis of the sound, but the sound source presents a characteristic tone.
c. Characteristic sound source
The sound source whose tone characteristically expresses the type of sound source.

List of non-speech dray sources
 Category# of samples Sound source [Japanese onomatopoeia]
Collision
sound
source
a1.Collision sound source (wood) 1187 wood board,wood stick [kon, poku]
a2.Collision sound source (metal) 1000 metal board, metal can [kan, kin]
a3.Collision sound source (plastic) 550 plastic case [kasha, karara]
a4.Collision sound source (ceramic) 800 glasss, china [chin]
Action
sound
source
b1.article dropping 200 dropping articles in box [Za:, barabara]
b2.gas jetting 200 spray, pump [shu:]
b3.rubbing 500 sawing, sanding [gi:]
b4.bursting and breaking 200 breaking stick, air cap [baki, puchi]
b5.clapping sound 829 hand clap, slamming clip [pan, pon]
Characteristic
sound
source
c1.small metal articles 1072 small bell, coin [chirin, shan]
c2.paper 400 droping book, tearing paper [basa, bee]
c3.musical instruments 1079 drum, whistle, bugle [pon, pee, pafu]
c4.electronic sound 705 phone, toy [pipi, buu]
c5.mechanical 1000 spring, stapler [jii, gacha]

3. Recording conditions

(1) Recording environment

Recording environment was the anechoic room. Specification of the anechoic room was described below. Actual noises measured level were 17.3 dBA and 44.3 dBC.

(2) Recording instrument

Instruments used for data recording of non-speech sound dry source are given below.

CategoryInstrumentMakerModel
Receicer Microphone B&K 4134 1/2inch Measuring Microphone
AmplificationMicrophone amplifier filter B&K 2636 Measuring Amp.
RecordingDAT recorder Sony Sony DTC-77ES

The microphone was fixed 20 cm from the floor (piano wire network), while the sound source was set 5 - 20 cm from the floor. The sound was generated from the source by such methods as beating the object while handholding it, beating the object while it was placed on a sound absorbing board on the floor and dropping the object on a board on the floor.

The distance between the sound source and the microphone was basically set at 1.0 m. To gain a higher S/N ratio, according to sound volume, two gains of the microphone amplifier were set to +30 dB and +40 dB. In case of very small sound, +40 dB was selected and the distance was decreased to 50 cm. In the microphone amplifier, a 22.4 kHz low path filter was used.

Close-up of 16-ch microphone array Overview of 16-ch microphone array
A 16-channel circular array
On a circular ring meauring 30-cm in diameter, 16 microphones were installed in a 22.5-degree interval.
Steps of editing data recorded
(1) Making a data file
The acoustic signal was recorded on a DAT tape as a continuous data. This data was fed to a computer using DAT-Link (product of Arcadia) to prepare a raw data file of 1 16-bit RAW format on the computer.

(2) Cutting out of phonetic unit
The raw data file was divided into files for respective phonetic samples. For this, an automatic extracting program for acoustic events was developed to extract automatically about 80% of acoustic events. In addition, by viewing and semiautomatic processing, all acoustic events were extracted. Finally, data was compiled in a form of a binary file of 48-kHz/ 16-bit RAW format.

(3) Down sampling
In acoustical research, 16 kHz sampling is mainly used, rather than 48 kHz sampling. Hence, using the low path filter for harmonics of the order 300 of the cut-off frequency 7.7 kHz generated by a command MATLAB, down sampling was done to generate a 16 kHz signal.
See calculation method for low path filter and down sampling

(4) Spectrogram
By using the 16 kHz signal yielded by down sampling, spectrogram (transition of a short time spectrum according to time) was calculated as follows. First, a frame of 16 ms (256 dots) is cut out in a 4 ms interval. A Hamming window is multiplied for each frame and a spectrum calculated by FFT is converted to decibel (20 log10).

(5) Convolution with the impulse response
By using the impulse response of three kinds of rooms, convolution with the 48 kHz sampling signal was carried out to reconstruct the sound signal in respective rooms.
See convolution calculation method of the dry source with the impulse response
List of data recorded

The following data is included in the DVD-ROM.

  1. Dry source (48kHz, 16bit integer RAW format)
  2. Dry source (16kHz, 16bit integer RAW format)

  3. Spectrogram (0〜8kHz, float CSV format)
  4. Graph image of dry source (GIF format)
  5. Graph image of spectrogram (JPEG format)
  6. Sound for replay (48kHz, 16bitWAVE format)

  7. Reconstructed signal [Reverberation time : 0.30sec.] (48kHz, 16bit integer RAW format)
  8. Reconstructed signal [Reverberation time : 0.78sec.] (48kHz, 16bit integer RAW format)
  9. Reconstructed signal [Reverberation time : 1.30sec.] (48kHz, 16bit integer RAW format)
About 9,700 samples of original data (48 kHz) from various sound source, as well as data yielded by down sampling to 16 kHz with a low path filter, are included. For reference, integers are all little endian (= Pentium, Alpha and the like.)

The spectrogram (progression of power spectrum over time), graph image and data for test listening included in the 92 types of sound sources are each only a representative one.

In the signal data reconstructed by convoluting the impulse response with the dry source, 0.005 raw of 48 kHz of respective sound source was used. In other words, each sound source contains one data.

The structure of a directory is stratified for each sound source in the dry source and for the impulse response in the reconstructed data.

/nospeech … Current directory
 +---/drysrc ………… Dry source
      +---/a1 ………… Collision sound source (wood)
      |     +---/cherry1 Beating handheld wooden board (small) B with a wooden stick
      |     |     +--- index.htm      
      |     |     +--- cherry1.gif       Graph image of waveform
      |     |     +--- cherry1.jpg       Graph image of spectrogram
      |     |     +--- cherry1.spg       Spectrogram data
      |     |     +--- cherry1.wav       Sound for replay
      |     |     +---/48khz           
      |     |     |     +--- 002.raw   Dry source (48kHz)
      |     |     |     :                  :
      |     |     |     +--- 100.raw   Dry source(48kHz)
      |     |     |
      |     |     +---/16khz        
      |     |           +--- 002.raw   Dry source(16kHz)
      |     |           :                  :
      |     |           +--- 100.raw   Dry source(16kHz)
      |     |
      |     +---/cherry2         Beating handheld wooden board (small) B with a wooden stick
      |     +---/cherry3         Beating handheld wooden board (medium) B with a wooden stick
      |     +---/magno1          Beating handheld wooden board (small) C with a wooden stick 
      |     +---/magno2          Beating handheld wooden board (medium) C with a wooden stick
      |     +---/magno3          Beating handheld wooden board (large) C with a wooden stick 
      |     +---/teak1           Beating handheld wooden board (small) A with a wooden stick                            
      |     +---/teak2           Beating handheld wooden board (medium) A with a wooden stick                           
      |     +---/teak3           Beating handheld wooden board (large) A with a wooden stick                            
      |     +---/wood1           Beating handheld wooden board (small) A and wooden board (small) B against each other  
      |     +---/wood2           Beating handheld wooden board (medium) A and wooden board (medium) B against each other
      |     +---/wood3           Beating handheld wooden board (large) A and wooden board (large) B against each other  
      |
      +---/a2 ………… Collision sound source (metal)
      |     +---/bank            Beating a handheld moneybox with a metal stick
      |     +---/bowl            Beating a handheld ball with a metal stick
      |     +---/candybwl        Beating a handheld metal box with a metal stick
      |     +---/coffcan         Beating a handheld coffee can with a metal stick
      |     +---/colacan         Beating a handheld Coca-Cola can with a metal stick
      |     +---/metal15         Beating a handheld metal board (1.5 mm in thickness) with a metal stick
      |     +---/metal10         Beating a handheld metal board (1.0 mm in thickness) with a metal stick
      |     +---/metal05         Beating a handheld metal board (0.5 mm in thickness) with a metal stick
      |     +---/pan             Beating a handheld pan  with a metal stick
      |     +---/trashbox        Beating a handheld dustbox with a metal stick
      |
      +---/a3 ………… Collision sound source (plastic)
      |     +---/case1           Beating handheld plastic cases A, B and C with a wooden stick
      |     +---/case2           Beating handheld plastic cases A, B and C against each other  
      |     +---/case3           Dropping plastic cases A, B and C on a plywood board         
      |     +---/dice1           Dropping dice A, B, C, D and E on a wooden board (large)     
      |     +---/dice2           Dropping dice A, B, C, D and E on a plywood board            
      |     +---/dice3           Dropping dice A, B, C, D and E on a metal board              
      |
      +---/a4 ………… Collision sound source (ceramic)
      |     +---/cup1            Beating glass cups A, B, C, D and E placed on a sound absorbing board with a wooden stick or a spoon
      |     +---/cup2            Beating handheld glass cups A, B, C, D and E with a wooden stick or a spoon  
      |     +---/bottle1         Beating side downward of glass bottles A, B, C, D and E placed on a sound absorbing board with a wooden stick or a spoon. 
      |     +---/bottle2         Beating mouth downward of glass bottles A, B, C, D and E placed on a sound absorbing board with a wooden stick or a spoon. 
      |     +---/china1          Beating side downward of china A, B, C, D and E placed on a sound absorbing board with a wooden stick or a spoon. 
      |     +---/china2          Beating bottom downward of china A, B, C, D and E placed on a sound absorbing board with a wooden stick or a spoon
      |     +---/china3          Beating side downward of handheld china A, B, C, D and E with a wooden stick or a spoon                           
      |     +---/china4          Putting china A, B, C, D and E one of top of the other on a sound absorbing board                                 
      |
      +---/b1 ………… Action sound source (article dropping)
      |     +---/particl1        Dropping articles A, B, C and D in a paper box
      |     +---/particl2        Dropping articles A, B, C and D in a metal box
      |
      +---/b2 ………… Action sound source (gas jetting)
      |     +---/spray           Sound of gas spray A and B
      |     +---/pump            Sound of air pump
      |
      +---/b3 ………… Action sound source (rubbing)
      |     +---/saw1            Sawing a metal piece with a metal saw
      |     +---/saw2            Sawing a wood piece with a jigsaw    
      |     +---/file            Filing a metal stick with a metal file
      |     +---/sandpp1         Sanding a wooden piece             
      |     +---/sandpp2         Sanding a wooden piece with sandpaper wrapped around wooden piece
      |
      +---/b4 ………… Action sound source (bursting and breaking)
      |     +---/sticks          Breaking disposable wooden chopsticks by hands
      |     +---/aircap          Crashing an air cap by hands                  
      |
      +---/b5 ………… Action sound source (clapping sound)
      |     +---/clap1           Clapping (once)        
      |     +---/clap2           Clapping (once (no. 2))
      |     +---/claps1          Clapping (a plurality of clapping)
      |     +---/claps2          Clapping (a plurality of clapping (no. 2))
      |     +---/clip1           Slamming a small clip 
      |     +---/clip2           Slamming a large clip 
      |     +---/cap1            Making cap A burst open
      |     +---/cap2            Closing cap B          

      |     +---/snap            Beating paper-made bellows (large and small) (used for beating a person's head in a TV laugh-in)
      |     +---/cracker            Setting off a cracker
      |
      +---/c1 ………… Characteristic sound source (small metal articles)
      |     +---/bell1           Ringing a suspended small bell (single) by pulling the cord           
      |     +---/bell2           Ringing a suspended large bell (single) by pulling the cord           
      |     +---/bells1          Ringing suspended small bells (multiple) by pulling the cord         
      |     +---/bells2          Ringing suspended large bells (multiple) by pulling the cord         
      |     +---/bells3          Ringing suspended small and large bells (multiple) by pulling the cord
      |     +---/coin1           Dropping a coin (single) on a wooden board (large)                    
      |     +---/coin2           Dropping a coin (single) on a plywood board                           
      |     +---/coin3           Dropping a coin (single) on a metal board                             
      |     +---/coins1          Dropping coins (multiple) on a wooden board (large)                  
      |     +---/coins2          Dropping coins (multiple) on a plywood board                         
      |     +---/coins3          Dropping coins (multiple) on a metal board                           
      |     +---/coins4          Shaking coins (multiple) in a container                              
						                    
      |     +---/bells4          Ringing a handheld bell 
      |     +---/bells5          Ringing a bicycle bell  
      |     +---/tabbouri      Ringing a tambourine    
      |
      +---/c2 ………… Characteristic sound source (paper)
      |     +---/book1           Dropping books and magazines A, B, C, D and E on paper        
      |     +---/book2           Dropping books and magazines A, B, C, D and E on paper (no. 2)
      |     +---/tear            Tearing copy paper
      |     +---/crumple         Crushing copy paper by hand 
      |
      +---/c3 ………… Characteristic sound source (musical instruments)
      |     +---/castanet        Clicking castanets
      |     +---/maracas         Shaking maracas
      |     +---/horn            Blowing a bugle
      |     +---/drum            Beating a drum
      |     +---/cymbals         Striking cymbals
      |     +---/string          Twanging of a stringed musical instrument
      |     +---/whistle1        Blowing whistle A
      |     +---/whistle2        Blowing whistle B
      |     +---/whistle3        Blowing whistle C
      |     +---/ring            Ringing a bell by shaking

      |     +---/kara            Shaking a baby rattle
      |
      +---/c4 ………… Characteristic sound source (electronic sound)
      |     +---/phone1          Ringing of a home telephone
      |     +---/phone2          Beep of a cellular phone                   
      |     +---/phone3          Beep of a microcellular phone              
      |     +---/clock2          Ringing of  an electronic alarm clock      
      |     +---/pipong          Sound of an electronic sound toy A (pipong)
      |     +---/buzzer          Sound of an electronic sound toy B (beep)  

      |     +---/phone4          Beep of a cellular phone (no. 2)
      |     +---/toy2          Sound of an electronic sound toy C (Gyuun)
      |
      +---/c5 ………… Characteristic sound source (mechanical)
            +---/stapler         Stapling copy papers with a stapler
            +---/punch           Punching copy paper with a punch
            +---/padlock         Opening and closing of padlocks A and B
            +---/toy             Sound by releasing spring
            +---/clock1          Ringing of  a bell-alarm clock

            +---/doorlock          Opening and closing of door locks A, B, and C 
            +---/coffmill          Grinding coffee beans with an electric grinder
            +---/shaver          Sound of electric shavers A and B
            +---/dryer          Sound of hair dryers A and B
            +---/mechbell          Ringing of mechanical bell of a bicycle

 +---/reconst ………… Reconstructed signal
      +--- ir030.dat  Impulse response (Reverberation room A [Reverberation time : 0.30sec.])
      +--- ir078.dat  Impulse response (Meeting room         [Reverberation time : 0.78sec.])
      +--- ir130.dat  Impulse response (Reverberation room B [Reverberation time : 1.30sec.])
      |
      +---/org ………… Dry source
      |     +--- aircap.raw   Crashing an air cap by hands
      |     :                   : (100 signals)
      |     +--- wood3.raw    Beating handheld wooden board (large) A and wooden board (large) B against each other
      |
      +---/ir030 ………… Reverberation room A [Reverberation time : 0.30sec.]
      |     +--- aircap.raw   Crashing an air cap by hands
      |     :                   : (100 signals)
      |     +--- wood3.raw    Beating handheld wooden board (large) A and wooden board (large) B against each other
      |
      +---/ir078 ………… Meeting room         [Reverberation time : 0.78sec.]
      |     +--- aircap.raw   Crashing an air cap by hands
      |     :                   : (100 signals)
      |     +--- wood3.raw    Beating handheld wooden board (large) A and wooden board (large) B against each other
      |
      +---/ir130 ………… Reverberation room B [Reverberation time : 1.30sec.]
            +--- aircap.raw   Crashing an air cap by hands
            :                   : (100 signals)
            +--- wood3.raw    Beating handheld wooden board (large) A and wooden board (large) B against each other

[Back to Sound Database Home]

[non-speech top] [sounds] [sample] [research 1] [research 2] [system]
RWCP Sound Scene Database in Real Acoustical Environments
Copyright (c) 1998-2001 Mitsubishi Research Institute,Inc.