Andres Ramos Posted October 18, 2020 Posted October 18, 2020 (edited) 1. Abstract From all my experiments with noise in ITC and the investigation of spirit impulses I drew two conclusions. Spirits use only a limited part of the noise spectrum. There is a lot waste remaining, contaminating the output signal Spirit impulses have a very good signal to noise ratio. The information loss of impulses compared to unclipped noise signals can be compensated by techniques like Paulstretch, an algorithm in Audacity, up to a certain degree. From those conclusions I got the impression that removing the base floor in a noise signal that is below the spirit impulses could enhance the modulation of the resulting voice signal. This theory I will explain in the following chapter. I wanted to create a setup where I could remove some lower level parts of the noise not contributing to the modulation, and see if the resulting modulation improves. 2. Noise level discrimination Let's take an example to clarify my thoughts. Imagine we have a noise signal with spirit voices buried in it. Raw noise signal You can see very clearly that you can divide this signal into three vertically stacked areas. The upper third is made of spikes with positve amplitude. The lower third is a mirror of the previous one with negative spikes. In the middle there is an area we can identify as a base noise floor. If we take into account that the crucial information is encoded in the spikes then the middle part is obsolete as it adds nothing valuable to the modulation. If we would estimate the modulation factor of the above shown signal then it could be roughly something around 30% and only a part of the 30% comes from spirits. The modulation factor is principally calculated as a ratio of amplitude maximum and minimum more or less. Cutting out the base floor Now let's imagine we could take a magic scissor and cut out the middle part of the signal that contains the base floor of noise. Glue together upper and lower section The last step is to remove the cutout and push both remaining sections together. You see that the modulation factor is now near 100% and thus also the spirit voice modulation should be enhanced. See it as a kind of filtering process in the time domain but keep in mind that it is no real filtering but rather a discrimination of certain noise levels specified by a window. 3. Electronic schematic of noise level discriminator Let me explain the electronic that does the discrimination. In the lower left corner you can see the usual virtual ground module I always use when I employ Op Amps. In the upper left corner is also something you should remember. It's my standard realisation of a germanium noise source that gets preamplified with IC1. On pin6 of IC1 we have a noise signal of around 3V amplitude. The following stage on the right is a window discriminator made with two comparators. With P1 you can setup a symmetrical cutout level for both comparators. Principally the combined output of the comparators generates a high signal if the current noise level is in the middle section(base floor) and a low signal if it is outside, where the spikes are. By turning P1 you can make the cutout section thicker or smaller. With P1= 0 Ohm there is no cutout, with P1 = max there is roughly 30% cutout. The comparator output is negated by a NAND gate and controls an analog switch. It switches the audio signal through only when the noise level is outside the cutout area. Signal with cutcout=0 The above picture shows the output signal if P1 is adjusted to 0 Ohms. As expected the signal stays unaffected. Signal with maximum cutout The second picture shows the signal with P1 adjusted to maximum. Now you can see clearly that the middle part of the signal is missing. The discriminator circuit thus works nicely! 3. Test results with noise level discriminator For testing and comparison I made three recordings of 10s duration each in Audacity. The first one I post processed with the following steps in consecutive order. First recording with post processing I recorded the signal without any level discrimination (P1=0 Ohm) High pass filtering with 250 Hz corner frequency and 12 dB/octave rolloff Low pass filtering with 6000 Hz corner frequency and 12 dB/octave rolloff 3 times denoised with 11dB denoising factor Again high pass filtering with 250 Hz corner frequency and 12 dB/octave rolloff Hear the audio: Recording #1 Second recording with post processing I recorded the audio with full level discrimination(P1=max). Paulstretch with delay factor 1.2 and 0.1s resolution High pass filtering with 250 Hz corner frequency and 12 dB/octave rolloff Low pass filtering with 6000 Hz corner frequency and 12 dB/octave rolloff 2 times denoised with 11dB denoising factor plus 1 time denoising with 7 dB denoising factor Again high pass filtering with 250 Hz corner frequency and 12 dB/octave rolloff Hear the audio: Recording #2 Third recording with post processing This recording I made again with zero level discrimination to check if Paulstretch gives the same results with undiscriminated signals as with discriminated ones. Paulstretch with delay factor 1.2 and 0.1s resolution High pass filtering with 250 Hz corner frequency and 12 dB/octave rolloff Low pass filtering with 6000 Hz corner frequency and 12 dB/octave rolloff 2 times denoised with 11dB denoising factor plus 1 time denoising with 7 dB denoising factor Again high pass filtering with 250 Hz corner frequency and 12 dB/octave rolloff Hear the audio : Recording #3 Now if you take a closer look at the displayed recording signals you may already find out that the signal of recording #2 looks a little sharper than the rest while recordings #1 and #3 are more blurred. Comparing the audio files brings even more findings. If we talk about modualtion quality then recording #2 is better than the other ones in respect of how phonemes can be distinguished. That does not mean that the intelligibility of recording #2 is really better. The problem with the intelligibilty lies in the spectral structure of the resulting noise that is not at optimum for spirit voice generation. Here the old problem shows up again; the germanium diode noise, although it is very sensitive to spirit interaction doesn't give good sounding voices. These are always low, rumbling and croaky. It could be valuable to try different noise source like white noise (germanium noise is pink). Another finding is that Paulstretch unfolds its magic really only on spiky signals. Applying Paulstretch to a continuing signal yields no improvements. In a final evaluation it could be found that an improvement of the voice modulation could be reached and the principle was proven. Since this technique can be used with any form of noise it is up to further experiments to find noise sources with better spectral quality. Even if these contain only low modulation, it now can be enhanced by using level discrimination. Here are some exported samples from another recording with discrimination level near maximum: Voice samples Edited November 9, 2020 by Chris minor typos 0 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.