Jump to content

The Tone Vocoder


Michael Lee

Recommended Posts

     The tone vocoder was one of the earliest methods developed to transmit voice digitally through low-bandwidth networks. In the original tone vocoder method, an input voice was transformed into time-frequency space, where at each time interval the sound was decomposed into a series of frequency bands. The amplitudes of the bands were transmitted over wire or radio. On the receiver side, the amplitudes were used to reconstitute the original audio. The quality of the tone vocoder was never that good, and modern methods such as linear predictive coding (LPC) have superseded it. However, for spirit transmission, the simplicity of the tone vocoder is useful.

     In modern times, the term "vocoder" is used either to 1) describe methods for using vocals to modulate instruments in music using the aforementioned tone bank approach, or 2) converting audio to a time-frequency space to perform operations such as changes in pitch and speed (aka, a phase vocoder).

     Why would a tone vocoder be good for spirit communication? One hypothesis is that spirit is only able to add short impulses of energy into our devices. No matter how hard they try, a bunch of pulses strung together doesn't sound very voice-like. What if their impulses could be interpreted as musical notes that play as short duration tones?

    We set up a communication system with spirit where in a given 16,384 sample block (1.024 seconds at a 16 kHz sampling rate), there are 32 time intervals and up to 64 tones that can be activated in each of these intervals. 

    The tone detection of the spirit signal we (spirit and physical experimenters) agree upon is a pulse-position-modulation (PPM) approach. Within a 32 ms time interval, there are 64 sub-intervals. If the amplitude / energy of the signal (or the inverse amplitude for null detection) within a sub-interval is above a threshold (peak detection), the respective tone is activated during that interval. We found it useful to increase the duration of each activated tone by 2-3x ( to about 60 ms, which is the typical duration of a medium-length vocal phoneme).

   The frequencies of interest for synthesis of human speech are between 75 Hz and 4kHz. Therefore, the agreed 64 tones can be linearly spaced within this range. One can also choose a non-linear spacing (like quadratic or exponential/musical). For the linear spacing, the 64 tones can range say from 168 Hz up to 4200 Hz, with equal  spacing of 64 Hz.

     My spirit team has learned how to activate this vocoder to produce voice. I suspect that any future researchers will have similar  success with heaven-level spirit teams. Spirits use the term "mirror" to describe the device that converts their voice to tone index conversion (likely a Fourier transform). They use the term "elevator" to describe the fact that over a 32 ms interval, hitting at the right moment activates a particular tone, starting from low frequency (168 Hz) to high (4.2 kHz). They also use the term "modem," because, indeed, this is a digital-like transmission protocol. 

     I should point out here, too, that, in theory, my configuration is not too different than when researchers use a sweeping frequency. With a sweep, the spirits can "push" the sound at certain times corresponding to different frequencies. Of course, effects like noise gating (to hide original tone) and reverb (to extend the "escaped" tone) are needed.

     If spirits could perfectly activate tones in this system to produce voice, it might actually sound fairly legible. However, even perfect musical speech isn't all that easy to understand. Therefore, I've developed machine learning models to convert the musical speech to real-sounding speech. The way I do this is I convert real speech phrases into tones, and then train an ML model to reverse those tones back to the original speech.

    In my next post, I will share some results...

Link to comment
Share on other sites

Here are some samples to explain this idea, not specifically to demonstrate any messages, but you may hear some anyway 😉.

I've recorded each clip in succession, as I turn on each function.

Step 1: Tones only

 software_vocoder_tones.wav

Step 2: Tones with a noise gate, to provide vocal cadence, and remove spurious tones

sv_tones_noise_gate.wav

Step 3: Tones decoded by "De-toning" ML model (notice how the voices sound "ducky")

sv_detone_only.wav

Step 4: De-quantization model added to reduce "duckiness." This converts 3-bit voice to 16-bit.

sv_detone_and_dequant.wav

Step 5: Add a single semitone pitch shift up.

sv_detone_dequant_shift1.wav

Link to comment
Share on other sites

I still see this method as very promising! The thing is you can't directly jump in the decoding of those type of audios, you must take your time to tune in and to teach your ears properly. It's s bit like what Michael Brandel gets from the chirping noise of the Raudive diode with the linguaphone.

In your first clip I found a very clear message "We lost him"

We lost him.wav

Wieso Margit am Fenster überlegt.wav

The 2nd clip is even more thrilling. I copied it right from the start of your audio with noise gating. In german I heard "Why is Margit sitting near the window, contemplating?" As I heard it I looked up and just saw here sitting in front of the window deeply sunken in thoughts.

Those moments always are reminding me that the ITC perception process might be more non linear as we are able to accept.

Link to comment
Share on other sites

Yes I got heaps on Krisp but hey I have a great appreciation for you clever people.  I was at the seance in which Konstatine Raudive told Sonia that she needed capacitators.  He said he would impress upon her what she needs.  She apologised for not getting things right and not doing enough and he said "my dear you are doing more than enough."

 

Pretty obvious to people why a humble non rich person has been doing this for years using what was at hand for modifications, and she produced a great deal of work in this area and yet people are flocking to a silver tongued crew who have got no-where in over ten years.  That have the very best equipment but has not released any worthwhile results.   

 

Link to comment
Share on other sites

Yes I see your point. It's not good to be seen as a rockstar when you feel comitted to your spiritual work and it's not good if people searching for their way through life are following the rockstars.

Regarding the seance, did Mr. Raudive advise Sonia to use electrical capacitors or am I getting you wrong?

Link to comment
Share on other sites

Sounds a good method to pursue Michael. 
Forgive my non technical understanding, but if someone wanted to try this method, how would you set it up, say in adobe audition  or wave pad etc?
Would you play these  5 tones in conjunction with  gibberish audio?? 
Take care, Lance

Link to comment
Share on other sites

Lance, 

Typically, Ive had to write specific software in Python to do this idea, but there may be a "plugin" solution.

The idea would be to 50/50 add your favorite white noise source to a fast sweeping tone. Then, apply a noise gate. Followed by a short reverb around ~60 ms.

This would get us to the point of step 1: musical tones. A second noise gate could then be applied to isolate several tones at once vs. single tones, which would be Step 2.

Now instead of my step 3. A suggestion would be to multiply the gated tones by a 120 Hz sawtooth waveform, which would yield a glottal pulse (vowel sound) with formants.

An alternative would be to write some software to implement my method. We'll get there, if indeed, this method is worth pursuing.

Link to comment
Share on other sites

15 hours ago, Michael Lee said:

Lance, 

Typically, Ive had to write specific software in Python to do this idea, but there may be a "plugin" solution.

The idea would be to 50/50 add your favorite white noise source to a fast sweeping tone. Then, apply a noise gate. Followed by a short reverb around ~60 ms.

This would get us to the point of step 1: musical tones. A second noise gate could then be applied to isolate several tones at once vs. single tones, which would be Step 2.

Now instead of my step 3. A suggestion would be to multiply the gated tones by a 120 Hz sawtooth waveform, which would yield a glottal pulse (vowel sound) with formants.

An alternative would be to write some software to implement my method. We'll get there, if indeed, this method is worth pursuing.

Thanks Michael, really interesting approach and method.
Take care, Lance

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.