The Basics of Voice Reproduction & Your Help Requested

Keith J. Clark · April 1, 2022

Hello everyone.

As I continue the work that began 16 years ago I've decided to share some basics with you. The hope is that you will think of some creative solutions to the principles outlined below which are the result of my experiments over a long period of time. With your creativity I believe we can crack this long-term project to achieve at least one-way real-time voice communication.

We've been successful in creating voice through many various methods and what we lack is a unified approach based upon these principles. I'm hoping for your creative solutions in response to the principles I mention. Please consider that I don't have the technical experience most of you have, my solutions are a combination of intuition and trial and error. So you may need to translate into your own understanding when thinking of a potential solution. The goal here, as I present it, is to mimic human voice. From there, it will be a piece of cake.

1. Voice requires a fundamental frequency with a fundamental frequency amplitude louder than the rest of the harmonics

2. Additional harmonics need to be shaped to be similar to reproduction of a human voice OR a perfect averaged voice sample may be provided. Harmonics should never be of the same amplitude as the fundamental. Tones need to be balanced. Without balance it will never be feasible. This may require tuning by ear.

3. Modulation is required to simulate a glottal pulse and the vocal folds. I dont know if its 20Hz etc, but if we determined this we would be much closer.

4. Comb filter (delay with reverb) can also simulate a fundamental with harmonics, but the harmonics would need to be toned down to simulate human voice. Overtones come into play.

5. Impulse train (modulation) is critical and when combined with combfilter is very powerful

6. Granulized sounds (short samples of sound) can be used reliably as long as a voice is shaped per number 2 above. Usage of the tool "Emissions Control" has proven this theory.

7. EVPMaker set to 140-600ms with X-fade option only will also produce a randomized form of speech that produces a single voice. If pulsecomb (combfilter (delay + reverb) plus impulse train (modulation) is used in post-processing of randomized sound (live or recorded) this can be a very powerful tool

8. Software enhancement. We have the ability now to enhance noise reduction and live filtering which in essence can allow sound to be heard that is imperceptible to human ears. Bias Soundsoap has improved greatly, as well as the introduction of Krisp artificial intelligence.

9. One idea that works is to provide voice is 2 tones spaced 2.5Hz apart. It is not known yet why this is. When tones 2.5 Hz apart are randomized a certain phenomenon happens. It's either a meditative state or conducive to speech patterns we are already trained to recognize. We don't know yet. to hear what that sounds like, listen to Stream 7 aka "Crystal trainer"

Please feel free to provide your creative interpretations of how we can simulate human voice. I am happy to demonstrate any of the principles outlined above for a better understanding. One of my shortcomings has been not taking the time to explain the ground already covered.
With your help I am confident we can crack this long-term goal.

Summary: We will have synthesized software "voiceboxes" that spirit can use. The fact is we already have this technology. Trial and error has shown that in order to hear a reproduction of a human voice an experimental sound needs to have some sort of randomization or be "energized" through a variety of methods. In the end though - it must mimic human voice characteristics.
In order to create a voice template for spirit we can refer back to human voice samples and provide a template that mimics normal human speech. This should be done visually as well as listening.

Different methods can be used such as: granulization, "energize" using a combination of combfilter (delay + reverb), or randomize using software such as EVPmaker. ALL of these efforts have shown the same principles to be true.

In short - do you have ideas for reproducing a human voice without intelligible signal information? I am happy to credit anyone that contributes to this work. My biggest challenge is that I have not taken the time to explain what I've learned so that it may be expanded upon by other creative people. And when I explain it in technical terms it falls short of what most people would understand or resonate with.

Feel free to add your comments and suggestions as to how to reproduce a human voice. I would find it helpful if you would consider the principles I placed above and let me know what questions you have regarding any of the items presented.

it is my hope that together, once and for all, we can crack this code. We already have all of the tools needed in our possession. We just need to work together. And I've held my ideas too close to my chest for too long.

What questions/suggestions do you have? I'm open to anything, as most of you know. Don't be shy

We have already succeeded partially with voice.....lets take it the rest of the way!

I'm happy to demonstrate any of the methods mentioned or presented to anyone that has interest to better explain the concepts.

Keith Clark

Keith J. Clark · April 2, 2022

Thanks Jeff, really appreciate the detailed response! You reminded me I have the allophones back from when Mr. Bion created them. Thinking I can modify it.
Playing with it right now

Andres Ramos · April 2, 2022

I value this discussion as a very good attempt to bring together all our results and lessons learned. Like you I am hoping for something like the uniform spirit voice theory.

For a long time I was just experimenting with noise in all its different flavors. i think I designed a dozen different deviced where noise was created and processed in all thinkable ways. In the end the results were all comparable. Noise seems to be the ideal stuff to wotk with because it is "fluid", very agile and basically contains all spectral components needed for the creation of human voices. The problem is that the pk-modulation in every device is so poor that all results were suffering from a very bad signal to noise ratio (SNR). The SNR was improved with pink noise but I had to pay the price of even more deteriorated spectral material. The voices were rough, croaky like grunting stoneage people.

It was some weeks ago as I remembered that Keith always worked with tones and harmonics. This led me to the idea to abandon the experiments with noise and concentrate more on the generation of sounds and tones instead of noise. The VISPRE was my first approach and now I am worrking with the SpiCa, a semi-chaotic audio circuit that converts voltage fluctuations into combinations of tones. It's a highly fragile and agile circuit that outputs digital signals.

I learned two things from my experiments and I can say that my results are aligned with Keith's theory. Firstly the spectral material that we should offer to the spirits should be made of tones and harmonics rather than noise. The nearer these tone combinations are to human voices the easier the spirits can recombine the spectral composition. Secondly, we need something to excite the tone circuit. Jeff calls this dynamism as far as I remember. basically it means that static tones left alone will not produce spirit voices, we always need to add what I call speech patterns which are impulse groups having the shape of a voice envelope function.

In my experiments with microphone recordings I could abserve that spirits are using everything I offer them as sound. When I was opening and closing drawers, they created voices rumbling with exactly that sound and rhythm. When i was typing on my keyboard they used the rhythm and sound to generate clicking voices. It always appeared to me they were able to reconfigure spectral content but never the rhythm. It appears to me they are constantly analyzing the rhythm of the sounds we create and aligning their desired voice content with it. In short terms, they have to see what they can do with that rhythm we provide because they cannot change it. What they can change is the spectral distribution of the original sound. How they are doing that I have no idea about.

The speech impulse patterns or impulse trains are clearly to be separated from the, lets call it "tone engine", the circuit that produces tones and harmonics.

Currently I am testing a new setup that contains two devices I designed previously. First there is the LINGER unit that generates speech patterns from the LED-light on a phototransistor with enhanced pk moduklation by use of the microphone processing circuit SSM 2167. These impulses are rectified, that means they are reduced to an enevelope function and become speech without any content, they are just rhythm.

This signal is fed into the SpiCa where the voltage fluctuations following the speech rhythm are causing the SpiCa to jump around between different tone combinations. The result sound like human speech but with a very small pool of vocals and consonants so far.

You can hear the results here: SpiCa

Basically I can say that Keith's theory is correct. What we still cannot achieve sufficiently is the combination of entropy and steering tones and harmonics.

Keith J. Clark · April 2, 2022

2 hours ago, Andres Ramos said:

I value this discussion as a very good attempt to bring together all our results and lessons learned. Like you I am hoping for something like the uniform spirit voice theory.

For a long time I was just experimenting with noise in all its different flavors. i think I designed a dozen different deviced where noise was created and processed in all thinkable ways. In the end the results were all comparable. Noise seems to be the ideal stuff to wotk with because it is "fluid", very agile and basically contains all spectral components needed for the creation of human voices. The problem is that the pk-modulation in every device is so poor that all results were suffering from a very bad signal to noise ratio (SNR). The SNR was improved with pink noise but I had to pay the price of even more deteriorated spectral material. The voices were rough, croaky like grunting stoneage people.

It was some weeks ago as I remembered that Keith always worked with tones and harmonics. This led me to the idea to abandon the experiments with noise and concentrate more on the generation of sounds and tones instead of noise. The VISPRE was my first approach and now I am worrking with the SpiCa, a semi-chaotic audio circuit that converts voltage fluctuations into combinations of tones. It's a highly fragile and agile circuit that outputs digital signals.

I learned two things from my experiments and I can say that my results are aligned with Keith's theory. Firstly the spectral material that we should offer to the spirits should be made of tones and harmonics rather than noise. The nearer these tone combinations are to human voices the easier the spirits can recombine the spectral composition. Secondly, we need something to excite the tone circuit. Jeff calls this dynamism as far as I remember. basically it means that static tones left alone will not produce spirit voices, we always need to add what I call speech patterns which are impulse groups having the shape of a voice envelope function.

In my experiments with microphone recordings I could abserve that spirits are using everything I offer them as sound. When I was opening and closing drawers, they created voices rumbling with exactly that sound and rhythm. When i was typing on my keyboard they used the rhythm and sound to generate clicking voices. It always appeared to me they were able to reconfigure spectral content but never the rhythm. It appears to me they are constantly analyzing the rhythm of the sounds we create and aligning their desired voice content with it. In short terms, they have to see what they can do with that rhythm we provide because they cannot change it. What they can change is the spectral distribution of the original sound. How they are doing that I have no idea about.

The speech impulse patterns or impulse trains are clearly to be separated from the, lets call it "tone engine", the circuit that produces tones and harmonics.

Currently I am testing a new setup that contains two devices I designed previously. First there is the LINGER unit that generates speech patterns from the LED-light on a phototransistor with enhanced pk moduklation by use of the microphone processing circuit SSM 2167. These impulses are rectified, that means they are reduced to an enevelope function and become speech without any content, they are just rhythm.

This signal is fed into the SpiCa where the voltage fluctuations following the speech rhythm are causing the SpiCa to jump around between different tone combinations. The result sound like human speech but with a very small pool of vocals and consonants so far.

You can hear the results here: SpiCa

Basically I can say that Keith's theory is correct. What we still cannot achieve sufficiently is the combination of entropy and steering tones and harmonics.

On this part below - YES!!

Jeff calls this dynamism as far as I remember. basically it means that static tones left alone will not produce spirit voices, we always need to add what I call speech patterns which are impulse groups having the shape of a voice envelope function

Right, we have to pass it through a mutable/modulatable (made-up word) medium whether it be sound, light, radio waves, impulse, other tones OR randomization such as evpmaker, granulizer software.

Keith J. Clark · April 5, 2022

Hi Andres, regarding this: my experiments with microphone recordings I could observe that spirits are using everything I offer them as sound

There's 2 parts to this. one is your energy field, the other is your brain's wiring and your interpretation. I agree with your experience as a whole because its the same for me.

Keith J. Clark · April 5, 2022

agreed Jeff. I would add that in my work I have found the following to be true:

Modulation can also occur through randomization:

evpmaker (if suitable sound is input either from audio file or live)
Emission Control software (granulizer)

And like you said, for optimal conditions the environment should contain both of the following:

a medium through which the experiment is passed, ie light, sound, software, biofeedback, radio waves, so on and so forth
tones consistent with human speech as our brains are trained to recognize it

My work in this area started by trying to imitate Spiricom and eventually ended up being reverse engineering Spiricom to provide a synthetic voice template. I can say the success I've had is more than I expected at first.

To give people some context, below is a picture of the infamous Spiricom "Mary Had A Little Lamb" recording. Note, whether an electrolarynx was used or not is irrelevant, the fact is it was understandable as human speech. So it's valuable because it's not just speech, its also enhanced\energized\excited speech via the technique. Note the modulation pictured below:

The closer I get/more I understand to real-time synthetic voice the more it seems to imitate Spiricom principles - at least as they were outlined.

I have used a variety of techniques, some of which I've held close to my chest. They were all successful in their own ways.
Here they are in their variety:

noisegen (white noise) with live noise reduction
noisegen with live noise reduction and Krisp A.I.
tone generator through evpmaker
tone generator through combfilter and live noise reduction
sinewave through Emission Control granulizer followed by live noise reduction
white noise through pulsecomb effect in audiomulch
tones through pulsecomb effect in audiomulch
tone generator through pulsecomb followed by combfilter
and many many more, including post-processing and "shaping" of the harmonics to be commensurate with human voice

I'll now share results of current work - though it may not seem impressive I'm pretty close to solving it. Which is why I ask for help.

Krisp post-processing

Without Krisp below:

How it appears visually:

Granted, the way I explain things tends to be in my own vernacular. When you guys get super technical I grasp the concept but lack the implementation. However I also know due to experience when I'm super "close"

Jeff, you're a super smart guy, maybe you can help. Right now I'm using 200Hz sine modulated (mixed with) 1Hz sine. This is then passed through evpmaker for the random factor. From there, one of the following:

pulsecomb followed by shaping (and with or without Krisp A.I.)
no pulsecomb with live vst combfilter (with or without Krisp A.I.)

So now I only have one question:

How can I get the bandwidth modulation of each harmonic to sway (more bw per harmonic) in harmony? To look more like the Spiricom photo above rather than the second picture above. I've used the wobble technique...its not optimal for this type of application, or at least it hasnt increased intelligibility for me (more like added vibrato).

Thanks!

Keith J. Clark · April 5, 2022

Here is the visualization of current voice techniques. Note that it has pretty good modulations that look like incomplete voice patterns

Andres Ramos · April 6, 2022

As I already mentioned in RC, if you are shortening the duty cycle of an impulse train, the bandwith of the signal should increase in terms of higher amplitudes of the harmonics.

Keith J. Clark · April 6, 2022

I asked the same question in both platforms. Sorry about that.

Here is Andres' reply:

Theoretically the bandwidth of an impulse becomes wider if the impulse gets shorter. In my engineering's study we were taught the Dirac impulse. This is an abstract assumption of an impulse with the width zero and an infinite high amplitude. It gives an infinite bandwidth with constant level.

So practically you could try to use impulse trains with a lower duty cycle. In a square wave signal you have 50% ON and 50% off. Try to make it 20% ON and 80% OFF or something like that.

Andres Ramos · April 7, 2022

This is true. The spectrum shape does not change with shorter duty cycles but the amplitudes of the harmonics are increasing all.

Andres Ramos · April 7, 2022

Hm, I just stumbled upon an impulse wave simulator.

https://connecthostproject.com/spectre_pulse_en.html

I played with a 1 khz signal and pulse widths between 10 and 100 us and 10 calculated harmonics. From what I saw the spectrum shape definitely changes and shows a sinusoidal amplitude distribution depending on the pulse width.

Keith J. Clark · May 31, 2022

Thanks guys. Jeff I think your explanation was pretty good. It's slightly over my head

this part was really good and resonated with me:
so I am thinking that you want to try and synthesize the action of spirit in this request? If so, it is possible you might make it easier for spirit to add their (then smaller) modulation tweaks (the good news), or inhibit their action by screwing up formants that they would want to modulate to their specified degree of level (the bad news). This is a 2 edged sword I think, in going down this path.

yes, that is definitely what I'm trying to do and it is a double-edged sword.

Introduce noise = helps voices.
More noise = voices harder to hear.

I'm still hung up on conforming the energy by using feedback. Meaning, to shape the energy to where spirit can influence it...yet is not overblown with additional unnecessary harmonics. One tone is not enough, too many tones is too much noise.

I'm not up to speed on exactly what you guys are describing.

Can you simplify it for me?

"Random" and "patchy" do sound basically like what I'm describing.

I guess its similar to a carrier wave in principle? The feedback inherently provides harmonics; however there is difficulty in separating the original tones themselves from the output.

What about some kind of software solution....would that be subtractive synthesis? If I could subtract the original signal from the influenced signal that would definitely be very interesting.

Keith J. Clark · May 31, 2022

there are 2 applications....

one is voice, the other is images. The area we're discussing could apply to both areas in separate ways.

ALF · September 1, 2022

Hi Keith,

I was contemplating your challenge, the need to mimic human voice for use by spirits to communicate, and also the theory that spirit would not be able to manipulate a clear and strong voice into their own voice patterns.

Recently, I had the idea to check a recorded reading I received from a channel Medium for evps in Audacity. There was a lot really interesting information that came out of this which I I'll post another time, but I wanted to share the results which I thought related to your research here. This will not be a solution but could be the catalyst for other ideas. I hesitated on presenting this, but here goes.....

The reading was given to me in a small office, about the size of a walk-in closet. The door was closed and there were no (noticeable) audible sounds coming from outside or inside the room other than our (the Medium's & my) clear & strong voices. I recorded using my cell phone Tape a Talk app. It was 12 minutes in length.

The Medium was a channel of her resident group of spirits. She asked them if she was blending, waited 2 seconds then she said they will just come in if they want to, and then immediately said, ok yes, they want to. Then the Medium's voice changed, and they group introduced themselves to me through the channel.

There were EVP voices, as evidenced through Audacity and using only the magnifying tool & some amplification. Some were weak and required amplification, but were still discernible. But some were strong & clear. They never once interfered with or appeared to manipulate our voices for their use, other than in the physical mediumship sense when they were speaking through & using the channel's vocal chords, but not to create evps, And, I guess you could also say, they also never interrupted themselves. And remember, there was no other apparent sound present inside our from outside the room other than our voices.

I started focusing on trying to figure out how they were able to do this. Where were they getting sound, the energy. The evps were heard only during the breaks in speech. I noticed that once the Medium blended with her team she had a difference speech pattern. She was elongating her words slightly, especially the last word before a pause. I think it is possible that they are utilizing this residual sound in some way to create the evp in the space of the pause in her voice that follows. If this is true, could they be in effect creating their own perpetual energy by speaking through her vocal chords and changing her speech pattern? Also, during the pauses, I heard multiple instances of what sounded like a door being slammed shut just prior to or at the beginning of an evp. These were the stronger, clearer evps. It seems they might be gaining some type of boost from this sound and/or whatever is creating it. Also included in this reading was probably the clearest evp I've ever heard so far, which utilized a sound like a metal ball dropping and rolling around (clip attached).

For what it's worth, I hope this is at least thought provoking.

Andrea

Keith J. Clark · September 1, 2022

Hi Andres, that's pretty cool. I've often wondered the same...where is the energy being drawn from.

Andres Ramos · September 13, 2022

I think it depends on the medium. The voice manifestations need energy. If the energy from the medium or experimenter is low the spirits take it from the sounds they remodel. If the Medium is strong it can provide enough energy for the spirits to manifest in the speech pauses without any sound present.

This is my hypothesis.

Sign In

The Basics of Voice Reproduction & Your Help Requested

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Important Information