Jump to content

What would be the value of using a microphone array (multiple mics)?


Recommended Posts

One technology that the commercial space has been exploring is microphone arrays for smart devices like the Amazon Echo. The idea is that multiple microphones better cancel out environmental noise and reverberation leading to a clearer voice for speech recognition.

What would be the benefit for ITC? Localized spirit voices? Improved signal-to-noise? It's not easy to make a microphone array, so the argument for pursuing this would have to be compelling.

BTW, I have played with two microphone setups. This is easy to do and does help with sound cancellation if there is a localized audio source.

 

Link to comment
Share on other sites

Jeff, 

Although a little off topic from the original question, I agree with your understanding that spirits utilize the sounds available to them. This can actually be stated mathematically as convolution: roughly speaking they can slightly change the volume of the individual frequency components of environmental sound. 

I recently experimented with two microphones and several projected sounds (testing one at a time) from a speaker. Then using microphone cancellation and machine learning to disentangle the original voice.

I like to think of the played sound as having two functions: 1) provide a fixed sound field that we can accurately and mathematically (digitally) remove from the recording. 2) energy for spirits to manipulate, like your thoughts suggest and my convolution theory.

I agree with you that locality may not be all that necessary for spirits. Ill tell you though, it would help with isolating out the sound pollution in my household. 😳

 

Link to comment
Share on other sites

The general method is to figure out how a spirit voice is corrupted when we hear it directly from our noise-generating devices and then train a machine learning model to reverse the effect. Specifically, I have found at least three corruption processes:

(1) the spirit signal is often heavily buried in noise (i.e. additive noise).

2) the spirit signal is "quantized" or in other words it sounds like it's (e.g.) 2 to 4-bit audio vs. clean 16-bit audio.

3) the signal is "sparse" or missing a lot in time - instead of hearing a smooth waveform, we are randomly getting 10-20% of the time samples, instead of all 100% of the samples.

So, what you do, is you train a machine learning to convert clean English speech corrupted by these three processes (or others) back into uncorrupted clean speech. Then apply this trained model on your favorite noise ITC signal.

It's been a struggle because I think these 3 processes together (and others we don't know about) are just too destructive on the original audio that spirits are trying to convey. In other words, we lose too much information to restore back to intelligible speech. We are always on the lookout for more spirit-sensitive hardware.

 

 

Link to comment
Share on other sites

Jeff, I should point out that spirits don't have to share their voice directly - they are also capable of activating phonemes or even converting their voice into frequency space. What they are limited to, however, it appears to me, is spikes of energy, at least with the hardware we have given them.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.