Julie sent
this video to me, and I was confused, because I didn't think it should saturate into noise on the third iteration. Human voices are in the 1000 Hz range, so if the Carl doubles the frequency, three iterations only gets it to 8000 Hz, which is still well sampled by a 44 kHz sound file (the standard). So, I did the sane thing when I got home, which is to download the video and do spectral analysis of the audio.
|
The human (s00), Carl A (s01), and Carl B (s02). |
The human speech is mostly that tiny red peak on the left side, at about 1-2 kHz. It's confused beyond that, but I think that second red peak (2k ish) can be plausibly shifted in the others.
|
Plotting everything. |
The interesting thing in this one is that you can see that there are two patterns. The dips around 11k and 13k are probably the easiest way to see that. They're caused by the response function of the devices:
|
Carl A has the benefit on the first iteration to have the true voice. |
|
Carl B. |
So I don't think it's really related to the speech frequency vs sampling rate. I think it's just the addition of the noise in the microphone/speaker feedback.
No comments:
Post a Comment