开发者

Audio without changes in pitch [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.

We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the ques开发者_C百科tion so it can be answered with facts and citations.

Closed 7 years ago.

Improve this question

I'm trying to find a library or sample code or something in the right direction that could help me change the speed of audio while maintaining normal pitch. I need this functionality in an open source application, so preferably the library is open source itself. Any ideas to get me on the right track?


If you need to convey a signal in the audio domain, playing on time but not in pitch: You have to know what your signal is composed of. So as to synthesitize the good frequency when its worth.

1/ You have all the parameters known, like in analogic synthesizing, you know you want to synthetize one note, so you tune all the Oscillators frequencies you can to this value: I guess this not what you can do, any virtual/virtual analog synth can do this on your demand.

2/ you have a source sound ou want to control You have to decompose it in items you can control to futhfill your harmonic constraint, in time and rhythmical constraints: 3 solutions.

a. FFT, fast fourrier transform, giving you the amount of power on all harmonics of your source sound, and up to you to enlarge the time scale of some harmonics or another ( really cook recipes, but really worth the expreriment)

b. Wavelet, close to FFT, but focussing on harmonic details whenever they happen, and how precise they happen. (imagine its like FFT optimizing on some meaningfull frequencies at each time)

c. Granular Synthesis, i think it is the easiest: it perfroms windows, (applying sort of Gauss Normal law to each time fragment of sound), like clouds of windows over your original sound, decoupling it in numerous parts, totally manageable on their pitch and duration (the speed and period of the window applied on the sound)

There maybe be a lot of other techniques but I am not aware of.


The Wikipedia article on Audio timescale-pitch modification may be helpful.


The basic idea is that you need to convert a signal along a time axis into a signal over time and frequency axes. Then you modify that signal appropriately, then convert back again.

Windowed fast fourier transforms are a common approach - take a short segment of the signal, convert to the frequency domain, repeat for periodic steps through the signal. Modifying the signal basically means relabelling your frequency and/or time axis scaling before applying the inverse transforms. Windows will probably overlap a little, so you can blend (cross-fade) from one block to another.

Another possible approach is to use wavelet transforms, filter banks, or some other closely related multi-resolution approach. The basis of these is the use of integral transforms in which each frequency is treated on an appropriate scale (relative to wavelength). A morlet basis, for example, is very like a single-wavelength-limited variation of the sine+j.cosine combination that is the basis of the fourier transform.

In theory, these should provide a better result. As the transforms naturally have both time and frequency axes, there is no need to generate the time axis "artificially" by windowing. This may avoid the sometimes obvious crossfade-between-blocks issues with the windowed Fourier transform approach. I'm going to guess that there may be other artefacts instead, but I don't know enough to know what they are.

Sorry if my terminology is misleading or wrong about multi-resolution stuff - I'm very far from being an expert.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜