Music Synthesizing

So I've been playing around a bit with music synthesizing and I've got a fundamental issue I'd like to get feedback on.

Assume we're only working with sine waves. To generate music all we do is add up a bunch of sines evaluated at the current playtime, so far so good. Now comes the problem, I'd like to be able to gradually change the pitch of a wave. Just lerping between the pitches will cause artifacts during the entirety of the lerp. Example 440hz to 660hz one second in is 220 periods apart. Lerping during one sec will then temporarily increase the pitch 220hz to 440hz during the lerp. At later playtimes this will number will be bigger and all we'll hear is clicking.

One obvious solution, is that we'll just save down a current phase and then add an appropriate amount at each sample. Which works just fine if we've got just one wave. However we get a problem when we're generating harmonics, especially with distances of one octave. for example 440hz & 880hz. The thing is, if we're adding a bit of phase shift each sample we quickly get errors that accumulate. And so the frequency won't really be 440 and 880 but say 438 and 883 at which point we get really noticeable tremolo effects.

I'd like to be able to play harmonics while changing the pitch so we can't just swap between the two.

Does anybody have any good ideas or know how this is usually done?




The ear is highly sensitive to pitch, and I agree that something like

1
2
DeltaPhase += DeltaDeltaPhase; // change in delta phase per step
Phase += DeltaPhase;


is not numerically stable enough to sound perfect (it might sound ok). What you want to do instead is make your DeltaPhase a function of time, so you can precisely compute it for every step if you wish:

1
2
DeltaPhase = ComputeDeltaPhase(CurrentTime);
Phase += DeltaPhase;


Note that DeltaPhase is just another way to express frequency here.

Also note that linearly interpolating the frequency (and thereby the DeltaPhase) may give a result different from your expectations, since frequencies are not perceived linearly. What you may want to do instead is interpolate the pitch, and convert that to a frequency. This is relatively expensive to compute, so it's fine to only compute the pitch occasionally (e.g. 500 times a second), and linearly interpolate the frequency in between.
Yes! Perfect! That worked. Thank you.

Now this seems completely obvious but I'm glad decided I asked for help :)


As a side note: It was a couple of years ago I last posted at stack overflow but the difference is huge. Here I got the answer I was looking for and a friendly tip for the future. At stack overflow my question would have been marked as a duplicate of an old irrelevant question, three people would have told me to just use some library, and a couple of comments would have reminded of me the existence of google..

Handmade network deserves a pad on the back :)

Edited by Daniel Hesslow on