Day 20. Debugging the Audio Sync

Video Length (including Q&A): 2h46

Welcome to “Handmade Hero Notes”, the book where we follow the footsteps of Handmade Hero in making the complete game from scratch, with no external libraries. If you'd like to follow along, preorder the game on handmadehero.org, and you will receive access to the GitHub repository, containing complete source code (tagged day-by-day) as well as a variety of other useful resources.

Suppose you've got this far, congratulations! Debugging sound issues may well be among the most tedious tasks one would face (even if that's your cup of tea). However, we aren't out of the woods yet; the system we have in our hands right now is still much latent, and there's room for improvement even at this early stage. Sure, with the default sound API and the integrated sound module, audio is laggy. But what about the sound junkies who have powerful cards with close to zero latency on the sound output? We want to support them as well! And today, we need to think real hard to make it happen.

You might have noticed a trend throughout this series: think first, and type after. If you understand the code you're about to write at least on an assumption level, it will be much easier for you to translate your thoughts into good code. And the benefits don't stop here! You will actually be able to read and edit it after a while. That said, you never want to go too deep and define every single scrupulous detail in advance: at that point, you might as well write the code directly! Leave yourself some space for a good surprise.

Today, however, we will mix the two together. It's a complex topic that we can simplify by performing one step at a time instead of trying to do the whole thing at once. If we find out that we were incorrect initially, the fix shouldn't be too difficult to make.

Deep Dive into the Issue

Define the Problem

Let's go back to our favorite timeline chart. We have our video frames which update every “frame flip” until the game stops running.

Figure 1: Game lifetime. Everything on it should be familiar to you by now.

For the moment, we're aiming at 30 frames per second frequency of the frame flips. This means that every second (or every 1000 milliseconds), we output 30 frames. Thus each frame only has $\frac{1000}{30} \approx 33.3$ milliseconds to do all the work.

For each frame, we want to prepare the image and the sound of our next frame. In an ideal world, both would go out at the same time.

Figure 2: Frame flip workflow in an ideal world.

This work happens sometime after the frame begins: before we get started on rendering, we need to collect the user input, update our world state, etc. Even further still, we're currently writing audio just before the frame flip, at the last possible moment, in fact. The idea is that the moment we send something to the hardware, it will be reproduced to the user.

Figure 3: Breakdown of the work within a single frame. Proportions of game update/render times will differ. Of course, some of these tasks may run in parallel in the future.

Unfortunately for us, hardware lag exists, so things do not happen immediately. Furthermore, DirectSound (on our specific machine) is not cooperating with us. Last time, we determined that the API has a delay of 30ms! That's, in fact, almost full frame! So this is what happens with our frames in reality:

Figure 4: Frame flip workflow in our reality.

As you can see, by the time audio starts playing, it's almost time for the next frame to flip! When we blindly assumed a low latency scenario, this created the audible clicks for us: information kept getting overwritten from under the Play Cursor.

This means that, when we were writing audio, we were told to write way before it was safe. DirectSound API works on two cursors: Play Cursor and Write Cursor. The API reads the audio bytes from the area around the Play Cursor, with the Write Cursor represents the first safe point user can write to. We're ignoring the write cursor for the time being and write to wherever we feel appropriate to our needs.

That's why we needed to add the additional frames of latency: simply to allow the write cursor to advance. Even further still, currently, we're asking for the cursor's location at the end of previous frame. This means that, by the time we get to compute audio, the cursors have advanced even further, resulting in a whopping 3 frames of latency!

Figure 5: Our current approach. 🕺 represents frame flips.

If we want to avoid latency, we'd ideally have to send audio data before our rendering work on a frame even starts! This isn't necessarily impossible but would introduce frame lag anyway since the new user input and world state wouldn't account for the old sound... On the surface, this seems like chasing one's own tail.

Evaluate the Options

At the end of the day, under the current system, there's no way to compute sound for a frame and have it reproduced together with its image when we intend it to. This leaves us with the following options:

Accept the reality that our audio will always be a couple frames behind the image. After all, so many games already do this! This would simply regress our audio to be latent by design.
Have a “low latency mode” that would take over on a machine with a great sound card and low latency; do things the slow way otherwise.
Continue tinkering on the inner workings of our platform layer.
Fix latency issues by essentially getting input and world state a frame before. This, however, would introduce input lag, which seems like a worse tradeoff in an action game like ours will be.

The last option in our particular case seems to be the worst of all, so we can rule it out straight away. In fact, we want to push the input capture as close to the frame flip as possible so that the user doesn't feel that the game is unresponsive.

Let's start with the simplest solution we might imagine. We write our sound to the hardware as soon as we have it, without waiting for frame flip. If, for instance, audio is sent out 15ms after the frame starts, it will be reproduced ~15ms after the frame flip occurred.

The game won't know of this happening: the platform will lie to the game, saying that the audio will be reproduced at the frame flip.

Figure 6: Sending the audio out as soon as it's ready.

Unfortunately, this kind of unsynchronized audio loop is customary in the industry. It's a bit sloppy but better than the alternative of pushing audio to the next frame and have a much more significant delay.

This solution also opens the road for the “low latency mode” we proposed above. If we determine that we're playing on extremely performant hardware, the sound will be targeting the frame flip instead.

Tracking Audio Latency

We calculated our latency last time by subtracting the write cursor from the play cursor. We did it on a calculator, so let's capture it in our debug output for simplicity. We can do this by simply subtracting the write cursor from the play cursor:

#if HANDMADE_INTERNAL
DWORD PlayCursor;
DWORD WriteCursor;
GlobalSecondaryBuffer->GetCurrentPosition(&PlayCursor, &WriteCursor);
DWORD AudioLatencyBytes = WriteCursor - PlayCursor;

char TextBuffer[256];sprintf_s(TextBuffer, sizeof(TextBuffer), "LPC:%u BTL:%u TC:%u BTW:%u - PC:%u WC:%u DELTA:%u\n",
            LastPlayCursor, ByteToLock, TargetCursor, BytesToWrite,
            PlayCursor, WriteCursor, AudioLatencyBytes);
OutputDebugStringA(TextBuffer);
#endif

Listing 1: [win32_handmade.cpp > WinMain] Tracking audio delta.

However, we need to remember that the audio buffer is circular, so the value we'll get sometimes might be negative. This happens when the write cursor “wraps” to the beginning of the buffer while the play cursor is still at the end. To circumvent this issue, we can “unwrap” the WriteCursor value in those cases by simply adding to it the length of the secondary buffer. It's a simple trick but quite effective. We'll use it in another instance later today.

Figure 7: If we add the circular buffer's length to the value of Write Cursor, we effectively unwrap it.

This is a circular buffer, so we need to account for when the write cursor is in front.

DWORD UnwrappedWriteCursor = WriteCursor;
if (UnwrappedWriteCursor < PlayCursor)
{
    UnwrappedWriteCursor += SoundOutput.SecondaryBufferSize;
}
DWORD AudioLatencyBytes = UnwrappedWriteCursor - PlayCursor;

Listing 2: [win32_handmade.cpp > WinMain] Accounting for circular buffer wrap.

Let's compile and run our program in the debugger. You should see a static DELTA value, which on our machine is 5760 bytes:

BTL:373332 TC:379093 BTW:5761 - PC:364800 WC:370560 DELTA:5760
BTL:379092 TC:2773 BTW:7681 - PC:372480 WC:378240 DELTA:5760
BTL:2772 TC:8533 BTW:5761 - PC:378240 WC:0 DELTA:5760
BTL:8532 TC:14293 BTW:5761 - PC:0 WC:5760 DELTA:5760
BTL:14292 TC:21973 BTW:7681 - PC:7680 WC:13440 DELTA:5760

Listing 3: [Debug Output] You will notice that delta is calculated correctly in case of buffer wrap, as we anticipated.

We could go even further, and calculate how many seconds does this amount of bytes translate to. Using some dimensional analysis you can quickly derive the following:

$$ bytes \times (\frac{samples}{bytes}) \times (\frac{seconds}{samples}) = $$ $$\frac{\bcancel{bytes} \times \bcancel{samples} \times seconds}{1 \times \bcancel{bytes} \times \bcancel {samples}} =$$ $$ seconds $$

Since we don't have SamplesPerBytes and SecondsPerSample, we'll need to invert the values we have. We will store the resulting value as a real (${\rm I\!R}$) number inside a float. When we pass it to the sprintf_s function, we can use the %f operator to read the float, or even %.3f to only display the first 3 decimal points (and round the rest).

DWORD UnwrappedWriteCursor = WriteCursor;
if (UnwrappedWriteCursor < PlayCursor)
{
    UnwrappedWriteCursor += SoundOutput.SecondaryBufferSize;
}
DWORD AudioLatencyBytes = UnwrappedWriteCursor - PlayCursor;f32 AudioLatencySeconds = ((f32)AudioLatencyBytes / (f32)SoundOutput.BytesPerSample) / 
                           (f32)SoundOutput.SamplesPerSecond;
char TextBuffer[256];sprintf_s(TextBuffer, sizeof(TextBuffer), "LPC:%u BTL:%u TC:%u BTW:%u - PC:%u WC:%u DELTA:%u (%.3fs)\n",
            LastPlayCursor, ByteToLock, TargetCursor, BytesToWrite,
            PlayCursor, WriteCursor, AudioLatencyBytes, AudioLatencySeconds);

Listing 4: [win32_handmade.cpp > WinMain] Displaying latency in seconds.

We've already encountered an operation where we need to divide by samples per second. In fact, it's quite prone to error, and we might want to pull out these calculations in the future. For the time being, let's place a todo for the future us to add the “bytes per second” field:

struct win32_sound_output
{
    int SamplesPerSecond;
    int BytesPerSample;
    DWORD SecondaryBufferSize;
    u32 RunningSampleIndex;
    int LatencySampleCount;    // TODO(casey): Math gets simpler if we add a "bytes per second" field? 
};

Listing 5: [win32_handmade.h] Adding a todo for the future.

Review Target Bytes calculation

If you remember, we define how many samples we want to send to the buffer in the following manner:

ByteToLock represents the point from which we should start writing.
TargetCursor defines where we're writing until.
BytesToWrite is the difference between the two (considering circular buffer wrap).

Until now, we were setting our ByteToLock to be whatever our RunningSampleIndex was. We would then calculate the target cursor, which would be the last known play cursor's location. We wouldn't really consider where the Write Cursor was. What mattered to us was just leaving enough room (LatencySampleCount) to surely get ahead of it.

Now we will start taking into account the DirectSound's write cursor. If it's within this frame's boundaries, we can assume that we are on low latency hardware and get on the “Low Latency Path.” In this case, we presume that we can write up from the frame flip, so we offset the TargetCursor until the next frame's boundary so that audio will always go out with the image.

Figure 8: Low latency path idea.

If the Write Cursor is beyond this frame, we assume that the latency is high. Then we won't wait until the frame flip, and the TargetCursor can be whatever our WriteCursor would be after one frame. We will offset it by but a small safety margin (a few milliseconds) to account for potential variability in the latency between now and the output.

Figure 9: High latency path.

All of this wouldn't impact our ByteToLock or BytesToWrite calculation (or, at least, not directly). However, given that TargetCursor will move depending on the hardware's latency, the rest of these values will move accordingly.

Implement the Two Audio Paths

Clean Up the Old Code

Alright, let's get cracking. First of all, let's get rid of the LastPlayCursor; it's gone. We'll need to remove it from the beginning of our WinMain and any usage we encounter along the way. We won't need it anymore because we'll do all the computations as soon as we get the cursor values.

int DebugTimeMarkerIndex = 0;
win32_debug_time_marker DebugTimeMarkers[GameUpdateHz / 2] = {};
DWORD LastPlayCursor = 0;
b32 SoundIsValid = false;

// ... 
// Towards the end of WinMain
// ... 

DWORD PlayCursor = 0;
DWORD WriteCursor= 0;
if (SUCCEEDED(GlobalSecondaryBuffer->GetCurrentPosition(&PlayCursor, &WriteCursor)))
{    LastPlayCursor = PlayCursor;
    if (!SoundIsValid)
    {
        SoundOutput.RunningSampleIndex = WriteCursor / SoundOutput.BytesPerSample;
        SoundIsValid = true;
    }
}

//... f32 AudioLatencySeconds = ((f32)AudioLatencyBytes / (f32)SoundOutput.BytesPerSample) /
                            (f32)SoundOutput.SamplesPerSecond;
char TextBuffer[256];sprintf_s(TextBuffer, sizeof(TextBuffer), "BTL:%u TC:%u BTW:%u - PC:%u WC:%u DELTA:%u (%.2fs)\n",
            ByteToLock, TargetCursor, BytesToWrite,
            PlayCursor, WriteCursor, AudioLatencyBytes, AudioLatencySeconds);
OutputDebugStringA(TextBuffer);

Listing 6: [win32_handmade.cpp > WinMain] Retiring LastPlayCursor.

Also, we can say goodbye to the FramesOfAudioLatency and LatencySampleCount:

#define FramesOfAudioLatency 3
// TODO(casey): How do we reliably query on monitor refresh rate on Windows?
#define MonitorRefreshHz 60
#define GameUpdateHz (MonitorRefreshHz / 2)

// ... 

// NOTE(casey): Sound test
win32_sound_output SoundOutput = {};
SoundOutput.SamplesPerSecond = 48000;
SoundOutput.BytesPerSample = sizeof(s16) * 2;
SoundOutput.SecondaryBufferSize = 2 * SoundOutput.SamplesPerSecond * SoundOutput.BytesPerSample;
SoundOutput.RunningSampleIndex = 0;SoundOutput.LatencySampleCount = FramesOfAudioLatency *(SoundOutput.SamplesPerSecond / GameUpdateHz);

Listing 7: [win32_handmade.cpp > WinMain] Removing latency tracking.

Clean up the struct definition in win32_sound_output:

struct win32_sound_output
{
    int SamplesPerSecond;
    int BytesPerSample;
    DWORD SecondaryBufferSize;
    u32 RunningSampleIndex;    int LatencySampleCount;
    // TODO(casey): Math gets simpler if we add a "bytes per second" field?
};

Listing 8: [win32_handmade.h] Removing latency tracking.

We can now consolidate the blocks dealing with the audio. For our debug code capturing the markers for visual display, we will have the own position calculation. Since it will happen quite close to the frame flip, we might as well consider these cursors as the Flip Cursors.

The sound initialization block can be removed for now, and we'll move it to another section in a second.

DWORD PlayCursor = 0;
DWORD WriteCursor= 0;
if (SUCCEEDED(GlobalSecondaryBuffer->GetCurrentPosition(&PlayCursor, &WriteCursor)))
{
    if (!SoundIsValid)
    {
        SoundOutput.RunningSampleIndex = WriteCursor / SoundOutput.BytesPerSample;
        SoundIsValid = true;
    }
}
else
{
    SoundIsValid = false;
}
#if HANDMADE_INTERNAL
// NOTE(casey): This is debug code
{    DWORD FlipPlayCursor = 0;
    DWORD FlipWriteCursor= 0;
    if (SUCCEEDED(GlobalSecondaryBuffer->GetCurrentPosition(&FlipPlayCursor, &FlipWriteCursor)))
    {
        Assert(DebugTimeMarkerIndex < ArrayCount(DebugTimeMarkers));
        win32_debug_time_marker *Marker = &DebugTimeMarkers[DebugTimeMarkerIndex++];
        if (DebugTimeMarkerIndex == ArrayCount(DebugTimeMarkers))
        {
            DebugTimeMarkerIndex = 0;
        }        Marker->PlayCursor = FlipPlayCursor;
        Marker->WriteCursor = FlipWriteCursor;
    }
}
#endif

Listing 9: [win32_handmade.cpp > WinMain] Simplifying blocks structure.

Next, we will delete the whole block where the cursors are calculated. Not all of this code is obsolete though, ByteToLock part is still relevant. We will recreate it further down below.

DWORD ByteToLock = 0;
DWORD TargetCursor = 0;
DWORD BytesToWrite = 0;

if (SoundIsValid)
{
    ByteToLock = ((SoundOutput.RunningSampleIndex * SoundOutput.BytesPerSample)
                    % SoundOutput.SecondaryBufferSize);
    
    TargetCursor = ((LastPlayCursor +
                        (SoundOutput.LatencySampleCount * SoundOutput.BytesPerSample))
                    % SoundOutput.SecondaryBufferSize);
    
    if(ByteToLock > TargetCursor)
    {
        BytesToWrite = SoundOutput.SecondaryBufferSize - ByteToLock;
        BytesToWrite += TargetCursor;
    }
    else
    {
        BytesToWrite = TargetCursor - ByteToLock;
    }
}

game_sound_output_buffer SoundBuffer = {};
SoundBuffer.SamplesPerSecond = SoundOutput.SamplesPerSecond;
SoundBuffer.SampleCount = BytesToWrite / SoundOutput.BytesPerSample;
SoundBuffer.Samples = Samples;

Listing 10: [win32_handmade.cpp > WinMain] Removing previous cursor calculation routine.

Finally, a bit further below, immediately after GameUpdateAndRender, we will house our new code for the audio calculation. We will add here all the pieces that were useful in the past, all in one place:

Capture position of the Play and Write cursors.
If the sound is not valid (we've just started, or something happened), initialize the RunningSampleIndex. This functionality is the same as we had before.
Calculate ByteToLock (same as before).
Calculate TargetCursor (same as before, for a moment, but already set up for eventual change).
Calculate BytesToWrite (same as before).
Fill the sound buffer with the samples we collected during GameUpdateAndRender.
If we're in debug mode (HANDMADE_INTERNAL), print out the various values we computed above.
Last, if by any chance GetCurrentPosition fails, we will do nothing of the above and simply mark SoundIsValid false.

GameUpdateAndRender(...);
DWORD PlayCursor;
DWORD WriteCursor;
if (GlobalSecondaryBuffer->GetCurrentPosition(&PlayCursor, &WriteCursor) == DS_OK)
{
    if (!SoundIsValid)
    {
        SoundOutput.RunningSampleIndex = WriteCursor / SoundOutput.BytesPerSample;
        SoundIsValid = true;
    }
    
    DWORD ByteToLock = ((SoundOutput.RunningSampleIndex * SoundOutput.BytesPerSample)
                        % SoundOutput.SecondaryBufferSize);
        
    DWORD TargetCursor = 0;
    TargetCursor = ((LastPlayCursor +
                        (SoundOutput.LatencySampleCount * SoundOutput.BytesPerSample))
                    % SoundOutput.SecondaryBufferSize);
    
    DWORD BytesToWrite = 0;
    if(ByteToLock > TargetCursor)
    {
        BytesToWrite = SoundOutput.SecondaryBufferSize - ByteToLock;
        BytesToWrite += TargetCursor;
    }
    else
    {
        BytesToWrite = TargetCursor - ByteToLock;
    }
if(SoundIsValid)
{
    Win32FillSoundBuffer(&SoundOutput, ByteToLock, BytesToWrite, &SoundBuffer);

#if HANDMADE_INTERNAL    DWORD PlayCursor;
    DWORD WriteCursor;
    GlobalSecondaryBuffer->GetCurrentPosition(&PlayCursor, &WriteCursor);
    
    DWORD UnwrappedWriteCursor = WriteCursor;
    if (UnwrappedWriteCursor < PlayCursor)
    {
        UnwrappedWriteCursor += SoundOutput.SecondaryBufferSize;
    }
    DWORD AudioLatencyBytes = UnwrappedWriteCursor - PlayCursor;
    f32 AudioLatencySeconds = ((f32)AudioLatencyBytes / (f32)SoundOutput.BytesPerSample) /
                               (f32)SoundOutput.SamplesPerSecond;
    char TextBuffer[256];
    sprintf_s(TextBuffer, sizeof(TextBuffer), 
                "LPC:%u BTL:%u TC:%u BTW:%u - PC:%u WC:%u DELTA:%u (%.2fs)\n",
                LastPlayCursor, ByteToLock, TargetCursor, BytesToWrite,
                PlayCursor, WriteCursor, AudioLatencyBytes, AudioLatencySeconds);
    OutputDebugStringA(TextBuffer);
#endif
}else
{
    // GetCurrentPosition didn't succeed
    SoundIsValid = false;
}

Listing 11: [win32_handmade.cpp > WinMain] Simplifying and combining audio blocks.

Last, we want to record our theory for the audio paths, so let's annotate what we decided:

DWORD PlayCursor;
DWORD WriteCursor;
if (GlobalSecondaryBuffer->GetCurrentPosition(&PlayCursor, &WriteCursor) == DS_OK)
{    /* NOTE(casey):
       
       Here is how sound output computation works.
     
       We define a safety value that is the number 
       of samples we think our game update loop 
       may vary by (let's say up to 2ms). 
     
       When we wake up to write audio, we will look
       and see what the play cursor position is and we
       will forecast ahead where we think the
       play cursor will be on the next frame boundary.
     
       We will then look to see if the write cursor is
       before that by at least our safety value. If it is, the
       target fill position is that frame boundary
       plus one frame. This gives us perfect audio
       sync in the case of a card that has low enough
       latency.
     
       If the write cursor is _after_ that safety 
       margin, then we assume we can never sync the
       audio perfectly, so we will write one frame's
       worth of audio plus the safety margin's worth
       of guard samples.
    */
if (!SoundIsValid)
{
    SoundOutput.RunningSampleIndex = WriteCursor / SoundOutput.BytesPerSample;
    SoundIsValid = true;
}

Listing 12: [win32_handmade.cpp > WinMain] Noting down our theory.

Change `TargetCursor` Calculation

Everything is set up. Let's go ahead with the changes we need. We'll write how we imagine target cursor calculation first and think about making it happen later.

For the fast path (low latency card), we would position ourselves at the next frame's expected boundary.
For the slow approach, the write cursor position would be moved by one frame's length (with some added safety bytes).

In both cases, the resulting value may be greater than our buffer's size, so to bind it, we'll simply modulo it against the buffer size. As a reminder, modulo operator (%) keeps the remainder of a given division.

DWORD TargetCursor = 0;if (AudioCardIsLowLatency)
{
    TargetCursor = ExpectedFrameBoundaryByte + ExpectedSoundBytesPerFrame;
}
else
{
    TargetCursor = WriteCursor + ExpectedSoundBytesPerFrame + SoundOutput.SafetyBytes;
}
TargetCursor = TargetCursor % SoundOutput.SecondaryBufferSize;

Listing 13: [win32_handmade.cpp > WinMain] Preparing two cases for TargetCursor calculation.

Now we just have to compute all those new values.

AudioCardIsLowLatency

Let's start with AudioCardIsLowLatency. We define the latency by taking ExpectedFrameBoundaryByte and check if it's greater than the write cursor with the safety margin.

b32 AudioCardIsLowLatency = true;
DWORD SafeWriteCursor = WriteCursor + SoundOutput.SafetyBytes;
if (SafeWriteCursor >= ExpectedFrameBoundaryByte)
{
    AudioCardIsLowLatency = false;
}
DWORD TargetCursor = 0;
if (AudioCardIsLowLatency)
//...

Listing 14: [win32_handmade.cpp > WinMain] Determining if our sound card is latent.

Or, in other words,

DWORD SafeWriteCursor = WriteCursor + SoundOutput.SafetyBytes;
b32 AudioCardIsLowLatency = SafeWriteCursor < ExpectedFrameBoundaryByte;

DWORD TargetCursor = 0;
if (AudioCardIsLowLatency)
//...

Listing 15: [win32_handmade.cpp > WinMain] A more concise notation of the above.

However, there's always the circular buffer issue that we need to take into account: if the write cursor is behind the play cursor, it means the buffer wrapped, and we need to “unwrap” the cursor as we've done earlier: simply add the sound buffer size to write cursor's value.

DWORD SafeWriteCursor = WriteCursor;
if (SafeWriteCursor < PlayCursor)
{
    SafeWriteCursor += SoundOutput.SecondaryBufferSize;
}
Assert(SafeWriteCursor >= PlayCursor);
SafeWriteCursor += SoundOutput.SafetyBytes;
b32 AudioCardIsLowLatency = SafeWriteCursor < ExpectedFrameBoundaryByte;

Listing 16: [win32_handmade.cpp > WinMain] Normalizing write cursor.

ExpectedFrameBoundaryByte

ExpectedFrameBoundaryByte seems reasonably straightforward. We capture the play cursor already, which represents our current position in time. Therefore, we simply add whatever amount of sound bytes we expect per frame.

If you spotted a mistake in our logic here, well done! We will return to this at the end of the chapter as it will prove key to a bug.

DWORD ExpectedFrameBoundaryByte = PlayCursor + ExpectedSoundBytesPerFrame;
DWORD SafeWriteCursor = WriteCursor + SoundOutput.SafetyBytes;

Listing 17: [win32_handmade.cpp > WinMain] Calculating expected frame flip boundary.

ExpectedSoundBytesPerFrame

When it comes to calculate how many sound bytes we expect per frame, we can get help from the dimensional analysis once again:

We want to know how many sound bytes are in a frame, or $\frac{bytes}{frame}$
We know how many frames do we want to have in a second, $\frac{frames}{second}$. We currently capture this value in GameUpdateHz.
We also know how to calculate bytes per second, we didn't capture this value just yet, but we have done this multiple times in the past: it's $\frac{samples}{second} \times \frac{bytes}{samples} = \frac{bytes}{second}$.
So in order to calculate the $\frac{bytes}{frame}$, we should divide $\frac{bytes}{second}$ over $\frac{frames}{second}$:

$$ \frac { \frac{bytes}{second} }{ \frac{frames}{second} } = $$ $$ \frac {bytes}{second} \times \frac{seconds}{frame} = $$ $$ \frac {bytes \times \bcancel{seconds}}{\bcancel{second} \times frame} =$$ $$ \frac {bytes}{frame} $$

Let's put it in code:

DWORD ExpectedSoundBytesPerFrame = (SoundOutput.SamplesPerSecond * SoundOutput.BytesPerSample)
                            / GameUpdateHz ;
DWORD ExpectedFrameBoundaryByte = PlayCursor + ExpectedSoundBytesPerFrame;
DWORD SafeWriteCursor = WriteCursor;

Listing 18: [win32_handmade.cpp > WinMain] Calculating expected sound bytes per frame.

SafetyBytes

Finally, we need to decide what do we consider a safety margin. What is a safety margin? It's a potential variance in frame length or how much a frame can go outside of its intended frequency.

It would be a relatively small length but not inconsiderate: we could say it's a third of a frame. We can adjust this value later on if necessary and maybe even compute the variance in the future.

We already know how to calculate sound bytes per frame, so let's simply say it's that value divided by three.

Also, we can incorporate the result as a part of the sound output structure.

win32_sound_output SoundOutput = {};
SoundOutput.SamplesPerSecond = 48000;
SoundOutput.BytesPerSample = sizeof(s16) * 2;
SoundOutput.SecondaryBufferSize = 2 * SoundOutput.SamplesPerSecond * SoundOutput.BytesPerSample;
SoundOutput.RunningSampleIndex = 0;// TODO(casey): Actually compute this variance and see 
// what the lowest reasonable value is.
SoundOutput.SafetyBytes = ((SoundOutput.SamplesPerSecond * SoundOutput.BytesPerSample) 
    / GameUpdateHz) / 3;

Listing 19: [win32_handmade.cpp > WinMain] Calculating safety bytes.

struct win32_sound_output
{
    int SamplesPerSecond;
    int BytesPerSample;
    DWORD SecondaryBufferSize;
    u32 RunningSampleIndex;    DWORD SafetyBytes;
    // TODO(casey): Math gets simpler if we add a "bytes per second" field?
};

Listing 20: [win32_handmade.h] Updating win32_sound_output structure.

We would now be mostly compilable, except we left one piece hanging in SoundBuffer.SampleCount definition. Take your time to fix any other errors you might have before moving one.

Introduce `GameGetSoundSamples`

Until now, we've been working based on the assumption that we know the current frame's length. But we don't really; we only have an estimate based on the previous frames' data. We already moved the target cursor calculation after GameUpdateAndRender to mitigate this issue, but what this does is merely preparing the target cursor for the next frame (hoping that it will be as long as the current one). It doesn't really address the core issue.

This all should be fine since we know the expected frame time and we have our safety margins, and we're preparing ourselves for the victory, and it's all nice and straightforward.

Right?

Getting real audio sync is complicated. We need to sync two clocks - the game clock and the audio API clock. We also need to know the number of bytes to ask from the game; to discover that we need to prepare the samples... so it's getting into a bit of a vicious cycle.

Maybe it's time for a big decision. We will separate game writing audio from the game doing the world update and rendering!

Let's make it happen. First of all, we will introduce the new function declaration inside handmade.h. It will essentially be the same as GameUpdateAndRender, except it will take a sound buffer instead of input and render buffer.

We will also remove the sound buffer from GameUpdateAndRender, like so:

struct game_memory
{
    u64 PermanentStorageSize;
    void *PermanentStorage;
    u64 TransientStorageSize;
    void *TransientStorage;
    b32 IsInitialized;
};
internal void GameUpdateAndRender(game_memory *Memory, game_input *Input, game_offscreen_buffer* Buffer);
internal void GameGetSoundSamples(game_memory *Memory, game_sound_output_buffer *SoundBuffer);

Listing 21: [handmade.h] Splitting sound update from the rest of the game update.

Next, we will move the necessary preparations of the sound buffer and call our new function after we finished writing our audio:

game_sound_output_buffer SoundBuffer = {};
SoundBuffer.SamplesPerSecond = SoundOutput.SamplesPerSecond;
SoundBuffer.SampleCount = BytesToWrite / SoundOutput.BytesPerSample;
SoundBuffer.Samples = Samples;

game_offscreen_buffer Buffer = {};
Buffer.Memory = GlobalBackbuffer.Memory;
Buffer.Width = GlobalBackbuffer.Width;
Buffer.Height = GlobalBackbuffer.Height;
Buffer.Pitch = GlobalBackbuffer.Pitch;
GameUpdateAndRender(&GameMemory, NewInput, &Buffer);

DWORD PlayCursor;
DWORD WriteCursor;
if (GlobalSecondaryBuffer->GetCurrentPosition(&PlayCursor, &WriteCursor) == DS_OK)
{
    // ... 
    DWORD BytesToWrite = 0;
    if(ByteToLock > TargetCursor)
    {
        BytesToWrite = SoundOutput.SecondaryBufferSize - ByteToLock;
        BytesToWrite += TargetCursor;
    }
    else
    {
        BytesToWrite = TargetCursor - ByteToLock;
    }
    game_sound_output_buffer SoundBuffer = {};
    SoundBuffer.SamplesPerSecond = SoundOutput.SamplesPerSecond;
    SoundBuffer.SampleCount = BytesToWrite / SoundOutput.BytesPerSample;
    SoundBuffer.Samples = Samples;

    GameGetSoundSamples(&GameMemory, &SoundBuffer);
    Win32FillSoundBuffer(&SoundOutput, ByteToLock, BytesToWrite, &SoundBuffer);

Listing 22: [win32_handmade.cpp > WinMain] Using GameGetSoundSamples.

Now, we shall look at what is happening inside handmade.cpp. Looking at it, we see that we already set ourselves for success by separating the sound logic away from most of GameUpdateAndRender into a single function. Let's finish the job by introducing the new function definition:

internal voidGameUpdateAndRender(game_memory* Memory, game_input *Input, game_offscreen_buffer* Buffer)
{
    // ... 
        GameOutputSound(SoundBuffer, GameState->ToneHz);
    RenderWeirdGradient(Buffer, GameState->XOffset, GameState->YOffset);
}
internal void
GameGetSoundSamples(game_memory* Memory, game_sound_output_buffer *SoundBuffer)
{
    game_state *GameState = (game_state*)Memory->PermanentStorage
    GameOutputSound(SoundBuffer, GameState->ToneHz);
}

Listing 23: [win32_handmade.cpp] Defining GameGetSoundSamples.

And... that's all there was to it. Our game updates the world state, and then the sound system comes in and writes the sound based on it.

However, it's not over yet. We have written tons of code until now, so we can probably find some bugs to catch.

Even More Debugging

Let's have some fun! We want to finish debugging sound and avoid having to spend tons of time reading our logs. On the other hand, we have our debug line drawing routine that we can leverage.

Add Global Pause Key

The first thing that we will do, though, is not drawing. Our graph is running quite fast, so a pause button that would stop the simulation altogether would be great for this case and potentially for the future.

Currently, we have our main loop based on the global variable GlobalRunning. We could introduce a new global for pause purposes.

global_variable b32 GlobalRunning;global_variable b32 GlobalPause;
global_variable win32_offscreen_buffer GlobalBackbuffer;
global_variable IDirectSoundBuffer *GlobalSecondaryBuffer;
global_variable s64 GlobalPerfCountFrequency;

Listing 24: [win32_handmade.cpp] Introducing Global Pause.

Now we can toggle this paused state on and off by pressing a key, say 'P'. If the pause is on, it will be toggled off, and the contrary.

else if (VKCode == VK_BACK)
{
    Win32ProcessKeyboardMessage(&KeyboardController->Back, IsDown);
}#if HANDMADE_INTERNAL
else if (VKCode == 'P')
{
    if (IsDown)
    {
        GlobalPause = !GlobalPause;
    }
}
#endif
b32 AltKeyWasDown = ((Message.lParam & (1 << 29)) != 0);

Listing 25: [win32_handmade.cpp > Win32ProcessPendingMessages] Adding Global Pause Hotkey.

As for the functionality, we will introduce a big if block immediately following our keyboard processing routine. If the pause is not on, we will execute the rest of the code. If it is, we will simply skip to the next iteration of the GlobalRunning loop, leaving our buffers in the same state as they were.

Win32ProcessPendingMessages(NewKeyboardController);if (!GlobalPause)
{
    DWORD MaxControllerCount = XUSER_MAX_COUNT;
    if(MaxControllerCount > (ArrayCount(NewInput->Controllers) - 1))
    {
        MaxControllerCount = (ArrayCount(NewInput->Controllers) - 1);
    }
    // ...
    // All of the game update
    // ...
    
#if HANDMADE_INTERNAL
    ++DebugTimeMarkerIndex;
    if (DebugTimeMarkerIndex == ArrayCount(DebugTimeMarkers))
    {
        DebugTimeMarkerIndex = 0;
    }
#endif}

Listing 26: [win32_handmade.cpp > WinMain] Implementing Global Pause.

This is not how you should make your final game pause state! Keep in mind that, with this pause, our sleep-based frame cycle goes straight out of the window. It's a quick and easy tool to help us with debugging.

Figure 10: If your WinMain block nesting is confusing you, this is how the end of the file should look like.

Highlight Latest Marker

Currently, we're visualizing all the markers containing play and writing cursors at the same level. We can highlight the latest one to give it a bit more importance. Let's say there's a current marker index, and if we hit it, we draw that marker under the rest.

This is how it would look in our drawing routine:

internal void
Win32DebugSyncDisplay(win32_offscreen_buffer *Backbuffer,
                      int MarkerCount, win32_debug_time_marker *Markers,                      int CurrentMarkerIndex,
                      win32_sound_output *SoundOutput, f32 TargetSecondsPerFrame)
{
    int PadX = 16;
    int PadY = 16;
        int LineHeight = 64;
    int Top = PadY;
    int Bottom = Backbuffer->Height - PadY;
    
    f32 C = (f32)(Backbuffer->Width - 2 * PadX) / (f32)SoundOutput->SecondaryBufferSize;
    for (int MarkerIndex = 0;
         MarkerIndex < MarkerCount;
         ++MarkerIndex)
    {        DWORD PlayColor = 0xFFFFFFFF;
        DWORD WriteColor = 0xFFFF0000;
        
        int Top = PadY;
        int Bottom = PadY + LineHeight;
        
        if (MarkerIndex == CurrentMarkerIndex)
        {
            Top += LineHeight + PadY;
            Bottom += LineHeight + PadY;
        }
        
        win32_debug_time_marker *ThisMarker = &Markers[MarkerIndex];        Win32DrawSoundBufferMarker(Backbuffer, SoundOutput, C, PadX, Top, Bottom, ThisMarker->PlayCursor, PlayColor);
        Win32DrawSoundBufferMarker(Backbuffer, SoundOutput, C, PadX, Top, Bottom, ThisMarker->WriteCursor, WriteColor);
    }
}

Listing 27: [win32_handmade.cpp] Improving Win32DebugSyncDisplay.

To get it to work, we need to pass the current marker index to the function. Since we're advancing DebugTimeMarkerIndex to write the markers, we can simply give it as is.

It somewhat introduces a bug on the 0th index (as it underflows to the maximum number), but we don't really care since it's the debug code. We'll leave a todo for the time being.

#if HANDMADE_INTERNAL// TODO(casey): Note, current is wrong on the zeroth index
Win32DebugSyncDisplay(&GlobalBackbuffer,
                        ArrayCount(DebugTimeMarkers), DebugTimeMarkers,                        DebugTimeMarkerIndex - 1,
                        &SoundOutput, TargetSecondsPerFrame);
#endif

Listing 28: [win32_handmade.cpp > WinMain] Passing current marker index.

Compile and run to make sure that everything is working

Record More Values

We have this newfound space for our current marker that we can utilize to display more values. Heck, we can even go all in and get to show as many values that we have.

struct win32_debug_time_marker
{    DWORD OutputPlayCursor;
    DWORD OutputWriteCursor;
    DWORD OutputLocation; 
    DWORD OutputByteCount;
    
    DWORD FlipPlayCursor;
    DWORD FlipWriteCursor;
};

Listing 29: [win32_handmade.h] Updating old values and adding some new ones.

We have our play cursor and write cursor at the frame flip time (that we'll call FlipPlayCursor and FlipWriteCursor), cursors at the output time, byte to lock, and how many bytes to write.

However, since we now utilize a marker in more than one place, we can't really advance the DebugTimeMarkerIndex when we access it. We'll do it at the very end of our frame before we're ready to flip.

#if HANDMADE_INTERNAL
// NOTE(casey): This is debug code
{
    DWORD FlipPlayCursor = 0;
    DWORD FlipWriteCursor= 0;
    if (SUCCEEDED(GlobalSecondaryBuffer->GetCurrentPosition(&FlipPlayCursor, &FlipWriteCursor)))
    {
        Assert(DebugTimeMarkerIndex < ArrayCount(DebugTimeMarkers));        win32_debug_time_marker *Marker = &DebugTimeMarkers[DebugTimeMarkerIndex];
        if (DebugTimeMarkerIndex == ArrayCount(DebugTimeMarkers))
        {
            DebugTimeMarkerIndex = 0;
        }
        Marker->FlipPlayCursor = FlipPlayCursor;
        Marker->FlipWriteCursor = FlipWriteCursor;
    }
}
#endif

// ... 

#if 0
// debug timing output
// ...
#endif
#if HANDMADE_INTERNAL
++DebugTimeMarkerIndex;
if (DebugTimeMarkerIndex == ArrayCount(DebugTimeMarkers))
{
    DebugTimeMarkerIndex = 0;
}
#endif

Listing 30: [win32_handmade.cpp > WinMain] Making sure time marker index is only updated once per frame.

Now that we have a more permanent marker index, we can access the marker from our sound code. Let's do it in the HANDMADE_INTERNAL section, where we print out the values to the Output console.

GameGetSoundSamples(&GameMemory, &SoundBuffer);
                        
Win32FillSoundBuffer(&SoundOutput, ByteToLock, BytesToWrite, &SoundBuffer);

#if HANDMADE_INTERNALwin32_debug_time_marker *Marker = &DebugTimeMarkers[DebugTimeMarkerIndex];
Marker->OutputPlayCursor = PlayCursor;
Marker->OutputWriteCursor = WriteCursor;
Marker->OutputLocation = ByteToLock;
Marker->OutputByteCount = BytesToWrite;

//... 
OutputDebugStringA(TextBuffer);
#endif

Listing 31: [win32_handmade.cpp > WinMain] Recording output markers.

For what it might concern Win32DebugSyncDisplay, first we'll need to correct the struct members currently in use:

win32_debug_time_marker *ThisMarker = &Markers[MarkerIndex];Win32DrawSoundBufferMarker(Backbuffer, SoundOutput, C, PadX, Top, Bottom,                                        ThisMarker->FlipPlayCursor, PlayColor);
Win32DrawSoundBufferMarker(Backbuffer, SoundOutput, C, PadX, Top, Bottom,                            ThisMarker->FlipWriteCursor, WriteColor);

Listing 32: [win32_handmade.cpp > Win32DebugSyncDisplay] Making us compilable.

So far, we're recording the new values, but we don't display them. You can compile and test it out: no changes yet. Let's add the newly registered values to our debug display. Instead of having too many colors, we'll simply reuse the PlayColor and WriteColor we already have, merely offset those down.

win32_debug_time_marker *ThisMarker = &Markers[MarkerIndex];

DWORD PlayColor = 0xFFFFFFFF;
DWORD WriteColor = 0xFFFF0000;

int Top = PadY;
int Bottom = PadY + LineHeight;

if (MarkerIndex == CurrentMarkerIndex)
{
    Top += LineHeight + PadY;
    Bottom += LineHeight + PadY;
        Win32DrawSoundBufferMarker(Backbuffer, SoundOutput, C, PadX, Top, Bottom, 
                               ThisMarker->OutputPlayCursor, PlayColor);
    Win32DrawSoundBufferMarker(Backbuffer, SoundOutput, C, PadX, Top, Bottom, 
                               ThisMarker->OutputWriteCursor, WriteColor);
    Top += LineHeight + PadY;
    Bottom += LineHeight + PadY;
    Win32DrawSoundBufferMarker(Backbuffer, SoundOutput, C, PadX, Top, Bottom, 
                               ThisMarker->OutputLocation, PlayColor);
    Win32DrawSoundBufferMarker(Backbuffer, SoundOutput, C, PadX, Top, Bottom,       
                               ThisMarker->OutputLocation + ThisMarker->OutputByteCount, WriteColor);
    Top += LineHeight + PadY;
    Bottom += LineHeight + PadY;
}win32_debug_time_marker *ThisMarker = &Markers[MarkerIndex];

Listing 33: [win32_handmade.cpp > Win32DebugSyncDisplay] Displaying new values (only on the current cursor).

Now that we're passing potentially out-of-bounds values, we must make sure that our program doesn't crash. We can do it by simply clamping the horizontal and vertical limits.

internal void
Win32DebugDrawVertical(win32_offscreen_buffer *Backbuffer,
                       int X, int Top, int Bottom, u32 Color)
{    if (Top <= 0)
    {
        Top = 0;
    }
    
    if (Bottom > Backbuffer->Height)
    {
        Bottom = Backbuffer->Height;
    }
    
    if ((X >= 0) && (X < Backbuffer->Width))
    {
        u8 *Pixel = (u8 *)Backbuffer->Memory +
            X * Backbuffer->BytesPerPixel +
            Top * Backbuffer->Pitch;
        for (int Y = Top;
            Y < Bottom;
            ++Y)
        {
            *(u32 *)Pixel = Color;
            Pixel += Backbuffer->Pitch;
        }    }
}

Listing 34: [win32_handmade.cpp] Making sure we never go out of bounds.

Now that we're passing potentially out-of-bounds values, we won't be able to flat assert value correctness. Instead, we can do it for the individual values on a case-by-case basis.

inline void
Win32DrawSoundBufferMarker(...)
{    Assert (Value < SoundOutput->SecondaryBufferSize);
    f32 XReal = C * (f32)Value;
    int X = PadX + (int)XReal;
    
    Win32DebugDrawVertical(Backbuffer, X, Top, Bottom, Color);
}

internal void
Win32DebugSyncDisplay(...)
{
    int PadX = 16;
    int PadY = 16;
    
    int LineHeight = 64;
    
    f32 C = (f32)(Backbuffer->Width - 2 * PadX) / (f32)SoundOutput->SecondaryBufferSize;
    for (int MarkerIndex = 0;
         MarkerIndex < MarkerCount;
         ++MarkerIndex)
    {
        win32_debug_time_marker *ThisMarker = &Markers[MarkerIndex];
                Assert(ThisMarker->OutputPlayCursor < SoundOutput->SecondaryBufferSize);
        Assert(ThisMarker->OutputWriteCursor < SoundOutput->SecondaryBufferSize);
        Assert(ThisMarker->OutputLocation < SoundOutput->SecondaryBufferSize);
        Assert(ThisMarker->OutputByteCount < SoundOutput->SecondaryBufferSize);
        Assert(ThisMarker->FlipPlayCursor < SoundOutput->SecondaryBufferSize);
        Assert(ThisMarker->FlipWriteCursor < SoundOutput->SecondaryBufferSize);

Listing 35: [win32_handmade.cpp] Fixing the assertions.

If you compile and run now, you'll see something like this:

Figure 11: Current flip frame data moved to the very bottom, and the new values are inserted in the middle.

Display Expected Frame Flip Time

At this point, we want to see if our estimate lines up at all. We assume that the frame flip happens as close to the current frame's FlipPlayCursor as possible (white bottom line). Let's put our FrameBoundaryByte as the expected flip play cursor and see if they are even close.

Again, we first add the new member to win32_debug_time_marker structure:

struct win32_debug_time_marker
{
    DWORD OutputPlayCursor;
    DWORD OutputWriteCursor;
    DWORD OutputLocation;
    DWORD OutputByteCount;    DWORD ExpectedFlipPlayCursor;
    
    DWORD FlipPlayCursor;
    DWORD FlipWriteCursor;
};

Listing 36: [win32_handmade.h] Expanding win32_debug_time_marker.

We then capture the cursor:

win32_debug_time_marker *Marker = &DebugTimeMarkers[DebugTimeMarkerIndex];
Marker->OutputPlayCursor = PlayCursor;
Marker->OutputWriteCursor = WriteCursor;
Marker->OutputLocation = ByteToLock;
Marker->OutputByteCount = BytesToWrite;Marker->ExpectedFlipPlayCursor = ExpectedFrameBoundaryByte;

Listing 37: [win32_handmade.cpp > WinMain] Recording calculated frame flip cursor.

Finally, we display it on screen. Let's say that this line will be a big yellow one:

DWORD PlayColor = 0xFFFFFFFF;         // White
DWORD WriteColor = 0xFFFF0000;        // RedDWORD ExpectedFlipColor = 0xFFFFFF00; // Yellow

int Top = PadY;
int Bottom = PadY + LineHeight;

if (MarkerIndex == CurrentMarkerIndex)
{
    Top += LineHeight + PadY;
    Bottom += LineHeight + PadY;
    
    Win32DrawSoundBufferMarker(..., ThisMarker->OutputPlayCursor, PlayColor);
    Win32DrawSoundBufferMarker(..., ThisMarker->OutputWriteCursor, WriteColor);
    
    Top += LineHeight + PadY;
    Bottom += LineHeight + PadY;
    
    Win32DrawSoundBufferMarker(..., ThisMarker->OutputLocation, PlayColor);
    Win32DrawSoundBufferMarker(..., ThisMarker->OutputLocation + ThisMarker->OutputByteCount, WriteColor);
    
    Top += LineHeight + PadY;
    Bottom += LineHeight + PadY;
        Win32DrawSoundBufferMarker(Backbuffer, SoundOutput, C, PadX, PadY, Bottom, 
                               ThisMarker->ExpectedFlipPlayCursor, ExpectedFlipColor);
}

Listing 38: [win32_handmade.cpp > Win32DebugSyncDisplay] Displaying expected flip play cursor.

Figure 12: Expected Flip Play Cursor.

Display Play Window

The actual flip play cursor and the expected one don't really line up perfectly. What's going on?

One possibility is that the update lag of DirectSound itself doesn't allow correct visualization. We calculated that the cursors move every 480 samples or ~3.3 times per frame, if you recall. It implies that the cursor we get is more of a window in time than a point in time. Let's visualize it as well, this time in a lovely magenta.

DWORD PlayColor = 0xFFFFFFFF;
DWORD WriteColor = 0xFFFF0000;
DWORD ExpectedFlipColor = 0xFFFFFF00;DWORD PlayWindowColor = 0xFFFF00FF;

int Top = PadY;
int Bottom = PadY + LineHeight;

if (MarkerIndex == CurrentMarkerIndex)
{
    // ...
}

Win32DrawSoundBufferMarker(..., ThisMarker->FlipPlayCursor, PlayColor);Win32DrawSoundBufferMarker(Backbuffer, SoundOutput, C, PadX, Top, Bottom, 
                           ThisMarker->FlipPlayCursor + (480 * SoundOutput->BytesPerSample), 
                           PlayWindowColor);
Win32DrawSoundBufferMarker(Backbuffer, SoundOutput, C, PadX, Top, Bottom, ThisMarker->FlipWriteCursor, WriteColor);

Listing 39: [win32_handmade.cpp > Win32DebugSyncDisplay] Displaying the boundaries of the play window.

Fixing the ExpectedFrameBoundaryByte calculation.

If our data is correct, the ExpectedFrameBoundaryByte should always fall between those values. But it's not. If you look closely, it's very rarely inside the window, and more often than not, it floats away. What's going on?

After giving a closer look at the ExpectedFrameBoundaryByte, we can see that the computation is wrong:

DWORD ExpectedFrameBoundaryByte = PlayCursor + ExpectedSoundBytesPerFrame;

That's not what we set out to do! This doesn't take into account whatever amount of time has already elapsed since the frame started. Let's fix this.

First, we need to introduce an exact wall clock snapshot, going off exactly at the frame flip time. Our current LastCounter is more liberal in this regard because it's measuring itself at every cycle, so it doesn't need precise placement at the frame flip. Flip wall clock should snap after we send the render buffer to Windows:

// Before the main loop
b32 SoundIsValid = false;

LARGE_INTEGER LastCounter = Win32GetWallClock();LARGE_INTEGER FlipWallClock = Win32GetWallClock();

u64 LastCycleCount = __rdtsc();
GlobalRunning = true;
win32_window_dimension Dimension = Win32GetWindowDimension(Window);

// ... 
// At the end of the main loop
// ... 

#if HANDMADE_INTERNAL
Win32DebugSyncDisplay(...);
#endif
Win32DisplayBufferInWindow(...);
FlipWallClock = Win32GetWallClock();

#if HANDMADE_INTERNAL
// ... 
#endif

Listing 40: [win32_handmade.cpp > WinMain] Getting the actual frame flip timing.

This allows us to measure the time elapsed since the beginning of the frame. Once we know how many seconds have passed since the beginning of the frame, we can derive the correct bytes until flip using a simple proportion.

GameUpdateAndRender(&GameMemory, NewInput, &Buffer);
                        LARGE_INTEGER AudioWallClock = Win32GetWallClock();
f32 FromBeginToAudioSeconds = Win32GetSecondsElapsed(FlipWallClock, AudioWallClock);

// Just before the main audio block
DWORD PlayCursor;
DWORD WriteCursor;
if (GlobalSecondaryBuffer->GetCurrentPosition(&PlayCursor, &WriteCursor) == DS_OK)
{
    // ... 
    DWORD ByteToLock = ((SoundOutput.RunningSampleIndex * SoundOutput.BytesPerSample)
                                                    % SoundOutput.SecondaryBufferSize);
                                
    DWORD ExpectedSoundBytesPerFrame = (SoundOutput.SamplesPerSecond * SoundOutput.BytesPerSample)
        / GameUpdateHz;
    f32 SecondsLeftUntilFlip = TargetSecondsPerFrame - FromBeginToAudioSeconds;
    DWORD ExpectedBytesUntilFlip = (DWORD)((SecondsLeftUntilFlip / TargetSecondsPerFrame) * (f32)ExpectedSoundBytesPerFrame);
    DWORD ExpectedFrameBoundaryByte = PlayCursor + ExpectedBytesUntilFlip;
}

Listing 41: [win32_handmade.cpp > WinMain] Calculating correct frame boundary byte.

If we compile and run now, we'll see that the flip play cursor and the calculated flip play cursor line up almost perfectly! Mission accomplished.

Recap

Audio is tricky to get right, and it's hard to find topics harder than this in programming. However, if you got this far, congratulations! From here on, it should be a breeze.

Side Considerations

Bugfix: Sound Changing Pitch

If you listen to the game output for long enough, you'll quickly hear that it changes in pitch over time, without any user input.

This is due to the loss of precision of the floating-point value. It turns out that with the running sample index, tSine becomes less and less precise very quickly indeed. We can fix this using a straightforward operation such as showcased below.

f32 SineValue = sinf(tSine);
s16 SampleValue = (s16)(SineValue * ToneVolume);

*SampleOut++ = SampleValue;
*SampleOut++ = SampleValue;
tSine += 2.0f * Pi32 * 1.0f / (f32)WavePeriod;
if (tSine > 2.0f * Pi32)
{
    tSine -= 2.0f * Pi32;
}

Listing 42: [handmade.cpp > GameOutputSound] Normalizing tSine.

Navigation

Previous: Day 19. Improving Audio Synchronization

Up Next: Day 21. Loading Game Code Dynamically

Back to Index

Glossary

Frame flip
Dimensional analysis
Input lag

formatted by Markdeep 1.10

✒

Deep Dive into the Issue

Define the Problem

Evaluate the Options

Tracking Audio Latency

Review Target Bytes calculation

Implement the Two Audio Paths

Clean Up the Old Code

Change TargetCursor Calculation

AudioCardIsLowLatency

ExpectedFrameBoundaryByte

ExpectedSoundBytesPerFrame

SafetyBytes

Introduce GameGetSoundSamples

Even More Debugging

Add Global Pause Key

Highlight Latest Marker

Record More Values

Display Expected Frame Flip Time

Display Play Window

Fixing the ExpectedFrameBoundaryByte calculation.

Recap

Side Considerations

Bugfix: Sound Changing Pitch

Navigation

Change `TargetCursor` Calculation

Introduce `GameGetSoundSamples`