Day 25. Finishing the Win32 Prototyping Layer

Day 25. Finishing the Win32 Prototyping Layer
Video Length (including Q&A): 02h30

Welcome to “Handmade Hero Notes”, the book where we follow the footsteps of Handmade Hero in making the complete game from scratch, with no external libraries. If you'd like to follow along, preorder the game on handmadehero.org, and you will receive access to the GitHub repository, containing complete source code (tagged day-by-day) as well as a variety of other useful resources.

As you have seen, building a Win32 prototyping layer is not that hard. Of course, building a solid shippable platform layer takes more time, so there will be more Win32 work down the line. Even then, it won't be a tremendous amount of work.

Today marks the last day of Win32-specific coding. Moving forward, we will no longer be thinking about win32-specific code. We will be revisiting it, of course, but only as a part of our cross-platform work. For instance, we still don't have a logging service. We will be implementing it in the Win32 layer as we did with the other services and then provide it to the game.

Day 24 Day 26

(Top)
Get Actual Monitor Refresh Rate
Prepare for Future Multi-Threading
Pass Mouse Input to the Game
Revisit Recording Code
  4.1  Keep Game Memory in Memory
  4.2  Poor Man's Profiling
  4.3  Map the Replay File to Memory
  4.4  Split the File Output
  4.5  Interrupt Playback
Clean Up Debug Code
  5.1  Comment Out DebugSyncDisplay
  5.2  Remove Framerate Printout
Recap
Navigation

   

Get Actual Monitor Refresh Rate

Currently, we're hard-coding MonitorRefreshHz to 60. We need to know what the actual refresh rate is. We can do that by using the GetDeviceCaps function. We looked into this function in the past, and we didn't use it because the VREFRESH value can be 0 or 1, which is not what we want. That said, most of the time, the function will return the correct value to use it. We will still initialize MonitorRefreshHz to 60 as a fallback.

// TODO(casey): How do we reliably query on monitor refresh rate on Windows? #define MonitorRefreshHz 60
int MonitorRefreshHz = 60; int Win32RefreshRate = GetDeviceCaps(DeviceContext, VREFRESH); if (Win32RefreshRate > 1) { MonitorRefreshHz = Win32RefreshRate; }
#define GameUpdateHz (MonitorRefreshHz / 2)
 Listing 1: [win32_handmade.cpp > WinMain] Getting actual refresh rate.

We don't have the own device context anymore, so let's quickly get one and release it when we're done. Make sure that you already have the handle to the window (and if you don't, simply move the block below to when you have it).

int MonitorRefreshHz = 60;
HDC RefreshDC = GetDC(Window);
int Win32RefreshRate = GetDeviceCaps(RefreshDC, VREFRESH);
ReleaseDC(Window, RefreshDC);
if (Win32RefreshRate > 1) { MonitorRefreshHz = Win32RefreshRate; }
 Listing 2: [win32_handmade.cpp > WinMain] Retrieving and releasing a device context.

Speaking of GameUpdateHz, we will eventually set it to the same value as MonitorRefreshHz to run at the same speed as the monitor. For now, though, we will leave it to half of the monitor refresh rate for performance concerns. That said, this value is currently an integer, and it might become an issue if the monitor refresh rate is odd (say, 59 instead of 60). We can directly define MonitorRefreshHz as a float and fix any code that relies on it being an integer.

f32 GameUpdateHz = (MonitorRefreshHz / 2.0f); f32 TargetSecondsPerFrame = 1.0f / GameUpdateHz;
// NOTE(casey): Sound test win32_sound_output SoundOutput = {}; SoundOutput.SamplesPerSecond = 48000; SoundOutput.BytesPerSample = sizeof(s16) * 2; SoundOutput.SecondaryBufferSize = 2 * SoundOutput.SamplesPerSecond * SoundOutput.BytesPerSample; SoundOutput.RunningSampleIndex = 0; // TODO(casey): Actually compute this variance and see // what the lowest reasonable value is.
SoundOutput.SafetyBytes = (int)(((f32)SoundOutput.SamplesPerSecond * (f32)SoundOutput.BytesPerSample / GameUpdateHz) / 3.0f);
// ... int DebugTimeMarkerIndex = 0;
win32_debug_time_marker DebugTimeMarkers[30] = {};
// ... DWORD ByteToLock = ((SoundOutput.RunningSampleIndex * SoundOutput.BytesPerSample) % SoundOutput.SecondaryBufferSize);
DWORD ExpectedSoundBytesPerFrame = (DWORD)((f32)(SoundOutput.SamplesPerSecond * SoundOutput.BytesPerSample) / GameUpdateHz);
f32 SecondsLeftUntilFlip = TargetSecondsPerFrame - FromBeginToAudioSeconds; DWORD ExpectedBytesUntilFlip = (DWORD)((SecondsLeftUntilFlip / TargetSecondsPerFrame) * (f32)ExpectedSoundBytesPerFrame);
 Listing 3: [win32_handmade.cpp > WinMain] Changing GameUpdateHz to a floating-point value.

We're now compilable and can rerun the game again with no changes. However, if you set your breakpoint just before the Win32RefreshRate calculation, you can see that we're now using the actual monitor refresh rate.

   

Prepare for Future Multi-Threading

There's one thing that we aren't going to touch right now. But, while we're at these early stages, it's a good idea to start preparing. We'll introduce a structure to hold any information relating to the current thread context, i.e., what thread you're in when you're running multi-threaded. This structure will contain absolutely nothing for the time being but will become essential in the future.

inline u32
SafeTruncateUInt64(u64 Value)
{
    Assert(Value <= 0xFFFFFFFF);
    u32 Result = (u32)Value;
    return (Result);
}
struct thread_context { int Placeholder; };
 Listing 4: [handmade.h] Introducing the thread_context structure.

We will pass a thread_context to any function speaking with the Platform layer (so all the functions currently listed in handmade.h):

#define DEBUG_PLATFORM_FREE_FILE_MEMORY(name) void name (thread_context *Thread, void *Memory)
typedef DEBUG_PLATFORM_FREE_FILE_MEMORY(debug_platform_free_file_memory);
#define DEBUG_PLATFORM_READ_ENTIRE_FILE(name) debug_read_file_result name (thread_context *Thread, char *Filename)
typedef DEBUG_PLATFORM_READ_ENTIRE_FILE(debug_platform_read_entire_file);
#define DEBUG_PLATFORM_WRITE_ENTIRE_FILE(name) b32 name (thread_context *Thread, char *Filename, u32 MemorySize, void *Memory)
typedef DEBUG_PLATFORM_WRITE_ENTIRE_FILE(debug_platform_write_entire_file); // ...
#define GAME_UPDATE_AND_RENDER(name) void name(thread_context *Thread, game_memory *Memory, game_input *Input, game_offscreen_buffer* Buffer)
typedef GAME_UPDATE_AND_RENDER(game_update_and_render); // NOTE(casey): At the moment, this has to be a very fast function, it cannot be // more than a millisecond or so. // TODO(casey): Reduce the pressure on this function's performance by measuring it // or asking about it, etc.
#define GAME_GET_SOUND_SAMPLES(name) void name(thread_context *Thread, game_memory *Memory, game_sound_output_buffer *SoundBuffer)
typedef GAME_GET_SOUND_SAMPLES(game_get_sound_samples);
 Listing 5: [handmade.h] Passing the thread context everywhere in the platform API.

We will propagate the usage for our thread context in the few places of handmade.cpp that we're currently using:

debug_read_file_result FileData = Memory->DEBUGPlatformReadEntireFile(Thread, __FILE__); if (FileData.Contents) { Memory->DEBUGPlatformWriteEntireFile(Thread, "test.out", FileData.ContentsSize, FileData.Contents); Memory->DEBUGPlatformFreeFileMemory(Thread, FileData.Contents); }
 Listing 6: [handmade.cpp] Passing the thread context inside the platform-independent layer.

As well in a single place inside win32_handmade.cpp:

DWORD BytesRead;
if (ReadFile(FileHandle, Result.Contents, FileSize32, &BytesRead, 0) &&
    (FileSize32 == BytesRead))
{
    // NOTE(casey): File read successfully
    Result.ContentsSize = BytesRead;
}
else
{
    // Error: Read failed
    // TODO(casey): Logging
DEBUGPlatformFreeFileMemory(Thread, Result.Contents);
Result.Contents = 0; }
 Listing 7: [win32_handmade.cpp > DEBUGPlatformReadEntireFile] More thread context usage.

Finally, we will introduce the thread context inside WinMain. Again, it will contain absolutely nothing for the time being but will become useful in the future.

thread_context Thread = {};
game_offscreen_buffer Buffer = {}; // ... if (Game.UpdateAndRender) {
Game.UpdateAndRender(&Thread, &GameMemory, NewInput, &Buffer);
}// ... if (Game.GetSoundSamples) {
Game.GetSoundSamples(&Thread, &GameMemory, &SoundBuffer);
}
 Listing 8: [win32_handmade.cpp > WinMain] Thread context initialization in the Win32 layer.

We made this addition because you might want to know which thread you're in or access some data pertinent to the current one. In Windows, most of the time, you can access something called ThreadLocalStorage, which is a global variable that is specific to the current thread. While this function can occasionally give decent results, we cannot rely on it for all uses. Additionally, other platforms we might want to support might not have this feature.

So with the introduction of the thread_context structure, we're getting into a habit of having it around. That's it for now.

   

Pass Mouse Input to the Game

While our game will be more of a keyboard/gamepad type of game, we might definitely benefit from having mouse input data. For instance, if we develop some debug overlay systems, using the mouse might be more appropriate.

Let's expand our game_input structure to include mouse input data. We already have the game_button_state structures which we can reuse. Let's say our mouse has 5 mouse buttons, and we want to store the state of each. We can also want to capture the mouse X, Y, and Z axis movement, the latter capturing the state of the mouse wheel.

struct game_input
{
game_button_state MouseButtons[5]; s32 MouseX, MouseY, MouseZ;
game_controller_input Controllers[5]; };
 Listing 9: [handmade.h] Storing mouse input.

Let's say we'll visualize our mouse position by reusing the handy RenderPlayer function. As the input, we'll pass the mouse position instead of the game state.

RenderWeirdGradient(Buffer, GameState->XOffset, GameState->YOffset);
RenderPlayer(Buffer, GameState->PlayerX, GameState->PlayerY);
RenderPlayer(Buffer, Input->MouseX, Input->MouseY);
 Listing 10: [handmade.cpp > GameUpdateAndRender] Drawing Mouse Cursor.

Now, how would we capture the mouse data? Inside win32_handmade.cpp, we'll need to add a few more things to pass the mouse position to the game. It will look something like this:

if (!GlobalPause)
{
NewInput->MouseX = ; NewInput->MouseY = ; NewInput->MouseZ = 0; // TODO(casey): Support mousewheel? NewInput->MouseButtons[0] = ; NewInput->MouseButtons[1] = ; NewInput->MouseButtons[2] = ;
// ... }
 Listing 11: [win32_handmade.cpp > WinMain] Sketching out mouse capture.

Mouse X and Y can be retrieved using the GetCursorPos function. This function tells you where the cursor is supposed to be at any given time. The result is stored in a POINT structure, a simple struct only containing an x and a y coordinate. Unfortunately, you cannot use them directly, something like this:

POINT MouseP; GetCursorPos(&MouseP); NewInput->MouseX = MouseP.x; NewInput->MouseY = MouseP.y;
NewInput->MouseZ = 0; // TODO(casey): Support mousewheel?
/*
NewInput->MouseButtons[0] = ; NewInput->MouseButtons[1] = ; NewInput->MouseButtons[2] = ;
*/
 Listing 12: [win32_handmade.cpp > WinMain] Retrieving mouse position.

If you compile and run this, you'll notice that the player position doesn't match the mouse cursor position. It's flat-out wrong. What's going on there? The thing is, the GetCursorPos function returns a position relative to the corner of the screen, while the coordinate system of our window starts from the top left corner of the window itself.

PhysicalmonitorborderScreen(0,0)HandmadeHeroXWindow(0,0)Screen

 Figure 1: Screen coordinates and window coordinates have different origin points.

So what we need to do is to map the mouse coordinates to the window coordinates. We can do this using the ScreenToClient function. You pass it a window and a point in screen coordinates, and it will convert said point to window coordinates.

POINT MouseP;
GetCursorPos(&MouseP);
ScreenToClient(Window, &MouseP);
NewInput->MouseX = MouseP.x; NewInput->MouseY = MouseP.y; NewInput->MouseZ = 0; // TODO(casey): Support mousewheel? /* NewInput->MouseButtons[0] = ; NewInput->MouseButtons[1] = ; NewInput->MouseButtons[2] = ; */
 Listing 13: [win32_handmade.cpp > WinMain] Translating mouse position to the client space.

We will revisit the coordinate systems in-depth in the future. Don't worry too much if you don't understand some of this now.

This system is not something you want to use in a shipped game; for instance, if you move the mouse to another monitor in a multi-monitor setup, the mouse coordinates will be wrong. But since we don't plan to use this code in the actual game, we can ignore it.

To see the state of the mouse keys, we could track the mouse-related Windows events. But we won't do that. Instead, we'll use the GetKeyState function. This function returns the state of the key, which can be 0 or 1. If the key is down, it will return 1. If it's up, it will return 0.

GetKeyState processes both keyboard and mouse keys. If you want to see the list of all the supported keys, check out the Virtual-Key Codes article. The state of the Left, Middle, and Right buttons is available at the very top of the page (VK_LBUTTON, VK_MBUTTON, and VK_RBUTTON, respectively). You can even see the so-called “XButtons” right away, so let's code them in as well:

There's a critical note to this. If you check the documentation of GetKeyState, you'll notice that the value returned is stored in the high bit of a short (16-bit integer). This means that if we want to see if the value is true (1) or false (0), we need to apply the bitmask of the 16'th bit. We don't care about the low bit; it only refers to things like CAPS LOCK (whereby it can be “on” or “off”).

POINT MouseP;
GetCursorPos(&MouseP);
ScreenToClient(Window, &MouseP);
NewInput->MouseX = MouseP.x;
NewInput->MouseY = MouseP.y;
NewInput->MouseZ = 0; // TODO(casey): Support mousewheel? 
/* NewInput->MouseButtons[0] = ; NewInput->MouseButtons[1] = ; NewInput->MouseButtons[2] = ; */
Win32ProcessKeyboardMessage(&NewInput->MouseButtons[0], GetKeyState(VK_LBUTTON) & 1 << 15); Win32ProcessKeyboardMessage(&NewInput->MouseButtons[1], GetKeyState(VK_MBUTTON) & 1 << 15); Win32ProcessKeyboardMessage(&NewInput->MouseButtons[2], GetKeyState(VK_RBUTTON) & 1 << 15); Win32ProcessKeyboardMessage(&NewInput->MouseButtons[3], GetKeyState(VK_XBUTTON1) & 1 << 15); Win32ProcessKeyboardMessage(&NewInput->MouseButtons[4], GetKeyState(VK_XBUTTON2) & 1 << 15);
 Listing 14: [win32_handmade.cpp > WinMain] Capturing mouse buttons.

Let's see whether we are capturing our buttons! To do that... we'll use our RenderPlayer function again. This is how programming works: once you have functions that you can reuse, once you have something working, it builds on itself so quickly.

So our mouse testing code will look something like this:

RenderPlayer(Buffer, GameState->PlayerX, GameState->PlayerY);
    
if (Input->MouseButtons[0].EndedDown) { RenderPlayer(Buffer, 10, 10); }
RenderPlayer(Buffer, Input->MouseX, Input->MouseY);
 Listing 15: [handmade.cpp > GameUpdateAndRender] Visualizing left click.

This means that a white rectangle will be visible at (10, 10) if the left mouse button is pressed. Nice and easy.

Let's compile and test it... Unfortunately, we crash immediately. We triggered an assertion that we put in place some time ago, saying that the new mouse state is the same as the previous one. This is correct; we're simply reading the mouse input instead of being notified of the change via a system message. That's actually fine. Let's make this system more robust and change Assert to an if statement. If the requested key state has changed, record the new state.

if(NewState->EndedDown != IsDown)
{
NewState->EndedDown = IsDown; ++NewState->HalfTransitionCount;
}
 Listing 16: [win32_handmade.cpp > Win32ProcessKeyboardMessage] Modifying key capturing routine.

This will fix the error, and now we're running correctly.

 Figure 2: While you're holding the left mouse button, you'll see a new rectangle in the top-right corner.

Now, if we want to test all buttons, we could write more RenderPlayer calls. But we've written it once, no need to write it again. Let's use a for loop. We'll offset the horizontal position for each mouse button to test multiple buttons at once.

RenderPlayer(Buffer, GameState->PlayerX, GameState->PlayerY);
for(int ButtonIndex = 0; ButtonIndex < ArrayCount(Input->MouseButtons); ++ButtonIndex) {
if(Input->MouseButtons[ButtonIndex].EndedDown)
{
RenderPlayer(Buffer, 10 + 20 * ButtonIndex, 10);
} }
 Listing 17: [handmade.cpp > GameUpdateAndRender] Testing all the mouse buttons.

Compile, run.... Looks good! The game now responds correctly to both the cursor position and the state of the mouse buttons.

That's it for mouse code.

   

Revisit Recording Code

Last time, we have done some additional work to our “live replay” code. However, we're still far from ideal here. Let's quickly review our recording code API:

internal void
Win32BeginRecordingInput(...)
{
    // Open the file to write to.
    // As the first thing, dump the whole memory block we're currently using (as a starting snapshot).
}

internal void
Win32EndRecordingInput(...)
{
    // Close whatever file we were writing to.
}

internal void 
Win32RecordInput(...)
{
    // Write the user's input for this frame to the recording file.
    // This should happen while the file is open.
}

internal void
Win32BeginInputPlayback(...)
{
    // Open the file to read from.
    // As the first thing, load the memory snapshot as our current state.
}

internal void
Win32EndInputPlayback(...)
{
    // Close the file we were reading from.
}

internal void 
Win32PlaybackInput(...)
{
    // As long as there is some data inside the file, interpret it as subsequent frames of user input.
    // Read one at a time, restart from the beginning once we've reached the end.
}

Today, we'll try to achieve two goals:

   

Keep Game Memory in Memory

In terms of supporting structures... we don't have any. We simply store the Recording/playback handles and recording/playback indices inside win32_state. Let's introduce the concept of win32_replay_buffer. This will store a pointer to the memory block that will contain data for our file and the name of the file. Let's say we'll have four of these:

#define WIN32_STATE_FILENAME_COUNT MAX_PATH
struct win32_replay_buffer { char ReplayFilename[WIN32_STATE_FILENAME_COUNT]; void *MemoryBlock; };
struct win32_state { u64 TotalSize; void *GameMemoryBlock;
win32_replay_buffer ReplayBuffers[4];
HANDLE RecordingHandle; int InputRecordingIndex; HANDLE PlaybackHandle; int InputPlayingIndex; char EXEFilename[WIN32_STATE_FILENAME_COUNT]; char *OnePastLastEXEFilenameSlash; };
 Listing 18: [win32_handmade.h] Introducing win32_replay_buffer.

So far, no changes to the code. We compile and run normally. What we'll try now is to VirtualAlloc a block of memory corresponding to our game memory size.

Win32State.TotalSize = (GameMemory.PermanentStorageSize + GameMemory.TransientStorageSize);
Win32State.GameMemoryBlock = VirtualAlloc(BaseAddress, (size_t)Win32State.TotalSize,
                                            MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE);

GameMemory.PermanentStorage = Win32State.GameMemoryBlock;
GameMemory.TransientStorage = ((u8 *)GameMemory.PermanentStorage +
                                GameMemory.PermanentStorageSize);
for(int ReplayIndex = 0; ReplayIndex < ArrayCount(Win32State.ReplayBuffers); ++ReplayIndex) { win32_replay_buffer *ReplayBuffer = &Win32State.ReplayBuffers[ReplayIndex]; ReplayBuffer->MemoryBlock = VirtualAlloc(0, (size_t)Win32State.TotalSize, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE); if (ReplayBuffer->MemoryBlock) { // All good } else { // TODO(casey): Diagnostic } }
 Listing 19: [win32_handmade.cpp > WinMain] Allocating memory for the replay buffers.

Then, instead of writing (and subsequently reading) our game state to a file, we'll copy it to this memory block instead. We'll then write our input as before, after offsetting the file pointer as if we've written the memory to disk.

We didn't really go into the specifics of how file operations in Windows work.

Windows uses a “File Pointer” concept, which is similar to how you'd have a cursor in a notepad. When you open a file handle for, let's say, writing, this pointer is automatically positioned at the beginning of the file. Each time you call WriteFile, the data is written at its position. Internally, the pointer then advances. The next time you call WriteFile, the data will be added immediately after (and not overwrite what you wrote last time). Of course, you can manipulate the pointer's position (using the SetFilePointerEx function), and that's what we're going to do.

Thus, we will then rewrite Win32BeginRecordingInput in the following way:

DWORD BytesToWrite = (DWORD)State->TotalSize; Assert(State->TotalSize == BytesToWrite); DWORD BytesWritten; WriteFile(State->RecordingHandle, State->GameMemoryBlock, BytesToWrite, &BytesWritten, 0);
LARGE_INTEGER FilePosition; FilePosition.QuadPart = State->TotalSize; SetFilePointerEx(State->RecordingHandle, FilePosition, 0, FILE_BEGIN);
 Listing 20: [win32_handmade.cpp > Win32BeginRecordingInput] Manually offsetting game state.

Instead of writing the game state to the file, we'll CopyMemory it to our ReplayBuffer:

LARGE_INTEGER FilePosition;
FilePosition.QuadPart = State->TotalSize;
SetFilePointerEx(State->RecordingHandle, FilePosition, 0, FILE_BEGIN);
CopyMemory(RecordBlock, State->GameMemoryBlock, State->TotalSize);
 Listing 21: [win32_handmade.cpp > Win32BeginRecordingInput] Copying the game state to the dedicated buffer.

This, of course, won't compile since we don't have a RecordBlock. We'll fetch it based on the index we get. Since we'll be repeating this operation in Win32BeginInputPlayback, let's add a utility function to do so.

internal win32_replay_buffer * Win32GetReplayBuffer(win32_state *State, int unsigned Index) { Assert (Index < ArrayCount(State->ReplayBuffers)); win32_replay_buffer *Result = &State->ReplayBuffers[Index]; return (Result); }
internal void Win32BeginRecordingInput(win32_state *State, int InputRecordingIndex) {
win32_replay_buffer *ReplayBuffer = Win32GetReplayBuffer(State, InputRecordingIndex); if (ReplayBuffer->MemoryBlock) {
State->InputRecordingIndex = InputRecordingIndex; char Filename[WIN32_STATE_FILENAME_COUNT]; Win32GetInputFileLocation(State, InputRecordingIndex, WIN32_STATE_FILENAME_COUNT, Filename); State->RecordingHandle = CreateFileA(Filename, GENERIC_WRITE, 0, 0, CREATE_ALWAYS, 0, 0); LARGE_INTEGER FilePosition; FilePosition.QuadPart = State->TotalSize; SetFilePointerEx(State->RecordingHandle, FilePosition, 0, FILE_BEGIN);
CopyMemory(ReplayBuffer->MemoryBlock, State->GameMemoryBlock, State->TotalSize);
}
}
 Listing 22: [win32_handmade.cpp] Loading the replay buffer based on given index.

Speaking of Win32BeginInputPlayback, let's rewrite it to use the same workflow. We'll load the game state from the replay buffer instead of the file, move the cursor and start reading.

internal void
Win32BeginInputPlayback(win32_state *State, int InputPlayingIndex)
{
win32_replay_buffer *ReplayBuffer = Win32GetReplayBuffer(State, InputPlayingIndex); if (ReplayBuffer->MemoryBlock) {
State->InputPlayingIndex = InputPlayingIndex; char Filename[WIN32_STATE_FILENAME_COUNT]; Win32GetInputFileLocation(State, InputPlayingIndex, WIN32_STATE_FILENAME_COUNT, Filename); State->PlaybackHandle = CreateFileA(Filename, GENERIC_READ, FILE_SHARE_READ, 0, OPEN_EXISTING, 0, 0);
DWORD BytesToRead = (DWORD)State->TotalSize; Assert(State->TotalSize == BytesToRead); DWORD BytesRead; ReadFile(State->PlaybackHandle, State->GameMemoryBlock, BytesToRead, &BytesRead, 0);
LARGE_INTEGER FilePosition; FilePosition.QuadPart = State->TotalSize; SetFilePointerEx(State->PlaybackHandle, FilePosition, 0, FILE_BEGIN); CopyMemory(State->GameMemoryBlock, ReplayBuffer->MemoryBlock, State->TotalSize); }
}
 Listing 23: [win32_handmade.cpp] Mirroring changes for playback.

Ok, let's compile and try it out... It might actually take even longer than before.

   

Poor Man's Profiling

What's exactly happening? Something is slowing us down, but what?

To answer this question, we'd need a profiler, a tool allowing us to analyze how much each call's execution takes. We could use the one available inside Visual Studio, but a more straightforward way would be to simply pepper our code with OutputDebugStringA calls. Whichever spot takes the longest to print out in the console, that is the culprit.

The slowdown evidently happens inside Win32BeginRecordingInput, so let's “profile” it:

OutputDebugStringA("SPAM 0\n");
State->InputRecordingIndex = InputRecordingIndex; char Filename[WIN32_STATE_FILENAME_COUNT]; Win32GetInputFileLocation(State, InputRecordingIndex, WIN32_STATE_FILENAME_COUNT, Filename);
OutputDebugStringA("SPAM 1\n");
State->RecordingHandle = CreateFileA(Filename, GENERIC_WRITE, 0, 0, CREATE_ALWAYS, 0, 0);
OutputDebugStringA("SPAM 2\n");
LARGE_INTEGER FilePosition; FilePosition.QuadPart = State->TotalSize; SetFilePointerEx(State->RecordingHandle, FilePosition, 0, FILE_BEGIN);
OutputDebugStringA("SPAM 3\n");
CopyMemory(ReplayBuffer->MemoryBlock, State->GameMemoryBlock, State->TotalSize);
OutputDebugStringA("SPAM 4\n");
 Listing 24: [win32_handmade.cpp > Win32BeginRecordingInput] Spamming the code.

Let's compile and run this. If you pay close attention to the Output window when you press L, you'll see that SPAM 0-3 will appear almost instantly, while SPAM 4 will take a short while longer but still nowhere near to the overall delay. You will also notice that our debug timing string takes time to appear. This means that the lag spike happens somewhere else but before the end of the frame. Let's step through and maybe find the culprit this way.

  1. Set the breakpoint at the start of Win32BeginRecordingInput and run the program.
  2. When you hit L, the program will stop at the breakpoint.
  3. Continue going with F10, and you'll see the SPAM strings appear.
  4. Step out of Win32BeginRecordingInput and out of Win32ProcessInput.
  5. Skip all the parts where we read input from the controllers.
  6. Get until the part where we write our first input snapshot to the file... what? It takes seconds!

What is going on here?

The reason for such a delay is that, by itself, the pointer offset doesn't really write to file. Windows will only reserve all those skipped bytes only when the first write happens. Therefore, if we skip past many of bytes, we will have to write a lot of zeros on disk.

We can verify this if we simply #if 0 the parts where we set pointer position.

internal void
Win32BeginRecordingInput(...)
{
    win32_replay_buffer *ReplayBuffer = Win32GetReplayBuffer(State, InputRecordingIndex);
    
    if (ReplayBuffer->MemoryBlock)
    {
OutputDebugStringA("SPAM 0\n");
State->InputRecordingIndex = InputRecordingIndex; char Filename[WIN32_STATE_FILENAME_COUNT]; Win32GetInputFileLocation(State, InputRecordingIndex, WIN32_STATE_FILENAME_COUNT, Filename);
OutputDebugStringA("SPAM 1\n");
State->RecordingHandle = CreateFileA(Filename, GENERIC_WRITE, 0, 0, CREATE_ALWAYS, 0, 0);
OutputDebugStringA("SPAM 2\n");
#if 0
LARGE_INTEGER FilePosition; FilePosition.QuadPart = State->TotalSize; SetFilePointerEx(State->RecordingHandle, FilePosition, 0, FILE_BEGIN);
#endif
OutputDebugStringA("SPAM 3\n");
CopyMemory(ReplayBuffer->MemoryBlock, State->GameMemoryBlock, State->TotalSize);
OutputDebugStringA("SPAM 4\n");
} } // ... internal void Win32BeginInputPlayback(...) { // ...
#if 0
LARGE_INTEGER FilePosition; FilePosition.QuadPart = State->TotalSize; SetFilePointerEx(State->PlaybackHandle, FilePosition, 0, FILE_BEGIN);
#endif
// ... }
 Listing 25: [win32_handmade.cpp] Removing the pointer offset.

Indeed. If we recompile and run, we'll see that the lag is almost gone if it's even noticeable.

   

Map the Replay File to Memory

Let's try a different thing and see what happens. Instead of opening the file and eventually closing it, we'll simply map it into memory. The function MapViewOfFile tells Windows to correspond a piece of memory with the file. Every time you touch (update) the memory, the file will be updated accordingly. A mapping between the two, if you may.

We can call the MapViewOfFile function when we allocate the memory for our replay buffers. In fact, we will do it instead of VirtualAlloc-ing this, and MemoryBlock will become the map view of file. Sort of, let's see how it works.

MapViewOfFile takes the following arguments:

win32_replay_buffer *ReplayBuffer = &Win32State.ReplayBuffers[ReplayIndex];
ReplayBuffer->MemoryBlock = VirtualAlloc(0, (size_t)Win32State.TotalSize, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE);
ReplayBuffer->MemoryMap = ; ReplayBuffer->MemoryBlock = MapViewOfFile(ReplayBuffer->MemoryMap, FILE_MAP_ALL_ACCESS, 0, 0, Win32State.TotalSize);
if (ReplayBuffer->MemoryBlock) { // All good } else { // TODO(casey): Diagnostic }
 Listing 26: [win32_handmade.cpp > WinMain] Mapping the input files to memory.

To get the MemoryMap handle, we can call the CreateFileMapping function. This function requires the following:

Concerning the MaximumSize, Windows wants us to pass the high and low 32-bit parts of the 64-bit value separately. We can do it in two ways:

You can use whichever option you prefer. Below we will showcase the second way as it's less error-prone.

LARGE_INTEGER MaxSize; MaxSize.QuadPart = Win32State.TotalSize;
ReplayBuffer->MemoryMap = CreateFileMapping(ReplayBuffer->FileHandle, 0, PAGE_READWRITE, MaxSize.HighPart, MaxSize.LowPart, 0);
ReplayBuffer->MemoryBlock = MapViewOfFile(ReplayBuffer->MemoryMap, FILE_MAP_ALL_ACCESS, 0, 0, Win32State.TotalSize);
 Listing 27: [win32_handmade.cpp > WinMain] Getting map handle.

Of course, we also should expand win32_replay_buffer structure to include the two handles.

struct win32_replay_buffer
{
HANDLE FileHandle; HANDLE MemoryMap;
char Filename[WIN32_STATE_FILENAME_COUNT]; void *MemoryBlock; };
 Listing 28: [win32_handmade.h] Expanding win32_replay_buffer.

Let's set this aside for a second and think of the names. At this point, we can expand our file names and have them follow the ReplayIndex value, 0 through 3. Luckily for us, we already extracted Win32GetInputFileLocation, and it already receives the SlotIndex so we only need to modify the logic there.

internal void
Win32GetInputFileLocation(win32_state *State, int SlotIndex, int DestCount, char *Dest)
{
Assert(SlotIndex == 1);
char Name[64]; wsprintf(Name, "loop_edit_%d.hmi", SlotIndex);
Win32BuildEXEPathFilename(State, Name, DestCount, Dest);
}
 Listing 29: [win32_handmade.cpp] Building replay file name based on its index.

At this point, we're opening the file handle only when we start recording. But let's say we don't want any of this; we will open all the file handles directly on startup. They will have read/write permissions, and we'll all be happy. We also want to restore SetFilePointerEx even if it slows us down.

internal void
Win32BeginRecordingInput(...)
{
    win32_replay_buffer *ReplayBuffer = Win32GetReplayBuffer(State, InputRecordingIndex);
    
    if (ReplayBuffer->MemoryBlock)
    {
        State->InputRecordingIndex = InputRecordingIndex;
char Filename[WIN32_STATE_FILENAME_COUNT]; Win32GetInputFileLocation(State, InputRecordingIndex, WIN32_STATE_FILENAME_COUNT, Filename);
State->RecordingHandle = ReplayBuffer->FileHandle;
#if 0
LARGE_INTEGER FilePosition; FilePosition.QuadPart = State->TotalSize; SetFilePointerEx(State->RecordingHandle, FilePosition, 0, FILE_BEGIN);
#endif
CopyMemory(ReplayBuffer->MemoryBlock, State->GameMemoryBlock, State->TotalSize); } } // ... internal void Win32BeginInputPlayback(...) { win32_replay_buffer *ReplayBuffer = Win32GetReplayBuffer(State, InputPlayingIndex); if (ReplayBuffer->MemoryBlock) { State->InputPlayingIndex = InputPlayingIndex;
char Filename[WIN32_STATE_FILENAME_COUNT]; Win32GetInputFileLocation(State, InputPlayingIndex, WIN32_STATE_FILENAME_COUNT, Filename);
State->PlaybackHandle = ReplayBuffer->FileHandle;
#if 0
LARGE_INTEGER FilePosition; FilePosition.QuadPart = State->TotalSize; SetFilePointerEx(State->PlaybackHandle, FilePosition, 0, FILE_BEGIN);
#endif
CopyMemory(State->GameMemoryBlock, ReplayBuffer->MemoryBlock, State->TotalSize); } } int CALLBACK WinMain(...) { // ...
Win32GetInputFileLocation(&Win32State, ReplayIndex, sizeof(ReplayBuffer->Filename), ReplayBuffer->Filename); ReplayBuffer->FileHandle = CreateFileA(ReplayBuffer->Filename, GENERIC_READ | GENERIC_WRITE, 0, 0, CREATE_ALWAYS, 0, 0);
LARGE_INTEGER MaxSize; MaxSize.QuadPart = Win32State.TotalSize; ReplayBuffer->MemoryMap = CreateFileMapping(ReplayBuffer->FileHandle, 0, PAGE_READWRITE, MaxSize.HighPart, MaxSize.LowPart, 0); ReplayBuffer->MemoryBlock = MapViewOfFile(ReplayBuffer->MemoryMap, FILE_MAP_ALL_ACCESS, 0, 0, Win32State.TotalSize); // ... }
 Listing 30: [win32_handmade.cpp] Consolidating file handle opening.

At this point, you can compile and make sure that FileHandle, MemoryMap and MemoryBlock are returned correctly (non zero) for each ReplayBuffer. You will also notice that new files loop_edit 0 to 3 .hmi have been created inside the build directory. However, we still face the issue of a big delay when we start recording. Also our playback seems to be broken. Let's fix all of this.

   

Split the File Output

The biggest issue we're facing right now is that our we're writing all the memory state to file during the first input recording. Maybe if we split the two outputs, game state and input records, we can improve the lag? Let's try it.

For Win32GetInputFileLocation, we will start requesting another argument. This argument will determine whether the file is the input stream or the game state. ReplayBuffer will store the handle to the state file, while the RecordingHandle/PlaybackHandle will store the handle to the input stream. This means that we need to roll back some changes we just did, namely comment out SetFilePointerEx calls once again.

internal void
Win32GetInputFileLocation(win32_state *State, b32 InputStream,
int SlotIndex, int DestCount, char *Dest) { char Name[64];
wsprintf(Name, "loop_edit_%d_%s.hmi", SlotIndex,
InputStream ? "input" : "state");
Win32BuildEXEPathFilename(State, Name, DestCount, Dest); } // ... internal void Win32BeginRecordingInput(...) { win32_replay_buffer *ReplayBuffer = Win32GetReplayBuffer(State, InputRecordingIndex); if (ReplayBuffer->MemoryBlock) { State->InputRecordingIndex = InputRecordingIndex;
char Filename[WIN32_STATE_FILENAME_COUNT]; Win32GetInputFileLocation(State, true, InputRecordingIndex, sizeof(Filename), Filename);
State->RecordingHandle = CreateFileA(Filename, GENERIC_WRITE, 0, 0, CREATE_ALWAYS, 0, 0);
#if 0
LARGE_INTEGER FilePosition; FilePosition.QuadPart = State->TotalSize; SetFilePointerEx(State->RecordingHandle, FilePosition, 0, FILE_BEGIN);
#endif
CopyMemory(ReplayBuffer->MemoryBlock, State->GameMemoryBlock, State->TotalSize); } } // ... internal void Win32BeginInputPlayback(...) { win32_replay_buffer *ReplayBuffer = Win32GetReplayBuffer(State, InputPlayingIndex); if (ReplayBuffer->MemoryBlock) { State->InputPlayingIndex = InputPlayingIndex;
char Filename[WIN32_STATE_FILENAME_COUNT]; Win32GetInputFileLocation(State, true, InputPlayingIndex, sizeof(Filename), Filename);
State->PlaybackHandle = CreateFileA(Filename, GENERIC_READ, 0, 0, OPEN_EXISTING, 0, 0);
#if 0
LARGE_INTEGER FilePosition; FilePosition.QuadPart = State->TotalSize; SetFilePointerEx(State->PlaybackHandle, FilePosition, 0, FILE_BEGIN);
#endif
CopyMemory(State->GameMemoryBlock, ReplayBuffer->MemoryBlock, State->TotalSize); } } // ... int CALLBACK WinMain(...) { // ...
Win32GetInputFileLocation(&Win32State, false, ReplayIndex,
sizeof(ReplayBuffer->Filename), ReplayBuffer->Filename); ReplayBuffer->FileHandle = CreateFileA(ReplayBuffer->Filename, GENERIC_READ | GENERIC_WRITE, 0, 0, CREATE_ALWAYS, 0, 0); // ... }
 Listing 31: [win32_handmade.cpp] Separating game state and input streams.

Ok, that was a lot of back and forth (roughly) but if you compile and run, you'll notice that the lag is much, much smaller! It's a bit weird that setting the pointer creates such a lag, so that might be something to investigate. One day. Let's leave a note for posterity:

win32_replay_buffer *ReplayBuffer = &Win32State.ReplayBuffers[ReplayIndex];
                
// TODO(casey): Recording system still seems to take too long // on record start - find out what Windows is doing and if // we can speed up / defer some of that processing.
Win32GetInputFileLocation(&Win32State, false, ReplayIndex, sizeof(ReplayBuffer->Filename), ReplayBuffer->Filename); ReplayBuffer->FileHandle = CreateFileA(ReplayBuffer->Filename, GENERIC_READ | GENERIC_WRITE, 0, 0, CREATE_ALWAYS, 0, 0);
 Listing 32: [win32_handmade.cpp > WinMain] Adding a note for the future.
   

Interrupt Playback

Ok, that's one issue (somewhat) out of the way, let's look at the other: how do we interrupt playback?

Well, this is actually very simple. Whenever the user presses L, we will check if InputPlayingIndex is zero. If it's not, we stop the playback. (if it is, we do the same check for recording).

else if (VKCode == 'L')
{
    if (IsDown)
    {
if (Start->InputPlayingIndex == 0) {
if (State->InputRecordingIndex == 0) { Win32BeginRecordingInput(State, 1); } else { Win32EndRecordingInput(State); Win32BeginInputPlayback(State, 1); } }
else { Win32EndInputPlayback(State); }
} }
 Listing 33: [win32_handmade.cpp > Win32ProcessPendingMessages] Terminating playback.

This should do the trick. Certainly not shippable quality, but for our debug purposes will suffice.

   

Clean Up Debug Code

We're at the end of the last day of the initial platform prototype layer. Next time, we will start diving into the architecture of the game proper, so we won't be spending so much time in Windows. Let's quickly make a few changes to make sure the platform layer doesn't get in our way.

   

Comment Out DebugSyncDisplay

Win32DebugSyncDisplay served us well when we were debugging audio. For now audio seems to be in a good place, so let's comment the function out, and delete the call.

#if 0
internal void Win32DebugSyncDisplay(...) { // ... }
#endif
// ... int CALLBACK WinMain(...) { // ...
#if HANDMADE_INTERNAL // TODO(casey): Note, current is wrong on the zero'th index Win32DebugSyncDisplay(&GlobalBackbuffer, ArrayCount(DebugTimeMarkers), DebugTimeMarkers, DebugTimeMarkerIndex - 1, &SoundOutput, TargetSecondsPerFrame); #endif
HDC DeviceContext = GetDC(Window); Win32DisplayBufferInWindow(&GlobalBackbuffer, DeviceContext, Dimension.Width, Dimension.Height); ReleaseDC(Window, DeviceContext); // ... }
 Listing 34: [win32_handmade.cpp] Removing audio debug lines.
   

Remove Framerate Printout

Another thing that we can get rid of is the spam of the Output window. We had two printouts: audio data, and the framerate.

#if HANDMADE_INTERNAL win32_debug_time_marker *Marker = &DebugTimeMarkers[DebugTimeMarkerIndex]; Marker->OutputPlayCursor = PlayCursor; Marker->OutputWriteCursor = WriteCursor; Marker->OutputLocation = ByteToLock; Marker->OutputByteCount = BytesToWrite; Marker->ExpectedFlipPlayCursor = ExpectedFrameBoundaryByte;
#if 0
DWORD UnwrappedWriteCursor = WriteCursor; if (UnwrappedWriteCursor < PlayCursor) { UnwrappedWriteCursor += SoundOutput.SecondaryBufferSize; } DWORD AudioLatencyBytes = UnwrappedWriteCursor - PlayCursor; f32 AudioLatencySeconds = ((f32)AudioLatencyBytes / (f32)SoundOutput.BytesPerSample) / (f32)SoundOutput.SamplesPerSecond; char TextBuffer[256]; sprintf_s(TextBuffer, sizeof(TextBuffer), "BTL:%u TC:%u BTW:%u - PC:%u WC:%u DELTA:%u (%.3fs)\n", ByteToLock, TargetCursor, BytesToWrite, PlayCursor, WriteCursor, AudioLatencyBytes, AudioLatencySeconds); OutputDebugStringA(TextBuffer);
#endif
#endif // ... #if 0// debug timing output f32 FPS = 0.0f; f32 MegaCyclesPerFrame = (f32)CyclesElapsed / (1000.0f * 1000.0f); char FPSBuffer[256]; sprintf_s(FPSBuffer, sizeof(FPSBuffer), "%.02fms/f, %.02ff/s, %.02fMc/f\n", MSPerFrame, FPS, MegaCyclesPerFrame); OutputDebugStringA(FPSBuffer);
#endif
 Listing 35: [win32_handmade.cpp] Removing console output logs.

Now we don't print out something each frame, so that's good. In the future, we'll have our debug logging system that we can use for debugging the data.

Everything else seems relatively good.

   

Recap

This marks the end of day 25 and with it, the initial work on the Win32 platform layer. There's still more that we need to learn; so we might revisit our existing systems in the future.

But for now, this is all behind us. Next time, we'll start playing around with the game's architecture.

   

Navigation

Previous: Day 24. Win32 Platform Layer Cleanup

Up Next: Day 26. Introduction to Game Architecture

Back to Index

Glossary

*

MSDN

Virtual-Key Codes

Win32 API

CopyMemory

CreateFileMapping

GetCursorPos

GetDeviceCaps

GetKeyState

MapViewOfFile

ScreenToClient

SetFilePointerEx

formatted by Markdeep 1.13