Welcome to “Handmade Hero Notes”, the book where we follow the footsteps of Handmade Hero in making the complete game from scratch, with no external libraries. If you'd like to follow along, preorder the game on handmadehero.org, and you will receive access to the GitHub repository, containing complete source code (tagged day-by-day) as well as a variety of other useful resources.
As you have seen, building a Win32 prototyping layer is not that hard. Of course, building a solid shippable platform layer takes more time, so there will be more Win32 work down the line. Even then, it won't be a tremendous amount of work.
Today marks the last day of Win32-specific coding. Moving forward, we will no longer be thinking about win32-specific code. We will be revisiting it, of course, but only as a part of our cross-platform work. For instance, we still don't have a logging service. We will be implementing it in the Win32 layer as we did with the other services and then provide it to the game.
(Top)
1 Get Actual Monitor Refresh Rate
2 Prepare for Future Multi-Threading
3 Pass Mouse Input to the Game
4 Revisit Recording Code
4.1 Keep Game Memory in Memory
4.2 Poor Man's Profiling
4.3 Map the Replay File to Memory
4.4 Split the File Output
4.5 Interrupt Playback
5 Clean Up Debug Code
5.1 Comment Out DebugSyncDisplay
5.2 Remove Framerate Printout
6 Recap
7 Navigation
Currently, we're hard-coding MonitorRefreshHz
to 60. We need to know what the actual refresh rate is. We can do that by using the GetDeviceCaps function.
We looked into this function in the past, and we didn't use it because the VREFRESH
value can be 0 or 1, which is not what we want. That said, most of the time, the function will return the correct value to use it. We will still initialize MonitorRefreshHz
to 60 as a fallback.
// TODO(casey): How do we reliably query on monitor refresh rate on Windows?
#define MonitorRefreshHz 60int MonitorRefreshHz = 60;
int Win32RefreshRate = GetDeviceCaps(DeviceContext, VREFRESH);
if (Win32RefreshRate > 1)
{
MonitorRefreshHz = Win32RefreshRate;
}#define GameUpdateHz (MonitorRefreshHz / 2)
We don't have the own device context anymore, so let's quickly get one and release it when we're done. Make sure that you already have the handle to the window (and if you don't, simply move the block below to when you have it).
int MonitorRefreshHz = 60;HDC RefreshDC = GetDC(Window);int Win32RefreshRate = GetDeviceCaps(RefreshDC, VREFRESH);ReleaseDC(Window, RefreshDC);if (Win32RefreshRate > 1)
{
MonitorRefreshHz = Win32RefreshRate;
}
Speaking of GameUpdateHz, we will eventually set it to the same value as MonitorRefreshHz to run at the same speed as the monitor. For now, though, we will leave it to half of the monitor refresh rate for performance concerns. That said, this value is currently an integer, and it might become an issue if the monitor refresh rate is odd (say, 59 instead of 60). We can directly define MonitorRefreshHz as a float and fix any code that relies on it being an integer.
f32 GameUpdateHz = (MonitorRefreshHz / 2.0f);
f32 TargetSecondsPerFrame = 1.0f / GameUpdateHz;// NOTE(casey): Sound test
win32_sound_output SoundOutput = {};
SoundOutput.SamplesPerSecond = 48000;
SoundOutput.BytesPerSample = sizeof(s16) * 2;
SoundOutput.SecondaryBufferSize = 2 * SoundOutput.SamplesPerSecond * SoundOutput.BytesPerSample;
SoundOutput.RunningSampleIndex = 0;
// TODO(casey): Actually compute this variance and see
// what the lowest reasonable value is.SoundOutput.SafetyBytes = (int)(((f32)SoundOutput.SamplesPerSecond * (f32)SoundOutput.BytesPerSample / GameUpdateHz) / 3.0f);// ...
int DebugTimeMarkerIndex = 0;win32_debug_time_marker DebugTimeMarkers[30] = {};// ...
DWORD ByteToLock = ((SoundOutput.RunningSampleIndex * SoundOutput.BytesPerSample)
% SoundOutput.SecondaryBufferSize);
DWORD ExpectedSoundBytesPerFrame = (DWORD)((f32)(SoundOutput.SamplesPerSecond * SoundOutput.BytesPerSample)
/ GameUpdateHz);
f32 SecondsLeftUntilFlip = TargetSecondsPerFrame - FromBeginToAudioSeconds;
DWORD ExpectedBytesUntilFlip = (DWORD)((SecondsLeftUntilFlip / TargetSecondsPerFrame) * (f32)ExpectedSoundBytesPerFrame);
We're now compilable and can rerun the game again with no changes. However, if you set your breakpoint just before the Win32RefreshRate
calculation, you can see that we're now using the actual monitor refresh rate.
There's one thing that we aren't going to touch right now. But, while we're at these early stages, it's a good idea to start preparing. We'll introduce a structure to hold any information relating to the current thread context, i.e., what thread you're in when you're running multi-threaded. This structure will contain absolutely nothing for the time being but will become essential in the future.
inline u32
SafeTruncateUInt64(u64 Value)
{
Assert(Value <= 0xFFFFFFFF);
u32 Result = (u32)Value;
return (Result);
}
struct thread_context
{
int Placeholder;
};
We will pass a thread_context
to any function speaking with the Platform layer (so all the functions currently listed in handmade.h
):
#define DEBUG_PLATFORM_FREE_FILE_MEMORY(name) void name (thread_context *Thread, void *Memory)typedef DEBUG_PLATFORM_FREE_FILE_MEMORY(debug_platform_free_file_memory);
#define DEBUG_PLATFORM_READ_ENTIRE_FILE(name) debug_read_file_result name (thread_context *Thread, char *Filename)typedef DEBUG_PLATFORM_READ_ENTIRE_FILE(debug_platform_read_entire_file);
#define DEBUG_PLATFORM_WRITE_ENTIRE_FILE(name) b32 name (thread_context *Thread, char *Filename, u32 MemorySize, void *Memory)typedef DEBUG_PLATFORM_WRITE_ENTIRE_FILE(debug_platform_write_entire_file);
// ...
#define GAME_UPDATE_AND_RENDER(name) void name(thread_context *Thread, game_memory *Memory, game_input *Input, game_offscreen_buffer* Buffer)typedef GAME_UPDATE_AND_RENDER(game_update_and_render);
// NOTE(casey): At the moment, this has to be a very fast function, it cannot be
// more than a millisecond or so.
// TODO(casey): Reduce the pressure on this function's performance by measuring it
// or asking about it, etc.
#define GAME_GET_SOUND_SAMPLES(name) void name(thread_context *Thread, game_memory *Memory, game_sound_output_buffer *SoundBuffer)typedef GAME_GET_SOUND_SAMPLES(game_get_sound_samples);
We will propagate the usage for our thread context in the few places of handmade.cpp
that we're currently using:
debug_read_file_result FileData = Memory->DEBUGPlatformReadEntireFile(Thread, __FILE__);
if (FileData.Contents)
{
Memory->DEBUGPlatformWriteEntireFile(Thread, "test.out", FileData.ContentsSize, FileData.Contents);
Memory->DEBUGPlatformFreeFileMemory(Thread, FileData.Contents);
}
As well in a single place inside win32_handmade.cpp
:
DWORD BytesRead;
if (ReadFile(FileHandle, Result.Contents, FileSize32, &BytesRead, 0) &&
(FileSize32 == BytesRead))
{
// NOTE(casey): File read successfully
Result.ContentsSize = BytesRead;
}
else
{
// Error: Read failed
// TODO(casey): Logging DEBUGPlatformFreeFileMemory(Thread, Result.Contents); Result.Contents = 0;
}
Finally, we will introduce the thread context inside WinMain
. Again, it will contain absolutely nothing for the time being but will become useful in the future.
thread_context Thread = {};
game_offscreen_buffer Buffer = {};
// ...
if (Game.UpdateAndRender)
{ Game.UpdateAndRender(&Thread, &GameMemory, NewInput, &Buffer);}// ...
if (Game.GetSoundSamples)
{ Game.GetSoundSamples(&Thread, &GameMemory, &SoundBuffer);}
We made this addition because you might want to know which thread you're in or access some data pertinent to the current one. In Windows, most of the time, you can access something called ThreadLocalStorage
, which is a global variable that is specific to the current thread. While this function can occasionally give decent results, we cannot rely on it for all uses. Additionally, other platforms we might want to support might not have this feature.
So with the introduction of the thread_context
structure, we're getting into a habit of having it around. That's it for now.
While our game will be more of a keyboard/gamepad type of game, we might definitely benefit from having mouse input data. For instance, if we develop some debug overlay systems, using the mouse might be more appropriate.
Let's expand our game_input
structure to include mouse input data. We already have the game_button_state
structures which we can reuse. Let's say our mouse has 5 mouse buttons, and we want to store the state of each. We can also want to capture the mouse X, Y, and Z axis movement, the latter capturing the state of the mouse wheel.
struct game_input
{ game_button_state MouseButtons[5];
s32 MouseX, MouseY, MouseZ;
game_controller_input Controllers[5];
};
Let's say we'll visualize our mouse position by reusing the handy RenderPlayer
function. As the input, we'll pass the mouse position instead of the game state.
RenderWeirdGradient(Buffer, GameState->XOffset, GameState->YOffset);
RenderPlayer(Buffer, GameState->PlayerX, GameState->PlayerY);
RenderPlayer(Buffer, Input->MouseX, Input->MouseY);
Now, how would we capture the mouse data? Inside win32_handmade.cpp
, we'll need to add a few more things to pass the mouse position to the game. It will look something like this:
if (!GlobalPause)
{ NewInput->MouseX = ;
NewInput->MouseY = ;
NewInput->MouseZ = 0; // TODO(casey): Support mousewheel?
NewInput->MouseButtons[0] = ;
NewInput->MouseButtons[1] = ;
NewInput->MouseButtons[2] = ;// ...
}
Mouse X and Y can be retrieved using the GetCursorPos function. This function tells you where the cursor is supposed to be at any given time. The result is stored in a POINT
structure, a simple struct only containing an x
and a y
coordinate. Unfortunately, you cannot use them directly, something like this:
POINT MouseP;
GetCursorPos(&MouseP);
NewInput->MouseX = MouseP.x;
NewInput->MouseY = MouseP.y;NewInput->MouseZ = 0; // TODO(casey): Support mousewheel? /* NewInput->MouseButtons[0] = ;
NewInput->MouseButtons[1] = ;
NewInput->MouseButtons[2] = ;*/
If you compile and run this, you'll notice that the player position doesn't match the mouse cursor position. It's flat-out wrong. What's going on there? The thing is, the GetCursorPos
function returns a position relative to the corner of the screen, while the coordinate system of our window starts from the top left corner of the window itself.
So what we need to do is to map the mouse coordinates to the window coordinates. We can do this using the ScreenToClient function. You pass it a window and a point in screen coordinates, and it will convert said point to window coordinates.
POINT MouseP;
GetCursorPos(&MouseP);ScreenToClient(Window, &MouseP);NewInput->MouseX = MouseP.x;
NewInput->MouseY = MouseP.y;
NewInput->MouseZ = 0; // TODO(casey): Support mousewheel?
/*
NewInput->MouseButtons[0] = ;
NewInput->MouseButtons[1] = ;
NewInput->MouseButtons[2] = ;
*/
This system is not something you want to use in a shipped game; for instance, if you move the mouse to another monitor in a multi-monitor setup, the mouse coordinates will be wrong. But since we don't plan to use this code in the actual game, we can ignore it.
To see the state of the mouse keys, we could track the mouse-related Windows events. But we won't do that. Instead, we'll use the GetKeyState function. This function returns the state of the key, which can be 0
or 1
. If the key is down, it will return 1
. If it's up, it will return 0
.
GetKeyState
processes both keyboard and mouse keys. If you want to see the list of all the supported keys, check out the Virtual-Key Codes article. The state of the Left, Middle, and Right buttons is available at the very top of the page (VK_LBUTTON
, VK_MBUTTON
, and VK_RBUTTON
, respectively). You can even see the so-called “XButtons” right away, so let's code them in as well:
There's a critical note to this. If you check the documentation of GetKeyState, you'll notice that the value returned is stored in the high bit of a short (16-bit integer). This means that if we want to see if the value is true (1) or false (0), we need to apply the bitmask of the 16'th bit. We don't care about the low bit; it only refers to things like CAPS LOCK (whereby it can be “on” or “off”).
POINT MouseP;
GetCursorPos(&MouseP);
ScreenToClient(Window, &MouseP);
NewInput->MouseX = MouseP.x;
NewInput->MouseY = MouseP.y;
NewInput->MouseZ = 0; // TODO(casey): Support mousewheel? /*
NewInput->MouseButtons[0] = ;
NewInput->MouseButtons[1] = ;
NewInput->MouseButtons[2] = ;
*/Win32ProcessKeyboardMessage(&NewInput->MouseButtons[0], GetKeyState(VK_LBUTTON) & 1 << 15);
Win32ProcessKeyboardMessage(&NewInput->MouseButtons[1], GetKeyState(VK_MBUTTON) & 1 << 15);
Win32ProcessKeyboardMessage(&NewInput->MouseButtons[2], GetKeyState(VK_RBUTTON) & 1 << 15);
Win32ProcessKeyboardMessage(&NewInput->MouseButtons[3], GetKeyState(VK_XBUTTON1) & 1 << 15);
Win32ProcessKeyboardMessage(&NewInput->MouseButtons[4], GetKeyState(VK_XBUTTON2) & 1 << 15);
Let's see whether we are capturing our buttons! To do that... we'll use our RenderPlayer
function again. This is how programming works: once you have functions that you can reuse, once you have something working, it builds on itself so quickly.
So our mouse testing code will look something like this:
RenderPlayer(Buffer, GameState->PlayerX, GameState->PlayerY);
if (Input->MouseButtons[0].EndedDown)
{
RenderPlayer(Buffer, 10, 10);
}
RenderPlayer(Buffer, Input->MouseX, Input->MouseY);
This means that a white rectangle will be visible at (10, 10) if the left mouse button is pressed. Nice and easy.
Let's compile and test it... Unfortunately, we crash immediately. We triggered an assertion that we put in place some time ago, saying that the new mouse state is the same as the previous one. This is correct; we're simply reading the mouse input instead of being notified of the change via a system message. That's actually fine. Let's make this system more robust and change Assert
to an if
statement. If the requested key state has changed, record the new state.
if(NewState->EndedDown != IsDown){ NewState->EndedDown = IsDown;
++NewState->HalfTransitionCount;}
This will fix the error, and now we're running correctly.
Now, if we want to test all buttons, we could write more RenderPlayer
calls. But we've written it once, no need to write it again. Let's use a for
loop. We'll offset the horizontal position for each mouse button to test multiple buttons at once.
RenderPlayer(Buffer, GameState->PlayerX, GameState->PlayerY);
for(int ButtonIndex = 0;
ButtonIndex < ArrayCount(Input->MouseButtons);
++ButtonIndex)
{ if(Input->MouseButtons[ButtonIndex].EndedDown) { RenderPlayer(Buffer, 10 + 20 * ButtonIndex, 10); }
}
Compile, run.... Looks good! The game now responds correctly to both the cursor position and the state of the mouse buttons.
That's it for mouse code.
Last time, we have done some additional work to our “live replay” code. However, we're still far from ideal here. Let's quickly review our recording code API:
internal void
Win32BeginRecordingInput(...)
{
// Open the file to write to.
// As the first thing, dump the whole memory block we're currently using (as a starting snapshot).
}
internal void
Win32EndRecordingInput(...)
{
// Close whatever file we were writing to.
}
internal void
Win32RecordInput(...)
{
// Write the user's input for this frame to the recording file.
// This should happen while the file is open.
}
internal void
Win32BeginInputPlayback(...)
{
// Open the file to read from.
// As the first thing, load the memory snapshot as our current state.
}
internal void
Win32EndInputPlayback(...)
{
// Close the file we were reading from.
}
internal void
Win32PlaybackInput(...)
{
// As long as there is some data inside the file, interpret it as subsequent frames of user input.
// Read one at a time, restart from the beginning once we've reached the end.
}
L
).
L
again and immediately start playback.
Today, we'll try to achieve two goals:
In terms of supporting structures... we don't have any. We simply store the Recording/playback handles and recording/playback indices inside win32_state. Let's introduce the concept of win32_replay_buffer
. This will store a pointer to the memory block that will contain data for our file and the name of the file. Let's say we'll have four of these:
#define WIN32_STATE_FILENAME_COUNT MAX_PATHstruct win32_replay_buffer
{
char ReplayFilename[WIN32_STATE_FILENAME_COUNT];
void *MemoryBlock;
};
struct win32_state
{
u64 TotalSize;
void *GameMemoryBlock; win32_replay_buffer ReplayBuffers[4];
HANDLE RecordingHandle;
int InputRecordingIndex;
HANDLE PlaybackHandle;
int InputPlayingIndex;
char EXEFilename[WIN32_STATE_FILENAME_COUNT];
char *OnePastLastEXEFilenameSlash;
};
So far, no changes to the code. We compile and run normally. What we'll try now is to VirtualAlloc
a block of memory corresponding to our game memory size.
Win32State.TotalSize = (GameMemory.PermanentStorageSize + GameMemory.TransientStorageSize);
Win32State.GameMemoryBlock = VirtualAlloc(BaseAddress, (size_t)Win32State.TotalSize,
MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE);
GameMemory.PermanentStorage = Win32State.GameMemoryBlock;
GameMemory.TransientStorage = ((u8 *)GameMemory.PermanentStorage +
GameMemory.PermanentStorageSize);
for(int ReplayIndex = 0;
ReplayIndex < ArrayCount(Win32State.ReplayBuffers);
++ReplayIndex)
{
win32_replay_buffer *ReplayBuffer = &Win32State.ReplayBuffers[ReplayIndex];
ReplayBuffer->MemoryBlock = VirtualAlloc(0, (size_t)Win32State.TotalSize,
MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE);
if (ReplayBuffer->MemoryBlock)
{
// All good
}
else
{
// TODO(casey): Diagnostic
}
}
Then, instead of writing (and subsequently reading) our game state to a file, we'll copy it to this memory block instead. We'll then write our input as before, after offsetting the file pointer as if we've written the memory to disk.
We didn't really go into the specifics of how file operations in Windows work.
Windows uses a “File Pointer” concept, which is similar to how you'd have a cursor in a notepad. When you open a file handle for, let's say, writing, this pointer is automatically positioned at the beginning of the file. Each time you call WriteFile
, the data is written at its position. Internally, the pointer then advances. The next time you call WriteFile
, the data will be added immediately after (and not overwrite what you wrote last time). Of course, you can manipulate the pointer's position (using the SetFilePointerEx function), and that's what we're going to do.
Thus, we will then rewrite Win32BeginRecordingInput
in the following way:
DWORD BytesToWrite = (DWORD)State->TotalSize;
Assert(State->TotalSize == BytesToWrite);
DWORD BytesWritten;
WriteFile(State->RecordingHandle, State->GameMemoryBlock, BytesToWrite,
&BytesWritten, 0); LARGE_INTEGER FilePosition;
FilePosition.QuadPart = State->TotalSize;
SetFilePointerEx(State->RecordingHandle, FilePosition, 0, FILE_BEGIN);
Instead of writing the game state to the file, we'll CopyMemory it to our ReplayBuffer:
LARGE_INTEGER FilePosition;
FilePosition.QuadPart = State->TotalSize;
SetFilePointerEx(State->RecordingHandle, FilePosition, 0, FILE_BEGIN);
CopyMemory(RecordBlock, State->GameMemoryBlock, State->TotalSize);
This, of course, won't compile since we don't have a RecordBlock
. We'll fetch it based on the index we get. Since we'll be repeating this operation in Win32BeginInputPlayback
, let's add a utility function to do so.
internal win32_replay_buffer *
Win32GetReplayBuffer(win32_state *State, int unsigned Index)
{
Assert (Index < ArrayCount(State->ReplayBuffers));
win32_replay_buffer *Result = &State->ReplayBuffers[Index];
return (Result);
}
internal void
Win32BeginRecordingInput(win32_state *State, int InputRecordingIndex)
{ win32_replay_buffer *ReplayBuffer = Win32GetReplayBuffer(State, InputRecordingIndex);
if (ReplayBuffer->MemoryBlock)
{ State->InputRecordingIndex = InputRecordingIndex;
char Filename[WIN32_STATE_FILENAME_COUNT];
Win32GetInputFileLocation(State, InputRecordingIndex, WIN32_STATE_FILENAME_COUNT, Filename);
State->RecordingHandle = CreateFileA(Filename, GENERIC_WRITE, 0, 0, CREATE_ALWAYS, 0, 0);
LARGE_INTEGER FilePosition;
FilePosition.QuadPart = State->TotalSize;
SetFilePointerEx(State->RecordingHandle, FilePosition, 0, FILE_BEGIN);
CopyMemory(ReplayBuffer->MemoryBlock, State->GameMemoryBlock, State->TotalSize); }}
Speaking of Win32BeginInputPlayback
, let's rewrite it to use the same workflow. We'll load the game state from the replay buffer instead of the file, move the cursor and start reading.
internal void
Win32BeginInputPlayback(win32_state *State, int InputPlayingIndex)
{ win32_replay_buffer *ReplayBuffer = Win32GetReplayBuffer(State, InputPlayingIndex);
if (ReplayBuffer->MemoryBlock)
{ State->InputPlayingIndex = InputPlayingIndex;
char Filename[WIN32_STATE_FILENAME_COUNT];
Win32GetInputFileLocation(State, InputPlayingIndex, WIN32_STATE_FILENAME_COUNT, Filename);
State->PlaybackHandle = CreateFileA(Filename, GENERIC_READ, FILE_SHARE_READ, 0, OPEN_EXISTING, 0, 0);
DWORD BytesToRead = (DWORD)State->TotalSize;
Assert(State->TotalSize == BytesToRead);
DWORD BytesRead;
ReadFile(State->PlaybackHandle, State->GameMemoryBlock, BytesToRead, &BytesRead, 0); LARGE_INTEGER FilePosition;
FilePosition.QuadPart = State->TotalSize;
SetFilePointerEx(State->PlaybackHandle, FilePosition, 0, FILE_BEGIN);
CopyMemory(State->GameMemoryBlock, ReplayBuffer->MemoryBlock, State->TotalSize);
}}
Ok, let's compile and try it out... It might actually take even longer than before.
What's exactly happening? Something is slowing us down, but what?
To answer this question, we'd need a profiler, a tool allowing us to analyze how much each call's execution takes. We could use the one available inside Visual Studio, but a more straightforward way would be to simply pepper our code with OutputDebugStringA
calls. Whichever spot takes the longest to print out in the console, that is the culprit.
The slowdown evidently happens inside Win32BeginRecordingInput
, so let's “profile” it:
OutputDebugStringA("SPAM 0\n");State->InputRecordingIndex = InputRecordingIndex;
char Filename[WIN32_STATE_FILENAME_COUNT];
Win32GetInputFileLocation(State, InputRecordingIndex, WIN32_STATE_FILENAME_COUNT, Filename);
OutputDebugStringA("SPAM 1\n");State->RecordingHandle = CreateFileA(Filename, GENERIC_WRITE, 0, 0, CREATE_ALWAYS, 0, 0);
OutputDebugStringA("SPAM 2\n");LARGE_INTEGER FilePosition;
FilePosition.QuadPart = State->TotalSize;
SetFilePointerEx(State->RecordingHandle, FilePosition, 0, FILE_BEGIN);
OutputDebugStringA("SPAM 3\n");CopyMemory(ReplayBuffer->MemoryBlock, State->GameMemoryBlock, State->TotalSize);
OutputDebugStringA("SPAM 4\n");
Let's compile and run this. If you pay close attention to the Output
window when you press L
, you'll see that SPAM
0-3 will appear almost instantly, while SPAM
4 will take a short while longer but still nowhere near to the overall delay. You will also notice that our debug timing string takes time to appear. This means that the lag spike happens somewhere else but before the end of the frame. Let's step through and maybe find the culprit this way.
Win32BeginRecordingInput
and run the program.
L
, the program will stop at the breakpoint.
F10
, and you'll see the SPAM
strings appear.
Win32BeginRecordingInput
and out of Win32ProcessInput
.
What is going on here?
The reason for such a delay is that, by itself, the pointer offset doesn't really write to file. Windows will only reserve all those skipped bytes only when the first write happens. Therefore, if we skip past many of bytes, we will have to write a lot of zeros on disk.
We can verify this if we simply #if 0
the parts where we set pointer position.
internal void
Win32BeginRecordingInput(...)
{
win32_replay_buffer *ReplayBuffer = Win32GetReplayBuffer(State, InputRecordingIndex);
if (ReplayBuffer->MemoryBlock)
{ OutputDebugStringA("SPAM 0\n"); State->InputRecordingIndex = InputRecordingIndex;
char Filename[WIN32_STATE_FILENAME_COUNT];
Win32GetInputFileLocation(State, InputRecordingIndex, WIN32_STATE_FILENAME_COUNT, Filename);
OutputDebugStringA("SPAM 1\n"); State->RecordingHandle = CreateFileA(Filename, GENERIC_WRITE, 0, 0, CREATE_ALWAYS, 0, 0);
OutputDebugStringA("SPAM 2\n");#if 0 LARGE_INTEGER FilePosition;
FilePosition.QuadPart = State->TotalSize;
SetFilePointerEx(State->RecordingHandle, FilePosition, 0, FILE_BEGIN);#endif OutputDebugStringA("SPAM 3\n"); CopyMemory(ReplayBuffer->MemoryBlock, State->GameMemoryBlock, State->TotalSize);
OutputDebugStringA("SPAM 4\n"); }
}
// ...
internal void
Win32BeginInputPlayback(...)
{
// ... #if 0 LARGE_INTEGER FilePosition;
FilePosition.QuadPart = State->TotalSize;
SetFilePointerEx(State->PlaybackHandle, FilePosition, 0, FILE_BEGIN);#endif // ...
}
Indeed. If we recompile and run, we'll see that the lag is almost gone if it's even noticeable.
Let's try a different thing and see what happens. Instead of opening the file and eventually closing it, we'll simply map it into memory. The function MapViewOfFile tells Windows to correspond a piece of memory with the file. Every time you touch (update) the memory, the file will be updated accordingly. A mapping between the two, if you may.
We can call the MapViewOfFile
function when we allocate the memory for our replay buffers. In fact, we will do it instead of VirtualAlloc
-ing this, and MemoryBlock
will become the map view of file. Sort of, let's see how it works.
MapViewOfFile
takes the following arguments:
hFileMappingObject
: let's say we have a MemoryMap
handle in our win32_replay_buffer
structure, and we'll think about actually getting it later.
dwDesiredAccess
: the access to the file mapping object. We want to read and write to it, so it's FILE_MAP_ALL_ACCESS
.
dwFileOffsetHigh
: the high DWORD of the file offset. We don't need this, so we'll just pass 0
.
dwFileOffsetLow
: the low DWORD of the file offset. Again, 0
here.
dwNumberOfBytesToMap
: the number of bytes to map.win32_replay_buffer *ReplayBuffer = &Win32State.ReplayBuffers[ReplayIndex];ReplayBuffer->MemoryBlock = VirtualAlloc(0, (size_t)Win32State.TotalSize,
MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE);ReplayBuffer->MemoryMap = ;
ReplayBuffer->MemoryBlock = MapViewOfFile(ReplayBuffer->MemoryMap, FILE_MAP_ALL_ACCESS,
0, 0, Win32State.TotalSize);if (ReplayBuffer->MemoryBlock)
{
// All good
}
else
{
// TODO(casey): Diagnostic
}
To get the MemoryMap
handle, we can call the CreateFileMapping function. This function requires the following:
hFile
: the file handle. Again, something to add to win32_replay_buffer
.
lpFileMappingAttributes
: the attributes of the file mapping object. We don't need this, so we'll just pass 0
.
flProtect
: the protection flags. We want to read and write to the file, so it's PAGE_READWRITE
.
dwMaximumSizeHigh
: the high DWORD of the maximum size. We need to calculate it from our TotalSize
.
dwMaximumSizeLow
: the low DWORD of the maximum size. Same as above.
lpName
: the name of the file mapping object. We can and will skip this.
Concerning the MaximumSize
, Windows wants us to pass the high and low 32-bit parts of the 64-bit value separately. We can do it in two ways:
TotalSize
by 32 bits (Win32State.TotalSize >> 32
), while the low part can be retrieved by masking with 0xFFFFFFFF (Win32State.TotalSize & 0xFFFFFFFF
).
LARGE_INTEGER
structure. This structure contains two 32-bit fields, HighPart
and LowPart
, as well as the corresponding 64-bit field, QuadPart
. We can use the QuadPart
field to store our 64-bit TotalSize
, and then pass HighPart
and LowPart
to CreateFileMapping
function.You can use whichever option you prefer. Below we will showcase the second way as it's less error-prone.
LARGE_INTEGER MaxSize;
MaxSize.QuadPart = Win32State.TotalSize;ReplayBuffer->MemoryMap = CreateFileMapping(ReplayBuffer->FileHandle, 0, PAGE_READWRITE,
MaxSize.HighPart, MaxSize.LowPart, 0);ReplayBuffer->MemoryBlock = MapViewOfFile(ReplayBuffer->MemoryMap, FILE_MAP_ALL_ACCESS,
0, 0, Win32State.TotalSize);
Of course, we also should expand win32_replay_buffer
structure to include the two handles.
struct win32_replay_buffer
{ HANDLE FileHandle;
HANDLE MemoryMap; char Filename[WIN32_STATE_FILENAME_COUNT];
void *MemoryBlock;
};
Let's set this aside for a second and think of the names. At this point, we can expand our file names and have them follow the ReplayIndex
value, 0 through 3. Luckily for us, we already extracted Win32GetInputFileLocation
, and it already receives the SlotIndex
so we only need to modify the logic there.
internal void
Win32GetInputFileLocation(win32_state *State, int SlotIndex, int DestCount, char *Dest)
{ Assert(SlotIndex == 1); char Name[64];
wsprintf(Name, "loop_edit_%d.hmi", SlotIndex); Win32BuildEXEPathFilename(State, Name, DestCount, Dest);}
At this point, we're opening the file handle only when we start recording. But let's say we don't want any of this; we will open all the file handles directly on startup. They will have read/write permissions, and we'll all be happy. We also want to restore SetFilePointerEx
even if it slows us down.
internal void
Win32BeginRecordingInput(...)
{
win32_replay_buffer *ReplayBuffer = Win32GetReplayBuffer(State, InputRecordingIndex);
if (ReplayBuffer->MemoryBlock)
{
State->InputRecordingIndex = InputRecordingIndex; char Filename[WIN32_STATE_FILENAME_COUNT];
Win32GetInputFileLocation(State, InputRecordingIndex, WIN32_STATE_FILENAME_COUNT, Filename);
State->RecordingHandle = ReplayBuffer->FileHandle;#if 0 LARGE_INTEGER FilePosition;
FilePosition.QuadPart = State->TotalSize;
SetFilePointerEx(State->RecordingHandle, FilePosition, 0, FILE_BEGIN);#endif
CopyMemory(ReplayBuffer->MemoryBlock, State->GameMemoryBlock, State->TotalSize);
}
}
// ...
internal void
Win32BeginInputPlayback(...)
{
win32_replay_buffer *ReplayBuffer = Win32GetReplayBuffer(State, InputPlayingIndex);
if (ReplayBuffer->MemoryBlock)
{
State->InputPlayingIndex = InputPlayingIndex; char Filename[WIN32_STATE_FILENAME_COUNT];
Win32GetInputFileLocation(State, InputPlayingIndex, WIN32_STATE_FILENAME_COUNT, Filename); State->PlaybackHandle = ReplayBuffer->FileHandle;#if 0 LARGE_INTEGER FilePosition;
FilePosition.QuadPart = State->TotalSize;
SetFilePointerEx(State->PlaybackHandle, FilePosition, 0, FILE_BEGIN);#endif
CopyMemory(State->GameMemoryBlock, ReplayBuffer->MemoryBlock, State->TotalSize);
}
}
int CALLBACK
WinMain(...)
{
// ... Win32GetInputFileLocation(&Win32State, ReplayIndex,
sizeof(ReplayBuffer->Filename), ReplayBuffer->Filename);
ReplayBuffer->FileHandle = CreateFileA(ReplayBuffer->Filename, GENERIC_READ | GENERIC_WRITE,
0, 0, CREATE_ALWAYS, 0, 0); LARGE_INTEGER MaxSize;
MaxSize.QuadPart = Win32State.TotalSize;
ReplayBuffer->MemoryMap = CreateFileMapping(ReplayBuffer->FileHandle, 0, PAGE_READWRITE,
MaxSize.HighPart, MaxSize.LowPart, 0);
ReplayBuffer->MemoryBlock = MapViewOfFile(ReplayBuffer->MemoryMap, FILE_MAP_ALL_ACCESS,
0, 0, Win32State.TotalSize);
// ...
}
At this point, you can compile and make sure that FileHandle
, MemoryMap
and MemoryBlock
are returned correctly (non zero) for each ReplayBuffer
. You will also notice that new files loop_edit
0 to 3 .hmi
have been created inside the build
directory. However, we still face the issue of a big delay when we start recording. Also our playback seems to be broken. Let's fix all of this.
The biggest issue we're facing right now is that our we're writing all the memory state to file during the first input recording. Maybe if we split the two outputs, game state and input records, we can improve the lag? Let's try it.
For Win32GetInputFileLocation
, we will start requesting another argument. This argument will determine whether the file is the input stream or the game state. ReplayBuffer
will store the handle to the state file, while the RecordingHandle
/PlaybackHandle
will store the handle to the input stream. This means that we need to roll back some changes we just did, namely comment out SetFilePointerEx
calls once again.
internal voidWin32GetInputFileLocation(win32_state *State, b32 InputStream, int SlotIndex, int DestCount, char *Dest)
{
char Name[64]; wsprintf(Name, "loop_edit_%d_%s.hmi", SlotIndex, InputStream ? "input" : "state"); Win32BuildEXEPathFilename(State, Name, DestCount, Dest);
}
// ...
internal void
Win32BeginRecordingInput(...)
{
win32_replay_buffer *ReplayBuffer = Win32GetReplayBuffer(State, InputRecordingIndex);
if (ReplayBuffer->MemoryBlock)
{
State->InputRecordingIndex = InputRecordingIndex;
char Filename[WIN32_STATE_FILENAME_COUNT];
Win32GetInputFileLocation(State, true, InputRecordingIndex, sizeof(Filename), Filename); State->RecordingHandle = CreateFileA(Filename, GENERIC_WRITE, 0, 0, CREATE_ALWAYS, 0, 0);
#if 0 LARGE_INTEGER FilePosition;
FilePosition.QuadPart = State->TotalSize;
SetFilePointerEx(State->RecordingHandle, FilePosition, 0, FILE_BEGIN);#endif CopyMemory(ReplayBuffer->MemoryBlock, State->GameMemoryBlock, State->TotalSize);
}
}
// ...
internal void
Win32BeginInputPlayback(...)
{
win32_replay_buffer *ReplayBuffer = Win32GetReplayBuffer(State, InputPlayingIndex);
if (ReplayBuffer->MemoryBlock)
{
State->InputPlayingIndex = InputPlayingIndex;
char Filename[WIN32_STATE_FILENAME_COUNT];
Win32GetInputFileLocation(State, true, InputPlayingIndex, sizeof(Filename), Filename); State->PlaybackHandle = CreateFileA(Filename, GENERIC_READ, 0, 0, OPEN_EXISTING, 0, 0);
#if 0 LARGE_INTEGER FilePosition;
FilePosition.QuadPart = State->TotalSize;
SetFilePointerEx(State->PlaybackHandle, FilePosition, 0, FILE_BEGIN);#endif
CopyMemory(State->GameMemoryBlock, ReplayBuffer->MemoryBlock, State->TotalSize);
}
}
// ...
int CALLBACK
WinMain(...)
{
// ... Win32GetInputFileLocation(&Win32State, false, ReplayIndex, sizeof(ReplayBuffer->Filename), ReplayBuffer->Filename);
ReplayBuffer->FileHandle = CreateFileA(ReplayBuffer->Filename, GENERIC_READ | GENERIC_WRITE,
0, 0, CREATE_ALWAYS, 0, 0);
// ...
}
Ok, that was a lot of back and forth (roughly) but if you compile and run, you'll notice that the lag is much, much smaller! It's a bit weird that setting the pointer creates such a lag, so that might be something to investigate. One day. Let's leave a note for posterity:
win32_replay_buffer *ReplayBuffer = &Win32State.ReplayBuffers[ReplayIndex];
// TODO(casey): Recording system still seems to take too long
// on record start - find out what Windows is doing and if
// we can speed up / defer some of that processing.
Win32GetInputFileLocation(&Win32State, false, ReplayIndex,
sizeof(ReplayBuffer->Filename), ReplayBuffer->Filename);
ReplayBuffer->FileHandle = CreateFileA(ReplayBuffer->Filename, GENERIC_READ | GENERIC_WRITE,
0, 0, CREATE_ALWAYS, 0, 0);
Ok, that's one issue (somewhat) out of the way, let's look at the other: how do we interrupt playback?
Well, this is actually very simple. Whenever the user presses L
, we will check if InputPlayingIndex
is zero. If it's not, we stop the playback. (if it is, we do the same check for recording).
else if (VKCode == 'L')
{
if (IsDown)
{ if (Start->InputPlayingIndex == 0)
{ if (State->InputRecordingIndex == 0)
{
Win32BeginRecordingInput(State, 1);
}
else
{
Win32EndRecordingInput(State);
Win32BeginInputPlayback(State, 1);
}
} else
{
Win32EndInputPlayback(State);
} }
}
This should do the trick. Certainly not shippable quality, but for our debug purposes will suffice.
We're at the end of the last day of the initial platform prototype layer. Next time, we will start diving into the architecture of the game proper, so we won't be spending so much time in Windows. Let's quickly make a few changes to make sure the platform layer doesn't get in our way.
Win32DebugSyncDisplay
served us well when we were debugging audio. For now audio seems to be in a good place, so let's comment the function out, and delete the call.
#if 0internal void
Win32DebugSyncDisplay(...)
{
// ...
}#endif
// ...
int CALLBACK
WinMain(...)
{
// ...#if HANDMADE_INTERNAL
// TODO(casey): Note, current is wrong on the zero'th index
Win32DebugSyncDisplay(&GlobalBackbuffer, ArrayCount(DebugTimeMarkers), DebugTimeMarkers,
DebugTimeMarkerIndex - 1, &SoundOutput, TargetSecondsPerFrame);
#endif HDC DeviceContext = GetDC(Window);
Win32DisplayBufferInWindow(&GlobalBackbuffer, DeviceContext, Dimension.Width, Dimension.Height);
ReleaseDC(Window, DeviceContext);
// ...
}
Another thing that we can get rid of is the spam of the Output
window. We had two printouts: audio data, and the framerate.
#if HANDMADE_INTERNAL
win32_debug_time_marker *Marker = &DebugTimeMarkers[DebugTimeMarkerIndex];
Marker->OutputPlayCursor = PlayCursor;
Marker->OutputWriteCursor = WriteCursor;
Marker->OutputLocation = ByteToLock;
Marker->OutputByteCount = BytesToWrite;
Marker->ExpectedFlipPlayCursor = ExpectedFrameBoundaryByte;
#if 0DWORD UnwrappedWriteCursor = WriteCursor;
if (UnwrappedWriteCursor < PlayCursor)
{
UnwrappedWriteCursor += SoundOutput.SecondaryBufferSize;
}
DWORD AudioLatencyBytes = UnwrappedWriteCursor - PlayCursor;
f32 AudioLatencySeconds = ((f32)AudioLatencyBytes / (f32)SoundOutput.BytesPerSample) /
(f32)SoundOutput.SamplesPerSecond;
char TextBuffer[256];
sprintf_s(TextBuffer, sizeof(TextBuffer), "BTL:%u TC:%u BTW:%u - PC:%u WC:%u DELTA:%u (%.3fs)\n",
ByteToLock, TargetCursor, BytesToWrite,
PlayCursor, WriteCursor, AudioLatencyBytes, AudioLatencySeconds);
OutputDebugStringA(TextBuffer);#endif#endif
// ...
#if 0// debug timing output
f32 FPS = 0.0f;
f32 MegaCyclesPerFrame = (f32)CyclesElapsed / (1000.0f * 1000.0f);
char FPSBuffer[256];
sprintf_s(FPSBuffer, sizeof(FPSBuffer), "%.02fms/f, %.02ff/s, %.02fMc/f\n", MSPerFrame, FPS, MegaCyclesPerFrame);
OutputDebugStringA(FPSBuffer);#endif
Now we don't print out something each frame, so that's good. In the future, we'll have our debug logging system that we can use for debugging the data.
Everything else seems relatively good.
This marks the end of day 25 and with it, the initial work on the Win32 platform layer. There's still more that we need to learn; so we might revisit our existing systems in the future.
But for now, this is all behind us. Next time, we'll start playing around with the game's architecture.
Previous: Day 24. Win32 Platform Layer Cleanup
Up Next: Day 26. Introduction to Game Architecture
*