A Minimal Direct3D 12 Playground
Direct3D 12 literature is sparse, and this is my take on a "Hello Triangle" guide. I am only focusing purely with Direct3D only. This means that interacting with the window such as mouse & keyboard inputs and window resizing is not discussed. We will work purely with generating frames.
This guide assumes basic familiarity with graphics programming concepts and C++.
We Forget About The Window
This post is working purely with the Direct3D 12 API only. In other words, the actual application window that the graphics API will draw on will be an after thought. There are many windowing libraries out there like SDL and GLFW.
You can always swap windowing libraries and keep the same graphics code that you already wrote. Therefore, I will work with a minimal window here
implemented using the win32
API. The window is fixed size and no mouse or keyboard inputs. This is done deliberately to
understand the fundamentals and work your way up towards more advanced features.
This guide does is not tied to any windowing system at all. I will assume you will have a window that can:
Init()
You can use a constructor for this, but we need a way for functions to be called at start up at least once.Load()
This is done in later posts, but we a function that is called after initialization process data, like vertices, only after all Direct3D objects are ready.Update()
The main function that is called every frame using a while loop.Shutdown()
The clean up function. You can use a destructor for this.
Creating the Device
The ID3D12Device
acts as your primary display adapter. In modern hardware, this will be your graphics card. If you happen to not have a dedicated hardware for display, then you can use the DirectX WARP, which means the CPU will do the rendering work instead of a dedicated GPU.
As you create the device, do pay attention to the feature level parameter in the D3D12CreateDevice
function. The feature level parameter is important as it will state how much of the latest and greatest features you want to support. For example, shader model 6.0 are only supported on D3D_FEATURE_LEVEL_12_0
or above.
So, let's start to create it! We start by adding our device in our derived class member.
class Basic : public App{
// ... other functions
private:
Microsoft::WRL::ComPtr<ID3D12Device14> m_device;
}


We can now create the device.
void Init() override{
D3D12CreateDevice(nullptr, D3D_FEATURE_LEVEL_12_2, IID_PPV_ARGS(&m_device));
}
A common function you will use often for creating objects is the IID_PPV_ARGS(T **ppType)
macro. The Direct3D API is full of [in]
and [out]
parameters where you need to give the API a GUID and the pointer to a pointer of the object you are trying to create. This can be a bit confusing at first, so just let the IID_PPV_ARGS
macro handle the inner workings of this procedure.
Creating the Command Queue
Modern graphics API no longer do commands immediate once upon called. You know must essentially write a bunch of commands in the ID3D12CommandQueue
, and then later executing that list full of commands. You can think of it like a recipe book where we list the steps in order we want the queue to execute in. For now, let's just create it.
Add to our private members in our derived class.
class Basic : public App{
// ... other functions
private:
Microsoft::WRL::ComPtr<ID3D12Device14> m_device;
Microsoft::WRL::ComPtr<ID3D12CommandQueue> m_commandQueue;
}
And initialize it with our newly created ID3D12Device14
. We are adding the function call inside of some curly brace scopes since we don't need the desc
anymore after we actually create the command queue.
void Init() override{
D3D12CreateDevice(nullptr, D3D_FEATURE_LEVEL_12_2, IID_PPV_ARGS(&m_device));
{
const D3D12_COMMAND_QUEUE_DESC desc = {
.Type = D3D12_COMMAND_LIST_TYPE_DIRECT,
.Priority = D3D12_COMMAND_QUEUE_PRIORITY_NORMAL,
.Flags = D3D12_COMMAND_QUEUE_FLAG_NONE,
.NodeMask = 0,
};
m_device->CreateCommandQueue(&desc, IID_PPV_ARGS(&m_commandQueue));
}
}
You will also encounter many parameters needing a *pDesc
, which is just a struct
containing some data the function needs. This basically just avoids letting some functions get too crazy with the number of parameter it needs to do its job. As a bonus, some struct
's can be reused!
Creating The Swapchain
The swapchain is a bit complex and verbose as it is the object that will present you everything you see on your monitor. We will take this one step at a time.
Let's first add the swapchain to our private member in our class.
class Basic : public App{
// ... other functions
private:
Microsoft::WRL::ComPtr<ID3D12Device14> m_device;
Microsoft::WRL::ComPtr<ID3D12CommandQueue> m_commandQueue;
Microsoft::WRL::ComPtr<IDXGISwapChain4> m_swapChain;
}
A frame is a single image that Direct3D draws on screen. Games are usually more then 60 frames per second (fps).
Modern graphics cards are incredibly fast, but even then, an intense scene can force the graphics API to slow down a bit as the application draws whatever is on screen. You don't want to watch the frame get drawn line by line. Therefore, we have the swapchain to alleviate this problem.
Your graphics card will have access to two images internally, but you can only see one single image at a time. The front buffer contains the complete frame that is displayed on your monitor. The back buffer is the next iteration of the frame that is currently being drawn behind the scenes. Once the back buffer is completed, the back buffer is now the front, and the process repeats itself possibly over a hundred times per second. When we want to see our frame, we call this operation presenting the frame.
Therefore, it is a good idea to track the back buffer index using a plain integer UINT
.
class Basic : public App{
// ... other functions
private:
Microsoft::WRL::ComPtr<ID3D12Device14> m_device;
Microsoft::WRL::ComPtr<ID3D12CommandQueue> m_commandQueue;
Microsoft::WRL::ComPtr<IDXGISwapChain4> m_swapChain;
UINT m_frameIndex = 0;
}
Detour: What is a IDXGIFactory?
We need to pause briefly. Previously, we used our ID3D12Device
method to create our command queue. Naturally, you would think perhaps we can create our swapchain using this as well! Alas, it is not that easy. For our swapchain, we need to a IDXGIFactory
. Also, make note of how the factory has no mention of ID3D12
or anything! Curious.
In extremely simple terms, during our boiler plate initialization for Direct3D, anything that needs information from the operating system will go through the IDXGIFactory
. The swapchain needs access to a hwnd
, which is a unique ID handle to a window that our operating system manages. In contrast, our command queue we created does not really care or need anything from the operating system to operate, but the swapchain needs something to draw on, thus it needs the hwnd
so the swapchain can draw on it. Let's create it.
void Init() override{
//previous code omitted
Microsoft::WRL::ComPtr<IDXGIFactory4> dxgiFactory;
CreateDXGIFactory2(DXGI_CREATE_FACTORY_DEBUG, IID_PPV_ARGS(&dxgiFactory));
}
Back To Our Swapchain!
We need another descriptor desc
for our swapchain. There are multiple descriptors you can use, but the DXGI_SWAP_CHAIN_DESC1
is less verbose. There are quite a lot of members for this struct
so do look at the documentation. Let's now add our swapchain descriptor in some scoped braces.
void Init() override{
//previous code omitted
Microsoft::WRL::ComPtr<IDXGIFactory4> dxgiFactory;
CreateDXGIFactory2(DXGI_CREATE_FACTORY_DEBUG, IID_PPV_ARGS(&dxgiFactory));
{
const DXGI_SWAP_CHAIN_DESC1 swapChainDesc = {
.Width = static_cast<UINT>(GetWidth()),
.Height = static_cast<UINT>(GetHeight()),
.Format = DXGI_FORMAT_R8G8B8A8_UNORM,
.Stereo = FALSE,
.SampleDesc = {.Count = 1, .Quality = 0},
.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT,
.BufferCount = 2, //2 frames
.Scaling = DXGI_SCALING_STRETCH,
.SwapEffect = DXGI_SWAP_EFFECT_FLIP_DISCARD,
.AlphaMode = DXGI_ALPHA_MODE_UNSPECIFIED,
.Flags = DXGI_SWAP_CHAIN_FLAG_ALLOW_TEARING,
};
}
}
This next part is a bit tricky due to how Microsoft's COM
works and also how backwards compatibility works along the DirectX lineup. But in short, we need to create a temporary IDXGISwapChain1
using CreateSwapChainForHwnd
.
Notice how it is a older version swapchain then the one we declared as a private member m_swapchain
. We need to ask our IDXGIFactory4
we created earlier to create this older swapchain but if our version of Windows supports the newer IDXGISwapChain4
, then we will get that instead using the older swapchain's method .As()
.
In short, we ask the factory for the older backwards compatible version of the swapchain, but then right after, ask for an upgrade.
void Init() override{
//previous code omitted
Microsoft::WRL::ComPtr<IDXGIFactory4> dxgiFactory;
CreateDXGIFactory2(DXGI_CREATE_FACTORY_DEBUG, IID_PPV_ARGS(&dxgiFactory));
{
const DXGI_SWAP_CHAIN_DESC1 swapChainDesc = {
.Width = static_cast<UINT>(GetWidth()),
.Height = static_cast<UINT>(GetHeight()),
.Format = DXGI_FORMAT_R8G8B8A8_UNORM,
.Stereo = FALSE,
.SampleDesc = {.Count = 1, .Quality = 0},
.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT,
.BufferCount = 2, //2 frames
.Scaling = DXGI_SCALING_STRETCH,
.SwapEffect = DXGI_SWAP_EFFECT_FLIP_DISCARD,
.AlphaMode = DXGI_ALPHA_MODE_UNSPECIFIED,
.Flags = DXGI_SWAP_CHAIN_FLAG_ALLOW_TEARING,
};
Microsoft::WRL::ComPtr<IDXGISwapChain1> swapChain1;
dxgiFactory->CreateSwapChainForHwnd(m_commandQueue.Get(), GetHwnd(), &swapChainDesc, nullptr, nullptr, &swapChain1);
swapChain1.As(&m_swapChain);
}
}
We got a working swapchain now, but these front/back buffers that have our precious frames needs to be dispalyed on our monitors somehow, queue in the Render Target Views. But before we move on, recall our m_frameIndex
we made, the UINT
that holds the current back buffer index from our swapchain. Let's initialize it after the scope.
m_frameIndex = m_swapChain->GetCurrentBackBufferIndex();
Render Target View (RTV)
Let's say our swapchain produced the frame we need. In order for us, the developer to see or do anything with it, we have to interface with the ID3D12Resource
. You can think of it as just a block of data that exist in your GPU.
Remember how many buffers we were creating in our swapchain? Therefore, we need an array of ID3D12Resource
in our class.
class Basic : public App{
private:
// previous members omitted
Microsoft::WRL::ComPtr<ID3D12Resource> m_renderTargets[2];
}
Let's set that render target resource aside for now since we need a way to control the resource's lifetime. Recall that ID3D12Resource
exists on the GPU, and we want ideally all rendering operations to happen on the GPU only and never cross the boundary to the CPU.
Direct3D 12 uses the concept of a descriptor, also called a view, which gives us information on what that generic ID3D12Resource
even is and where it is. We can potentially have many of descriptors that the GPU needs access to, so we place them inside of some heap.
Therefore, we need a ID3D12Heap
on our GPU. This heap is a contiguous block of memory somewhere on the GPU. Unlike the new
keyword, we don't have the luxury of the compiler creating the resource and putting in memory, so we are just working with pure memory with our empty ID3D12Heap
.
We also need to manage the size of each descriptor we place in our ID3D12Heap
as certain GPUs might have different descriptor sizes. This size has to be managed by the developer as it is our way to navigate and access our ID3D12Heap
array.
When you do a int n = myArray[1]
, it is syntactic sugar for pointer arithmetic. Under the hood, you are doing a int n = *myArray + 1
. The compiler is smart enough to realize the 1
is 4 bytes, and will traverse that block of memory accordingly. We are doing the same process when we work with a ID3D12Heap
.
So, let's create in our class the heap and UINT
for our descriptor size. The UINT
macro is an unsigned integer typedef
that the Windows API likes to use.
class Basic : public App{
private:
// previous members omitted
Microsoft::WRL::ComPtr<ID3D12Resource> m_renderTargets[2];
Microsoft::WRL::ComPtr<ID3D12DescriptorHeap> m_rtvHeap;
UINT m_rtvDescriptorSize = 0;
}
We are now ready to create our ID3D12Heap
. We will ask our device to call the CreateDescriptorHeap
method, which needs a D3D12_DESCRIPTOR_HEAP_DESC
. The desc
is generic, so we need to describe what it is and how we will use it. Once again, let's scope it.
void Init() override {
//previous code omitted
{
D3D12_DESCRIPTOR_HEAP_DESC rtvHeapDesc = {
.Type = D3D12_DESCRIPTOR_HEAP_TYPE_RTV,
.NumDescriptors = FrameCount,
.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_NONE,
.NodeMask = 0, //single adapter
};
}
}
We will also create the actual descriptor heap with our desc
and also ask our device to give us the size of each descriptor. These APIs can get quite long!
void Init() override {
//previous code omitted
{
D3D12_DESCRIPTOR_HEAP_DESC rtvHeapDesc = {
.Type = D3D12_DESCRIPTOR_HEAP_TYPE_RTV,
.NumDescriptors = FrameCount,
.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_NONE,
.NodeMask = 0, //single adapter
};
m_device->CreateDescriptorHeap(&rtvHeapDesc, IID_PPV_ARGS(&m_rtvHeap));
m_rtvDescriptorSize = m_device->GetDescriptorHandleIncrementSize(D3D12_DESCRIPTOR_HEAP_TYPE_RTV);
}
}
Okay, remember our m_renderTargets
that we made earlier and took a detour? We need to now actually create the descriptor for them. Recall that a descriptor only gives information about a resource, it is not the actual resource itself.
First, let us get the start of our heap. We are using Microsoft's helper functions that is part of the repo. This handle points to the start of the heap. Scoped, as always.
void Init() override {
//previous code omitted
{
CD3DX12_CPU_DESCRIPTOR_HANDLE rtvHandle(
m_rtvHeap->GetCPUDescriptorHandleForHeapStart());
}
}
The next steps are filling in our heap with manually arraying with in the indices with our rtvHandle
pointer. In our loop, we are basically doing:
- Ask the swapchain to
GetBuffer
to ourm_renderTarget[n]
- Ask the device to
CreateRenderTargetView
inside of our heap - Tell our
rtvHandle
to offset by the size of our descriptor to work on next iteration
In other words, the swapchain is not sharing its pointer to the frame, so we ask it to GetBuffer
to our ID3D12Resource
render target.
void Init() override {
//previous code omitted
{
CD3DX12_CPU_DESCRIPTOR_HANDLE rtvHandle(
m_rtvHeap->GetCPUDescriptorHandleForHeapStart());
for (UINT n = 0; n < FrameCount; n++)
{
m_swapChain->GetBuffer(n, IID_PPV_ARGS(&m_renderTargets[n]));
m_device->CreateRenderTargetView(m_renderTargets[n].Get(), nullptr, rtvHandle);
rtvHandle.Offset(1, m_rtvDescriptorSize);
}
}
}
If you were curious why we need to bother with views and descriptors, it's because a ID3D12Resource
is completely generic. We only say it is a render target because we created a specific descriptor that says the GPU what that resource is, otherwise it will have no idea.
Create Command Allocator and List
Direct3D 12 gives us a lot of control on how we want things rendered, so a lot of the responsibility is on the developer to drive where data needs to go for a successful frame. This gives us a lot of power if you explore multi threading.
We created a ID3D12CommandQueue
earlier, but it does not do anything. It needs a ID3D12CommandList
so it can start to draw a frame step by step. But keep in mind that unlike older graphics API, the commands are not immediate. They are only evaluated once we tell the queue to start, which is done in the Update()
.
Since our ID3D12CommandList
is not immediate, we need a backing ID3D12CommandAllocator
so our commands we created don't get lost somewhere in memory. Also, it is a good practice to keep the command list closed when we initialize it. Therefore, we will both those now.
class Basic : public App{
private:
//previous code omitted
Microsoft::WRL::ComPtr<ID3D12CommandAllocator> m_commandAllocator;
Microsoft::WRL::ComPtr<ID3D12GraphicsCommandList> m_commandList;
}
void Init() override {
//previous code omitted
m_device->CreateCommandAllocator(D3D12_COMMAND_LIST_TYPE_DIRECT, IID_PPV_ARGS(&m_commandAllocator));
m_device->CreateCommandList(0, D3D12_COMMAND_LIST_TYPE_DIRECT, m_commandAllocator.Get(), nullptr, IID_PPV_ARGS(&m_commandList));
m_commandList->Close();
}
Fences
If you ever worked with concurrency, you will always run into race conditions if you do not lock certain threads. Direct3D 12 solves this issue with the use of ID3D12Fence
objects. In other words, when our GPU is busy in the middle of working, we do not want another operation to meddle in and disrupt the data it is currently working on.
To put it simply, a fence handles a large integer UINT64
inside of the GPU. We use the fence's internal value and compare it to some integer that we will create on the CPU side. This acts like a checkpoint, where we ask the fence to update it's own value to the one we set on the CPU only when it is done doing some work. This number only increases from here on out.
Let's create these as members.
class Basic : public App{
private:
//previous code omitted
Microsoft::WRL::ComPtr<ID3D12Fence> m_fence;
UINT64 m_fenceValue = 0;
}
But once the GPU is done doing the work and updates the m_fenceValue
on the CPU side, we need to know about it. We can just busy poll it asking our GPU repeatedly if it is done yet, but it is far more efficient for it to send an event.
class Basic : public App{
private:
//previous code omitted
Microsoft::WRL::ComPtr<ID3D12Fence> m_fence;
UINT64 m_fenceValue = 0;
HANDLE m_fenceEvent = nullptr;
}
In our Init()
method, all we have to do is just create the fence and set the m_fenceValue
and m_fenceEvent
accordingly.
void Init() override {
//previous code omitted
m_device->CreateFence(0, D3D12_FENCE_FLAG_NONE, IID_PPV_ARGS(&m_fence));
m_fenceValue = 1;
m_fenceEvent = CreateEvent(nullptr, FALSE, FALSE, nullptr);
}
A Small Fence Helper
We will create a small helper function which you can add as a private function for our derived class. For our basic Direct3D playground, we are not creating a ring buffer. We will let the GPU complete all its commands, then we let the CPU wait until the GPU finishes, and then only can we proceed to loop. Not the most efficient as you want the GPU to always keep working ahead no matter what, but it works for learning purposes.
void WaitForPreviousFrame() {
const UINT64 fence = m_fenceValue;
m_commandQueue->Signal(m_fence.Get(), fence);
m_fenceValue++;
if (m_fence->GetCompletedValue() < fence) {
m_fence->SetEventOnCompletion(fence, m_fenceEvent);
WaitForSingleObject(m_fenceEvent, INFINITE);
}
}
This function just says to Signal
the start point, then immediately increase our m_fenceValue
so now it is out of sync and no further work can be done until the GPU finishes. If it is done, we SetEventOnCompletion
and call Direct3D's function to wait. This function will be used in our Update()
function where we fill in our command list.
Recap
In this post, we initialized a minimal setup for Direct3D 12. We have everything we need to get something displayed on our window, but that will be for a later post. For now, we can be confident that our foundation is set, and ready to be used later in our Update()
.