Rendering Terrain Part 2 – Loading a height map and Initializing a Pipeline

Loading a Height Map

Let’s start off with the question, “What is a height map?”
That’s a pretty basic place to start, but I think it’s important.
A height map is an array of values that define the relative heights of each vertex in a mesh. I say relative because you may scale the values or, depending on the format you store the data in, you may even need to do a calculation to come up with the final height values.
Shamus Young gives a great example of the latter in his own terrain engine project. Here’s an example of a height map composed of RGB values from that post:
terrain_composite This is a really handy method as you can store your height data in a standardized format like BMP or PNG and grab a ready-to-use library to load the files.
The down side is that you have to calculate the actual height of the terrain based on 3 or 4 values (depending on whether you use RGB or RGBA). This isn’t a ton of extra calculation, however, and you do get a fairly large range of possible values this way.
For the purposes of this project, I just downloaded a couple of PNG height maps from a Google search. To load the image data, I chose LodePNG. LodePNG is super simple to use. Just download and include lodepng.h and lodepng.cpp in your project, and you’re good to go.
Opening a PNG file is done with one command:

 unsigned error = lodepng::decode(maImage, mWidth, mHeight, filename);

It gets you not only the image stored in a std::vector of unsigned chars (four elements per vertex), but the height and width of the image, as well.
There are, however, 2 issues that I’ve run into:
Firstly, from what I’ve read, OpenGL and Direct3D both define images bottom-up, while every image format I’ve seen, including PNG, define them top-down. So you wind up with an inverted image. This isn’t a big deal, as long as you’re aware of it. You can either invert the image in an image editor or write a routine to flip everything. For my purposes, I don’t actually care whether or not the image is upside down. When I do, I just invert the image in an image editor.

The second issue I ran into is one of speed. I fully expected that loading a 2048×2048 or 4096×4096 file would not be very fast, but I didn’t expect that shutdown would also be slower. To be sure it wasn’t something weird with DirectX, I tried creating a regular C array filled with random values. It took a hell of a lot longer to generate random values using the std::random library, but shutdown was instant instead of taking a few seconds. I’m pretty sure it’s because the LodePNG function above uses std::vectors and there is some overhead that becomes apparent when you’re dealing with millions of values.
After searching through lodepng.h, I found the function lodepng_decode32_file(). It takes the same arguments as decode() does, except it uses a C array instead of a std::vector. Now the program shuts down quickly and very little else had to be changed, besides adding a delete call to my Terrain object’s destructor.

The PNG file just contains a series of RGBA values. My example images1 are both black and white images so don’t use all of the channels. When I move to 3D rendering, I’ll need to use math, as mentioned above, to convert the channel values to height values that make sense. Since it’s black and white, probably just scaling the values will be fine. Hell, since the values are between 0 and 255, it might be fine to use them as is. We’ll see. I’d expect most terrains generated from this data to be fairly blocky looking, though. That’s ok for now.

In the future, I’d like to replace this RGBA data with a single float per vertex, with the values scaled to between -1 and 1. Noise functions, particularly Perlin Noise, produce values in this range, so having my terrain use the same range makes it easy to incorporate noise. While you can convert those values and store them in RGBA format, I find it more useful to keep them as floats. Direct3D can handle either format just fine.

Initializing The Pipeline

We’ve already initialized the Device, but now we are talking about creating the resources necessary for the shaders, and the shaders themselves, to render the terrain.
So exactly what do we need to do?

  • Create a Shader Resource View Descriptor Heap that will contain references to all the textures and constants the shader will require.
    /* In Terrain constructor */
    D3D12_DESCRIPTOR_HEAP_DESC srvHeapDesc = {};
    srvHeapDesc.NumDescriptors = 1;
    GFX->CreateDescriptorHeap(srvHeapDesc, mpSRVHeap);
    /* Graphics object method */
    // Create and return a pointer to a Descriptor Heap.
    void CreateDescriptorHeap(D3D12_DESCRIPTOR_HEAP_DESC heapDesc, ID3D12DescriptorHeap*& heap) {
    	if FAILED(mpDev->CreateDescriptorHeap(&heapDesc, IID_PPV_ARGS(&heap))) {
    		throw GFX_Exception("Failed to create descriptor heap.");
  • Load the Height map, store it in an ID3D12Resource*, and create a Shader Resource View.
    /* In Terrain constructor */
    // loads the heightmap texture into memory
    void Terrain::PrepareHeightmap(Graphics *GFX) {
    	D3D12_RESOURCE_DESC texDesc = {};
    	D3D12_SUBRESOURCE_DATA texData = {};
    	D3D12_SHADER_RESOURCE_VIEW_DESC	srvDesc = {};
    	// create the texture.
    	texDesc.MipLevels = 1;
    	texDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
    	texDesc.Width = mWidth;
    	texDesc.Height = mHeight;
    	texDesc.Flags = D3D12_RESOURCE_FLAG_NONE;
    	texDesc.DepthOrArraySize = 1;
    	texDesc.SampleDesc.Count = 1;
    	texDesc.SampleDesc.Quality = 0;
    	texDesc.Dimension = D3D12_RESOURCE_DIMENSION_TEXTURE2D;
    	// create shader resource view for the heightmap.
    	srvDesc.Shader4ComponentMapping = D3D12_DEFAULT_SHADER_4_COMPONENT_MAPPING;
    	srvDesc.Format = texDesc.Format;
    	srvDesc.ViewDimension = D3D12_SRV_DIMENSION_TEXTURE2D;
    	srvDesc.Texture2D.MipLevels = texDesc.MipLevels;
    	texData.pData = maImage;
    	texData.RowPitch = mWidth * 4;
    	texData.SlicePitch = mHeight * mWidth * 4;
    	GFX->CreateCommittedBuffer(mpHeightmap, mpUploadHeightmap, &texDesc);
    	// copy the data to the upload heap.
    	const unsigned int subresourceCount = texDesc.DepthOrArraySize * texDesc.MipLevels;
    	ID3D12GraphicsCommandList *cmdList = GFX->GetCommandList();
    	UpdateSubresources(cmdList, mpHeightmap, mpUploadHeightmap, 0, 0, subresourceCount, &texData);
    	GFX->CreateSRV(mpHeightmap, &srvDesc, mpSRVHeap);
    	mpHeightmap->SetName(L"Heightmap Resource Heap");
    	mpUploadHeightmap->SetName(L"Heightmap Upload Resource Heap");
    /* Graphics object method */
    // Create a committed default buffer and an upload buffer for copying data to it.
    void Graphics::CreateCommittedBuffer(ID3D12Resource*& buffer, ID3D12Resource*&upload, D3D12_RESOURCE_DESC* texDesc) {
    // Create the resource heap on the gpu.
    		throw GFX_Exception("Failed to create default heap on CreateSRV.");
    	// Create an upload heap.
    	const auto buffSize = GetRequiredIntermediateSize(buffer, 0, 1);
    		throw GFX_Exception("Failed to create upload heap on CreateSRV.");
    // Create a Shader Resource view for the supplied resource.
    void Graphics::CreateSRV(ID3D12Resource*& tex, D3D12_SHADER_RESOURCE_VIEW_DESC* srvDesc, ID3D12DescriptorHeap* heap) {
    	mpDev->CreateShaderResourceView(tex, srvDesc, heap->GetCPUDescriptorHandleForHeapStart());

    You’ll notice that to get the height map into memory, we have to create a resource heap for it, PLUS an upload heap. Every example I’ve found works this way. Well, we’ll see in the next part that you can use just an upload heap, but we use a default heap because these resources are located in memory on the GPU that makes it more efficient for the GPU to access them, while the upload heap appears to actually be in system memory and has to be re-uploaded to the GPU each frame, wasting bandwidth. You’d only use the upload heap directly for resources that need to change often (every frame, for instance). Otherwise, we have to upload to the resource heap THROUGH the upload heap.
    Of the example code I’ve been able to find, every example keeps the upload heap in memory until shut down for the sake of simplicity. The only reference I could find to freeing this resource is a topic on from the middle of 2015. We need to wait until the GPU has completely uploaded the data before we can release the upload heap. But the only way I’ve found to do so is by setting a fence and waiting for it. If you went through the Braynzar Soft tutorial on initializing DirectX, you’ll remember that we use fences every frame to confirm that the GPU is done rendering to a given back buffer buffer before we reset it and start preparing the next frame that will be rendered to it.
    A fence is just a value we tell the GPU to set as the last command in its queue. Once it has set that value, we know it is done rendering. We can set an event to fire upon that signal being set with the ID3D12Fence.SetEventOnCompletion() method. We can then call WaitForSingleObject() or WaitForSingleObjectEx()2 to wait for the GPU to finish. the Ex version sounds like it would be useful in a multi-threaded application, but we’re just using the normal version.
    So we have to wait for the GPU to finish before we can actually release the upload buffer. And we need to do this for every upload buffer that we’re not going to use again. For every object. I can see why the examples I’ve found don’t bother dealing with it.
    But I want to. I know it doesn’t make sense to wait for the GPU for every single buffer. Hell, it doesn’t make sense to wait for every object, even though we only have the one so far. So we create our Terrain object in the constructor for the Scene as part of initialization. Once we’re done, we tell the GPU to execute all of the instructions it was given during initialization. Then we wait for it to finish. When it’s done, we can let the Terrain object know to delete it’s unneeded upload heaps.

    /* In Scene constructor */
    // after creating and initializing the heightmap for the terrain, we need to close the command list
    // and tell the Graphics object to execute the command list to actually finish the subresource init.
    // Close all command lists. Currently there is only the one.
    void Scene::CloseCommandLists() {
    	// close the command list.
    	if (FAILED(mpGFX->GetCommandList()->Close())) {
    		throw GFX_Exception("CommandList Close failed.");
    /* Graphics object method */
    void Graphics::LoadAssets() {
    	// load the command list.
    	ID3D12CommandList* lCmds[] = { mpCmdList };
    	// execute
    	mpCmdQ->ExecuteCommandLists(__crt_countof(lCmds), lCmds);
    	// Add Signal command to set fence to the fence value that indicates the GPU is done with that buffer. maFenceValues[i] contains the frame count for that buffer.
    	if (FAILED(mpCmdQ->Signal(maFences[mBufferIndex], maFenceValues[mBufferIndex]))) {
    		throw GFX_Exception("CommandQueue Signal Fence failed on Render.");
    	// if the current value returned by the fence is less than the current fence value for this buffer, then we know the GPU is not done with the buffer, so wait.
    	if (FAILED(maFences[mBufferIndex]->SetEventOnCompletion(maFenceValues[mBufferIndex], mFenceEvent))) {
    		throw GFX_Exception("Failed to SetEventOnCompletion for fence in WaitForGPU.");
    	WaitForSingleObject(mFenceEvent, INFINITE);

    Luckily, we don’t need to deal with dynamically created and destroyed objects. There’s so much I don’t know about DirectX 12 yet, I’m not sure how we’d handle this in that case.

  • Next, we compile the shaders. We need to compile each stage of the shader pipeline that we are using. In our case, so far, we only need a Vertex shader and a Pixel shader.
    /* In Terrain constructor */
    ID3DBlob* VS;
    ID3DBlob* PS;
    ID3DBlob* err;
    D3D12_SHADER_BYTECODE PSBytecode = {};
    D3D12_SHADER_BYTECODE VSBytecode = {};
    // compile our shaders.
    // vertex shader.
    if (FAILED(D3DCompileFromFile(L"RenderTerrainVS.hlsl", NULL, NULL, "main", "vs_5_0", D3DCOMPILE_DEBUG | D3DCOMPILE_SKIP_OPTIMIZATION, 0, &VS, &err))) {
    	if (VS) VS->Release();
    	if (err) {
    		throw GFX_Exception((char *)err->GetBufferPointer());
    	else {
    		throw GFX_Exception("Failed to compile terrain Vertex Shader. No error returned from compiler.");
    VSBytecode.BytecodeLength = VS->GetBufferSize();
    VSBytecode.pShaderBytecode = VS->GetBufferPointer();
    // pixel shader.
    if (FAILED(D3DCompileFromFile(L"RenderTerrainPS.hlsl", NULL, NULL, "main", "ps_5_0", D3DCOMPILE_DEBUG | D3DCOMPILE_SKIP_OPTIMIZATION, 0, &PS, &err))) {
    	if (PS) PS->Release();
    	if (err) {
    		throw GFX_Exception((char *)err->GetBufferPointer());
    	else {
    		throw GFX_Exception("Failed to compile Pixel Shader. No error returned from compiler.");
    PSBytecode.BytecodeLength = PS->GetBufferSize();
    PSBytecode.pShaderBytecode = PS->GetBufferPointer();
  • Create a Root Signature. The Root Signature describes the resources that you are binding to the pipeline. In our case, this will be our texture and a simple linear sampler for sampling from our texture. We don’t include the texture itself in the Root Signature. We simply inform the GPU how many resources, what kinds, and in what order they’ll be on the heap.
    /* In Terrain constructor */
    CD3DX12_ROOT_PARAMETER				paramsRoot[1];
    CD3DX12_DESCRIPTOR_RANGE			rangeRoot;
    CD3DX12_STATIC_SAMPLER_DESC			descSamplers[1];
    // set up the Root Signature.
    // create a descriptor table with 1 entry for the descriptor heap containing our SRV to the heightmap.
    rangeRoot.Init(D3D12_DESCRIPTOR_RANGE_TYPE_SRV, 1, 0);
    paramsRoot[0].InitAsDescriptorTable(1, &rangeRoot);
    // create our sampler.
    descSamplers[0].Init(0, D3D12_FILTER_MIN_MAG_MIP_LINEAR);
    rootDesc.Init(1, paramsRoot, 1, descSamplers, D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT);
    GFX->CreateRootSig(&rootDesc, mpRootSig);
    /* Graphics object method */
    // Create and pass back a pointer to a new root signature matching the provided description.
    void CreateRootSig(CD3DX12_ROOT_SIGNATURE_DESC* rootDesc, ID3D12RootSignature*& rootSig) {
    	ID3DBlob*	err;
    	ID3DBlob*	sig;
    	if (FAILED(D3D12SerializeRootSignature(rootDesc, D3D_ROOT_SIGNATURE_VERSION_1, &sig, &err))) {
    		throw GFX_Exception((char *)err->GetBufferPointer());
    	if (FAILED(mpDev->CreateRootSignature(0, sig->GetBufferPointer(), sig->GetBufferSize(), IID_PPV_ARGS(&rootSig)))) {
    		throw GFX_Exception("Failed to create Root Signature.");
  • Create a Pipeline State Object. This is the compiled pipeline for rendering our 2D height map. It will contain pointers to each of our compiled shaders and our Root Signature. It also contains other pipeline configurations such as the number of render targets, cull mode for back face culling of polygons, whether we’re using a depth or stencil buffer, what the input layout for vertex data looks like, and what sorts of primitives we’re assembling3.
    /* In Terrain constructor */
    D3D12_INPUT_LAYOUT_DESC inputLayoutDesc = {};
    // create input layout.
    D3D12_INPUT_ELEMENT_DESC inputLayout[] = {
    inputLayoutDesc.NumElements = sizeof(inputLayout) / sizeof(D3D12_INPUT_ELEMENT_DESC);
    inputLayoutDesc.pInputElementDescs = inputLayout;
    // create the pipeline state object
    psoDesc.InputLayout = inputLayoutDesc;
    psoDesc.pRootSignature = mpRootSig;
    psoDesc.VS = VSBytecode;
    psoDesc.PS = PSBytecode;
    psoDesc.PrimitiveTopologyType = D3D12_PRIMITIVE_TOPOLOGY_TYPE_TRIANGLE;
    psoDesc.RTVFormats[0] = DXGI_FORMAT_R8G8B8A8_UNORM;
    psoDesc.SampleDesc = sampleDesc;
    psoDesc.SampleMask = UINT_MAX;
    psoDesc.RasterizerState = CD3DX12_RASTERIZER_DESC(D3D12_DEFAULT);
    psoDesc.RasterizerState.CullMode = D3D12_CULL_MODE_NONE;
    psoDesc.BlendState = CD3DX12_BLEND_DESC(D3D12_DEFAULT);
    psoDesc.NumRenderTargets = 1;
    psoDesc.DepthStencilState.DepthEnable = false;
    psoDesc.DepthStencilState.StencilEnable = false;
    GFX->CreatePSO(&psoDesc, mpPSO);
    /* Graphics object method */
    // Create and pass back a pointer to a new Pipeline State Object matching the provided description.
    void CreatePSO(D3D12_GRAPHICS_PIPELINE_STATE_DESC* psoDesc, ID3D12PipelineState*& PSO) {
    	if (FAILED(mpDev->CreateGraphicsPipelineState(psoDesc, IID_PPV_ARGS(&PSO)))) {
    		throw GFX_Exception("Failed to CreateGraphicsPipeline.");

My Terrain object’s constructor takes care of all of this, with help from a pointer to the Graphics object. I didn’t want the Terrain object to need to worry about directly talking to the device, but that means the Graphics object has to act as an intermediary.

I had intended to talk about the actual code to draw the height map in 2D today, but we’re already over 2000 words and I have another 1000+ on drawing, so I guess we’ll end it here and I’ll post about drawing next time.

The latest code for this project can be found on GitHub.