Rendering Terrain Part 11 – A Bit of Refactoring

At the moment I don’t think anyone is actually reading this blog, so no one will really notice the changes I’ve made to the code. I still think it’s valuable to talk about them, though.
Until now, I’ve had the Terrain object handling creating and managing the Pipeline State Object, Root Signature, and resources necessary for rendering it, both in 2D and 3D. There are a couple of problems with doing it this way:

  1. The Terrain object needs to know and talk to the Graphics object. I’m not trying to make the Terrain completely independent of DirectX 12, but I’d rather not have objects talking directly to the Graphics layer. That’s kind of the point of having a Scene object to manage this stuff. Also, it means I would need to give a pointer to the Graphics object to every object in the scene. That’s kind of ridiculous.
  2. There are a lot of different resources needed to render the Terrain. We’ve got a height map texture, a vertex buffer, an index buffer, and a constant buffer. On top of that is the Root Signature and the Pipeline State Object. Later, we’ll be adding displacement/detail maps and some sort of material definition to determine the colour of the terrain. In a real engine, we’d have a memory budget, both in terms of system memory and video memory. Letting our objects initialize their resources themselves doesn’t really lend itself to memory management.
  3. Speaking of managing resources, one of the benefits of DirectX 12 is that you can manage everything yourself in engine, rather than letting a generic driver decide for you. This means better control over optimization, leading to improvements in performance. Some of these optimizations revolve around the Scene controller being able to decide how to divide the work load between threads. Another involves ensuring objects using the same shader pipelines are rendered at the same time to minimize swapping the pipeline states. These both imply that the Scene should be managing the pipelines.
  4. In order to let the Terrain manage its own pipeline, the Terrain has to be aware of things it shouldn’t have to be aware of. In my old code, I had to pass the Terrain the position of the camera, the view frustum, and the view/projection matrix so that that data could be added to the constant buffer. Why should the Terrain care about the camera? Shouldn’t it only care about how to draw itself? And this problem only gets worse as we move away from a hard-coded light source and adding shadows. Here’s more stuff that has nothing to do with the Terrain that we’d need to pass it.

Those are the issues I can think of, anyway. There may be more. Suffice it to say I had hacked together a bunch of code that was quickly becoming increasingly difficult to maintain and extend. I needed to make a change now before I continue any further. Adding shadows and a proper light source require the addition of new objects that, as I mentioned, really shouldn’t need to be passed to the Terrain.
To that end, I’ve torn the guts of initializing the pipeline and resources out of the Terrain object and moved them to the Scene object. This isn’t perfect. There probably should be some sort of memory manager object that controls every pointer everywhere, I probably shouldn’t have the file i/o code directly accessed by the Terrain, and the Terrain probably shouldn’t be directly storing the vertex and index buffer arrays. Hell, I probably shouldn’t even be giving the Terrain pointers to the height map, vertex buffer, and index buffer resources (ID32D12Resource*). It doesn’t need them. It doesn’t even use them any more. It just still feels to me like the Terrain should be cleaning up those resources itself on shut down. I’ll have to revisit that in future projects when I have more than one object to deal with, or if I decide to do a project with an actual memory budget.

The Terrain Object

So what does the Terrain object look like now? As I mentioned above, it still handles the file input to load the height map. That code hasn’t really changed. I was messing around with passing normals in with the height values, all in the same texture, but since I’m now calculating normals on the fly, I’ve reduced memory usage by cutting the height map down from four floats to one. We also don’t create the Shader Resource View, Descriptor Heap, or the Texture in the Terrain object.
Building the terrain mesh, the vertex and index buffers, still occurs in the Terrain object, but now the code for setting up the buffers on the GPU has been moved to the Scene, as has the constant buffer.
Because we’re not doing any graphics resource acquisition or uploading, there’s fewer member variables and fewer pointers to worry about cleaning up, so that’s nice. As I said above, I could probably eliminate a few more, but I left them here for now.
Our Draw code has been simplified significantly. I had three different draw calls for different modes. I’ve eliminated one mode as it was redundant and combined the other two into one draw method.

void Terrain::Draw(ID3D12GraphicsCommandList* cmdList, bool Draw3D) {
	if (Draw3D) {
		cmdList->IASetPrimitiveTopology(D3D_PRIMITIVE_TOPOLOGY_4_CONTROL_POINT_PATCHLIST); // describe how to read the vertex buffer.
		cmdList->IASetVertexBuffers(0, 1, &mVBV);
		cmdList->IASetIndexBuffer(&mIBV);

		cmdList->DrawIndexedInstanced(mIndexCount, 1, 0, 0, 0);
	} else {
		// draw in 2D
		cmdList->IASetPrimitiveTopology(D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST); // describe how to read the vertex buffer.
		cmdList->DrawInstanced(3, 1, 0, 0);
	}
}

There’s also a bunch of new Get and Set methods to facilitate the Scene initializing buffers for us.

size_t GetSizeOfVertexBuffer() { return mVertexCount * sizeof(Vertex); }
size_t GetSizeOfIndexBuffer() { return mIndexCount * sizeof(UINT); }
size_t GetSizeOfHeightMap() { return mWidth * mDepth * sizeof(float); }
UINT GetHeightMapWidth() { return mWidth; }
UINT GetHeightMapDepth() { return mDepth; }
float* GetHeightMapTextureData() { return maImage; }
Vertex* GetVertexArray() { return maVertices; }
UINT* GetIndexArray() { return maIndices; }
float GetScale() { return mHeightScale; }
	
void SetVertexBufferView(D3D12_VERTEX_BUFFER_VIEW vbv) { mVBV = vbv; }
void SetIndexBufferView(D3D12_INDEX_BUFFER_VIEW ibv) { mIBV = ibv; }
void SetHeightmapResource(ID3D12Resource* tex) { mpHeightmap = tex; }
void SetVertexBufferResource(ID3D12Resource* vb) { mpVertexBuffer = vb; }
void SetIndexBufferResource(ID3D12Resource* ib) { mpIndexBuffer = ib; }

The Scene Object

We make the assumption that the Scene knows about every possible object that could ever be rendered (in this case, literally just the Terrain) and what the possible pipelines for those objects would be. As this is meant to be the controller for the world, this assumption makes sense.
I did think about the possibility of making something general that wouldn’t know or care how to render a specific object. It would just ask the object for a pipeline specification. But that meant the object once again needed to know every possible way it would be rendered. I’d also need a way to express all of the possible variables. This was probably even messier than my original code and there was no way to optimize it as the Scene wouldn’t know what it was rendering or how. So I abandoned that approach pretty quickly.
I added a couple of new structs to represent our shader constant data. Before, we had something like this:

struct ConstantBuffer {
	XMFLOAT4X4	viewproj;
	XMFLOAT4	eye;
	XMFLOAT4	frustum[6];
	float		scale;
	float		width;
	float		depth;
};

Now, we’ve split these up into two separate structures. One for Terrain shader specific constants, and one for the more general constants.

struct PerFrameConstantBuffer {
	XMFLOAT4X4	viewproj;
	XMFLOAT4	eye;
	XMFLOAT4	frustum[6];
};

struct TerrainShaderConstants {
	float scale;
	float width;
	float depth;

	TerrainShaderConstants(float s, float w, float d) : scale(s), width(w), depth(d) {}
};

I’ve separated the constants into two structures because the perhaps poorly named PerFrameConstantBuffer will likely be needed for most shaders, and also changes every frame; whereas the TerrainShaderConstants never change and so only needs to be uploaded to the GPU once.

We’ve also created two methods for initializing the pipeline. One for 2D, and one for 3D. Other than taking into account these new buffers, they are identical to what used to be in the Terrain object. Initializing the Descriptor Heap and constant buffer are the same as before as well.
For the resources that are only going to the GPU once, I decided to try using a single upload buffer for all four resources (height map, vertex buffer, index buffer, terrain shader constant buffer). This wound up working pretty easily. I stuck it in a try/catch block so that if there isn’t enough contiguous video memory, the program will try to create separate upload buffers for each resource.

ID3D12GraphicsCommandList *cmdList = mpGFX->GetCommandList();
try {
	ID3D12Resource* upload;
	mpGFX->CreateUploadBuffer(upload, &CD3DX12_RESOURCE_DESC::Buffer(sizeofHeightmap + sizeofVertexBuffer + sizeofIndexBuffer + sizeofConstantBuffer));
	mlTemporaryUploadBuffers.push_back(upload);

	// upload heightmap data
	UpdateSubresources(cmdList, heightmap, upload, 0, 0, 1, &dataTex);

	// upload vertex buffer data
	UpdateSubresources(cmdList, vertexbuffer, upload, sizeofHeightmap, 0, 1, &dataVB);

	// upload index buffer data
	UpdateSubresources(cmdList, indexbuffer, upload, sizeofHeightmap + sizeofVertexBuffer, 0, 1, &dataIB);

	// upload the constant buffer data
	UpdateSubresources(cmdList, constantbuffer, upload, sizeofHeightmap + sizeofVertexBuffer + sizeofIndexBuffer, 0, 1, &dataCB);
} catch (GFX_Exception e) {
	// create 4 separate upload buffers
	ID3D12Resource *uploadHeightmap, *uploadVB, *uploadIB, *uploadCB;
	mpGFX->CreateUploadBuffer(uploadHeightmap, &CD3DX12_RESOURCE_DESC::Buffer(sizeofHeightmap));
	mpGFX->CreateUploadBuffer(uploadVB, &CD3DX12_RESOURCE_DESC::Buffer(sizeofVertexBuffer));
	mpGFX->CreateUploadBuffer(uploadIB, &CD3DX12_RESOURCE_DESC::Buffer(sizeofIndexBuffer));
	mpGFX->CreateUploadBuffer(uploadCB, &CD3DX12_RESOURCE_DESC::Buffer(sizeofConstantBuffer));
	mlTemporaryUploadBuffers.push_back(uploadHeightmap);
	mlTemporaryUploadBuffers.push_back(uploadVB);
	mlTemporaryUploadBuffers.push_back(uploadIB);
	mlTemporaryUploadBuffers.push_back(uploadCB);

	// upload heightmap data
	UpdateSubresources(cmdList, heightmap, uploadHeightmap, 0, 0, 1, &dataTex);

	// upload vertex buffer data
	UpdateSubresources(cmdList, vertexbuffer, uploadVB, 0, 0, 1, &dataVB);

	// upload index buffer data
	UpdateSubresources(cmdList, indexbuffer, uploadIB, 0, 0, 1, &dataIB);

	// upload constant buffer data
	UpdateSubresources(cmdList, constantbuffer, uploadCB, 0, 0, 1, &dataCB);
}

// set resource barriers to inform GPU that data is ready for use.
cmdList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(heightmap, D3D12_RESOURCE_STATE_COPY_DEST,	D3D12_RESOURCE_STATE_PIXEL_SHADER_RESOURCE | D3D12_RESOURCE_STATE_NON_PIXEL_SHADER_RESOURCE));
cmdList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(vertexbuffer, D3D12_RESOURCE_STATE_COPY_DEST,	D3D12_RESOURCE_STATE_VERTEX_AND_CONSTANT_BUFFER));
cmdList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(indexbuffer, D3D12_RESOURCE_STATE_COPY_DEST,	D3D12_RESOURCE_STATE_INDEX_BUFFER));
cmdList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(constantbuffer, D3D12_RESOURCE_STATE_COPY_DEST,	D3D12_RESOURCE_STATE_VERTEX_AND_CONSTANT_BUFFER));

The rest of the resource initialization remains the same.

As far as actually rendering the Terrain, I’ve created a separate Draw method called by the main Draw() that specifically handles the Terrain.

void Scene::DrawTerrain(ID3D12GraphicsCommandList* cmdList) {
	cmdList->SetPipelineState(mlPSOs[mDrawMode]);
	cmdList->SetGraphicsRootSignature(mlRootSigs[mDrawMode]);

	ID3D12DescriptorHeap* heaps[] = { mlDescriptorHeaps[0] };
	cmdList->SetDescriptorHeaps(_countof(heaps), heaps);

	// set the srv buffer.
	CD3DX12_GPU_DESCRIPTOR_HANDLE handleSRV(mlDescriptorHeaps[0]->GetGPUDescriptorHandleForHeapStart(), 1, msizeofDescHeapIncrement);
	cmdList->SetGraphicsRootDescriptorTable(0, handleSRV);

	if (mDrawMode == 1) {
		// set the per frame constant buffer.
		XMFLOAT4 frustum[6];
		C.GetViewFrustum(frustum);
		
		PerFrameConstantBuffer constants;
		constants.viewproj = C.GetViewProjectionMatrixTransposed();
		constants.eye = C.GetEyePosition();
		constants.frustum[0] = frustum[0];
		constants.frustum[1] = frustum[1];
		constants.frustum[2] = frustum[2];
		constants.frustum[3] = frustum[3];
		constants.frustum[4] = frustum[4];
		constants.frustum[5] = frustum[5];
		memcpy(mpPerFrameConstantsMapped, &constants, sizeof(PerFrameConstantBuffer));
				
		cmdList->SetGraphicsRootDescriptorTable(1, mlDescriptorHeaps[0]->GetGPUDescriptorHandleForHeapStart());
	
		// set the terrain shader constants
		CD3DX12_GPU_DESCRIPTOR_HANDLE handleCBV(mlDescriptorHeaps[0]->GetGPUDescriptorHandleForHeapStart(), 2, msizeofDescHeapIncrement);
		cmdList->SetGraphicsRootDescriptorTable(2, handleCBV);
	}
	
	// mDrawMode = 0/false for 2D rendering and 1/true for 3D rendering
	T.Draw(cmdList, (bool)mDrawMode);
}

the mDrawMode variable just tracks whether I am drawing in 2D or 3D. I chose to use an integer so that I could use it to index the Pipeline State and Root Signature lists. Initialization puts the 2D pipeline in slot 0 and the 3D in slot 1. We can then pass the integer as a boolean to T.Draw() and get the correct behaviour. This may change if I add more modes, but I think it will work, even with additional objects, as long as the Terrain pipelines are always the first and second in the list.

Performance

When I started refactoring the project, I wasn’t expecting any performance gains. In fact, I was somewhat worried I may make things slower inadvertently. My goal, as stated before, was just to make the code more manageable and more extensible moving forward. Still, because I was worried about making it slower, I did retest my frame rate after making the changes. I was quite surprised to see a slight improvement in frame rate.
Our 1024×1024 height map was rendering in Part 10 at its slowest at 533fps. That has improved to 556fps, or about 1.8ms per frame. Probably not more than a 0.1ms improvement, really. The average is more like 1.5ms, or around 640fps.
For the 2048×2048 height map, I struggled to eventually find a worst case location rendering at 3.6ms per frame, or 279fps. Compare that to about 3.8ms per frame, 266fps before. On average, we haven’t really gained. The frame rate fluctuates slightly as we move around, but stays around 3ms, 320fps, the same as last time.
We see the best improvements on the 4096×4096 map. Previously, we had a worst case of 10.9ms, 91fps. Now, this has improved to 9.6ms, or 104fps. The average is a bit closer. Our new average is about 7.6ms/130fps, vs 8ms/120fps.

I’m not entirely certain what I owe the improvement to. The biggest change was probably reducing the height map texture from four floats to one float. I would have thought that would only matter for our memory footprint, but perhaps it affected performance as well. Perhaps the texture fits in faster memory now. I don’t see how anything else I changed would account for any improvement at all, no matter how minor.

Anyway, that’s enough for now. I still have other changes to make. I’d like to change a few aspects of the Graphics class, and I need to create a new Light class. The Graphics changes aren’t a big thing. I’ll do that when I feel like it. I’d like to move on to shadows next post. I’ll need the Light for that. I’ve been looking at Shadow Maps. I think they may work. I’m a little concerned about using them across the whole Terrain, but we’ll see if it’ll work. Otherwise, I have some other ideas.

Until next time, the latest code is available at GitHub.

Traagen