QPF = QueryPerformanceFrequency();
QueryPerformanceCounter( &begin );
// Code that you are going to measure.
....
QueryPerformanceCounter( &end );
#ticks = begin - end
clock cycles = CPU speed * # ticks / QPF
time elapsed = # tick / QPF // time scale is SECOND.
--------------------------------------------------------------------------------------------------------------------
// Be carefully following APIs, they will cause command buffer to be emptied.
● When one of the lock methods (IDirect3DVertexBuffer9::Lock) is called on a vertex
buffer, index buffer, or texture (under certain conditions with certain flags).
● When a device or vertex buffer, index buffer, or texture is created.
● When a device or vertex buffer, index buffer, or texture is destroyed by the last
release.
● When IDirect3DDevice9::ValidateDevice is called.
● When IDirect3DDevice9::Present is called.
● When the command buffer fills up.
● When IDirect3DQuery9::GetData is called with D3DGETDATA_FLUSH.
--------------------------------------------------------------------------------------------------------------------
Summary
This paper demonstrates how to control the command buffer so that individual calls can
be accurately profiled. The profiling numbers can be generated in ticks, cycles, or
absolute time. They represent the amount of runtime and driver work associated with each
API call.
Start by profiling a Draw*Primitive call in a render sequence. Remember to:
1. Use QueryPerformanceCounter to measure the number of ticks per API call. Use
QueryPerformanceFrequency to convert the results to cycles or time if you like.
2. Use the query mechanism to empty the command buffer before starting.
3. Include the render sequence in a loop to minimize the impact of the mode
transition.
4. Use the query mechanism to measure when the GPU has completed its work.
5. Watch out for runtime concatenation that will have a major impact on the amount
of work done.
This gives you a baseline performance for IDirect3DDevice9::DrawPrimitive that can be
used to build from. To profile one state change, follow these additional tips:
1. Add the state change to a known render sequence profile the new sequence. Since
the testing is done in a loop, this requires setting the state twice into opposite
values (like enable and disable for instance).
2. Compare the difference in cycle times between the two sequences.
3. For state changes that significantly change the pipeline (like
IDirect3DDevice9::SetTexture), subtract the difference between the two sequences to get
the time for state change.
4. For state changes that significantly change the pipeline (and therefore require
toggling states like IDirect3DDevice9::SetRenderState), subtract the difference between
the render sequences and divide by 2. This will generate the average number of cycles
for each state change.
------------------------------------------------------------------------------------------------------------------------------------------------------
There are two important things to realize:
1) There is no single correct way to design or build an engine; if there was, we'd all be using that.
The architecture needs to fit the requirements and serve your needs.
2) Figuring out those needs is much easier if you're actually building a game, and not just an engine.
Building engines in isolation is folly unless one is already quite experienced (and even then it can be chore);
by focusing on building games and refactoring the engine components out from underneath those games
(they can be pretty simple), you end up with something much more robust that feels much better to use
because its proven to be usable, as opposed to engines designed in isolation which often are not as
usable as one would think.
Eberly's books are a good read.
posted on 2007-10-04 10:04
Konami wiki 阅读(151)
评论(0) 编辑 收藏 引用