I did a little work over the weekend to optimize the pipeline, specifically optimizing glDrawArrays() when GL_LIGHTING is enabled.Neoblast wrote:How much do you think it'd be possible by improving the pipeline?

First of all, when GL_LIGHTING is enabled, the Vertex Position and the Vertex Normal must be transformed into Model Space, and the Position and the Normal both need to be transformed by separate Matrices.

The standard Vertex Submission Pipeline will do this Per Vertex, meaning the Matrix Register gets re-loaded 2 times per Vertex. Each time the Matrix gets reloaded costs ~11 cycles. So the standard Vertex Submission Pipeline has an overhead of ~22 cycles just re-loading the Matrix Register.

I realized that can be avoided when Submitting Arrays, by making 3 separate passes over the data:

-Pass 1: Load Modelview Rotation Matrix, Transform all Vertex Normals

-Pass 2: Load Modelview Matrix, Transform all Vertex Positions

-Pass 3: Load Render Matrix, then Light then Transform Vertices

This optimization saves ~22 cycles Per Vertex when submitting Lit Vertices through glDrawArrays().

Another optimization I made over the weekend was reducing the number of Vertices output by my ZClipping algorithm.

By making 2 passes over the data, I am able to determine if the entire strip is inside or outside of the Clipping window.

If so, the original strip will be output, instead of being broken down into single triangles and destroying the efficiency of the strip, as was being done previously.

Some new screens, testing all of the changes I made over the weekend:

(1 dynamic light on the terrain)

(3 dynamic lights on the terrain)