开发者

Vertex shader world transform, why do we use 4 dimensional vectors?

From this site: http://www.toymaker.info/Games/html/vertex_shaders.html

We have the following code snippet:

// transformations provided by the app, constant Uniform data
float4x4 matWorldViewProj: WORLDVIEWPROJECTION;

// the format of our vertex data
struct VS_OUTPUT
{
  float4 Pos  : POSITION;
};

// Simple Vertex Shader - carry out transformation
VS_OUTPUT VS(float4 Pos  : POSITION)
{
  VS_OUTPUT Out = (VS_OUTPUT开发者_如何学编程)0;
  Out.Pos = mul(Pos,matWorldViewProj);
  return Out;
}

My question is: why does the struct VS_OUTPUT have a 4 dimensional vector as its position? Isn't position just x, y and z?


Because you need the w coordinate for perspective calculation. After you output from the vertex shader than DirectX performs a perspective divide by dividing by w.

Essentially if you have 32768, -32768, 32768, 65536 as your output vertex position then after w divide you get 0.5, -0.5, 0.5, 1. At this point the w can be discarded as it is no longer needed. This information is then passed through the viewport matrix which transforms it to usable 2D coordinates.

Edit: If you look at how a matrix multiplication is performed using the projection matrix you can see how the values get placed in the correct places.

Taking the projection matrix specified in D3DXMatrixPerspectiveLH

2*zn/w  0       0              0
0       2*zn/h  0              0
0       0       zf/(zf-zn)     1
0       0       zn*zf/(zn-zf)  0

And applying it to a random x, y, z, 1 (Note for a vertex position w will always be 1) vertex input value you get the following

x' = ((2*zn/w) * x) + (0 * y) + (0 * z) + (0 * w)
y' = (0 * x) + ((2*zn/h) * y) + (0 * z) + (0 * w)
z' = (0 * x) + (0 * y) + ((zf/(zf-zn)) * z) + ((zn*zf/(zn-zf)) * w)
w' = (0 * x) + (0 * y) + (1 * z) + (0 * w)

Instantly you can see that w and z are different. The w coord now just contains the z coordinate passed to the projection matrix. z contains something far more complicated.

So .. assume we have an input position of (2, 1, 5, 1) we have a zn (Z-Near) of 1 and a zf (Z-Far of 10) and a w (width) of 1 and a h (height) of 1.

Passing these values through we get

x' = (((2 * 1)/1) * 2
y' = (((2 * 1)/1) * 1
z' = ((10/(10-1)  * 5 + ((10 * 1/(1-10)) * 1)
w' = 5

expanding that we then get

x' = 4
y' = 2
z' = 4.4
w' = 5

We then perform final perspective divide and we get

x'' = 0.8
y'' = 0.4
z'' = 0.88
w'' = 1

And now we have our final coordinate position. This assumes that x and y ranges from -1 to 1 and z ranges from 0 to 1. As you can see the vertex is on-screen.

As a bizarre bonus you can see that if |x'| or |y'| or |z'| is larger than |w'| or z' is less than 0 that the vertex is offscreen. This info is used for clipping the triangle to the screen.

Anyway I think thats a pretty comprehensive answer :D

Edit2: Be warned i am using ROW major matrices. Column major matrices are transposed.


Rotation is specified by a 3 dimensional matrix and translation by a vector. You can perform both transforms in a "single" operation by combining them into a single 4 x 3 matrix:

rx1 rx2 rx3 tx1
ry1 ry2 ry3 ty1
rz1 rz2 rz3 tz1

However as this isn't square there are various operations that can't be performed (inversion for one). By adding an extra row (that does nothing):

0   0   0   1

all these operations become possible (if not easy).

As Goz explains in his answer by making the "1" a non identity value the matrix becomes a perspective transformation.


Clipping is an important part of this process, as it helps to visualize what happens to the geometry. The clipping stage essentially discards any point in a primitive that is outside of a 2-unit cube centered around the origin (OK, you have to reconstruct primitives that are partially clipped but that doesn't matter here).

It would be possible to construct a matrix that directly mapped your world space coordinates to such a cube, but gradual movement from the far plane to the near plane would be linear. That is to say that a move of one foot (towards the viewer) when one mile away from the viewer would cause the same increase in size as a move of one foot when several feet from the camera.

However, if we have another coordinate in our vector (w), we can divide the vector component-wise by w, and our primitives won't exhibit the above behavior, but we can still make them end up inside the 2-unit cube above.

For further explanations see http://www.opengl.org/resources/faq/technical/depthbuffer.htm#0060 and http://en.wikipedia.org/wiki/Transformation_matrix#Perspective_projection.

A simple answer would be to say that if you don't tell the pipeline what w is then you haven't given it enough information about your projection. This can be verified directly without understanding what the pipeline does with it...

As you probably know the 4x4 matrix can be split into parts based on what each part does. The 3x3 matrix at the top left is altered when you do rotation or scale operations. The fourth column is altered when you do a translation. If you ever inspect a perspective matrix, it alters the bottom row of the matrix. If you then look at how a Matrix-Vector multiplication is done, you see that the bottom row of the matrix ONLY affects the resultant w component of the vector. So if you don't tell the pipeline about w it won't have all your information.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜