<< Computer Graphics from scratch
<< 计算机图形学从无到有 pouei.com

Describing and Rendering a Scene 描述和渲染一个场景

In the last few chapters, we’ve developed algorithms to draw 2D triangles on the canvas given their 2D coordinates, and we’ve explored the math required to transform the 3D coordinates of points in the scene to the 2D coordinates of points on the canvas.
在过去的几章中，我们已经开发了在画布上绘制2D三角形的算法，并探索了将场景中各点的3D坐标转换为画布上各点的2D坐标所需的数学方法。

At the end of the previous chapter, we cobbled together a program that used both to render a 3D cube on the 2D canvas. In this chapter, we’ll formalize and extend that work with the goal of rendering a whole scene containing an arbitrary number of objects.
在上一章的末尾，我们拼凑了一个程序，用这两个程序在2D画布上渲染一个3D立方体。在这一章中，我们将把这项工作正式化并加以扩展，目的是渲染一个包含任意数量物体的整个场景。

Representing a Cube 表示一个立方体

Let’s think again about how to represent and manipulate a cube, this time with the goal of finding a more general approach. The edges of our cube are 2 units long and are parallel to the coordinate axes, and it’s centered on the origin, as shown in Figure 10-1.
让我们再次思考如何表示和操作一个立方体，这次的目标是找到一种更普遍的方法。如图10-1所示，我们的立方体的边长为2个单位，与坐标轴平行，并且以原点为中心。

Figure 10-1: Our standard cube 图10-1：我们的标准立方体

These are the coordinates of its vertices:
这些是其顶点的坐标。

$A = (1, 1, 1) a = (1,1,1) 𝐴 = ( 1 , 1 , 1 )$

$B = (-1, 1, 1) b = (-1,1,1) 𝐵 = ( -1 , 1 , 1 )$

$C = (-1, -1, 1) c=(-1,-1,1) 𝐶 = ( - 1 , - 1 , 1 )$

$D = (1, -1, 1) d=(1,-1,1) 𝐷 = ( 1 , -1 , 1 )$

$E = (1, 1, -1) e=(1,1,-1) 𝐸 = ( 1 , 1 , - 1 )$

$F = (-1, 1, -1) f=(-1,1,-1) 𝐹 = ( - 1 , 1 , - 1 )$

$G = (-1, -1, -1)$

$H = (1, -1, -1) h=(1,-1,-1) 𝐻=( 1 , -1 , -1 )$

The sides of the cube are square, but the algorithms we have developed work with triangles. One of the reasons we chose triangles in the first place is that any other polygon, including squares, can be decomposed into triangles. So we’ll represent each square side of the cube using two triangles.
立方体的边是正方形，但我们开发的算法是针对三角形的。我们首先选择三角形的原因之一是，任何其他多边形，包括正方形，都可以分解成三角形。所以我们将用两个三角形来表示立方体的每个正方形边。

However, we can’t take any three vertices of the cube and expect them to describe a triangle on its surface (for example, ADG is inside the cube). This means that the vertex coordinates, by themselves, don’t fully describe the cube: we also need to know which sets of three vertices describe the triangles that make up its sides.
然而，我们不能拿立方体的任何三个顶点来期望它们描述其表面的一个三角形（例如，ADG在立方体内部）。这意味着顶点坐标本身并不能完全描述立方体：我们还需要知道哪三组顶点描述了构成其侧面的三角形。

Here’s a possible list of triangles for our cube:
下面是我们的立方体可能的三角形列表。

A, B, C
A, C, D
E, A, D
E, D, H
F, E, H
F, H, G
B, F, G
B, G, C
E, F, B
E, B, A
C, G, H
C, H, D

This suggests a generic structure we can use to represent any object made of triangles: a Vertices list, holding the coordinates of each vertex; and a Triangles list, specifying which sets of three vertices describe triangles on the surface of the object.
这表明我们可以用一个通用的结构来表示任何由三角形组成的物体：一个 Vertices 列表，保存每个顶点的坐标；和一个 Triangles 列表，指定哪一组三个顶点描述物体表面的三角形。

Each entry in the Triangles list may include additional information besides the vertices that make it up; for example, this would be the perfect place to specify the color of each triangle.
除了组成它的顶点之外， Triangles 列表中的每个条目都可以包括额外的信息；例如，这将是指定每个三角形的颜色的完美地方。

Since the most natural way to store this information is in two lists, we’ll use list indices to refer to the vertices in the vertex list. So our cube would be represented like this:
由于存储这些信息的最自然方式是两个列表，我们将使用列表索引来指称顶点列表中的顶点。所以我们的立方体将被表示成这样。

Vertices
0 = ( 1,  1,  1)
1 = (-1,  1,  1)
2 = (-1, -1,  1)
3 = ( 1, -1,  1)
4 = ( 1,  1, -1)
5 = (-1,  1, -1)
6 = (-1, -1, -1)
7 = ( 1, -1, -1)

Triangles
 0 = 0, 1, 2, red
 1 = 0, 2, 3, red
 2 = 4, 0, 3, green
 3 = 4, 3, 7, green
 4 = 5, 4, 7, blue
 5 = 5, 7, 6, blue
 6 = 1, 5, 6, yellow
 7 = 1, 6, 2, yellow
 8 = 4, 5, 1, purple
 9 = 4, 1, 0, purple
10 = 2, 6, 7, cyan
11 = 2, 7, 3, cyan

Rendering an object with this representation is quite simple: we first project every vertex, storing them in a temporary projected vertices list (since each vertex is used an average of four times, this avoids a lot of repeated work); then we go through the triangle list, rendering each individual triangle. A first approximation would look like this:
用这种表示法渲染一个物体是非常简单的：我们首先投影每个顶点，将它们存储在一个临时的投影顶点列表中（因为每个顶点平均被使用四次，这避免了大量的重复工作）；然后我们通过三角形列表，渲染每个单独的三角形。第一个近似值是这样的。

RenderObject(vertices, triangles) {
    projected = []
    for V in vertices {
        projected.append(ProjectVertex(V))
    }
    for T in triangles {
        RenderTriangle(T, projected)
    }
}

RenderTriangle(triangle, projected) {
    DrawWireframeTriangle(projected[triangle.v[0]],
                          projected[triangle.v[1]],
                          projected[triangle.v[2]],
                          triangle.color)
}

Listing 10-1: An algorithm to render any object made of triangles
清单10-1：一个渲染任何由三角形组成的物体的算法

We can go ahead and apply this directly to the cube as defined above, but the results won’t look good. This is because some of its vertices are behind the camera, which, as we discussed in the previous chapter, is a recipe for weird things. And if you look at the vertex coordinates and Figure 10-1, you’ll notice the coordinate origin, the position of our camera, is inside the cube.
我们可以按照上面的定义直接将其应用到立方体上，但结果看起来并不理想。这是因为它的一些顶点在摄像机后面，正如我们在上一章所讨论的那样，这是会产生奇怪的东西的。如果你看一下顶点坐标和图10-1，你会发现坐标原点，也就是我们的摄像机的位置，是在立方体里面。

To work around this problem, we’ll just move the cube. To do this, we need to move each vertex of the cube in the same direction. Let’s call this direction $\vec{T}$ , for “translation.” We’ll translate the cube 7 units forward to make sure it’s completely in front of the camera. We’ll also translate it 1.5 units to the left to make it look more interesting. Since “forward” is the direction of $\vec{Z_+}$ and “left” is $\vec{X-}$ , the translation vector is simply
为了解决这个问题，我们只需移动立方体。要做到这一点，我们需要将立方体的每个顶点向同一方向移动。我们把这个方向称为T⃗𝑇→，表示 "平移"。我们将把立方体向前平移7个单位，以确保它完全处于摄像机的前面。我们还将把它向左平移1.5个单位，使它看起来更有趣。由于 "向前 "是Z + → 𝑍 + →的方向，"向左 "是X- → 𝑋 - →的方向，所以平移矢量就是

\vec{T} = (\begin{matrix} - 1.5 \\ 0 \\ 7 \end{matrix})t ⃗ = ⎛ ⎝ ⎜ -1.5 0 7 ⎞ ⎠ ⎟ ⎟ 𝑇 \to = ( - 1.5 0 7 )

$\vec{T} = \begin{pmatrix} -1.5 \\ 0 \\ 7 \end{pmatrix}$

To compute the translated version $V\, '$ of each vertex $V$ in the cube, we just need to add the translation vector to it:
为了计算立方体中每个顶点V𝑉的翻译版本V′𝑉′，我们只需要把翻译矢量加到它上面。

V^{'} = V + \overset{\tov ' =v+ t ⃗ 𝑉 ' =𝑉 + 𝑇 \to}{T}

$V\, ' = V + \vec{T}$

At this point, we can take the cube, translate each vertex, and then apply the algorithm in Listing 10-1 to get our first 3D cube (Figure 10-2).
在这一点上，我们可以把立方体，平移每个顶点，然后应用清单10-1中的算法，得到我们的第一个三维立方体（图10-2）。

Figure 10-2: Our cube, translated in front of the camera, rendered with wireframe triangles
图10-2：我们的立方体，在摄像机前平移，用线框三角形进行渲染

Source code and live demo >>
源代码和现场演示 >>

Models and Instances 模型和实例

What if we want to render two cubes? A naive approach would be to create a new set of vertices and triangles describing a second cube. This would work, but it would waste a lot of memory. What if we wanted to render one million cubes?
如果我们想渲染两个立方体怎么办？一个天真的方法是创建一个新的顶点和三角形集来描述第二个立方体。这样做是可行的，但会浪费大量的内存。如果我们想渲染一百万个立方体呢？

A better approach is to think in terms of models and instances. A model is a set of vertices and triangles that describes a certain object in a generic way (think “a cube has eight vertices and six sides”). An instance of a model, on the other hand, describes a concrete occurrence of that model within the scene (think “there’s a cube at (0, 0, 5)”).
一个更好的方法是用模型和实例的方式来思考。一个模型是一组顶点和三角形，它以一种通用的方式描述某个物体（想想 "一个立方体有八个顶点和六个边"）。另一方面，一个模型的实例描述了该模型在场景中的具体出现（想想 "在（0，0，5）有一个立方体"）。

How do we apply this idea in practice? We can have a single description of each unique object in the scene and then place multiple copies of it by specifying their coordinates. Informally, it would be like saying, “This is what a cube looks like, and there’s cubes here, here and there.”
我们如何在实践中应用这个想法呢？我们可以对场景中每个独特的物体进行单一的描述，然后通过指定它们的坐标来放置多个副本。非正式地说，这就像说："这就是一个立方体的样子，这里、这里和那里都有立方体。"

This is a rough approximation of how we’d describe a scene using this approach:
这是我们使用这种方法描述一个场景的粗略近似值。

model {
    name = cube
    vertices {
        ...
    }
    triangles {
        ...
    }
}

instance {
    model = cube
    position = (-1.5, 0, 7)
}

instance {
    model = cube
    position = (1.25, 2, 7.5)
}

In order to render this, we just go through the list of instances; for each instance, we make a copy of the model’s vertices, translate them according to the position of the instance, and then render them as before (Listing 10-2):
为了渲染，我们只需浏览实例列表；对于每个实例，我们复制模型的顶点，根据实例的位置对其进行平移，然后像以前一样进行渲染（清单10-2）。

RenderScene() {
    for I in scene.instances {
        RenderInstance(I);
    }
}

RenderInstance(instance) {
    projected = []
    model = instance.model
    for V in model.vertices {
        V' = V + instance.position
        projected.append(ProjectVertex(V'))
    }
    for T in model.triangles {
        RenderTriangle(T, projected)
    }
}

Listing 10-2: An algorithm to render a scene that can contain multiple instances of several objects, each in a different position
清单10-2：一个渲染场景的算法，该场景可以包含几个对象的多个实例，每个实例都在不同的位置上。

If we want this to work as expected, the coordinates of the vertices on the model should be defined in a coordinate system that “makes sense” for the object; we’ll call this coordinate system model space. For example, we defined our cube such that its center was (0, 0, 0); this means that when we say “a cube located at (1, 2, 3),” we mean “a cube centered around (1, 2, 3).”
如果我们想让它按预期工作，模型上顶点的坐标应该定义在一个对物体 "有意义 "的坐标系中；我们将这个坐标系称为模型空间。例如，我们定义了我们的立方体，它的中心是（0，0，0）；这意味着当我们说 "一个位于（1，2，3）的立方体 "时，我们的意思是 "一个以（1，2，3）为中心的立方体。"

After applying the instance translation to the vertices defined in model space, the transformed vertices are now expressed in the coordinate system of the scene; we’ll call this coordinate system world space.
在对模型空间中定义的顶点应用实例平移后，转换后的顶点现在被表达在场景的坐标系中；我们将这个坐标系称为世界空间。

There are no hard and fast rules to define a model space; it depends on the needs of your application. For example, if you have the model of a person, it might be sensible to place the origin of the coordinate system at their feet.
定义一个模型空间没有硬性规定；这取决于你的应用需求。例如，如果你有一个人的模型，把坐标系的原点放在他们的脚下可能是明智的。

Figure 10-3 shows a simple scene with two instances of our cube.
图10-3显示了一个有两个立方体实例的简单场景。

Figure 10-3: A scene with two instances of the same cube model, placed in different positions
图10-3：有两个相同立方体模型实例的场景，放置在不同的位置。

Source code and live demo >>
源代码和现场演示 >>

Model Transform 模型转换

The scene definition we described above doesn’t give us a lot of flexibility. Since we can only specify the position of a cube, we could instantiate as many cubes as we wanted, but they would all be facing the same direction. In general, we want to have more control over the instances: we also want to specify their orientation and possibly their scale.
我们上面描述的场景定义并没有给我们很大的灵活性。由于我们只能指定一个立方体的位置，我们可以随心所欲地实例化许多立方体，但它们都会朝向同一个方向。一般来说，我们希望对实例有更多的控制：我们还希望指定它们的方向，可能还有它们的比例。

Conceptually, we can define a model transform with these three elements: a scaling factor, a rotation around the origin in model space, and a translation to a specific point in the scene:
从概念上讲，我们可以用这三个元素来定义一个模型变换：一个缩放系数，一个围绕模型空间原点的旋转，以及一个翻译到场景中的一个特定点。

instance {
    model = cube
    transform {
        scale = 1.5
        rotation = <45 degrees around the Y axis>
        translation = (1, 2, 3)
    }
}

We can extend the algorithm in Listing 10-2 to accommodate the new transforms. However, the order in which we apply the transforms is important; in particular, the translation must be done last. This is because most of the time we want to rotate and scale the instances around their origin in model space, so we need to do that before they’re transformed into world space.
我们可以扩展清单10-2中的算法以适应新的变换。然而，我们应用变换的顺序是很重要的；特别是，平移必须最后进行。这是因为在大多数情况下，我们要围绕模型空间中的原点旋转和缩放实例，所以我们需要在它们被转换到世界空间之前完成这些工作。

To understand the difference in the results, take a look at Figure 10-4, which shows a $45^\circ$ rotation around the origin followed by a translation along the Z axis.
为了理解结果的不同，请看图10-4，它显示了绕原点旋转45∘，然后沿Z轴平移。

Figure 10-4: Applying rotation and then translation
图10-4：先应用旋转，再应用平移

Figure 10-5 shows the translation applied before the rotation
图10-5显示了在旋转之前应用的平移。

Figure 10-5: Applying translation and then rotation
图10-5：应用平移，然后旋转

Strictly speaking, given a rotation followed by a translation, we can find a translation followed by a rotation (perhaps not around the origin) that achieves the same result. However, it’s far more natural to express this kind of transform using the first form.
严格地说，给定一个旋转后的平移，我们可以找到一个平移后的旋转（也许不是围绕原点），达到同样的结果。然而，用第一种形式来表达这种变换要自然得多。

We can write a new version of RenderInstance that supports scale, rotation, and position:
我们可以写一个新版本的 RenderInstance ，支持缩放、旋转和位置。

RenderInstance(instance) {
    projected = []
    model = instance.model
    for V in model.vertices {
        V' = ApplyTransform(V, instance.transform)
        projected.append(ProjectVertex(V'))
    }
    for T in model.triangles {
        RenderTriangle(T, projected)
    }
}

Listing 10-3: An algorithm to render a scene that can contain multiple instances of several objects, each with a different transform
清单10-3：一个渲染场景的算法，该场景可以包含几个对象的多个实例，每个实例都有一个不同的变换。

The ApplyTransform method looks like this:
ApplyTransform 方法看起来像这样。

ApplyTransform(vertex, transform) {
    scaled = Scale(vertex, transform.scale)
    rotated = Rotate(scaled, transform.rotation)
    translated = Translate(rotated, transform.translation)
    return translated
}

Listing 10-4: A function that applies transforms to a vertex in the correct order
清单10-4：一个按正确顺序对顶点应用变换的函数

Camera Transform 相机转换

The previous sections explored how we can position instances of models at different points in the scene. In this section, we’ll explore how to move and rotate the camera within the scene.
前几节探讨了我们如何在场景中的不同点定位模型实例。在本节中，我们将探讨如何在场景中移动和旋转摄像机。

Imagine you’re a camera floating in the middle of a completely empty coordinate system. Suddenly, a red cube appears exactly in front of you (Figure 10-6).
想象一下，你是一个漂浮在一个完全空旷的坐标系中间的摄像机。突然，一个红色的立方体正好出现在你的面前（图10-6）。

Figure 10-6: A red cube appears in front of the camera.
图10-6：一个红色的立方体出现在摄像机前面。

A second later, the cube moves 1 unit toward you (Figure 10-7).
一秒钟后，这个立方体向你移动了1个单位（图10-7）。

Figure 10-7: The red cube moves toward the camera . . . or does it?
图10-7：红色立方体向摄像机移动......或者说是向摄像机移动。......或者说它是？

But did the cube really move 1 unit toward you? Or did you move 1 unit toward the cube? Since there are no points of reference at all, and the coordinate system isn’t visible, there’s no way to tell just by looking at what you see, because the relative position of the cube and the camera are identical in both cases (Figure 10-8).
但这个立方体真的向你移动了1个单位吗？还是你向立方体移动了1个单位？由于根本就没有参考点，而且坐标系也不可见，所以光看你看到的东西是无法判断的，因为在两种情况下，立方体和摄像机的相对位置是相同的（图10-8）。

Figure 10-8: Without the coordinate system, we can’t tell whether it was the object or the camera that moved.
图10-8：没有坐标系，我们无法判断是物体还是摄像机在移动。

Now the cube rotates around you $45^\circ$ clockwise. Or does it? Maybe it was you who rotated $45^\circ$ counterclockwise? Again, there’s no way to tell (Figure 10-9).
现在，这个立方体绕着你顺时针旋转了45∘45∘。或者是这样？也许是你逆时针旋转了45∘45∘？同样，我们也无从得知（图10-9）。

Figure 10-9: Without the coordinate system, we can’t tell whether it was the object or the camera that rotated.
图10-9：没有坐标系，我们无法判断是物体还是摄像机在旋转。

What this thought experiment shows is that there’s no difference between moving the camera around a fixed scene and keeping the camera fixed while rotating and translating the scene around it!
这个思想实验所显示的是，在一个固定的场景周围移动摄像机和在保持摄像机固定的同时旋转和平移周围的场景是没有区别的

The advantage of this clearly self-centered vision of the universe is that by keeping the camera fixed at the origin and pointing at $\vec{Z_+}$ , we can use the projection equations derived in the previous chapter without any modification. The coordinate system of the camera is called the camera space.
这种明显以自我为中心的宇宙观的好处是，通过保持照相机固定在原点并指向Z + → 𝑍 + →，我们可以使用前一章中得出的投影方程，而无需任何修改。照相机的坐标系被称为照相机空间。

Let’s assume the camera also has a transform attached to it, consisting of translation and rotation. In order to render the scene from the point of view of the camera, we need to apply the opposite transforms to each vertex of the scene:
让我们假设摄像机也有一个附加的变换，由平移和旋转组成。为了从摄像机的视角来渲染场景，我们需要对场景的每个顶点应用相反的变换。

V_{t r a n s l a t e d} = V_{s c e n e} - c a m e r a . t r a n s l a t i o nV translated = V scene -camera.translation 𝑉 𝑡 𝑟 𝑎 𝑛 𝑠 𝑙 𝑎 𝑡 𝑒 𝑑 = 𝑉 𝑠 𝑐 𝑛 𝑒 - 𝑐 𝑎 𝑚 𝑒 𝑟 𝑎.𝑡 𝑟 𝑎 𝑛 𝑠 𝑙 𝑎 𝑡 𝑖 𝑜 𝑛。

$V_{translated} = V_{scene} - camera.translation$

V_{c a m_s p a c e} = i n v e r s e (c a m e r a . r o t a t i o n) \cdot V_{t r a n s l a t e dV cam_space =inverse(camera.rotation)⋅ V translated 𝑉 𝑐 𝑎 𝑚 𝑠 𝑝 𝑎 𝑐 𝑒 = 𝑖 𝑛 𝑣 𝑒 𝑠 𝑒 ( 𝑐 𝑎 𝑚 𝑒 𝑟 𝑎 . 𝑟 𝑜 𝑡 𝑎 𝑡 𝑖 𝑜 𝑛 ) ⋅𝑉 𝑡 𝑟 帝国 𝑛 𝑠 𝑙 帝国 𝑡 𝑒}

$V_{cam\_space} = inverse(camera.rotation) \cdot V_{translated}$

V_{p r o j e c t e d} = p e r s p e c t i v e_p r o j e c t i o n (V_{c a m_s p a c e})V projected =perspective_projection( V cam_space ) 𝑉 𝑝 𝑟 𝑜 𝑗 𝑒 𝑐 𝑡 𝑒 𝑑 = 𝑝 𝑒 𝑟 𝑠 𝑝 𝑐 𝑡 𝑖 𝑜 𝑛 ( 𝑉 𝑐 𝑎 𝑚 _ 𝑠 𝑝 𝑎 𝑒 )

$V_{projected} = perspective\_projection(V_{cam\_space})$

Note that we represent rotations using rotation matrices. Please refer to Appendix [ch:linear_algebra_appendix] for more details about this.
注意，我们用旋转矩阵来表示旋转。请参考附录[ch:linear_algebra_appendix]了解更多相关细节。

The Transform Matrix 变换矩阵

Now that we can move both the camera and the instances around the scene, let’s take a step back and consider everything that happens to a vertex $V_{model}$ in model space until it’s projected into the canvas point $(cx, cy)$ .
现在，我们可以在场景中移动摄像机和实例，让我们退一步考虑一个顶点V模型𝑉 𝑚 𝑜 𝑑 𝑒 𝑙在模型空间发生的一切，直到它被投影到画布点（cx，cy）( 𝑐 𝑥 , 𝑐 𝑦 ) 。

We first apply the model transform to go from model space to world space:
我们首先应用模型转换，从模型空间到世界空间。

V_{m o d e l_s c a l e d} = i n s t a n c e . s c a l e \cdot V_{m o d e lV model_scaled =instance.scale⋅ V model 𝑉 𝑚 𝑜 𝑑 𝑒 𝑙 _ 𝑐 𝑎 𝑙 𝑒 𝑑 = 𝑖 𝑛 𝑠 𝑡 𝑎 𝑛 𝑐 𝑒 。𝑠 𝑐 𝑙 𝑒 ⋅ 𝑉 𝑚 𝑜 𝑑 𝑙}

$V_{model\_scaled} = instance.scale \cdot V_{model}$

V_{m o d e l_r o t a t e d} = i n s t a n c e . r o t a t i o n \cdot V_{m o d e l_s c a l e dV model_rotated =instance.rotation⋅ V model_scaled 𝑉 𝑚 𝑜 𝑑 𝑒 𝑙 _ 𝑟 𝑜 𝑡 𝑎 𝑒 𝑑 = 𝑖 𝑛 𝑠 𝑡 𝑎 𝑛 𝑐 𝑒 。𝑟 𝑜 𝑡 𝑎 𝑡 𝑖 𝑜 𝑛 ⋅ 𝑉 𝑚 𝑜 𝑑 𝑒 𝑙 _ 𝑠 𝑐 𝑎 𝑙 𝑑}

$V_{model\_rotated} = instance.rotation \cdot V_{model\_scaled}$

V_{w o r l d} = V_{m o d e l_r o t a t e d} + i n s t a n c e . t r a n s l a t i o nV world = V model_rotated +instance.translation 𝑉 𝑤 𝑜 𝑟 𝑙 𝑑 = 𝑉 𝑚 𝑜 𝑒 𝑙 𝑜 𝑡 𝑎 𝑡02 𝑑 + 𝑖 𝑛 𝑠 𝑡 𝑎 𝑛 𝑒 .𝑟 𝑎 𝑛 𝑠 𝑙 𝑎 𝑜 𝑛

$V_{world} = V_{model\_rotated} + instance.translation$

Then we apply the camera transform to go from world space to camera space:
然后我们应用相机转换，从世界空间到相机空间。

V_{t r a n s l a t e d} = V_{w o r l d} - c a m e r a . t r a n s l a t i o nV translated = V world -camera.translation 𝑉 𝑡 𝑟 𝑎 𝑛 𝑠 𝑙 𝑎 𝑡 𝑒 𝑑 = 𝑉 𝑤 𝑜 𝑟 𝑙 𝑑 - 𝑐 𝑚 𝑎 𝑒 .𝑡 𝑟 𝑎 𝑛 𝑠 𝑙 𝑎 𝑡 𝑖 𝑜 𝑛

$V_{translated} = V_{world} - camera.translation$

V_{c a m e r a} = i n v e r s e (c a m e r a . r o t a t i o n) \cdot V_{t r a n s l a t e dV camera =inverse(camera.rotation)⋅ V translated 𝑉 𝑐 𝑎 𝑚 𝑒 𝑟 𝑎 = 𝑖 𝑛 𝑣 𝑒 𝑟 𝑠 𝑒 ( 𝑐 𝑎 𝑚 𝑟 𝑎 . 𝑟 𝑜 𝑡 𝑎 𝑡 𝑛 𝑜 ) ⋅𝑉 𝑡_01 𝑎 𝑛 𝑠 𝑙 𝑎 𝑡 𝑒 𝑑}

$V_{camera} = inverse(camera.rotation) \cdot V_{translated}$

Next, we apply the perspective equations to get viewport coordinates:
接下来，我们应用透视方程来获得视口坐标。

v_{x} = \frac{V_{c a m e r a} x \cdot d}{V_{c a m e r a} zv x = V camera x⋅d V camera z 𝑣 𝑥 = 𝑉 𝑐 𝑎 𝑚 𝑒 𝑟 𝑎 𝑑 𝑉 𝑐 𝑎 𝑚 𝑒 𝑟 𝑧}

$v_x = \{\{V_{camera}x \cdot d} \over {V_{camera}z\}\}$

v_{y} = \frac{V_{c a m e r a} y \cdot d}{V_{c a m e r a} zv y = V camera y⋅d V camera z 𝑣 𝑦 = 𝑉 𝑐 𝑎 𝑚 𝑒 𝑟 𝑦 𝑑 𝑉 𝑐 𝑎 𝑚 𝑒 𝑟 𝑎 𝑧}

$v_y = \{\{V_{camera}y \cdot d} \over {V_{camera}z\}\}$

And finally we map the viewport coordinates to canvas coordinates:
最后我们将视口坐标映射到画布坐标。

c_{x} = \frac{v_{x} \cdot c_{w}}{v_{wc x = v x ⋅ c w v w 𝑐 𝑥 = 𝑣 ⋅ 𝑐 𝑤 𝑣 𝑤}}

$c_x = \{\{v_x \cdot c_w} \over {v_w\}\}$

c_{y} = \frac{v_{y} \cdot c_{h}}{v_{hc y = v y ⋅ c h v h 𝑐 𝑦 = 𝑣 𝑦 ⋅ 𝑐 𝑐 ℎ 𝑣 ℎ}}

$c_y = \{\{v_y \cdot c_h} \over {v_h\}\}$

As you can see, it’s a lot of computation and a lot of intermediate values for each vertex. Wouldn’t it be nice if we could reduce all of that to a more compact and efficient form?
正如你所看到的，这是一个大量的计算和每个顶点的大量中间值。如果我们能将所有这些减少到一个更紧凑和有效的形式，那不是很好吗？

Let’s express the transforms as functions that take a vertex and return a transformed vertex. Let $C_T$ and $C_R$ be the camera translation and rotation; $I_R$ , $I_S$ , and $I_T$ the instance rotation, scale, and translation; $P$ the perspective projection; and $M$ the viewport-to-canvas mapping. If $V$ is the original vertex and $V\, '$ is the point on the canvas, we can express all the equations above like this:
让我们把变换表示为接受一个顶点并返回一个变换后的顶点的函数。让C T 𝐶 𝑇和C R 𝐶 𝑅表示相机的平移和旋转；I R 𝐼 𝑅、I S 𝐼 𝑆和I T 𝐼 𝑇表示实例旋转、缩放和翻译；P 𝑃表示视角投影；M𝑀表示视口-画布映射。如果V𝑉是原始顶点，V′𝑉′是画布上的点，我们可以这样表达上面的所有方程式。

V^{'} = M (P (C_{R}^{- 1} (C_{T}^{- 1} (I_{T} (I_{R} (I_{S} (V)))))))v ' =m(p( c -1 r ( c -1 t ( i t ( i r ( i s ( v))))))) 𝑉 ' = 𝑀 ( 𝑅 - 1 ( 𝐶 𝑇 - 1 ( 𝐼 𝑇 ( 𝐼 𝑅 ) ))))))

$V\, ' = M(P(C_R^{-1}(C_T^{-1}(I_T(I_R(I_S(V)))))))$

Ideally, we’d like a single transform $F$ that does whatever the series of original transforms does, but that has a much simpler expression:
理想情况下，我们希望有一个单一的变换F𝐹，它能做一系列原始变换所做的任何事情，但它有一个更简单的表达式。

F = M \cdot P \cdot C_{R}^{- 1} \cdot C_{T}^{- 1} \cdot I_{T} \cdot I_{R} \cdot I_{Sf=m⋅p⋅ c -1 r ⋅ c -1 t ⋅ i t ⋅ i r ⋅ i s 𝐹 = 𝑀 ⋅ 𝑃 ⋅ 𝐶 𝑅 - 1 ⋅ 𝑇 - 1 ⋅ 𝐼 𝑇 ⋅ 𝐼 𝑅 𝐼 𝑆}

$F = M \cdot P \cdot C_R^{-1} \cdot C_T^{-1} \cdot I_T \cdot I_R \cdot I_S$

V^{'} = F (V)v ' =f(v) 𝑉 ' = 𝐹 ( 𝑉 )

$V\, ' = F(V)$

Finding a simple way to represent $F$ isn’t trivial. Our main obstacle is that we express each transform in a different way: we express translation as the sum of a point and a vector, rotation as the multiplication of a matrix and a point, scaling as the multiplication of a real number and a point, and perspective projection as real number multiplications and divisions. But if we could express all the transforms in the same way, and if such a way had a mechanism to compose transforms, we’d get the simple transform we want.
找到一种简单的方式来表示F𝐹并不容易。我们的主要障碍是我们用不同的方式来表达每个变换：我们把平移表达为一个点和一个矢量的总和，把旋转表达为一个矩阵和一个点的乘法，把缩放表达为一个实数和一个点的乘法，把透视投影表达为实数乘法和除法。但是，如果我们能够用同样的方式来表达所有的变换，而且这样的方式有一种机制来组合变换，我们就会得到我们想要的简单变换。

Homogeneous Coordinates 同质坐标

Consider the expression $A = (1, 2, 3)$ . Does $A$ represent a 3D point or a 3D vector? If we don’t know the context in which $A$ is used, there’s no way to know.
考虑表达式A=（1,2,3）𝐴=（1 , 2 , 3）。A𝐴是代表一个三维点还是一个三维矢量？如果我们不知道A 𝐴是在什么情况下使用的，就没有办法知道。

But let’s add a fourth value, called $w$ , to mark $A$ as a point or a vector. If $w = 0$ , it’s a vector; if $w = 1$ , it’s a point. So the point $A$ is unambiguously represented as $A = (1, 2, 3, 1)$ and the vector $\vec{A}$ is represented as $(1, 2, 3, 0)$ .
但让我们加上第四个值，叫做w𝑤，来标记A𝐴为一个点或一个矢量。如果w=0 𝑤=0，它就是一个矢量；如果w=1 𝑤=1，它就是一个点。所以点A𝐴被明确表示为A=(1,2,3,1) 𝐴=( 1 , 2 , 3 , 1 )，而矢量A⃗ 𝐴→被表示为(1,2,3,0) ( 1 , 2 , 3 , 0 )。

Since points and vectors share the same representation, these four-component coordinates are called homogeneous coordinates. Homogeneous coordinates have a far deeper and far more involved geometric interpretation, but that’s outside the scope of this book; here, we’ll just use them as a convenient tool.
由于点和向量有相同的表示方法，这些四分量的坐标被称为同质坐标。同质坐标有更深层次和更多的几何解释，但这不在本书的范围之内；在这里，我们只是把它们作为一个方便的工具。

Manipulating points and vectors expressed in homogeneous coordinates is compatible with their geometric interpretation. For example, subtracting two points produces a vector:
操纵用同质坐标表示的点和向量与它们的几何解释是一致的。例如，将两个点相减产生一个矢量。

(8, 4, 2, 1) - (3, 2, 1, 1) = (5, 2, 1, 0)(8,4,2,1)-(3,2,1,1)=(5,2,1,0) ( 8 , 4 , 2 , 1 ) - ( 3 , 2 , 1 , 1 ) = ( 5 , 2 , 1 , 0 )

$(8, 4, 2, 1) - (3, 2, 1, 1) = (5, 2, 1, 0)$

Adding two vectors produces another vector:
将两个向量相加产生另一个向量。

(0, 0, 1, 0) + (1, 0, 0, 0) = (1, 0, 1, 0)

$(0, 0, 1, 0) + (1, 0, 0, 0) = (1, 0, 1, 0)$

In the same way, it’s easy to see that adding a point and a vector produces a point, multiplying a vector by a scalar produces a vector, and so on, just as we expect.
同样，我们很容易看到，一个点和一个矢量相加产生一个点，一个矢量乘以一个标量产生一个矢量，以此类推，正如我们所期望的那样。

So what do coordinates with a $w$ value other than $0$ or $1$ represent? They also represent points. In fact, any point in 3D has an infinite number of representations in homogeneous coordinates. What matters is the ratio between the coordinates and the $w$ value. For example, $(1, 2, 3, 1)$ and $(2, 4, 6, 2)$ represent the same point, as does $(-3, -6, -9, -3)$ .
那么，除了0 0或1 1之外，w𝑤值的坐标代表什么？它们也代表点。事实上，三维中的任何一个点在同质坐标中都有无数种表示方法。重要的是坐标和w𝑤值之间的比率。例如，（1,2,3,1）（1 , 2 , 3 , 1）和（2,4,6,2）（2 , 4 , 6 , 2）代表同一个点，正如（-3,-6,-9,-3）（-3 , -6 , -9 , -3）。

Of all of these representations, we call the one with $w = 1$ the canonical representation of the point in homogeneous coordinates; converting any other representation to its canonical representation or to its Cartesian coordinates is trivial:
在所有这些表示中，我们把w=1𝑤=1的那个表示称为同质坐标中的点的经典表示；把任何其他表示转换为其经典表示或其笛卡尔坐标是微不足道的。

(\begin{matrix} x \\ y \\ z \\ w \end{matrix}) = (\begin{matrix} \frac{x}{w} \\ \frac{y}{w} \\ \frac{z}{w} \\ 1 \end{matrix}) \to (\begin{matrix} \frac{x}{w} \\ \frac{y}{w} \\ \frac{z}{w} \end{matrix})⎛ ⎝ ⎜ ⎜ ⎜ ⎜ x y z w ⎞ ⎠ ⎟ ⎟ ⎟ = ⎛ ⎝ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ( 𝑥 𝑦 𝑧 𝑤 ) = ( 𝑥 𝑤 𝑦 𝑤 𝑧 𝑤 1 ) \to ( 𝑥 𝑤 𝑦 𝑤 𝑧 𝑤 )

$\begin{pmatrix} x \\ y \\ z \\ w \end{pmatrix} = \begin{pmatrix} x \over w \\[6pt] y \over w \\[6pt] z \over w \\[6pt] 1 \end{pmatrix} \rightarrow \begin{pmatrix} x \over w \\[6pt] y \over w \\[6pt] z \over w \end{pmatrix}$

So we can convert Cartesian coordinates to homogeneous coordinates, and back to Cartesian coordinates. But how does this help us find a single representation for all the transforms?
因此，我们可以将直角坐标转换为同质坐标，再转换回直角坐标。但这如何帮助我们为所有的变换找到一个单一的表示？

Homogeneous Rotation Matrix 同质旋转矩阵

Let’s begin with a rotation matrix. Converting a $3 \times 3$ rotation matrix in Cartesian coordinates to a $4 \times 4$ rotation matrix in homogeneous coordinates is trivial; since the $w$ coordinate of the point shouldn’t change, we add a column to the right, a row to the bottom, fill them with zeros, and place a $1$ in the lower-right element to keep the value of $w$ :
让我们从一个旋转矩阵开始。将笛卡尔坐标中的3×3 3×3旋转矩阵转换为同质坐标中的4×4 4×4旋转矩阵是很简单的；由于点的w𝑤坐标不应该改变，我们在右边增加一列，在下面增加一行，用零填充它们，并在右下角的元素中放置一个1 1以保持w𝑤的值。

(\begin{matrix} A & B & C \\ D & E & F \\ G & H & I \end{matrix}) \cdot (\begin{matrix} x \\ y \\ z \end{matrix}) = (\begin{matrix} x^{'} \\ y^{'} \\ z^{'} \end{matrix}) \to (\begin{matrix} A & B & C & 0 \\ D & E & F & 0 \\ G & H & I & 0 \\ 0 & 0 & 0 & 1 \end{matrix}) \cdot (\begin{matrix} x \\ y \\ z \\ 1 \end{matrix}) = (\begin{matrix} x^{'} \\ y^{'} \\ z^{'} \\ 1 \end{matrix})⎛ ⎝ ⎜ A D G B E H C F I ⎞ ⎠ ⎟ \cdot ⎛ ⎝ ⎜ x y z ⎞ ⎠ ⎟ = ⎛ ⎝ ⎜ x ' y ' z ' ⎞ ⎠ ⎟ \to ⎛ ⎜ ⎜ A D G 0 B E H 0 C F I 0 0 1 ⎞ ⎠ ⎟ ⎟ \cdot⎛ ⎜ ⎜ x y z 1 ⎞ ⎠ ⎟ =⎛ ⎜ ⎜ x ' y ' z '1 ⎞ ⎠ ⎟⎟ ⎟ ( 𝐴 𝐵 𝐶 𝐷 𝐸 𝐹 𝐺 𝐻 𝐼 ) \cdot ( 𝑥 𝑦 𝑧 ) = ( 𝑥 ' 𝑦 ' 𝑧 ' ) \to ( 𝐴 𝐵 𝐶 0 𝐷 𝐸𝐹 0 𝐺 𝐻 𝐼 0 ' 0 ① \cdot ( 𝑥 𝑦 𝑧 1 ) = ( 𝑥 ″ 𝑦 ' 𝑧' 1 )

$\begin{pmatrix} A & B & C \\ D & E & F \\ G & H & I \end{pmatrix} \cdot \begin{pmatrix} x \\ y \\ z \end{pmatrix} = \begin{pmatrix} x' \\ y' \\ z' \end{pmatrix} \rightarrow \begin{pmatrix} A & B & C & 0 \\ D & E & F & 0 \\ G & H & I & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} \cdot \begin{pmatrix} x \\ y \\ z \\ 1 \end{pmatrix} = \begin{pmatrix} x' \\ y' \\ z' \\ 1 \end{pmatrix}$

Homogeneous Scale Matrix 同质比例矩阵

A scaling matrix is also trivial in homogeneous coordinates, and it’s constructed in the same way as the rotation matrix:
在同质坐标中，缩放矩阵也是微不足道的，它的构造方式与旋转矩阵相同。

(\begin{matrix} S_{x} & 0 & 0 \\ 0 & S_{y} & 0 \\ 0 & 0 & S_{z} \end{matrix}) \cdot (\begin{matrix} x \\ y \\ z \end{matrix}) = (\begin{matrix} x \cdot S_{x} \\ y \cdot S_{y} \\ z \cdot S_{z} \end{matrix}) \to (\begin{matrix} S_{x} & 0 & 0 & 0 \\ 0 & S_{y} & 0 & 0 \\ 0 & 0 & S_{z} & 0 \\ 0 & 0 & 0 & 1 \end{matrix}) \cdot (\begin{matrix} x \\ y \\ z \\ 1 \end{matrix}) = (\begin{matrix} x \cdot S_{x} \\ y \cdot S_{y} \\ z \cdot S_{z} \\ 1 \end{matrix})⎛ ⎝ ⎜ S x 0 0 0 S y 0 0 0 S z ⎞ ⎠ ⎟ \cdot ⎛ ⎝ ⎜ x y z ⎞ ⎠ ⎟ = ⎛ ⎝ ⎜ x\cdot S x y\cdot S y z\cdot S z ⎞ ⎠ ⎟ ⎛ ⎝ ⎜ S x 0 0 Sy 0 0 0 S z 0 0 0 1 ⎞ ⎠ ⎟ ⎟ \cdot ⎛ ⎝ ⎜ ⎜ x y z 1 ⎞ ⎠ ⎟ ⎟ ⎟ = ⎛ ⎝ ⎜ ⎜ x\cdot S x y\cdot S y z\cdot S z 1 ⎞ ⎠ ⎟ ⎟ ( 𝑆𝑥 0 0 0 𝑆 𝑦 0 0 0 𝑆 𝑧 ) \cdot ( 𝑥 𝑦 𝑧 ) = ( 𝑥 \cdot 𝑆 𝑦 \cdot 𝑆 𝑧 ) \to ( 𝑆 𝑥 0 	 0 0 0 𝑆 𝑦 0 0 0 𝑆 𝑧 0 0 0 1 ) \cdot ( 𝑥 𝑦 𝑧 1 ) = ( 𝑥 𝑦 \cdot 𝑆 𝑧 \cdot 𝑆 𝑧 1 )

$\begin{pmatrix} S_x & 0 & 0 \\ 0 & S_y & 0 \\ 0 & 0 & S_z \end{pmatrix} \cdot \begin{pmatrix} x \\ y \\ z \end{pmatrix} = \begin{pmatrix} x \cdot S_x \\ y \cdot S_y \\ z \cdot S_z \end{pmatrix} \rightarrow \begin{pmatrix} S_x & 0 & 0 & 0 \\ 0 & S_y & 0 & 0 \\ 0 & 0 & S_z & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} \cdot \begin{pmatrix} x \\ y \\ z \\ 1 \end{pmatrix} = \begin{pmatrix} x \cdot S_x \\ y \cdot S_y \\ z \cdot S_z \\ 1 \end{pmatrix}$

Homogeneous Translation Matrix 同质翻译矩阵

The rotation and scale matrices were easy; they were already expressed as matrix multiplications in Cartesian coordinates, we just had to add a $1$ to preserve the $w$ coordinate. But what can we do with a translation, which we had expressed as an addition in Cartesian coordinates?
旋转和缩放矩阵很容易；它们已经被表达为笛卡尔坐标中的矩阵乘法，我们只需要添加一个1 1来保留w𝑤坐标。但是，我们可以对平移做什么，我们已经把平移表达为笛卡尔坐标中的加法？

We’re looking for a $4 \times 4$ matrix such that
我们正在寻找一个4×4的4×4矩阵，使得

(\begin{matrix} T_{x} \\ T_{y} \\ T_{z} \\ 0 \end{matrix}) + (\begin{matrix} x \\ y \\ z \\ 1 \end{matrix}) = (\begin{matrix} A & B & C & D \\ E & F & G & H \\ I & J & K & L \\ M & N & O & P \end{matrix}) \cdot (\begin{matrix} x \\ y \\ z \\ 1 \end{matrix}) = (\begin{matrix} x + T_{x} \\ y + T_{y} \\ z + T_{z} \\ 1 \end{matrix})⎛ ⎝ ⎜ ⎜ T x T y T z 0 ⎞ ⎠ ⎟ ⎟ + ⎛ ⎝ ⎜ ⎜ ⎜ x y z 1 ⎞ ⎠ ⎟ ⎟ = ⎛ ⎝ ⎜ ⎜ A E I M B F J N C G K O D H L P ⎞ ⎟ ⎟ \cdot ⎛ ⎝ ⎜ ⎜ ⎜ x y z 1 ⎞ ⎠ ⎟ = ⎛ ⎝ ⎜ ⎜ X+ T x y+ T y z + T z 1 ⎞ ⎠ ⎟⎟ ⎟ ( 𝑇 𝑥 𝑇 𝑦 𝑇 𝑧 0 ) + ( 𝑥 𝑦 𝑧 1 ) = ( 𝐴 𝐵 𝐷 𝐸 𝐹 𝐺 𝐼 𝐾 𝐿 𝑀 𝑁 𝑂 𝑃 ) \cdot ( 𝑥 𝑦 𝑧 1 ) = ( 𝑥 + 𝑇 ⑵ + 𝑦 𝑧 + 𝑇 𝑧 1 )

$\begin{pmatrix} T_x \\ T_y \\ T_z \\ 0 \end{pmatrix} + \begin{pmatrix} x \\ y \\ z \\ 1 \end{pmatrix} = \begin{pmatrix} A & B & C & D \\ E & F & G & H \\ I & J & K & L \\ M & N & O & P \end{pmatrix} \cdot \begin{pmatrix} x \\ y \\ z \\ 1 \end{pmatrix} = \begin{pmatrix} x + T_x \\ y + T_y \\ z + T_z \\ 1 \end{pmatrix}$

Let’s focus on getting $x + T_x$ first. This value is the result of multiplying the first row of the matrix and the point—that is,
让我们先关注一下得到x+ T x𝑥+𝑇𝑥。这个值是矩阵的第一行和点相乘的结果，也就是说。

(\begin{matrix} A & B & C & D \end{matrix}) \cdot (\begin{matrix} x \\ y \\ z \\ 1 \end{matrix}) = x + T_{x( A B C D )⋅ ⎛ ⎝ ⎜ ⎜ x y z 1 ⎞ ⎠ ⎟ ⎟ =x+ T x ( 𝐴 𝐵 𝐶 𝐷 ) ⋅ ( 𝑥 𝑦 𝑧 1 ) =𝑥 + 𝑇}

$\begin{pmatrix} A & B & C & D \end{pmatrix} \cdot \begin{pmatrix} x \\ y \\ z \\ 1 \end{pmatrix} = x + T_x$

If we expand the vector multiplication, we get
如果我们扩大矢量乘法，我们可以得到

A x + B y + C z + D = x + T_{x}

$Ax + By + Cz + D = x + T_x$

From here we can deduce that $A = 1$ , $B = C = 0$ , and $D = T_x$ .
从这里我们可以推断出，A=1 𝐴 = 1，B=C=0 𝐵 = 𝐶 = 0，D= T x 𝐷 = 𝑇 𝑥。

Following a similar reasoning for the rest of the coordinates, we arrive at the following matrix expression for the translation:
按照对其余坐标的类似推理，我们得出了以下关于平移的矩阵表达。

(\begin{matrix} T_{x} \\ T_{y} \\ T_{z} \\ 0 \end{matrix}) + (\begin{matrix} x \\ y \\ z \\ 1 \end{matrix}) = (\begin{matrix} 1 & 0 & 0 & T_{x} \\ 0 & 1 & 0 & T_{y} \\ 0 & 0 & 1 & T_{z} \\ 0 & 0 & 0 & 1 \end{matrix}) \cdot (\begin{matrix} x \\ y \\ z \\ 1 \end{matrix}) = (\begin{matrix} x + T_{x} \\ y + T_{y} \\ z + T_{z} \\ 1 \end{matrix})⎛ ⎝ ⎜ ⎜ ⎜ T x T y T z 0 ⎞ ⎠ ⎟ ⎟ ⎟ + ⎛ ⎝ ⎜ ⎜ ⎜ x y z 1 ⎞ ⎠ ⎟ ⎟ = ⎛ ⎝ ⎜ ⎜ ⎜ ⎜ 1 0 0 0 0 1 0 T x T y T z 1 ⎞ ⎠ ⎟ ⎟ ⎟ \cdot ⎛ ⎝ ⎜ ⎜ ⎜ x y z 1 ⎞ ⎠ ⎟ ⎟ = ⎛ ⎝ ⎜ ⎟ ⎟ ⎟ = ⎛ ⎝ ⎜ ⎜ ⎟ ⎟ = ⎛ ⎝ ⎜ ⎜ x+ T x y+ T y z+ T z1 ⎞ ⎠ ⎟ ⎟ ( 𝑇 𝑥 𝑇 𝑦 𝑧 0 ) + ( 𝑥 𝑦 𝑧 1 ) = ( 1 0 0 𝑇 𝑥 0 0 𝑦 0 0 1 𝑇 𝑧 0 0 1 ) \cdot ( 𝑥 𝑦 𝑧 1 ) = ( 𝑥 + 𝑇 𝑥 𝑦 + 𝑇 𝑧 + 𝑇 𝑧 1 )

$\begin{pmatrix} T_x \\ T_y \\ T_z \\ 0 \end{pmatrix} + \begin{pmatrix} x \\ y \\ z \\ 1 \end{pmatrix} = \begin{pmatrix} 1 & 0 & 0 & T_x \\ 0 & 1 & 0 & T_y \\ 0 & 0 & 1 & T_z \\ 0 & 0 & 0 & 1 \end{pmatrix} \cdot \begin{pmatrix} x \\ y \\ z \\ 1 \end{pmatrix} = \begin{pmatrix} x + T_x \\ y + T_y \\ z + T_z \\ 1 \end{pmatrix}$

Homogeneous Projection Matrix 同质投影矩阵

Sums and multiplications are easy to express as multiplications of matrices and vectors because they involve, after all, sums and multiplications. But the perspective projection equations have a division by $z$ . How can we express that?
和与乘法很容易表示为矩阵和向量的乘法，因为它们毕竟涉及到和与乘法。但透视投影方程有一个除以z𝑧。我们该如何表达呢？

You may be tempted to think that dividing by $z$ is the same as multiplying by $1/z$ , and you may want to solve this problem by putting $1/z$ in the matrix. However, which $z$ coordinate would we put there? We want this projection matrix to work for every input point, so hardcoding the $z$ coordinate of any point would not give us what we want.
你可能会认为除以z𝑧与乘以1/z 1 / 𝑧是一样的，你可能想通过把1/z 1 / 𝑧放在矩阵中来解决这个问题。然而，我们会把哪个z𝑧坐标放在那里？我们希望这个投影矩阵对每个输入点都有效，所以硬编码任何一个点的z𝑧坐标都不会得到我们想要的结果。

Fortunately, homogeneous coordinates do have one instance of a division: the division by the $w$ coordinate when converting back to Cartesian coordinates. If we can manage to make the $z$ coordinate of the original point appear as the $w$ coordinate of the “projected” point, we’ll get the projected $x$ and $y$ once we convert the point back to Cartesian coordinates:
幸运的是，同质坐标确实有一个除法的实例：在转换回直角坐标时，被w𝑤坐标除以。如果我们能够设法使原点的z𝑧坐标显示为 "投影 "点的w𝑤坐标，一旦我们将该点转换回直角坐标，我们就会得到投影的x𝑥和y𝑦。

(\begin{matrix} A & B & C & D \\ E & F & G & H \\ I & J & K & L \end{matrix}) \cdot (\begin{matrix} x \\ y \\ z \\ 1 \end{matrix}) = (\begin{matrix} x \cdot d \\ y \cdot d \\ z \end{matrix}) \to (\begin{matrix} \frac{x \cdot d}{z} \\ \frac{y \cdot d}{z} \end{matrix})⎛ ⎝ ⎜ A E I B F J C G K D H L ⎞ ⎠ ⎟ \cdot ⎛ ⎝ ⎜ ⎜ ⎜ x y z 1 ⎞ ⎠ ⎟ ⎟ = ⎛ ⎝ ⎜ x\cdotd y\cdotd z ⎞ ⎠ ⎟ \to ⎛ ⎝ ⎜ x\cdotd z y\cdotd z ⎞ ⎠ ⎟ ( 𝐴) 	 𝐵 𝐷 𝐸 𝐹 𝐺 𝐻 𝐼 𝐾 𝐿 ) \cdot ( 𝑥 𝑦 𝑧 1 ) = ( 𝑥\cdot 𝑑 𝑦 \cdot 𝑑 𝑧 ) \to ( 𝑥\cdot 𝑑 𝑧 𝑦 \cdot 𝑑 𝑧 )

$\begin{pmatrix} A & B & C & D \\ E & F & G & H \\ I & J & K & L \end{pmatrix} \cdot \begin{pmatrix} x \\ y \\ z \\ 1 \end{pmatrix} = \begin{pmatrix} x \cdot d \\ y \cdot d \\ z \end{pmatrix} \rightarrow \begin{pmatrix} x \cdot d \over z \\[6pt] y \cdot d \over z \end{pmatrix}$

Note that this matrix is $3 \times 4$ ; it can be multiplied by a four-element vector (the transformed 3D point in homogeneous coordinates) and it will yield a three-element vector (the projected 2D point in homogeneous coordinates), which is then converted to 2D Cartesian coordinates by dividing by $w$ . This gives us exactly the values of $x'$ and $y'$ we were looking for. The missing element here is $z'$ , which we know is equal to $d$ by definition.
请注意，这个矩阵是3×4 3×4；它可以乘以一个四元素的向量（同质坐标中转换的三维点），它将得到一个三元素的向量（同质坐标中投影的二维点），然后通过除以w𝑤转换为二维直角坐标。这正是我们要找的x′ 𝑥 ′和y′ 𝑦 ′的值。这里缺少的元素是z ′ 𝑧 ′，根据定义，我们知道它等于d 𝑑。

Applying the same reasoning we used to deduce the translation matrix, we can express the perspective projection as follows:
应用我们用来推导翻译矩阵的相同推理，我们可以将透视投影表达如下。

(\begin{matrix} d & 0 & 0 & 0 \\ 0 & d & 0 & 0 \\ 0 & 0 & 1 & 0 \end{matrix}) \cdot (\begin{matrix} x \\ y \\ z \\ 1 \end{matrix}) = (\begin{matrix} x \cdot d \\ y \cdot d \\ z \end{matrix}) \to (\begin{matrix} \frac{x \cdot d}{z} \\ \frac{y \cdot d}{z} \end{matrix})⎛ ⎝ ⎜ d 0 0 d 0 0 1 0 0 0 ⎞ ⎠ ⎟ \cdot ⎛ ⎝ ⎜ ⎞ x y z 1 ⎞ ⎠ ⎟ ⎟ = ⎛ ⎝ ⎜ x\cdotd y\cdotd z ⎞ ⎠ ⎟ \to ⎛ ⎝ ⎜ x\cdotd z y\cdotd z ⎞ ⎠ ⎠⎟ ( 𝑑 0 0 0 𝑑 0 0 0 1 0 ) \cdot ( 𝑥 𝑦 𝑧 1 ) = ( 𝑥 \cdot 𝑑 𝑦 \cdot 𝑑 𝑧 ) \to ( 𝑥 \cdot 𝑑 𝑦 \cdot 𝑑 𝑧 )

$\begin{pmatrix} d & 0 & 0 & 0 \\ 0 & d & 0 & 0 \\ 0 & 0 & 1 & 0 \end{pmatrix} \cdot \begin{pmatrix} x \\ y \\ z \\ 1 \end{pmatrix} = \begin{pmatrix} x \cdot d \\ y \cdot d \\ z \end{pmatrix} \rightarrow \begin{pmatrix} x \cdot d \over z \\[6pt] y \cdot d \over z \end{pmatrix}$

Homogeneous Viewport-to-Canvas Matrix
均匀的视口-画布矩阵

The last step is mapping the projected point on the viewport to the canvas. This is just a 2D scaling transform with $S_x = {c_w \over v_w}$ and $S_y = {c_h \over v_h}$ . This matrix is thus
最后一步是将视口上的投影点映射到画布上。这只是一个二维缩放变换，S x = c w v w 𝑆 𝑥 = 𝑐 𝑤 𝑣 𝑤，S y = c h v h 𝑆 𝑦 =𝑐 ℎ 𝑣 ℎ。因此，这个矩阵是

(\begin{matrix} \frac{c_{w}}{v_{w}} & 0 & 0 \\ 0 & \frac{c_{h}}{v_{h}} & 0 \\ 0 & 0 & 1 \end{matrix}) \cdot (\begin{matrix} x \\ y \\ z \end{matrix}) = (\begin{matrix} \frac{x \cdot c_{w}}{v_{w}} \\ \frac{y \cdot c_{h}}{v_{h}} \\ z \end{matrix})⎛ ⎝ ⎜ c w v w 0 0 c h v h 0 0 1 ⎞ ⎠ ⎟ ⎟ \cdot ⎛ ⎝ ⎜ x y z ⎞ ⎠ ⎟ = ⎛ ⎝ ⎜ ⎜ x\cdot c w v w y\cdot c h v h z ⎞ ⎠ ⎟ ⎟⎟ ( 𝑐 𝑤 𝑣 𝑤 0 0 0 𝑐 ℎ 𝑣 ℎ 0 0 1 ) \cdot ( 𝑥 𝑦 𝑧 ) = ( 𝑥 \cdot 𝑐 𝑤 𝑣 𝑤 𝑦 𝑐C5 𝑐 𝑣 ℎ 𝑧 )

$\begin{pmatrix} c_w \over v_w & 0 & 0 \\ 0 & c_h \over v_h & 0 \\ 0 & 0 & 1 \end{pmatrix} \cdot \begin{pmatrix} x \\ y \\ z \end{pmatrix} = \begin{pmatrix} x \cdot c_w \over v_w \\[6pt] y \cdot c_h \over v_h \\[6pt] z \end{pmatrix}$

In fact, it’s easy to combine this with the projection matrix to get a simple 3D-to-canvas matrix:
事实上，很容易将其与投影矩阵结合起来，得到一个简单的3D-to-canvas矩阵。

(\begin{matrix} \frac{d \cdot c w}{v w} & 0 & 0 & 0 \\ 0 & \frac{d \cdot c h}{v h} & 0 & 0 \\ 0 & 0 & 1 & 0 \end{matrix}) \cdot (\begin{matrix} x \\ y \\ z \\ 1 \end{matrix}) = (\begin{matrix} \frac{x \cdot d \cdot c w}{v w} \\ \frac{y \cdot d \cdot c w}{v h} \\ z \end{matrix}) \to (\begin{matrix} (\frac{x \cdot d}{z}) (\frac{c w}{v w}) \\ (\frac{y \cdot d}{z}) (\frac{c h}{v h}) \end{matrix})⎛ ⎝ ⎜ ⎜ d\cdotcw vw 0 0 d\cdotch vh 0 0 0 0 1 0 0 0 ⎞ ⎠ ⎟ \cdot ⎛ ⎝ ⎜ ⎜ x y z 1 ⎞ ⎠ ⎟ ⎟ = ⎛ ⎝ ⎜ ⎜ x\cdotd\cdotcw vw y\cdotd\cdotcw vh z ⎞ ⎠ ⎟ ⎟\to ⎛ ⎜ ( x\cdotd z ) ( cw vw ) ( y\cdotd z ) ( ch vh ) ⎞ ⎠ ⎟( 𝑑 \cdot 𝑐 _l_1D464↩𝑣𝑤 0 0 0 𝑑 \cdot 𝑐 ℎ 𝑣 ℎ 0 0 0 0 1 0 ) \cdot ( 𝑥 𝑦 𝑧 1 ) = ( 𝑥 \cdot 𝑑 \cdot 𝑐 𝑤 𝑣 𝑤 𝑦 \cdot 𝑑 \cdot 𝑐 𝑤 𝑣 ℎ 𝑧 ) \to ( 𝑦 \cdot 𝑑 \cdot ℎ ℎ )

$\begin{pmatrix} d \cdot cw \over vw & 0 & 0 & 0 \\ 0 & d \cdot ch \over vh & 0 & 0 \\ 0 & 0 & 1 & 0 \end{pmatrix} \cdot \begin{pmatrix} x \\ y \\ z \\ 1 \end{pmatrix} = \begin{pmatrix} x \cdot d \cdot cw \over vw \\[6pt] y \cdot d \cdot cw \over vh \\[6pt] z \end{pmatrix} \rightarrow \begin{pmatrix} ({x \cdot d \over z})({cw \over vw}) \\[6pt] ({y \cdot d \over z})({ch \over vh}) \end{pmatrix}$

The Transform Matrix Revisited
重新审视变形矩阵

After all this work, we can express every transform we need to convert a model vertex $V$ into a canvas pixel $V\, '$ as a matrix. Moreover, we can compose these transforms by multiplying their corresponding matrices. So we can express the whole sequence of transforms as a single matrix:
做完这些工作后，我们可以把我们需要把模型顶点V 𝑉转换成画布像素V ′ 𝑉 ′的每一个变换表示为一个矩阵。此外，我们可以通过乘以其相应的矩阵来组合这些变换。所以我们可以把整个变换序列表示为一个单一的矩阵。

F = M \cdot P \cdot C_{R}^{- 1} \cdot C_{T}^{- 1} \cdot I_{T} \cdot I_{R} \cdot I_{Sf=m⋅p⋅ c -1 r ⋅ c -1 t ⋅ i t ⋅ i r ⋅ i s 𝐹 = 𝑀 ⋅ 𝑃 ⋅ 𝐶 𝑅 - 1 ⋅ 𝑇 - 1 ⋅ 𝐼 𝑇 ⋅ 𝐼 𝑅 𝐼 𝑆}

$F = M \cdot P \cdot C_R^{-1} \cdot C_T^{-1} \cdot I_T \cdot I_R \cdot I_S$

Now transforming a vertex is just a matter of computing the following matrix-by-point multiplication:
现在，转换一个顶点只是计算以下矩阵的逐点乘法问题。

V^{'} = F \cdot Vv ′ =f⋅v 𝑉 ′ = 𝐹 ⋅ 𝑉

$V\, ' = F \cdot V$

Furthermore, we can decompose the transform into three parts:
此外，我们可以将转换分解为三个部分。

M_{P r o j e c t i o n} = M \cdot PM投影=M⋅P 𝑀 𝑃 𝑟 𝑜 𝑗 𝑒 𝑐 𝑡 𝑖 𝑜𝑛 = 𝑀 ⋅ 𝑃

$M_{Projection} = M \cdot P$

M_{C a m e r a} = C_{R}^{- 1} \cdot C_{T}^{- 1M Camera = C -1 R ⋅ C -1 T 𝑀 𝑎 𝑚 𝑒 𝑟 𝑎 = 𝑅 - 1 ⋅ 𝑇 - 1}

$M_{Camera} = C_R^{-1} \cdot C_T^{-1}$

M_{M o d e l} = I_{T} \cdot I_{R} \cdot I_{SM模型 = I T ⋅ I R ⋅ I S 𝑀 𝑜 𝑑 𝑒 𝑙 = 𝐼 𝑇 ⋅ 𝐼 𝑅 𝐼 𝑆}

$M_{Model} = I_T \cdot I_R \cdot I_S$

M = M_{P r o j e c t i o n} \cdot M_{C a m e r a} \cdot M_{M o d e lM= M投影 ⋅ M相机 ⋅ M模型 𝑀 = 𝑀 𝑃 𝑟 𝑜 𝑗 𝑒 𝑐 𝑡 𝑖 𝑜 𝑛 ⋅ 𝑀 𝐶 𝑎 𝑚 𝑒 𝑟 𝑎 ⋅ 𝑀 𝑜 𝑑 𝑒 𝑙}

$M = M_{Projection} \cdot M_{Camera} \cdot M_{Model}$

These matrices don’t need to be computed from scratch for every vertex (that’s the point of using a matrix after all). Because matrix multiplication is associative, we can reuse the parts of the expression that don’t change.
这些矩阵不需要为每个顶点从头开始计算（这毕竟是使用矩阵的重点）。因为矩阵乘法是关联性的，我们可以重复使用表达式中不改变的部分。

$M_{Projection}$ should rarely change; it only depends on the size of the viewport and the size of the canvas. The size of the canvas changes when, for example, the application goes from windowed to fullscreen. The size of the viewport would only change if the field of view of the camera changes; this doesn’t happen very often.
M 投影 𝑀 𝑃 𝑟 𝑜 𝑒 𝑐 𝑡 𝑜 𝑛应该很少变化；它只取决于视口的尺寸和画布的尺寸。例如，当应用程序从窗口化到全屏化时，画布的大小会发生变化。视口的大小只有在摄像机的视场发生变化时才会改变；这种情况并不经常发生。

$M_{Camera}$ may change every frame; it depends on the camera position and orientation, so if the camera is moving or turning, it needs to be recomputed. Once computed, though, it remains constant for every object drawn in the frame, so it would be computed at most once per frame.
M 相机 𝑀 𝐶 𝑎 𝑚 𝑒 𝑟 𝑎每帧都可能发生变化；它取决于相机的位置和方向，所以如果相机在移动或转动，它需要重新计算。但是，一旦计算出来，它对帧中的每个物体都保持不变，所以它每帧最多计算一次。

$M_{Model}$ will be different for each instance in the scene; however, it will remain constant over time for instances that don’t move (for example, trees and buildings), so it can be computed once and stored in the scene itself. For objects that do move (for example, cars in a racing game) it needs to be computed every time they move (which is likely to be every frame).
M模型𝑀 𝑀 𝑜 𝑑 𝑒 𝑙对于场景中的每个实例都是不同的；然而，对于不移动的实例（例如，树木和建筑物），它将随时间保持不变，所以它可以被计算一次并存储在场景本身。对于那些会移动的物体（例如赛车游戏中的汽车），它需要在它们每次移动时（可能是每一帧）进行计算。

A very high level of the scene rendering pseudocode would look like Listing 10-5.
一个非常高层次的场景渲染伪代码看起来就像清单10-5。

RenderModel(model, transform) {
    projected = []
    for V in model.vertices {
        projected.append(ProjectVertex(transform * V))
    }
    for T in model.triangles {
        RenderTriangle(T, projected)
    }
}

RenderScene() {
    M_camera = MakeCameraMatrix(camera.position, camera.orientation)

    for I in scene.instances {
        M = M_camera * I.transform
        RenderModel(I.model, M)
    }
}

Listing 10-5: An algorithm to render a scene using transform matrices
清单10-5：使用变换矩阵渲染一个场景的算法

We can now draw a scene containing several instances of different models, possibly moving around and rotating, and we can move the camera throughout the scene. Figure 10-10 shows two instances of our cube model, each with a different transform (including translation and rotation), rendered from a translated and rotated camera.
我们现在可以绘制一个包含不同模型的几个实例的场景，可能会四处移动和旋转，我们可以在整个场景中移动摄像机。图10-10显示了我们的立方体模型的两个实例，每个实例都有不同的变换（包括平移和旋转），从一个平移和旋转的摄像机上渲染。

Figure 10-10: A scene with two instances of the same cube model, having different instance transforms, and a transformed camera
图10-10：一个具有相同立方体模型的两个实例的场景，具有不同的实例变换，以及一个变换的摄像机

Source code and live demo >>
源代码和现场演示 >>

Summary

We covered a lot of ground in this chapter. We first explored how to represent models made out of triangles. Then we figured out how to apply the perspective projection equation we derived in the previous chapter to entire models, so we can go from an abstract 3D model to its representation on the screen.
我们在这一章中涵盖了很多内容。我们首先探讨了如何表示由三角形组成的模型。然后我们想出了如何将我们在前一章中得出的透视投影方程应用于整个模型，因此我们可以从一个抽象的三维模型到它在屏幕上的表现。

Next we developed a way to have multiple instances of the same model in the scene without having multiple copies of the model itself. Then we found out how to lift one of the limitations we had been working with so far: our camera no longer needs to be fixed at the origin of the coordinate system or pointing toward $\vec{Z+}$ .
接下来我们开发了一种方法，可以在场景中拥有同一模型的多个实例，而不需要有模型本身的多个副本。然后我们发现了如何解除我们迄今为止一直在工作的一个限制：我们的摄像机不再需要固定在坐标系的原点或指向Z+ → 𝑍 + →。

Finally, we explored how to represent all the transforms we need to apply to a vertex as matrix multiplications in homogeneous coordinates, and this allowed us to reduce the computations required to render a scene by condensing many of the consecutive transforms into just three matrices: one for the perspective projection and viewport-to-canvas mapping, one for the instance transform, and one for the camera transform.
最后，我们探索了如何将我们需要应用于顶点的所有变换表示为同质坐标中的矩阵乘法，这使我们能够通过将许多连续的变换浓缩为三个矩阵来减少渲染场景所需的计算：一个用于透视投影和视口到画布的映射，一个用于实例变换，还有一个用于摄像机变换。

This has given us a lot of flexibility in terms of what we can represent in a scene, and it also allows us to move the camera around the scene. But we still have two important limitations. First, moving the camera means we can end up with objects behind it, which causes all sorts of problems. Second, the rendering doesn’t look so great: it’s still a wireframe image.
这给了我们很大的灵活性，我们可以在一个场景中表现什么，它还允许我们在场景中移动摄像机。但是我们仍然有两个重要的限制。首先，移动摄像机意味着我们可能会在摄像机后面出现物体，这就造成了各种问题。第二，渲染结果看起来并不那么好：它仍然是一个线框图像。

Note that for practical reasons we won’t be using the full projection matrix in the rest of this book. Instead, we’ll use the model and camera transforms separately and then convert their results back to Cartesian coordinates as follows:
请注意，由于实际原因，我们不会在本书的其余部分中使用完整的投影矩阵。相反，我们将分别使用模型和相机的变换，然后将其结果转换为笛卡尔坐标，如下所示。

x^{'} = \frac{x \cdot d \cdot c w}{z \cdot v wx ′ = x⋅d⋅cw z⋅vw 𝑥 ′ = 𝑥 ⋅ 𝑑 𝑐 𝑤 𝑧 ⋅ 𝑣 𝑤}

$x' = {x \cdot d \cdot cw \over z \cdot vw}$

y^{'} = \frac{y \cdot d \cdot c h}{z \cdot v hy ′ = y⋅d⋅ch z⋅vh 𝑦 ′ = 𝑦 ⋅ 𝑑 𝑐 ℎ 𝑧 ⋅ 𝑣 ℎ}

$y' = {y \cdot d \cdot ch \over z \cdot vh}$

This lets us do some more operations in 3D that can’t be expressed as matrix transforms before we project the points.
这让我们在投射点之前，可以在三维中做一些不能用矩阵变换来表达的操作。

In the next chapter, we’ll deal with objects that shouldn’t be visible, and then we’ll spend the rest of this book making the rendered objects look better.
在下一章中，我们将处理那些不应该被看到的对象，然后我们将用本书的其余部分使渲染后的对象看起来更好。