I got my break as a graphics engineer at Acclaim working on the first Shadowman title, starting before graduating, though I did graduate. Shadowman2 was my first full game, and the studio’s first PS2 title. Development was troubled from the outset; a sorry tale of redundancies, management changes, toning down the content to avoid a mature rating, a script rewrite and engine change half way through, senior staff leaving, & etc.. I stuck it out, writing all of the core rendering code, mostly in ASM on the vector units. However I then spent almost a year as lead, a time when the rendering code remained static, due fire fighting so many other fires. We never did implement triangle strips, an essential optimisation for PS2, but the world geometry did support quads. This was a first generation PS2 title, unfortunately delayed by a year.
Early on character artist Robert Nash wanted an increase in poly budget to open up Shadowman’s chest, adding a point light and effect, which enabled the night to be truly dark without being unplayable, but allowing monsters to lurk unseen. Additionally we wanted to support environment and weapon based point lights, a maximum of four lights affecting any one ‘VU packet’ of world geometry at once. The problem was, we could not afford the required divide and sqrt instructions. (And this is per-vertex, the PS2 did not have pixel shaders) Four lights is also a natural number for the PS2 VU’s, since floating point SIMD instructions typically had a latency of 4 cycles between issue and being able to use the result, but you could issue an instruction every cycle.

The Playstation2 VU’s had two pipes, the upper is a SIMD4 single precision floating point unit, which operated on each of .xyzw in parallel. From memory divides were very expensive at 7 or 13 clock cycles (the general divide instruction being faster than the reciprocal instruction!), sqrt and rsqrt were similarly expensive. Worse these operated on a single floating point value, there was no vector4 divide or vector4 sqrt instruction.
The text book point light implementation for 4 point lights would have required well over 100 clock cycles, and of course we had to transform the vertex, apply 1/w to the UV’s etc.. I needed something which ran an order of magnitude faster!
The first ‘trick’ was not store vectors as .xyz, as it was almost impossible to use the w component of the 4 component registers effectively. (and this remained true throughout the next console generation as well, X360/PS3) So the four point light positions were stored in three vector registers by packing all four lights position X components in one register, all Y in the next and all Z in the last. (.xxxx, .yyyy. .zzzz) The PS2 VU’s had a great instruction set which allowed you code the following very efficiently:
float4 Lx = lightPositionX.xyzw - worldPos.x;
float4 Ly = lightPositionY.xyzw - worldPos.y;
float4 Lz = lightPositionZ.xyzw - worldPos.z;
float4 distToLightSqr = Lx * Lx + Ly * Ly + Lz * Lz;
Quick and with no wasted register lanes. The next part was how to produce the attenuation term as cheaply as possible? This is easiest explained with a code fragement (though this isn’t what shipped)
float4 attenuation = 1.0 - distToLightSqr.xyzw * light.invRadius2.xyzw;
vertexColour += attention.x * lightColour0;
vertexColour += attention.y * lightColour1;
vertexColour += attention.z * lightColour2;
vertexColour += attention.w * lightColour3;
Here we pre-calculate 1/radius^2 on the CPU. In fact the above is a simplification for understanding. The final step was to multiply the attenuation calculation by light.colour and negate the last term, so that we can use the multiply-add instruction.
vertexColour += distToLightSqr.x * lightColour0DivRadius2 + lightColour0;
vertexColour += distToLightSqr.y * lightColour1DivRadius2 + lightColour1;
vertexColour += distToLightSqr.z * lightColour2DivRadius2 + lightColour2;
vertexColour += distToLightSqr.w * lightColour3DivRadius2 + lightColour3;
The CPU pre-calculated: -(lightColour / radius^2).
Just for completeness, here’s the whole ‘shader’ run on VU1. This of course doesn’t look anything like modern code for point lighting.
float4 Lx = lightPositionX.xyzw - worldPos.x;
float4 Ly = lightPositionY.xyzw - worldPos.y;
float4 Lz = lightPositionZ.xyzw - worldPos.z;
float4 distToLightSqr = Lx * Lx + Ly * Ly + Lz * Lz;
vertexColour += distToLightSqr.x * lightColour0DivRadius2 + lightColour0;
vertexColour += distToLightSqr.y * lightColour1DivRadius2 + lightColour1;
vertexColour += distToLightSqr.z * lightColour2DivRadius2 + lightColour2;
vertexColour += distToLightSqr.w * lightColour3DivRadius2 + lightColour3;
The eagle eyed will have noticed also there’s no N.L term, I simply couldn’t afford that, so point lights attenuated with distance only, they didn’t attenuate with the surface angle. But that was it, 4 points lights without divide or sqrt instructions, efficently utilising all four SIMD lanes and the multiply-add instruction.
This super cheap attenuation curve is not a great model of reality, somewhat the inverse of what real light does, but it worked well enough for magical ‘voodoo’ effects, with small/medium sized triangles, and the rasterizers non-perspective correct(!) interpolation of these radiance values across the triangle. (Only textures were perspective correct on PS2, vertex colour was not)
I shared the point lighting code with Acclaim Studios Cheltenham who used is in the 60fps racer Extreme G3 on PS2. I’m not aware of many other PS2 games doing point lighting.
A personal bug bear was that the collision system was extremely ropey, and there was no seperate collision skin, so the artists kept the world geometry simple in areas the player could traverse, with a lot of the triangle budget often going to the ceiling! I eventually entirely rewrote the collision system prior to shipping, but far too late to effect the art content.
Dynamic lighting for instanced objects (as opposed to world geometry) worked differently. Here the CPU converted point lights into direction lights before uploading to VU1. That is the CPU computed a single distance attenuation value and a single light direction vector for the whole object. Meaning you could compute N.L angle attenuation very cheaply on the vector unit, but could not do per-vertex distance attenuation. I assume a lot of games did this.
Shadowman’s chest light didn’t look great automatically converted to a direction light, being inside the model, direction vectors flicked widly with the animation played. So I forced the light to point straight down, which helped also with his shadow and the visual aid for making jumps. The undesirable feature of using the render geometry as the collision geometry had a silver lining. I was able to use the collision geometry to project a disc between shadowman’s feet and draw a shadow. The CPU found the subset of collision triangles overlapping the bounds of the shadow, and then rendered these with UV’s computed to render a disk, darkening the underlying geometry between Shadowman’s feet, following any undulations in the geometry perfectly. (Unlike Shadowman1, which rendered shadowman squashed onto a single plane and in black, without any transparency)
Something I wish I’d had time to code was an option to light large instance objects the same way as world gometry, where the conversion of a point light to a direction light didn’t work well. Again I was fire fighting other problems all the way to shipping.
The live side levels (as opposed to dead side) had a sun/moon and a real time day night cycle was essential to the game. Instanced objects periodically raycast to the sun, I think every 16 frames, and would fade their contribution from the sun/moon up or down over time. Live side world geometry vertices had precalculated visibility from the sun for 16 different times of day, stored as individual bits. This same visibility was used at night for the moon. Something which really helped our RAM and storage problems was that the only world geometry that needed vertex normals was liveside outdoor sectors. So deadside levels and indoor sectors didn’t have vertex normals. We also had height fog, which was pretty unusual for the time and I don’t think even possible on PC at that time with DirectX7.

I had a prototype of just-in-time world lighting on VU0 which was a lot faster but didn’t ship, due to there being some edge cases I never found time to code, including adding in the day night cycle shadows. We did however do just-in-time vertex skinning on VU0, so skinning and render ran in parallel on the two vector units.
Something I prototyped after Shadowman2 shipped was ‘dot3’ bump mapping on the PS2. This was not the full screen multi-pass algorithm Sony developed. Instead I uploaded 256 normals to VU0, computed N.L and point rendered the result to a texture palette as the render target. The idea was that I would do distance attenuation per vertex as in Shadowman2, and then modulate this per texel with the N.L term from the texture. This did look good for the time (I wish I had a screenshot!), however the problems here were that because the PS2 could only single texture, you needed a pass per light, and then you had to render the geometry again to blend the albedo texture on top. We were going to try this for Forsaken2, perhaps only for the ships and select models, avoiding organic shapes, so the 256 normal limitation was likely going to work out ok.
Sadly Forsaken 2 was canned, Acclaim folded, and I never found the opportunity to implement this tech in another title. Of course ‘dot3’ didn’t really take off, and was superseded by the far superior and now ubiquitous tangent space normal mapping.