A3: Lights by Dennis Sun (shaoqias)

For this assignment I implemented a forward-rendering lighting system on top of the existing Vulkan pipeline. Light data (sphere, spot, and sun lights) is loaded from the LIGHT entries in the s72 scene file into a flat Light struct array (position/type, direction/shadow-slot, tint×power, and per-type parameters) and uploaded every frame as a storage buffer. The fragment shader loops over all lights and accumulates contributions using a Lambertian diffuse term and a GGX/Smith PBR BRDF, with environment-map IBL as a base. For shadows, a dedicated ShadowPipeline renders depth to a R32_SFLOAT color attachment for up to 8 spot lights per frame; the shadow maps are then sampled with 4-sample PCF and a constant depth bias. The extra-credit features A3x-sort, A3x-soft, A3x-cascade, and A3x-cube were not completed due to time constraints.

My Lit Scene A3-create

Describe the scene you lit, the light count and placement approach, and include a screen recording showing it running in real-time:

Credit + cite your sources for textures and models, if you did not create them yourself.

Command-line Arguments (Viewer)

Document the command-line arguments that can be used to control your viewer. This will probably be a copy of the section from your A2 report with a few new flags relating to lights (if you added any).

--scene scene.s72 -- required -- load scene from scene.s72
--camera name -- optional -- view the scene through the camera name
--physical-device name -- optional -- use the physical device whose VkPhysicalDeviceProperties::deviceName matches name
--drawing-size w h -- optional -- set the initial size of the drawable part of the window in physical pixels of width w and height h
--culling mode -- optional -- sets the culling mode to be none|frustum|bvh
--headless -- optional -- if specified, run in headless mode (no windowing system connection), and read frame times and events from standard input. In headless mode, the flag --drawing-size specifies the size of the offscreen canvas that is rendered into
--test mode -- optional -- runs the A1-test section, mode can be cpu or gpu Under cpu mode, the program will automatically loads the test scene for cpu "cpu-bottleneck.s72" Under gpu mode, the program will load the scene from --scene scene.s72 argument
--culled-count number -- optional -- combined with --test cpu argument to set the number of culled objects (outside the camera frustum) in the scene
--csv-file-name file_name -- optional -- in A1-test section, the user chosen log file name
--exposure E scale computed radiance by 2^E before tone mapping. This simulates camera exposure adjustment and is always applied prior to the tone-mapping operator.
--tone-map mode select the tone-mapping operator used to convert HDR radiance to displayable color. Supported modes are: linear (no additional mapping after exposure) and reinhard (nonlinear compression of high luminance values).

Evidence

The purpose of this section is to get you to think critically about your code by providing evidence sufficient to demonstrate to course staff that it works. These thoughts may also help you improve the code as you work on it in the Final project and beyond.

Loading light objects A3-load

During scene graph traversal in SceneViewer.cpp, each node that references a "LIGHT" object is parsed and stored as a LightInfo struct in a flat std::vector<LightInfo>. The struct records the light type (sun / sphere / spot), world-space position and direction (computed from the node's transform), the tint color, and the type-specific intensity value — strength for sun lights and power for sphere and spot lights — kept as separate fields. At render time these are repacked into GPU-side Light structs (one storage buffer per frame) where TINT_STRENGTH.xyz holds the raw tint and TINT_STRENGTH.w holds the raw power/strength; tint and power are multiplied together in the fragment shader. The one exception is the sky/sun environment energy stored in the World uniform, where tint and strength are premultiplied into SUN_ENERGY / SKY_ENERGY at update time since those values are consumed directly without a per-light loop.

Adding Direct Lighting to Materials A3-materials

All lights are passed to materials through a single GPU-side storage buffer bound at set 0, binding 4, shared across all draw calls in a frame. Each entry in the buffer is a 64-byte Light struct composed of four vec4s: POSITION_TYPE (xyz = world-space position, w = type: 0=sun, 1=sphere, 2=spot), DIRECTION_SHADOW (xyz = normalized direction, w = shadow-map slot index, or −1 if unshadowed), TINT_STRENGTH (rgb = tint, w = power/strength), and PARAMS (per-type parameters: cone angle, radius, range limit, blend factor). The total light count is carried in the World uniform buffer (set 0, binding 0) as LIGHT_COUNT, so the fragment shader knows how many entries to iterate over. Every frame, Tutorial.cpp converts the flat std::vector<LightInfo> from the scene viewer into this packed array, uploads it via a host-coherent staging buffer (Lights_src), and copies it to a device-local buffer (Lights); the descriptor is updated once when the buffer is (re)allocated and remains valid for subsequent frames until the light count grows.

Recording 1: lambertian and pbr materials under sphere, spot, and sun lights.

Recording 2: lambertian and pbr materials under sphere, spot, and sun lights.

Recording 3: lambertian and pbr materials under sphere, spot, and sun lights.

The test scene consists of a single 70×70 lambertian floor plane (checker texture) with N unshadowed sphere lights (radius=0, power=200, limit=20) arranged in a uniform grid 3 units above the surface, and a fixed overhead camera at [0, 0, 50] looking straight down. Each data point is the average gpu_draw_ms over 190 frames (first 10 warmup frames discarded), measured on Apple M3 Max via Vulkan timestamp queries. The increase is nearly linear for N ≥ 20, as expected from the per-light loop in the fragment shader. At the low end (N < 20) the fixed GPU overhead dominates and the curve appears flat. Even at 1000 lights the GPU draw time is only ~4.8 ms, well under the 33 ms budget for 30 fps, so the viewer handles at least 1000 lights at a very comfortable frame rate on this hardware.

GPU draw time (ms) as a function of the number of sphere lights, measured on Apple M3 Max. Shadows disabled. Scene: 70×70 lambertian floor, overhead camera, N unshadowed sphere lights in a grid.

Adding Shadows for Spot Lights A3-shadows

Shadow maps are rendered at the very beginning of each frame's command buffer, before the main color pass. For each spot light whose "shadow" field is non-zero, a dedicated ShadowPipeline render pass draws all scene geometry into a shadow × shadow R32_SFLOAT color attachment (plus a D32_SFLOAT depth attachment for hardware depth testing). Writing depth to a float color attachment — rather than sampling a depth-format image — avoids macOS/MoltenVK restrictions on depth texture sampling. The shadow camera is set up each frame with a perspective projection whose field of view matches the spot light's fov, near plane = max(radius, 1.0), far plane = limit (or 1000 if unlimited), and a lookAt aimed along the spot's emission direction. The world-to-clip matrix for each shadow slot is also uploaded to a ShadowMatricesUniform buffer (set 0, binding 6) so the main pass can reproject surface points into shadow space.

To avoid shadow acne the shadow pipeline culls back faces (VK_CULL_MODE_BACK_BIT), which means only the front surfaces of occluders are stored in the map, naturally pushing the stored depth away from self-shadowing surfaces. No hardware depth bias is applied at rasterization time; instead a constant bias of 0.001 is subtracted from the surface's shadow-space depth at sample time in objects.frag before comparing against the stored value.

The shadow maps are provided to materials as a fixed-size array sampler2D SHADOW_MAPS[8] bound at set 0, binding 5. Unused slots are filled with a 1×1 dummy white texture so all 8 descriptor slots are always valid. Each spot light stores its assigned slot index in DIRECTION_SHADOW.w; a value of −1 means unshadowed and the shadow lookup is skipped entirely.

The PCF filter takes 4 samples arranged in a 2×2 sub-pixel kernel with offsets (±0.5, ±0.5) texels. Each sample performs a manual depth comparison ((ref_depth − bias) ≤ stored_depth ? 1.0 : 0.0) and the four results are averaged with equal weight (multiplied by 0.25), producing a soft penumbra one texel wide at shadow map boundaries.

Spot light shadows on both lambertian (checker) and pbr (stone tiles) materials — wide view.

Spot light shadows on both lambertian and pbr materials — close-up view.

Shadow map resolution: 32×32 — heavily pixelated shadow edges.

Shadow map resolution: 512×512 — smooth shadow edges.

Include a graph showing the performance impact of adding shadowing to lights to a scene. Attempt to separate the performance impact of shadow map rendering and shadow map sampling (e.g., by testing the same scenes with per-frame rendered shadow maps and pre-rendered shadow maps). Which is larger?

Extra: Light Sort A3x-sort

Cover, at least: your chosen method for sorting lights to meshes.

Build a scene in which your light sorting technique provides a performance improvement over rendering all meshes with all lights. Include a screen shot (and the scene itself).

Include data demonstrating the performance improvement in rendering this scene. E.g., a graph of the frame times for an animated fly-through of the scene with and without your sorting code enabled.

Extra: PCSS Soft Shadows A3x-soft

Cover, at least: your implementation of PCSS (sampling pattern, counts)

Include images showing the same shadows rendered with and without PCSS, showing the spreading behavior as the shadow stretches further from the light.

Extra: Shadows for Sun Lights A3x-cascade

Cover, at least: your choice of shadow map cascade levels and layout; how your cascade is packed into a texture; how [if at all] your cascade avoids "boiling" as the camera moves

Include images showing a scene rendered with a shadow-casting distant directional light. Include images with a modified shader color-coding pixels by what cascade level they are sampling. Include images from a debug camera, showing how the shadow map cascade fits the camera frustum.

Extra: Shadows for Sphere Lights A3x-cube

Cover, at least: how you chose to set up cameras for cube map rendering

Include images or video showing a scene rendered with a shadow-casting sphere light. Show that there are no artifacts at the edges or corners of the shadow map cube.

Extra: Other A3x-*

If you have received instructor permission to pursue another extra credit activity, include information about that activity here.

Include images or video showing a scene rendered with a shadow-casting sphere light. Show that there are no artifacts at the edges or corners of the shadow map cube.

Feedback

This is the end of the structured report. Feel free to add feedback about A1 to this section.