Fast fourier transform height field based ocean simulations, popularized by Jerry Tessendorf, have been present in the industry for many years.

We developed a CUDA optimized version, capable of rendering a 512×512 ocean patch at 600 fps running on Intel(R) Xeon(R) CPU E5-1620 v4 @ 3.50GHz with a NVIDIA GeForce GTX 1060.

Using CUDA optimization techniques such as streaming , dynamic shared memory, privatization and memory coalescing, we optimized the naive CUDA implimentation by more than 400%.

This Project was done in collaboration with Daryl Teo: https://www.linkedin.com/in/dteowm/

There are four parts in the ocean simulation:

1

## Generating Modified Phillips Spectrum

Generates a wave spectrum in frequencies space using complex number arithmetic, sine and cosine functions. The output is a frequency domain image which looks like:

2

## Inverse Fast Fourier Transform

Using an inverse fast fourier transform in cuFFT to convert the Modified Philips Spectrum from frequency domain to spatial domain, we obtain the following image:

3

## Generating Wave Geometry

Applying the output from the previous step as a displacement on the ocean grid, we then achieve an ocean like planar geometry.

4

## Generate Vertex & Face Normals

However, the vertex and face normals are incorrect, causing the ocean to look unrealistic. We fix this by calculating the vertex and face normals using tri-linear slope interpolation for each face.

With this step, we are done and have a realistic approximation of an ocean surface.

Optimization was an iterative processes for us, starting out with the naive CPU version and slowly optimizing it into the fast CUDA based simulation it is now.

**CPU Simulation**- The initial simulation was done mainly on the CPU with only the rendering and shading done on the GPU. This build made use of the naive Fourier Transform method, not the IFFT.

As such the time taken to render a 512×512 grid sized ocean was around 160ms, a whopping 6 frames per second.

- The initial simulation was done mainly on the CPU with only the rendering and shading done on the GPU. This build made use of the naive Fourier Transform method, not the IFFT.
**First Iteration: Naïve Port to CUDA**- We then developed a basic CUDA optimized version of the Fast Fourier Transform allowing us to achieve around 150fps on a computer per patch.

**Second Iteration: Inverse Fast Fourier Transform Optimizations**- As such we moved the generation of the Phillips Spectrum in Complex Number Frequency space to CUDA, and switch to from the traditional Fast Fourier Transform to use the Inverse Fast Fourier Transform instead.
- This gave us a huge performance boost, allowing us to render a much larger ocean patch.

**Third Iteration: CUDA Optimizations**- Lastly to further optimize the last 20%, we implemented CUDA Optimization techniques like Memory Coalescing, Dynamic Shared Memory, Streaming and Privatization. This gave us a more than 400% speedup when compared with the naïve CUDA implementation.

With all the optimizations in place we were able to generate and render a 512×512 ocean patch at 600 fps running on Intel(R) Xeon(R) CPU E5-1620 v4 @ 3.50GHz with a NVIDIA GeForce GTX 1060.

I was in charge of generating the Ocean Geometry. As such I took charge of the first 2 steps in the algorithm, computing the Modified Philips Spectrum and Applying the Inverse Fast Fourier Transform.

Furthermore I was tasked with making the simulation look more realistic, so I created simple ocean shaders to simulate the light passing through crest of the waves where the water was not as thick and opaque. I also applied custom shaders to certain parts of the wave to help them “catch the light” at certain angles making it sparkle.

In Summary:

- Generating Phillips Spectrum in Complex Number Frequency Space in CUDA.
- Calculating the Displacement Map using Inverse Fast Fourier Transform in CUDA.
- CUDA Optimizations.
- Memory coalescing
- Dynamic Shared memory
- Streaming
- Privatization

- Ocean Shaders for simulated translucency and sparkling.