Cycles X

Today it’s been exactly 10 years since Cycles was announced. In the past decade Cycles has developed into a full-fledged production renderer, used by many artists and studios. We learned a lot in those 10 years, things that worked well, but also things that didn’t work well, or became outdated as rendering algorithms and hardware […]

Today it’s been exactly 10 years since Cycles was announced. In the past decade Cycles has developed into a full-fledged production renderer, used by many artists and studios. We learned a lot in those 10 years, things that worked well, but also things that didn’t work well, or became outdated as rendering algorithms and hardware evolved.

We’re keen to make bigger improvements to core Cycles rendering. However some decisions made in the past are holding back performance and making it difficult to maintain the code. To address that, Sergey and I started a research project named Cycles X, with the aim is to refresh the architecture and prepare it for the next 10 years. Rather than finding quick fixes or optimizations that solve only part of the problem, we’re rethinking the architecture as a whole.

The Project

Broadly speaking, the goal of the project is to:

  • Improve the architecture for future development
  • Improve usability of viewport and batch rendering
  • Improve performance on modern CPUs and GPUs
  • Introduce more advanced rendering algorithms

Our first target was to validate the new architecture. To that end, we’ve implemented a prototype of a new GPU kernel, and new scheduling algorithms for viewport and batch renders. There’s just enough functionality to render some of our benchmark scenes now.

Current Cycles X kernel graph

Today we’re sharing some initial performance results, and publishing the code to collaborate with Cycles contributors. A technical presentation for developers on the new architecture is available, and the code can be found in the cycles-x branch on git.blender.org.

There is much be done. We expect it will take at least 6 months until this work is part of an official Blender release.

Initial Results

First, some results from GPU rendering with well-known benchmark scenes. Scenes have been modified to remove features like volume rendering, which are not implemented yet.

Be aware that the numbers will change as we keep working on the new architecture. OptiX support was added just a few days ago by Patrick Mours.

The most significant improvements are in interior scenes with many light bounces and shaders, where the new kernels can achieve higher occupancy and coherence.

CPU rendering performance is approximately the same as before at this point, but the new architecture opens up new possibilities there as well.

Secondly, we’ve been working to improve viewport rendering. Faster rendering kernels help, but we also found that improving the scheduling, timing, and display mechanisms can make the viewport feel more interactive. New viewport support for adaptive sampling and batching samples make it so the image cleans up faster once the first few samples are done.

Looking Forward

In the coming months we will try more optimization ideas, and restore missing functionality. When functionality is missing, it’s usually because we want to take a different approach in the new architecture. Some examples:

  • Volume rendering: we plan to implement ray-marching and light sampling with more modern algorithms
  • Shadow catchers: we’ll try a different algorithm that can take into account indirect light
  • Multi-device rendering: we’ll experiment with more fine-grained load balancing without tiles

Beyond this, the new architecture should let us more easily fit in rendering algorithms like path guiding, which we will experiment with and research how they can be made GPU friendly.

Deprecation

As part of the new architecture, we are removing some functionality. Most notably:

  • OpenCL rendering kernels. The combination of the limited Cycles split kernel implementation, driver bugs, and stalled OpenCL standard has made maintenance too difficult. We can only make the kinds of bigger changes we are working on now by starting from a clean slate.
    We are working with AMD and Intel to get the new kernels working on their GPUs, possibly using different APIs (such as CYCL, HIP, Metal, …). This will not necessarily be ready for the first release, the implementation needs to reach a higher quality bar than what is there now. Long term, supporting all major GPU hardware vendors remains an important goal, and we think that with this new architecture we’ll be able to better performance and something stability. It is just a matter of time until more GPUs are supported in Cycles X again.
  • Branched path tracing. We are working to improve sampling algorithms to make this obsolete, and more automatically assign samples where needed. Improved adaptive sampling and light importance sampling are key here.
  • NLM denoiser. AI denoising algorithms and in particular OpenImageDenoise generally yield better results, and we will optimize the architecture and workflow for them.

These features will remain available and supported in 2.83 and 2.93 LTS releases.

Source: Blender