Performance Optimization Guidelines
The more complex your Substance graphs are, the more processing power you need to render them. You should try to strike a balance between complexity and rendering speed.
This is especially important if you will use them in real-time graphics applications, such as games.Generally speaking, nodes exposing custom parameters - that can be modified at run-time - should be placed as close to the end of the graph as possible.
This is because the output of each node is cached wherever possible. Therefore, the further up the graph your tweakable node is, the more outputs will need to be processed whenever one of those exposed parameters is modified. If your exposed node is close to the end of the graph, only the few nodes between it and the output nodes will need to be recomputed.
For example, if tweaking a uniform color at the beginning of your graph, all the following nodes will be recomputed. If you tweak a HSL node placed right before the output, only this node will be recomputed, greatly improving the performance of the graph.
Please make a good note of the following guidelines:
General performance-related settings:
- GPU engine is much faster than CPU engine. Unless you have an unsupported (integrated) graphics card, use the GPU substance engine (change with Hotkey F9).
- Switching parent resolution of the graph is slow; it recomputes graph, cache and all thumbnails. It's better to use the Batch tab of the export dialog as it avoids extensive, unneeded recalculation (for example when exporting to 8192 resolution).
- In extreme cases, increased memory cache might be needed. the application limits the amount of RAM that can be used for the image cache, but you can override and increase this (with care).
- Pay careful attention to the resolution and relative-to-parent resolution settings!
High values will seriously affect performance, so consider how the material is likely to be used and whether you can reduce the data sizes involved.
- Use grayscale when no color is needed.
Color operations take four times longer than grayscale operations. Try to minimize type conversion, convert to grayscale/gradient/another convert to grayscale.. and so on.
- Use 8 bit when 16-bit is not needed.
Note: The CPU version of the Substance Engine (SSE2) does not actually support 16-bit color or 8-bit greyscale. The GPU engine supports all 4 combinations of 8/16 bits and greyscale/color. Currently, only the CPU engine is used in Unity and Unreal Engine 4.
- Minimize node output size whenever possible.
Sometimes, downsizing some nodes doesn't affect the final result, but will affect performance. For example, using a Uniform Color node set to the same output size as the document is pointless: The Uniform Color should be set to Absolute [16px x 16px] and the subsequent node to Relative to Parent. Generally this trick works well for low-frequency images, such as Perlin noise.
- Do not use images smaller than 16*16 pixels.
This slows rendering performance.
- When using the Blend node, disable Alpha blending when it's not required.
- Blurs and Warps are the most processor-intensive nodes;
- Some noise generators are affected by the amount of patterns drawn.
For instance, the Tile Generator node will get slower to process the more patterns you add to it.
- Some noises are affected by a scale factor.
This factor will in fact draw more patterns. Affected nodes include N oise, C ells, etc. If you want a white noise pattern, don't take a N oise with a very high scale value, use the W hite Noise instead.
- Conversely, there are some very fast noise generators.
These include White Noise, Fractal Sum Base, and Anisotropic Noise.
- Watch out with heavy image sampling functions in some cases
Functions are executed on CPU engine, except in Pixel Processors. If you are doing a lot of heavy image sampling (changing $pos coordinates) in Value Processors or FXmaps, there would be a lot of swapping between VRAM and CPU RAM, causing performance delays.
Optimizations for Mobile usage:
- It is not recommended to use Warps and FX-Maps.
They are very performance costly.
- Avoid Blur nodes.
Use downscale transformations instead.
- Work as much as possible in greyscale.
Switch to color mode at the end of the graph.
- Share nodes as much as possible between outputs.
Size optimizations for Embedded Bitmaps:
- Bitmaps have their Output Size set to Absolute by default. This means that if the bitmap is connected through the node chain to an output, it will then force the final output to be the size of the embedded bitmap.
A node you insert after the bitmap will have its Output Size set to Relative to Input. This means that the node will also inherent the size of the bitmap and carry this size down the node chain to the outputs.
To correct this, you need to set the node after the bitmap to have its Output Size set to Relative to Parent.
If the graph is set to have a dynamic resolution, you can change the Output Size on the embedded bitmap to be Relative To Parent.
This way, the bitmap size will change based on the parent graph and you won't get into a situation where the graph is processing higher resolution in the bitmap than what is needed.
Note : Setting a bitmap node to "relative to parent" and cooking the graph (exporting it as an sbsar) will save the bitmap at a resolution of 256x256 instead of its original size. It is advised instead to keep the bitmap node in "absolute" and use a transform2D node just after with a "relative to parent" resolution.