1. Build Configuration
In order to enable SIMD instructions, besides SSE, for Unreal Engine, the follow entry must first be added to the "[...].Build.cs"
MinCpuArchX64 = MinimumCpuArchitectureX64.AVX512;
You can chose between various options:
- AVX
- AVX2
- AVX512
Important: When enabling this in a UPLUGIN for example, the main projects' setting will always override the plugins' settings. The best approach would then be to enable what you wanna support in the plugin and then set what the project for should support for the current build.
2. Preprocessor directives/Macros
With the above done, let's go for some compiler specific macros. The following is dummy code, but should get across what we want to do.
#if defined(__AVX2__)
// Do AVX2 logic
#elif defined(__AVX__)
// Do AVX logic
#elif defined(__SSE4_1__)
// Do SSE logic
#else
// Non-vector logic
#endif
The above are macros used by the Microsoft Visual Studio C++ compiler.
Now Unreal has its own macros for that, but I have found them to be rather confusing. Additionally, you'll always get a warning (at least in Rider) about a "macro redefinition". I wasn't yet able to fix that and there's no real documentation there. In short: instead of the default compiler macros, you could also use:
#if PLATFORM_ALWAYS_HAS_AVX_2
// AVX2
#elif PLATFORM_ALWAYS_HAS_AVX
// AVX
#elif PLATFORM_ALWAYS_HAS_SSE4_2
// SSE4.2
#else
// Non-vector logic
#endif
Yet I have found many issues, like when switching the "MinimumCpuArchitectureX64" value sometimes it wouldn't compile with certain macros. Hence for now I am using the default ones.
Tip: When using AVX FMA instructions, you could also check for just that:
#if defined(__FMA3__)
// FMA3 default compiler
#elif PLATFORM_ALWAYS_HAS_FMA3
// FMA Unreal
#else
// Non-FMA logic
#endif
Note: I am aware that you can choose between "PLATFORM_ALWAYS_HAS_[...]" and "PLATFORM_MAYBE_HAS_[...]". But I have yet to figure out the actual difference.
3. Compiling the code
Let's say for example we have enabled AVX512 and your function supports AVX512, AVX2 and SSE4.1. If the project is set to support AVX512, it will correctly compile using the AVX512 implementation. If instead you set it to AVX2, it will compile for that.
In theory, if you try to compile for a system that does not support AVX512, it would only compile for AVX2 by default, despite the "MinimumCpuArchitectireX64" settings.
Let's say you have set everything up correctly and now want to build your project for different instruction sets, you could either:
1. Compile with the correct "MinimumCpuArchitectireX64" setting.
2. Compile on the system that you want to support directly (using the default compiler macros this should work fine).
A cool note: you could also add macros like "PLATFORM_64BITS" or "LINUX_ARM64" to support multiple SIMD instructions depending on the platform. For example, first check for the CPU architecture and then implement both AVX instructions for x86-64 and NEON instructions for ARMx64.
4. Issues/anomalies
In Rider you may see that, when using the macros, that your actual implementation is greyed out in your IDE. Let's say you have enabled AVX512 and it compiles for that, yet it is greyed out: this seems to be normal behavior for now. If you're not sure to check what has been compiled, look at step 5.
Also, when enabling "MinCpuArchX64", you'll get the following error when compiling (UE 5.5):
11>command line: Warning C5106 : macro redefined with different parameter names
11>WindowsPlatform.h(77): Reference C5106 : see previous definition of 'PLATFORM_ENABLE_VECTORINTRINSICS'
11>command line: Warning C4005 : 'PLATFORM_MAYBE_HAS_AVX': macro redefinition
11>Platform.h(199): Reference C4005 : see previous definition of 'PLATFORM_MAYBE_HAS_AVX'
11>command line: Warning C4005 : 'PLATFORM_ALWAYS_HAS_AVX': macro redefinition
11>Platform.h(202): Reference C4005 : see previous definition of 'PLATFORM_ALWAYS_HAS_AVX'
11>command line: Warning C4005 : 'PLATFORM_ALWAYS_HAS_AVX_2': macro redefinition
11>Platform.h(205): Reference C4005 : see previous definition of 'PLATFORM_ALWAYS_HAS_AVX_2'
I wasn't yet able to identify the issue, cause it's something with Unreal macros or such. Hence the tip to use default compiler preprocessing directives.
5. Compiler Tips
If you want to make sure that the correct implementation is used, there are two key things you can use.
1. Use compiler messages.
#pragma message ("SIMD: SSE - ENABLED")
This way you'll see what is being used for compiling the code. Important: if you happen to see multiple messages when compiling, despite you having correctly used "#if", "#elif" etc., that is normal. In that case, the last entry is the one that counts. This behavior stems from Unreal compiling more than once which can especially occur when using plugins.
2. Use Unreal Engine on-screen messages.
If the compiler messages get too confusing or you want to be 100% sure, simply use the runtime checks unreal engine offers.
GEngine->AddOnScreenDebugMessage(
-1, 25.f, FColor::White, "SIMD: SSE - ENABLED");
Now when running the game, you can easily see which instruction set is being used. Nice, eh?
BUT: don't compare performance when printing something to the screen. FString is always allocated on the heap and will therefore make anything MUCH slower (not to mention the actual printing of it)!
No comments:
Post a Comment