AVX2 code generation can be incorrect for some functions with the __vectorcall calling convention.
Please find attached a C++ source file "vectorcall_gather.cpp" that reproduces the problem.
In an x64 command prompt, compile the source file with the following options:
> cl /O2 /MD /W3 /fp:fast /EHsc /Zi /arch:AVX2 vectorcall_gather.cpp
This test program calls two simple image convolution functions "apply" and "vapply" that are identical except that the former has the ordinary __cdecl calling ...
Status: Active, 1 Up-Vote, 0 Down-Votes, 0 validations, 0 workarounds, 1 comment, feedback id: 1230845
Status: Active, 1 Up-Vote, 0 Down-Votes, 0 validations, 0 workarounds, 1 comment, feedback id: 1230845