I finally ported over my vectorized code to Intel and AMD chips! And with time to spare, because my midterm evaluations are coming up (I’d like to thank my mentors they have helped me so much, I wouldn’t have done all of this this quickly without their help). So yea where to go from now? My current plans are to port the code to PowerPC’s AltiVec extensions and make a AVX2 version of the SSE2 code I made for x86_64 processors. Other than that here are some pictures I took of weird bugs while porting the Arm NEON code to x86 SSE2 with some comments, (which these pictures are dearly needed, this blog has been quite boring without pictures).
Now don’t worry, I fixed all the bugs. In fact you will be able to tell that it’s fine once my PR (#5144) gets accepted. Hopefully it makes your games run a bit faster (even if you don’t have vector extensions in your computer).