Re: c++ optimised vector class -
01-08-2004
Instead of using SSE/SSE2 did you try just using plain old assembly (486)?
One thing you should know, when MMX did math it used something called register aliasing. It used the same registers as FPU math. So mixing MMX and FPU instructions can cause slowdown. I don't know if this is true with SSE or not.
|