Upon further testing, the OpenGL lighting is, indeed, faster (especially when running in debug mode). I'm curious if the same holds true running it on my laptop with an integrated intel card. I'll have to test that later.
Came across a little article about why it's difficult to optimize with simd, and how compiler-friendly idioms are often the better way to go.
http://www.altdevblogaday.com/2011/12/2 ... ode-idiom/