[putting the pxor after the loop helps performance, but makes the fn bigger peter@cordes.ca**20080315060359] { hunk ./rshift.asm 135 - C seems to make no diff where we put pxor, so move it to function start if that helps alignment - pxor %xmm6, %xmm6 C we need this for later, in L(out). hunk ./rshift.asm 173 + C seems to make no diff where we put pxor, so move it to function start if that helps alignment + pxor %xmm6, %xmm6 C we need this for later, in L(out). }