[fix comments on alternatives to shufpd peter@cordes.ca**20080318201949] { hunk ./rshift.asm 141 +C movdqa %xmm3, %xmm7 hunk ./rshift.asm 150 -C good: pshufd works, and gave 2.0 cycles/limb on conroe, but 2.5 on Harpertown. (?!). 3.0 on K8 +C good: 2.5c/l on Conroe. w/pshufd ..,%xmm7, %xmm6 before the load, 2.0 cycles/limb on conroe, but 2.5 on Harpertown. (?!), and 3.0 on K8. hunk ./rshift.asm 152 +C punpcklqdq %xmm7, %xmm6 C dest=dest[0],src[0] hunk ./rshift.asm 154 -C the best alternative for getting limb1 in the low part of a register hunk ./rshift.asm 160 +C the best alternative for getting limb1 in the low part of a register }