[TAG fastest working version with unaligned loads but aligned stores peter@cordes.ca**20080319052545] < [adjust lea offsets to simply addressing modes peter@cordes.ca**20080319044041] [new version of rshift using unaligned loads and not overshooting. 2.0c/l Conroe, but 2.5c/l Harpertown peter@cordes.ca**20080319040110] [use little-endian notation for register contents peter@cordes.ca**20080318230601] [fix comments on alternatives to shufpd peter@cordes.ca**20080318201949] [update benchmark comments peter@cordes.ca**20080318190152] [update benchmark comments peter@cordes.ca**20080318181416] [remove stale comment peter@cordes.ca**20080317044629] [TAG good working version, 16byte alignment req. 2cycle core2 peter@cordes.ca**20080316233700] > { }