threefish_512_avx2.cpp |
Behold. The key schedule progresses like so. The values
loop back to the originals after the rounds are complete
so we don't need to reload for starting the next block.
R0 R1 R2
K1,K2,K3 (7,5,3,1),(8,6,4,2),(0,7,5,3)
K3,K4,K5 (0,7,5,3),(1,8,6,4),(2,0,7,5)
K5,K6,K7 (2,0,7,5),(3,1,8,6),(4,2,0,7)
K7,K8,K0 (4,2,0,7),(5,3,1,8),(6,4,2,0)
K0,K1,K2 (6,4,2,0),(7,5,3,1),(8,6,4,2)
K2,K3,K4 (8,6,4,2),(0,7,5,3),(1,8,6,4)
K4,K5,K6 (1,8,6,4),(2,0,7,5),(3,1,8,6)
K6,K7,K8 (3,1,8,6),(4,2,0,7),(5,3,1,8)
K8,K0,K1 (5,3,1,8),(6,4,2,0),(7,5,3,1)
To compute the values for the next round:
X0 is X2 from the last round
X1 becomes (X0[4],X1[1:3])
X2 becomes (X1[4],X2[1:3])
Uses 3 permutes and 2 blends, is there a faster way?
|
20891 |