История изменений
Исправление Harald, (текущая версия) :
g++-8.2.0 -o clamptest -O2 clamp.cpp
./clamptest
initialized source buffer
Test #1, random data:
clampByte(): 68991 clocks clampsum 8555683262
clampByteSimple(): 248676 clocks clampsum 8555683262
clampByte_tsar(): 65709 clocks clampsum 8555683262
orig vs simple: 3.60447x, tsar vs simple: 3.7845x
Test #2, -1,127,257,0:
clampByte(): 69517 clocks clampsum 3213152209
clampByteSimple(): 219396 clocks clampsum 3213152209
clampByte_tsar(): 65299 clocks clampsum 3213152209
orig vs simple: 3.156x, tsar vs simple: 3.35987x
Test #3, preclamped data:
clampByte(): 69517 clocks clampsum 17112727680
clampByteSimple(): 62822 clocks clampsum 17112727680
clampByte_tsar(): 65924 clocks clampsum 17112727680
orig vs simple: 0.903693x, tsar vs simple: 0.952946x
Test #4, 257:
clampByte(): 69978 clocks clampsum 17112760320
clampByteSimple(): 62370 clocks clampsum 17112760320
clampByte_tsar(): 65213 clocks clampsum 17112760320
orig vs simple: 0.89128x, tsar vs simple: 0.956404x
Test #4, -1:
clampByte(): 68893 clocks clampsum 0
clampByteSimple(): 41736 clocks clampsum 0
clampByte_tsar(): 65508 clocks clampsum 0
orig vs simple: 0.605809x, tsar vs simple: 0.637113x
это всё на Intel(R) Core(TM) i5-4590 CPU @ 3.30GHz
Исходная версия Harald, :
g++-8.2.0 -o clamptest -O2 clamp.cpp
./clamptest
initialized source buffer
Test #1, random data:
clampByte(): 68991 clocks clampsum 8555683262
clampByteSimple(): 248676 clocks clampsum 8555683262
clampByte_tsar(): 65709 clocks clampsum 8555683262
orig vs simple: 3.60447x, tsar vs simple: 3.7845x
Test #2, -1,127,257,0:
clampByte(): 69517 clocks clampsum 3213152209
clampByteSimple(): 219396 clocks clampsum 3213152209
clampByte_tsar(): 65299 clocks clampsum 3213152209
orig vs simple: 3.156x, tsar vs simple: 3.35987x
Test #3, preclamped data:
clampByte(): 69517 clocks clampsum 17112727680
clampByteSimple(): 62822 clocks clampsum 17112727680
clampByte_tsar(): 65924 clocks clampsum 17112727680
orig vs simple: 0.903693x, tsar vs simple: 0.952946x
Test #4, 257:
clampByte(): 69978 clocks clampsum 17112760320
clampByteSimple(): 62370 clocks clampsum 17112760320
clampByte_tsar(): 65213 clocks clampsum 17112760320
orig vs simple: 0.89128x, tsar vs simple: 0.956404x
Test #4, -1:
clampByte(): 68893 clocks clampsum 0
clampByteSimple(): 41736 clocks clampsum 0
clampByte_tsar(): 65508 clocks clampsum 0
orig vs simple: 0.605809x, tsar vs simple: 0.637113x