История изменений
Исправление qnikst, (текущая версия) :
код:
{-# LANGUAGE LambdaCase #-}
import Criterion.Main
import Data.Vector.Unboxed as V
import System.Environment
magic :: Int -> Int
magic n = V.sum p1
where
p1 = V.unfoldrN (n+1) (\(c,s) -> let k = c+s in k `seq` Just (k,(c+1,k))) (0,0)
-- tail recursion version
--
fermaPyr' n = let fAcc 0 a = a
fAcc n a = fAcc (n - 1) (fermaTria' n + a)
in fAcc n 0
-- tail recursion ver.2
--
fermaPyr'' :: Int -> Int
fermaPyr'' 0 = 0
fermaPyr'' n = fermaTria' n + fermaPyr'' (n - 1)
-- recursion version
--
fermaTria 0 = 0
fermaTria n = n + fermaTria (n - 1)
-- tail recursion version
fermaTria' n = let fAcc 0 a = a
fAcc n a = fAcc (n - 1) (n + a)
in fAcc n 0
main = getArgs >>= \case
("mem":x:_) -> membench (read x)
_ -> criterion
where
membench x = print $ magic x
criterion = defaultMain
[ bench "magic" $ nf magic 10000
, bench "fermaPyr''" $ nf fermaPyr'' 10000
]
сборка:
$ ghc -O2 -fllvm -optc-O2 -optc-mtune=native -optc-march=native 1.hs -fforce-recomp
[1 of 1] Compiling Main ( 1.hs, 1.o )
You are using a new version of LLVM that hasn't been tested yet!
We will try though...
Linking 1 ...
результат (скорость + память):
\time -v ./1 mem 1000000
166667166667000000
Command being timed: "./1 mem 1000000"
User time (seconds): 0.00
System time (seconds): 0.00
Percent of CPU this job got: 0%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.00
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 7632
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 546
Voluntary context switches: 2
Involuntary context switches: 1
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
бенчмарк
qnikst@thinkpad ~/tmp/lor/pyr $ ./1
warming up
estimating clock resolution...
mean is 2.012293 us (320001 iterations)
found 24393 outliers among 319999 samples (7.6%)
15929 (5.0%) low severe
8464 (2.6%) high severe
estimating cost of a clock call...
mean is 51.33679 ns (13 iterations)
found 1 outliers among 13 samples (7.7%)
1 (7.7%) high severe
benchmarking magic
mean: 29.38982 ns, lb 29.00721 ns, ub 30.33777 ns, ci 0.950
std dev: 3.062961 ns, lb 1.591688 ns, ub 5.057607 ns, ci 0.950
found 9 outliers among 100 samples (9.0%)
4 (4.0%) high mild
5 (5.0%) high severe
variance introduced by outliers: 81.033%
variance is severely inflated by outliers
benchmarking fermaPyr''
mean: 119.3138 us, lb 119.1146 us, ub 119.6163 us, ci 0.950
std dev: 1.240734 us, lb 925.3786 ns, ub 1.972768 us, ci 0.950
qnikst@thinkpad ~/tmp/lor/pyr $ ./1 mem 1000000 +RTS -sstderr
166667166667000000
68,648 bytes allocated in the heap
3,512 bytes copied during GC
44,416 bytes maximum residency (1 sample(s))
17,024 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 0 colls, 0 par 0.00s 0.00s 0.0000s 0.0000s
Gen 1 1 colls, 0 par 0.00s 0.00s 0.0002s 0.0002s
INIT time 0.00s ( 0.00s elapsed)
MUT time 0.00s ( 0.00s elapsed)
GC time 0.00s ( 0.00s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 0.00s ( 0.00s elapsed)
%GC time 10.0% (31.6% elapsed)
Alloc rate 1,105,727,723 bytes per MUT second
Productivity 84.0% of total user, 266.1% of total elapsed
offtop: можешь для интереса в HCFLAGS прописать "-fllvm -optc-O2 -optc-march=native" надо это проверить кстати и в блог написать, а то у многих хацкель прог не по делу llvm флаг торчит
Исправление qnikst, :
код:
{-# LANGUAGE LambdaCase #-}
import Criterion.Main
import Data.Vector.Unboxed as V
import System.Environment
magic :: Int -> Int
magic n = V.sum p1
where
p1 = V.unfoldrN (n+1) (\(c,s) -> let k = c+s in k `seq` Just (k,(c+1,k))) (0,0)
-- tail recursion version
--
fermaPyr' n = let fAcc 0 a = a
fAcc n a = fAcc (n - 1) (fermaTria' n + a)
in fAcc n 0
-- tail recursion ver.2
--
fermaPyr'' :: Int -> Int
fermaPyr'' 0 = 0
fermaPyr'' n = fermaTria' n + fermaPyr'' (n - 1)
-- recursion version
--
fermaTria 0 = 0
fermaTria n = n + fermaTria (n - 1)
-- tail recursion version
fermaTria' n = let fAcc 0 a = a
fAcc n a = fAcc (n - 1) (n + a)
in fAcc n 0
main = getArgs >>= \case
("mem":x:_) -> membench (read x)
_ -> criterion
where
membench x = print $ magic x
criterion = defaultMain
[ bench "magic" $ nf magic 10000
, bench "fermaPyr''" $ nf fermaPyr'' 10000
]
сборка:
$ ghc -O2 -fllvm -optc-O2 -optc-mtune=native -optc-march=native 1.hs -fforce-recomp
[1 of 1] Compiling Main ( 1.hs, 1.o )
You are using a new version of LLVM that hasn't been tested yet!
We will try though...
Linking 1 ...
результат (скорость + память):
\time -v ./1 mem 1000000
166667166667000000
Command being timed: "./1 mem 1000000"
User time (seconds): 0.00
System time (seconds): 0.00
Percent of CPU this job got: 0%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.00
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 7632
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 546
Voluntary context switches: 2
Involuntary context switches: 1
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
бенчмарк
qnikst@thinkpad ~/tmp/lor/pyr $ ./1
warming up
estimating clock resolution...
mean is 2.012293 us (320001 iterations)
found 24393 outliers among 319999 samples (7.6%)
15929 (5.0%) low severe
8464 (2.6%) high severe
estimating cost of a clock call...
mean is 51.33679 ns (13 iterations)
found 1 outliers among 13 samples (7.7%)
1 (7.7%) high severe
benchmarking magic
mean: 29.38982 ns, lb 29.00721 ns, ub 30.33777 ns, ci 0.950
std dev: 3.062961 ns, lb 1.591688 ns, ub 5.057607 ns, ci 0.950
found 9 outliers among 100 samples (9.0%)
4 (4.0%) high mild
5 (5.0%) high severe
variance introduced by outliers: 81.033%
variance is severely inflated by outliers
benchmarking fermaPyr''
mean: 119.3138 us, lb 119.1146 us, ub 119.6163 us, ci 0.950
std dev: 1.240734 us, lb 925.3786 ns, ub 1.972768 us, ci 0.950
-- tail recursion version
--
fermaPyr' n = let fAcc 0 a = a
fAcc n a = fAcc (n - 1) (fermaTria' n + a)
in fAcc n 0
-- tail recursion ver.2
--
fermaPyr'' 0 = 0
fermaPyr'' n = fermaTria' n + fermaPyr'' (n - 1)
-- recursion version
--
fermaTria 0 = 0
fermaTria n = n + fermaTria (n - 1)
-- tail recursion version
fermaTria' n = let fAcc 0 a = a
fAcc n a = fAcc (n - 1) (n + a)
in fAcc n 0
offtop: можешь для интереса в HCFLAGS прописать "-fllvm -optc-O2 -optc-march=native" надо это проверить кстати и в блог написать, а то у многих хацкель прог не по делу llvm флаг торчит
Исходная версия qnikst, :
код:
{-# LANGUAGE LambdaCase #-}
import Criterion.Main
import Data.Vector.Unboxed as V
import System.Environment
magic :: Int -> Int
magic n = V.sum p1
where
p1 = V.unfoldrN (n+1) (\(c,s) -> let k = c+s in k `seq` Just (k,(c+1,k))) (0,0)
-- tail recursion version
--
fermaPyr' n = let fAcc 0 a = a
fAcc n a = fAcc (n - 1) (fermaTria' n + a)
in fAcc n 0
-- tail recursion ver.2
--
fermaPyr'' :: Int -> Int
fermaPyr'' 0 = 0
fermaPyr'' n = fermaTria' n + fermaPyr'' (n - 1)
-- recursion version
--
fermaTria 0 = 0
fermaTria n = n + fermaTria (n - 1)
-- tail recursion version
fermaTria' n = let fAcc 0 a = a
fAcc n a = fAcc (n - 1) (n + a)
in fAcc n 0
main = getArgs >>= \case
("mem":x:_) -> membench (read x)
_ -> criterion
where
membench x = print $ magic x
criterion = defaultMain
[ bench "magic" $ nf magic 10000
, bench "fermaPyr''" $ nf fermaPyr'' 10000
]
сборка:
$ ghc -O2 -fllvm -optc-O2 -optc-mtune=native -optc-march=native 1.hs -fforce-recomp
[1 of 1] Compiling Main ( 1.hs, 1.o )
You are using a new version of LLVM that hasn't been tested yet!
We will try though...
Linking 1 ...
результат (скорость + память):
\time -v ./1 mem 1000000
166667166667000000
Command being timed: "./1 mem 1000000"
User time (seconds): 0.00
System time (seconds): 0.00
Percent of CPU this job got: 0%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.00
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 7632
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 546
Voluntary context switches: 2
Involuntary context switches: 1
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
бенчмарк
qnikst@thinkpad ~/tmp/lor/pyr $ ./1
warming up
estimating clock resolution...
mean is 2.012293 us (320001 iterations)
found 24393 outliers among 319999 samples (7.6%)
15929 (5.0%) low severe
8464 (2.6%) high severe
estimating cost of a clock call...
mean is 51.33679 ns (13 iterations)
found 1 outliers among 13 samples (7.7%)
1 (7.7%) high severe
benchmarking magic
mean: 29.38982 ns, lb 29.00721 ns, ub 30.33777 ns, ci 0.950
std dev: 3.062961 ns, lb 1.591688 ns, ub 5.057607 ns, ci 0.950
found 9 outliers among 100 samples (9.0%)
4 (4.0%) high mild
5 (5.0%) high severe
variance introduced by outliers: 81.033%
variance is severely inflated by outliers
benchmarking fermaPyr''
mean: 119.3138 us, lb 119.1146 us, ub 119.6163 us, ci 0.950
std dev: 1.240734 us, lb 925.3786 ns, ub 1.972768 us, ci 0.950
offtop: можешь для интереса в HCFLAGS прописать "-fllvm -optc-O2 -optc-march=native" надо это проверить кстати и в блог написать, а то у многих хацкель прог не по делу llvm флаг торчит