История изменений
Исправление Nervous, (текущая версия) :
Заход за производительностью к мутабельности с другой стороны оказался ещё бесполезнее:
;; using transients and arrays of primitives
;; profiler shows that updating stats maps (station and total) consumes most
;; of the time
;; - representing total stats as a transient map and updating it in-place
;; - representing station stats as an array of doubles and updating it in-place
;; - using type hints to avoid slow reflection calls
(defn update-station-stats!
"Returns an array of doubles containing stats of some station updated
with the given measured value."
[^doubles [count minimum maximum average :as station-stats] ^double value]
(doto ^doubles station-stats
(aset 0 ^double (unchecked-inc count))
(aset 1 ^double (min minimum value))
(aset 2 ^double (max maximum value))
(aset 3 ^double (moving-average average count value))))
(defn update-total-stats!
"Returns the (transient) map of station stats updated with the result
of updating the stats of the given station with the given measured
value."
([] (transient {}))
([acc] acc)
([acc [key value]]
(assoc! acc
key
((fnil update-station-stats!
(double-array [0.0 value value value]))
(acc key)
value))))
(defn transducing-stats!
"Returns the map of station stats keyed by station names, obtained by
transducing (reducing with transformation of the reducing function)
the sequence of lines."
[lines]
(persistent! (transduce line->measurement update-total-stats! lines)))
(defn format-stats-1
"Returns the map of station stats (converted to maps from arrays of doubles),
sorted lexicographically by key."
[stats]
(into (sorted-map) (map (fn [[key [count minimum maximum average]]]
[key {:count count
:minimum minimum
:maximum maximum
:average average}]) stats)))
(comment
;; the same as the baseline (or slightly worse), worse than the simple
;; transducing variant by ~30%
;; ~1.3 sec
(time (format-stats-1
(transducing-stats! (read-file "dev/resources/measurements1M.txt"))))
)
Может, конечно, я при профилировании что-то не так понял (ленивые последовательности правильно профилировать могут не только лишь все), надо попробовать натянуть мутабельность на жадную итерацию.
Исходная версия Nervous, :
Заход за производительностью к мутабельности с другой стороны оказался ещё бесполезнее:
;; using transients and arrays of primitives
;; profiler shows that updating stats maps (station and total) consumes most
;; of the time
;; - representing total stats as a transient map and updating it in-place
;; - representing station stats as an array of doubles and updating it in-place
;; - using type hints to avoid slow reflection calls
(defn update-station-stats!
"Returns an array of doubles containing stats of some station updated
with the given measured value."
[^doubles [count minimum maximum average :as station-stats] ^double value]
(doto ^doubles station-stats
(aset 0 ^double (unchecked-inc count))
(aset 1 ^double (min minimum value))
(aset 2 ^double (max maximum value))
(aset 3 ^double (moving-average average count value))))
(defn update-total-stats!
"Returns the (transient) map of station stats updated with the result
of updating the stats of the given station with the given measured
value."
([] (transient {}))
([acc] acc)
([acc [key value]]
(assoc! acc
key
((fnil update-station-stats!
(double-array [0.0 value value value]))
(acc key)
value))))
(defn transducing-stats!
"Returns the map of station stats keyed by station names, obtained by
transducing (reducing with transformation of the reducing function)
the sequence of lines."
[lines]
(persistent! (transduce line->measurement update-total-stats! lines)))
(defn format-stats-1
"Returns the map of station stats (represented by arrays of doubles),
sorted lexicographically by key."
[stats]
(into (sorted-map) (map (fn [[key [count minimum maximum average]]]
[key {:count count
:minimum minimum
:maximum maximum
:average average}]) stats)))
(comment
;; the same as the baseline (or slightly worse), worse than the simple
;; transducing variant by ~30%
;; ~1.3 sec
(time (format-stats-1
(transducing-stats! (read-file "dev/resources/measurements1M.txt"))))
)
Может, конечно, я при профилировании что-то не так понял (ленивые последовательности правильно профилировать могут не только лишь все), надо попробовать натянуть мутабельность на жадную итерацию.