LINUX.ORG.RU

История изменений

Исправление Nervous, (текущая версия) :

Заход за производительностью к мутабельности с другой стороны оказался ещё бесполезнее:

;; using transients and arrays of primitives

;; profiler shows that updating stats maps (station and total) consumes most
;; of the time

;; - representing total stats as a transient map and updating it in-place
;; - representing station stats as an array of doubles and updating it in-place
;; - using type hints to avoid slow reflection calls

(defn update-station-stats!
  "Returns an array of doubles containing stats of some station updated
  with the given measured value."
  [^doubles [count minimum maximum average :as station-stats] ^double value]
  (doto ^doubles station-stats
    (aset 0 ^double (unchecked-inc count))
    (aset 1 ^double (min minimum value))
    (aset 2 ^double (max maximum value))
    (aset 3 ^double (moving-average average count value))))

(defn update-total-stats!
  "Returns the (transient) map of station stats updated with the result
  of updating the stats of the given station with the given measured
  value."
  ([] (transient {}))
  ([acc] acc)
  ([acc [key value]]
   (assoc! acc
           key
           ((fnil update-station-stats!
                  (double-array [0.0 value value value]))
            (acc key)
            value))))

(defn transducing-stats!
  "Returns the map of station stats keyed by station names, obtained by
  transducing (reducing with transformation of the reducing function)
  the sequence of lines."
  [lines]
  (persistent! (transduce line->measurement update-total-stats! lines)))

(defn format-stats-1
  "Returns the map of station stats (converted to maps from arrays of doubles),
  sorted lexicographically by key."
  [stats]
  (into (sorted-map) (map (fn [[key [count minimum maximum average]]]
                            [key {:count   count
                                  :minimum minimum
                                  :maximum maximum
                                  :average average}]) stats)))

(comment

  ;; the same as the baseline (or slightly worse), worse than the simple
  ;; transducing variant by ~30%

  ;; ~1.3 sec
  (time (format-stats-1
         (transducing-stats! (read-file "dev/resources/measurements1M.txt"))))

)

Может, конечно, я при профилировании что-то не так понял (ленивые последовательности правильно профилировать могут не только лишь все), надо попробовать натянуть мутабельность на жадную итерацию.

Исходная версия Nervous, :

Заход за производительностью к мутабельности с другой стороны оказался ещё бесполезнее:

;; using transients and arrays of primitives

;; profiler shows that updating stats maps (station and total) consumes most
;; of the time

;; - representing total stats as a transient map and updating it in-place
;; - representing station stats as an array of doubles and updating it in-place
;; - using type hints to avoid slow reflection calls

(defn update-station-stats!
  "Returns an array of doubles containing stats of some station updated
  with the given measured value."
  [^doubles [count minimum maximum average :as station-stats] ^double value]
  (doto ^doubles station-stats
    (aset 0 ^double (unchecked-inc count))
    (aset 1 ^double (min minimum value))
    (aset 2 ^double (max maximum value))
    (aset 3 ^double (moving-average average count value))))

(defn update-total-stats!
  "Returns the (transient) map of station stats updated with the result
  of updating the stats of the given station with the given measured
  value."
  ([] (transient {}))
  ([acc] acc)
  ([acc [key value]]
   (assoc! acc
           key
           ((fnil update-station-stats!
                  (double-array [0.0 value value value]))
            (acc key)
            value))))

(defn transducing-stats!
  "Returns the map of station stats keyed by station names, obtained by
  transducing (reducing with transformation of the reducing function)
  the sequence of lines."
  [lines]
  (persistent! (transduce line->measurement update-total-stats! lines)))

(defn format-stats-1
  "Returns the map of station stats (represented by arrays of doubles),
  sorted lexicographically by key."
  [stats]
  (into (sorted-map) (map (fn [[key [count minimum maximum average]]]
                            [key {:count   count
                                  :minimum minimum
                                  :maximum maximum
                                  :average average}]) stats)))

(comment

  ;; the same as the baseline (or slightly worse), worse than the simple
  ;; transducing variant by ~30%

  ;; ~1.3 sec
  (time (format-stats-1
         (transducing-stats! (read-file "dev/resources/measurements1M.txt"))))

)

Может, конечно, я при профилировании что-то не так понял (ленивые последовательности правильно профилировать могут не только лишь все), надо попробовать натянуть мутабельность на жадную итерацию.