benchmarks
Utilities for computing statistics on benchmark data.
Translated from https://github.com/jupyterlab/jupyterlab/blob/82df0b635dae2c1a70a7c41fe7ee7af1c1caefb2/galata/src/benchmarkReporter.ts#L150-L244 which was originally added in https://github.com/jupyterlab/benchmarks/blob/f55db969bf4d988f9d627ba187e28823a50153ba/src/compare.ts#L136-L213
Distribution
dataclass
Statistical description of a distribution
Source code in lineapy/utils/benchmarks.py
41 42 43 44 45 46 47 48 49 50 51 52 |
|
DistributionChange
dataclass
Change between two distributions
Source code in lineapy/utils/benchmarks.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
|
__str__()
Format a performance changes like between 20.1% slower and 30.3% faster (95% CI)
.
Source code in lineapy/utils/benchmarks.py
30 31 32 33 34 35 36 37 38 |
|
distribution_change(old_measures, new_measures, confidence_interval=0.95)
Compute the performance change based on a number of old and new measurements.
Based on the work by Tomas Kalibera and Richard Jones. See their paper "Quantifying Performance Changes with Effect Size Confidence Intervals", section 6.2, formula "Quantifying Performance Change".
Note: The measurements must have the same length. As fallback, you could use the minimum size of the two measurement sets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
old_measures |
List[float]
|
The list of timings from the old system |
required |
new_measures |
List[float]
|
The list of timings from the new system |
required |
confidence_interval |
float
|
The confidence interval for the results. The default is a 95% confidence interval (95% of the time the true mean will be between the resulting mean +- the resulting CI) |
0.95
|
Test against the example in the paper, from Table V, on pages 18-19
res = distribution_change(
old_measures=[
round(mean([9, 11, 5, 6]), 1),
round(mean([16, 13, 12, 8]), 1),
round(mean([15, 7, 10, 14]), 1),
],
new_measures=[
round(mean([10, 12, 6, 7]), 1),
round(mean([9, 1, 11, 4]), 1),
round(mean([8, 5, 3, 2]), 1),
],
confidence_interval=0.95
)
from math import isclose
assert isclose(res.mean, 68.3 / 74.5, rel_tol=0.05)
assert isclose(res.confidence_interval, 60.2 / 74.5, rel_tol=0.05)
Source code in lineapy/utils/benchmarks.py
102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 |
|