class Sampler
Represents a statistical sampler for benchmarking, collecting timing samples and computing statistics.
Definitions
def initialize(minimum: 8, confidence: 0.95, margin_of_error: 0.02)
Initializes a new class Sus::Fixtures::Benchmark::Sampler
object with an optional minimum sample size, confidence level, and margin of error.
Signature
-
parameter
minimum
Integer
The minimum number of samples required before convergence can be determined (default: 8).
-
parameter
confidence
Float
The confidence level (default: 0.95). If we repeated this measurement process many times, % of the calculated intervals would include the true value we’re trying to estimate.
-
parameter
margin_of_error
Float
The acceptable margin of error relative to the mean (default: 0.02, e.g. ±2%).
Implementation
def initialize(minimum: 8, confidence: 0.95, margin_of_error: 0.02)
@minimum = minimum
@confidence = confidence
@margin_of_error = margin_of_error
# Calculate the z-score for the given confidence level:
@z_score = quantile((1 + @confidence) / 2.0)
# Welford's algorithm for calculating mean and variance in a single pass:
@count = 0
@mean = 0.0
@variance_accumulator = 0.0
end
attr :minimum
The minimum number of samples required for convergence.
Signature
-
returns
Integer
def add(value)
Adds a new timing value to the sample set.
Signature
-
parameter
value
Float
The timing value to add (in seconds).
Implementation
def add(value)
@count += 1
delta = value - @mean
@mean += delta / @count
@variance_accumulator += delta * (value - @mean)
end
def size
Returns the number of samples collected.
Signature
-
returns
Integer
Implementation
def size
@count
end
def mean
Returns the mean (average) of the collected samples.
Signature
-
returns
Float
Implementation
def mean
@mean
end
def variance
Returns the variance of the collected samples.
Signature
-
returns
Float | Nil
Returns nil if not enough samples.
Implementation
def variance
if @count > 1
@variance_accumulator / (@count)
end
end
def standard_deviation
Returns the standard deviation of the collected samples.
Signature
-
returns
Float | Nil
Returns nil if not enough samples.
Implementation
def standard_deviation
v = self.variance
v ? Math.sqrt(v) : nil
end
def standard_error
Returns the standard error of the mean for the collected samples.
Signature
-
returns
Float | Nil
Returns nil if not enough samples.
Implementation
def standard_error
sd = self.standard_deviation
sd ? sd / Math.sqrt(@count) : nil
end
def to_s
Returns a summary string of the sample statistics.
Signature
-
returns
String
Implementation
def to_s
"#{self.size} samples, mean: #{format_duration(self.mean)}, standard deviation: #{format_duration(self.standard_deviation)}, standard error: #{format_duration(self.standard_error)}"
end
def converged?
Determines if the sample size has converged based on the confidence level and margin of error.
Sampling data is always subject to some degree of uncertainty, and we want to ensure that our sample size is sufficient to provide a reliable estimate of the true value we're trying to measure (e.g. the mean execution time of a block).
The mean of the data is the average of all samples, and the standard error tells us how much the sampled mean is expected to vary from the true value. So a big standard error indicates that the mean is not very reliable, and a small standard error indicates that the mean is more reliable.
We could use the standard error to compute convergence, but we also want to ensure that the margin of error is within an acceptable range relative to the mean. This is where the confidence level and margin of error come into play. In other words, the margin of error is a relative measure, and we want an absolute measurement by which to determine convergence.
The typical way to express this is to say that we are "confident" that the true mean is within a certain margin of error relative to the measured mean. For example, if we have a mean of 100 seconds and a margin of error of 0.02, then we are saying that we are confident (with the specified confidence level) that the true mean is within ±2 seconds (2% of the mean) of our measured mean.
When we say "confident", we are referring to a statistical confidence level, which is a measure of how likely it is that the true value lies within the margin of error if we repeated the benchmark many times. A common confidence level is 95%, which means that if we repeated the measurement process many times, 95% of the calculated intervals would include the true value we’re trying to estimate, in other words, there is a 5% chance that the true value lies outside the margin of error.
Assuming we are measuring in seconds, this tells us "Based on my current data, I am %confident that the true mean is within ±current_margin_of_error seconds of my measured mean." In other words, increasing the number of samples will make the margin of error smaller for the same confidence level. Given that we are asked to be at least some confidence level, we can calculate whether this is true or not yet (i.e. whether we need to keep sampling).
The only caveat to this, is that we need to have at least @minimum samples before we can make this determination. If we have less than @minimum samples, we will return false. The reason for this is that we need a minimum number of samples to calculate a meaningful standard error and margin of error.
Signature
-
returns
Boolean
true if the sample size has converged, false otherwise.
Implementation
def converged?
return false if @count < @minimum
# Calculate the current mean and standard error
mean = self.mean
standard_error = self.standard_error
if mean && standard_error
# Calculate the margin of error:
current_margin_of_error = @z_score * standard_error
# Normalize the margin of error relative to the mean:
relative_margin_of_error = @margin_of_error * mean.abs
# Check if the margin of error is within the acceptable range:
current_margin_of_error <= relative_margin_of_error
end
end