`class Sampler`

Represents a statistical sampler for benchmarking, collecting timing samples and computing statistics.

Definitions

`def initialize(minimum: 8, confidence: 0.95, margin_of_error: 0.02)`

Initializes a new class Sus::Fixtures::Benchmark::Sampler object with an optional minimum sample size, confidence level, and margin of error.

Signature

parameter　minimum　Integer: The minimum number of samples required before convergence can be determined (default: 8).
parameter　confidence　Float: The confidence level (default: 0.95). If we repeated this measurement process many times, % of the calculated intervals would include the true value we’re trying to estimate.
parameter　margin_of_error　Float: The acceptable margin of error relative to the mean (default: 0.02, e.g. ±2%).

Implementation

def initialize(minimum: 8, confidence: 0.95, margin_of_error: 0.02)
	@minimum = minimum
	@confidence = confidence
	@margin_of_error = margin_of_error
	
	# Calculate the z-score for the given confidence level:
	@z_score = quantile((1 + @confidence) / 2.0)
	
	# Welford's algorithm for calculating mean and variance in a single pass:
	@count = 0
	@mean = 0.0
	@variance_accumulator = 0.0
end

`attr :minimum`

The minimum number of samples required for convergence.

Signature

returns　Integer

`def add(value)`

Adds a new timing value to the sample set.

Signature

parameter　value　Float: The timing value to add (in seconds).

Implementation

def add(value)
	@count += 1
	delta = value - @mean
	@mean += delta / @count
	@variance_accumulator += delta * (value - @mean)
end

`def size`

Returns the number of samples collected.

Signature

returns　Integer

Implementation

def size
	@count
end

`def mean`

Returns the mean (average) of the collected samples.

Signature

returns　Float

Implementation

def mean
	@mean
end

`def variance`

Returns the variance of the collected samples.

Signature

returns　Float | Nil: Returns nil if not enough samples.

Implementation

def variance
	if @count > 1
		@variance_accumulator / (@count)
	end
end

`def standard_deviation`

Returns the standard deviation of the collected samples.

Signature

returns　Float | Nil: Returns nil if not enough samples.

Implementation

def standard_deviation
	v = self.variance
	v ? Math.sqrt(v) : nil
end

`def standard_error`

Returns the standard error of the mean for the collected samples.

Signature

returns　Float | Nil: Returns nil if not enough samples.

Implementation

def standard_error
	sd = self.standard_deviation
	sd ? sd / Math.sqrt(@count) : nil
end

`def to_s`

Returns a summary string of the sample statistics.

Signature

returns　String

Implementation

def to_s
	"#{self.size} samples, mean: #{format_duration(self.mean)}, standard deviation: #{format_duration(self.standard_deviation)}, standard error: #{format_duration(self.standard_error)}"
end

`def converged?`

Determines if the sample size has converged based on the confidence level and margin of error.

Sampling data is always subject to some degree of uncertainty, and we want to ensure that our sample size is sufficient to provide a reliable estimate of the true value we're trying to measure (e.g. the mean execution time of a block).

The mean of the data is the average of all samples, and the standard error tells us how much the sampled mean is expected to vary from the true value. So a big standard error indicates that the mean is not very reliable, and a small standard error indicates that the mean is more reliable.

We could use the standard error to compute convergence, but we also want to ensure that the margin of error is within an acceptable range relative to the mean. This is where the confidence level and margin of error come into play. In other words, the margin of error is a relative measure, and we want an absolute measurement by which to determine convergence.

The typical way to express this is to say that we are "confident" that the true mean is within a certain margin of error relative to the measured mean. For example, if we have a mean of 100 seconds and a margin of error of 0.02, then we are saying that we are confident (with the specified confidence level) that the true mean is within ±2 seconds (2% of the mean) of our measured mean.

When we say "confident", we are referring to a statistical confidence level, which is a measure of how likely it is that the true value lies within the margin of error if we repeated the benchmark many times. A common confidence level is 95%, which means that if we repeated the measurement process many times, 95% of the calculated intervals would include the true value we’re trying to estimate, in other words, there is a 5% chance that the true value lies outside the margin of error.

Assuming we are measuring in seconds, this tells us "Based on my current data, I am %confident that the true mean is within ±current_margin_of_error seconds of my measured mean." In other words, increasing the number of samples will make the margin of error smaller for the same confidence level. Given that we are asked to be at least some confidence level, we can calculate whether this is true or not yet (i.e. whether we need to keep sampling).

The only caveat to this, is that we need to have at least @minimum samples before we can make this determination. If we have less than @minimum samples, we will return false. The reason for this is that we need a minimum number of samples to calculate a meaningful standard error and margin of error.

Signature

returns　Boolean: true if the sample size has converged, false otherwise.

Implementation

def converged?
	return false if @count < @minimum
	# Calculate the current mean and standard error
	mean = self.mean
	standard_error = self.standard_error
	if mean && standard_error
		# Calculate the margin of error:
		current_margin_of_error = @z_score * standard_error
		# Normalize the margin of error relative to the mean:
		relative_margin_of_error = @margin_of_error * mean.abs
		# Check if the margin of error is within the acceptable range:
		current_margin_of_error <= relative_margin_of_error
	end
end