Evenly spaced sample
While building a trivia application for 2020’s Video Conferenced Christmas, I had need of a function that would provide an evenly spaced sample from a given list.
I considered random.choices, but the results aren’t evenly spaced. I considered numpy.linspace, but the conversion of its floats to integer list indices results in odd edge-cases that can bunch up values.
In need of a holiday puzzle, I sat down to take care of it myself. If you’re interested in the same sort of thing, read only the docstring and try to build something that achieves the same thing yourself!
import random
def evenly_spaced_sample(lst, total):
"""Return evenly spaced out elements of the given list
## Deterministic outcomes
>>> evenly_spaced_sample([0,1, 2, 3,4,5,6, 7, 8,9], 2)
[2, 7]
>>> evenly_spaced_sample([0, 1, 2,3, 4, 5], 2)
[1, 4]
>>> evenly_spaced_sample([0,1,2], 1)
[1]
>>> evenly_spaced_sample(['a', 'b'], 2)
['a', 'b']
>>> evenly_spaced_sample(['a', 'b', 'c'], 2)
['a', 'c']
## Non-deterministic outcomes
>>> sorted(set(
... tuple(evenly_spaced_sample([0,1,2,3,4,5,6,7], 4))
... for x in range(100)
... ))
[(0, 2, 4, 6), (1, 3, 5, 7)]
>>> sorted(set(
... tuple(evenly_spaced_sample([0, 1, 2,3, 4,5, 6,7, 8, 9], 3))
... for x in range(100)
... ))
[(1, 4, 8), (1, 5, 8)]
>>> sorted(set(
... tuple(evenly_spaced_sample(['a', 'b'], 1))
... for x in range(100)
... ))
[('a',), ('b',)]
"""
# Handle the edge cases
len_lst = len(lst)
if len_lst == total: # Requesting all items
return lst
elif total == 0: # Requesting no items
return []
elif total < 0 or len_lst < total: # Requesting too many items
raise ValueError("Sample larger than population or is negative")
"""
Imagine a list like [0,1,2,3,4,5,6,7,8,9] and we want 3 evenly spaced
items. The list length (10) divided by the number of items (3) gives
us 3.3~, so we know our evenly spaced items must be at least 3 apart
from one another. There's also one more space we need to assign to one
of the spaces.
"""
# Determine the space between items and how many gaps of each size we need
gap_min = int(len_lst / total)
gap_max = gap_min + 1
gap_max_count = len_lst - gap_min * total
gap_min_count = total - gap_max_count
# Generate a list of offsets, ensuring that the tail has a min gap
offsets = [gap_min] * (gap_min_count - 1) + [gap_max] * gap_max_count
random.shuffle(offsets)
offsets.append(gap_min)
"""
Next, we need to make sure that we're not returning [0,3,7] as that's
what you might call "left aligned". We want our evenly spaced sample
to be somewhat centered. We can do this by shifting our starting index
over by half of the last offset (considering the head and the tail to
be one "space" between items, as though the given list were a ring.
"""
# The starting index is half the last offset, adjusted if it's not an int
shift = (offsets[-1] - 1) / 2
index = int(shift)
if not shift.is_integer() and shift - index < random.random():
index += 1
# Compose and return the response list
response = []
for offset in offsets:
response.append(lst[index])
index += offset
return response
if __name__ == "__main__":
import doctest
doctest.testmod()
Looking around after writing this, I come across some similar problems (e.g. StackOverflow’s Random list of numbers in a range keeping with a minimum distance, but nothing quite like what I’m trying to achieve. I have no doubt someone out there has written something similar, but sometimes it’s difficult to bring these things to the surface.
Worth noting that the more complex the problem, the more time you’d want to spend on investigating prior work instead of just building it from scratch.