Elegant Random Dates, Balanced Partitions, Python

I often work with time series data and find it useful to have a variety of ways to randomly generate dates. This particular example is great for evenly distributed date partitions. Running the script below with the default arguments will output a list of random dates, one for each month of the year.

[datetime.date(2024, 1, 18), datetime.date(2024, 2, 20)...]

import calendar
from random import randint
from datetime import date

# the default arguments
year = 2024
num = 1


def generate_random_date(year: int, month: int) -> date:
    """ Return a random date in a given year and month """

    days_in_month = calendar.monthrange(year, month)[1]
    random_day = randint(1, days_in_month)
    return date(year, month, random_day)


def generate_random_days_each_month(year: int, n: int = 1) -> list[date]:
    """ Return a list of 'n' random dates for every month in the given year """

    r_dates = [generate_random_date(year, i) for i in range(1, 13) for _ in range(n)]
    return r_dates


if __name__ == "__main__":

    r_dates = generate_random_days_each_month(year, num)
    print(r_dates)

There are interesting aspects in this script that are worthy of further breakdown.

First, the generate_random_date() function takes in year and month parameters and returns a random date within that range. The calendar standard library includes the monthrange() function which provides an elegant way to find the number of days in a month. Here is an example of how it works:

import calendar

year = 2024
month = 6
result = calendar.monthrange(year, month)

# > (weekday of first day of the month, number of days in month)
# > (calendar.SATURDAY, 30)

Isolating the second element of the tuple calendar.monthrange(year, month)[1], cleanly extracts the number of days in the month to pass to randint(1, days_in_month), which correlate to the first and last day of the month.

Then, the generate_random_days_each_month() function iterates over each month, n number of times, to generate a random date from generate_random_date(). Including a list comprehension [generate_random_date(year, i) for i in range(1, 13) for _ in range(n)] to iterate over both loops is a slick one-liner of code.

def generate_random_days_each_month(year: int, n: int = 1) -> list[date]:
    """ Return a list of 'n' random dates for every month in the given year """

    r_dates = [generate_random_date(year, i) for i in range(1, 13) for _ in range(n)]
    return r_dates

To illustrate the behavior of the list comprehension, I’ve rewritten the code as a standard for loop:

r_dates = []
for i in range(1, 13):
    for _ in range(n):
        r_dates.append(generate_random_date(year, i))

The list comprehension is simplified and compact – a great feature of Python.
To generate the data you need, the process is simple. Determine the desired number of rows per partition, fill in the parameters, and execute.


Further Reading