fastymd

Overview

fastymd is a package for working with Year-Month-Day (YMD) style date objects. It provides extremely fast passing of character strings and numeric values to date objects as well as fast decomposition of these in to their year, month and day components. The underlying algorithms follow the approach of Howard Hinnant for calculating days from the UNIX Epoch of Gregorian Calendar dates and vice versa.

The API won’t give any surprises:

library(fastymd)
cdate <- c("2025-04-16", "2025-04-17")
(res <- fymd(cdate))
#> [1] "2025-04-16" "2025-04-17"
res == as.Date(cdate)
#> [1] TRUE TRUE
get_ymd(res)
#>   year month day
#> 1 2025     4  16
#> 2 2025     4  17
fymd(2025, 4, 16) == res[1L]
#> [1] TRUE

Invalid dates will return NA and a warning:

fymd(2021, 02, 29) # not a leap year
#> NAs introduced due to invalid month and/or day combinations.
#> [1] NA

More interesting is the handling of output after a valid date. Consider the following timestamp:

timelt <- as.POSIXlt(Sys.time(), tz = "UTC")
(timestamp <- strftime(timelt, "%Y-%m-%dT%H:%M:%S%z"))
#> [1] "2025-10-03T10:38:29+0000"

By default the time element is ignored:

(res <- fymd(timestamp))
#> [1] "2025-10-03"
res == as.Date(timestamp, tz = "UTC")
#> [1] TRUE

This ignoring of the timestamp is both good and bad. For timestamps it makes perfect sense, but perhaps you have simple dates and a concern that some are corrupted. For these we can use the strict argument:

cdate <- "2025-04-16nonsense "
fymd(cdate)
#> [1] "2025-04-16"
fymd(cdate, strict = TRUE)
#> NAs introduced due to invalid date strings.
#> [1] NA

Benchmarks

The character method of fymd() parses input strings in a fixed, year, month and day order. These values must be digits but can be separated by any non-digit character. This is similar in spirit to the fastDate() function in Simon Urbanek’s fasttime package, using pure text parsing and no system calls for maximum speed.

For extremely fast passing of POSIX style timestamps you will struggle to beat the performance of fasttime. This works fantastically for timestamps that do not need validation and are within the date range supported by the package (currently 1970-01-01 through to the year 2199).

fymd() fills the, admittedly small, niche where you want fast parsing of YMD strings along with date validation and support for a wider range of dates from the Proleptic Gregorian calendar (currently we support years in the range [-9999, 9999]). This additional capability does come with a small performance penalty but, hopefully, this has been kept to a minimum and the implementation remains competitive.

library(microbenchmark)

# 1970-01-01 (UNIX epoch) to "2199-01-01"
dates <- seq.Date(from = .Date(0), to = fymd("2199-01-01"), by = "day")

# comparison timings for fymd (character method)
cdates  <- format(dates)
(res_c <- microbenchmark(
    fasttime  = fasttime::fastDate(cdates),
    fastymd   = fymd(cdates),
    ymd       = ymd::ymd(cdates),
    lubridate = lubridate::ymd(cdates),
    check     = "equal"
))
#> Unit: microseconds
#>       expr      min       lq      mean    median        uq       max neval
#>   fasttime  533.370  538.009  567.7949  541.1745  544.9115  2132.778   100
#>    fastymd  816.601  821.840  831.0653  824.7610  829.4945   952.866   100
#>        ymd 4208.248 4272.809 4380.0580 4297.0345 4344.4440  6314.957   100
#>  lubridate 5526.730 5609.514 6560.1853 5673.3795 7213.5065 37612.231   100
# comparison timings for fymd (numeric method)
ymd  <- get_ymd(dates)
(res_n <- microbenchmark(
    fastymd   = fymd(ymd[[1]], ymd[[2]], ymd[[3]]),
    lubridate = lubridate::make_date(ymd[[1]], ymd[[2]], ymd[[3]]),
    check     = "equal"
))
#> Unit: microseconds
#>       expr     min       lq     mean   median      uq      max neval
#>    fastymd 324.839 326.5620 349.9121 328.1800 331.692 1983.287   100
#>  lubridate 717.224 724.3125 829.1032 727.2135 730.059 2535.663   100
# comparison timings for year getter
(res_get_year <- microbenchmark(
    fastymd   = get_year(dates),
    ymd       = ymd::year(dates),
    lubridate = lubridate::year(dates),
    check     = "equal"
))
#> Unit: microseconds
#>       expr      min       lq      mean   median       uq       max neval
#>    fastymd  358.732  360.235  388.8129  364.413  370.935  1778.674   100
#>        ymd  381.235  396.884  432.0702  407.058  427.922   769.292   100
#>  lubridate 7580.309 7597.230 8180.8564 7627.503 8876.418 10873.061   100
# comparison timings for month getter
(res_get_month <- microbenchmark(
    fastymd   = get_month(dates),
    ymd       = ymd::month(dates),
    lubridate = lubridate::month(dates),
    check     = "equal"
))
#> Unit: microseconds
#>       expr      min        lq      mean    median        uq       max neval
#>    fastymd  325.190  326.5575  333.3184  328.3455  332.2375   433.984   100
#>        ymd  417.693  424.2205  445.4195  429.6400  438.6725   715.761   100
#>  lubridate 8227.723 8266.6150 9278.9854 8291.0255 9739.5565 40420.224   100
# comparison timings for mday getter
(res_get_mday <- microbenchmark(
    fastymd   = get_mday(dates),
    ymd       = ymd::mday(dates),
    lubridate = lubridate::day(dates),
    check     = "equal"
))
#> Unit: microseconds
#>       expr      min       lq      mean   median        uq       max neval
#>    fastymd  361.437  362.269  393.5199  363.872  367.4185  2605.303   100
#>        ymd  420.608  426.069  476.8840  431.113  439.7145  2323.335   100
#>  lubridate 7517.121 7540.965 7867.9210 7562.520 7630.1770 10291.621   100