fastymd

Overview

fastymd is a package for working with Year-Month-Day (YMD) style date objects. It provides extremely fast passing of character strings and numeric values to date objects as well as fast decomposition of these in to their year, month and day components. The underlying algorithms follow the approach of Howard Hinnant for calculating days from the UNIX Epoch of Gregorian Calendar dates and vice versa.

The API won’t give any surprises:

library(fastymd)
cdate <- c("2025-04-16", "2025-04-17")
(res <- fymd(cdate))
#> [1] "2025-04-16" "2025-04-17"
res == as.Date(cdate)
#> [1] TRUE TRUE
get_ymd(res)
#>   year month day
#> 1 2025     4  16
#> 2 2025     4  17
fymd(2025, 4, 16) == res[1L]
#> [1] TRUE

Invalid dates will return NA and a warning:

fymd(2021, 02, 29) # not a leap year
#> NAs introduced due to invalid month and/or day combinations.
#> [1] NA

More interesting is the handling of output after a valid date. Consider the following timestamp:

timelt <- as.POSIXlt(Sys.time(), tz = "UTC")
(timestamp <- strftime(timelt, "%Y-%m-%dT%H:%M:%S%z"))
#> [1] "2026-02-27T14:14:28+0000"

By default the time element is ignored:

(res <- fymd(timestamp))
#> [1] "2026-02-27"
res == as.Date(timestamp, tz = "UTC")
#> [1] TRUE

This ignoring of the timestamp is both good and bad. For timestamps it makes perfect sense, but perhaps you have simple dates and a concern that some are corrupted. For these we can use the strict argument:

cdate <- "2025-04-16nonsense "
fymd(cdate)
#> [1] "2025-04-16"
fymd(cdate, strict = TRUE)
#> NAs introduced due to invalid date strings.
#> [1] NA

Benchmarks

The character method of fymd() parses input strings in a fixed, year, month and day order. These values must be digits but can be separated by any non-digit character. This is similar in spirit to the fastDate() function in Simon Urbanek’s fasttime package, using pure text parsing and no system calls for maximum speed.

For extremely fast passing of POSIX style timestamps you will struggle to beat the performance of fasttime. This works fantastically for timestamps that do not need validation and are within the date range supported by the package (currently 1970-01-01 through to the year 2199).

fymd() fills the, admittedly small, niche where you want fast parsing of YMD strings along with date validation and support for a wider range of dates from the Proleptic Gregorian calendar (currently we support years in the range [-9999, 9999]). This additional capability does come with a small performance penalty but, hopefully, this has been kept to a minimum and the implementation remains competitive.

library(microbenchmark)

# 1970-01-01 (UNIX epoch) to "2199-01-01"
dates <- seq.Date(from = .Date(0), to = fymd("2199-01-01"), by = "day")

# comparison timings for fymd (character method)
cdates  <- format(dates)
(res_c <- microbenchmark(
    fasttime  = fasttime::fastDate(cdates),
    fastymd   = fymd(cdates),
    ymd       = ymd::ymd(cdates),
    lubridate = lubridate::ymd(cdates),
    check     = "equal"
))
#> Unit: microseconds
#>       expr      min       lq      mean   median        uq       max neval
#>   fasttime  540.835  546.361  558.2403  549.111  555.6535   869.112   100
#>    fastymd  845.458  852.371  940.4753  857.531  867.2185  7280.352   100
#>        ymd 4155.624 4203.313 4366.6950 4232.058 4291.9950  6627.016   100
#>  lubridate 5482.736 5727.907 6812.1883 5927.512 7281.9700 40156.682   100
# comparison timings for fymd (numeric method)
ymd  <- get_ymd(dates)
(res_n <- microbenchmark(
    fastymd   = fymd(ymd[[1]], ymd[[2]], ymd[[3]]),
    lubridate = lubridate::make_date(ymd[[1]], ymd[[2]], ymd[[3]]),
    check     = "equal"
))
#> Unit: microseconds
#>       expr     min       lq     mean   median       uq      max neval
#>    fastymd 325.341 327.0790 334.8967 328.3415 331.6380  425.048   100
#>  lubridate 666.872 711.4865 845.6369 715.1130 721.0945 2644.436   100
# comparison timings for year getter
(res_get_year <- microbenchmark(
    fastymd   = get_year(dates),
    ymd       = ymd::year(dates),
    lubridate = lubridate::year(dates),
    check     = "equal"
))
#> Unit: microseconds
#>       expr      min        lq      mean   median        uq       max neval
#>    fastymd  358.774  359.8005  379.7869  361.504  364.8855  1996.249   100
#>        ymd  386.536  405.1010  418.9190  413.877  422.1580   852.470   100
#>  lubridate 7560.067 7575.0760 8277.1717 7592.699 7620.9320 43778.964   100
# comparison timings for month getter
(res_get_month <- microbenchmark(
    fastymd   = get_month(dates),
    ymd       = ymd::month(dates),
    lubridate = lubridate::month(dates),
    check     = "equal"
))
#> Unit: microseconds
#>       expr      min        lq      mean    median        uq       max neval
#>    fastymd  324.920  326.5785  334.0995  329.3530  331.9635   506.882   100
#>        ymd  417.193  424.9580  481.9050  433.5695  443.5280  2038.328   100
#>  lubridate 8208.485 8259.5410 9011.2101 8314.5845 9826.9045 12247.831   100
# comparison timings for mday getter
(res_get_mday <- microbenchmark(
    fastymd   = get_mday(dates),
    ymd       = ymd::mday(dates),
    lubridate = lubridate::day(dates),
    check     = "equal"
))
#> Unit: microseconds
#>       expr      min        lq      mean   median        uq       max neval
#>    fastymd  361.539  362.5660  382.5129  364.645  367.7400  1694.722   100
#>        ymd  421.522  428.0485  480.2683  433.875  440.4825  1730.570   100
#>  lubridate 7492.580 7509.7475 7964.2169 7529.800 8597.7065 10327.484   100