fastymd is a package for working with Year-Month-Day (YMD) style date objects. It provides extremely fast passing of character strings and numeric values to date objects as well as fast decomposition of these in to their year, month and day components. The underlying algorithms follow the approach of Howard Hinnant for calculating days from the UNIX Epoch of Gregorian Calendar dates and vice versa.
The API won’t give any surprises:
library(fastymd)
cdate <- c("2025-04-16", "2025-04-17")
(res <- fymd(cdate))
#> [1] "2025-04-16" "2025-04-17"
res == as.Date(cdate)
#> [1] TRUE TRUE
get_ymd(res)
#> year month day
#> 1 2025 4 16
#> 2 2025 4 17
fymd(2025, 4, 16) == res[1L]
#> [1] TRUE
Invalid dates will return NA
and a warning:
fymd(2021, 02, 29) # not a leap year
#> NAs introduced due to invalid month and/or day combinations.
#> [1] NA
More interesting is the handling of output after a valid date. Consider the following timestamp:
timelt <- as.POSIXlt(Sys.time(), tz = "UTC")
(timestamp <- strftime(timelt , "%Y-%m-%dT%H:%M:%S%z"))
#> [1] "2025-04-25T12:58:48+0000"
By default the time element is ignored:
(res <- fymd(timestamp))
#> [1] "2025-04-25"
res == as.Date(timestamp, tz = "UTC")
#> [1] TRUE
This ignoring of the timestamp is both good and bad. For timestamps it makes
perfect sense, but perhaps you have simple dates and a concern that some are
corrupted. For these we can use the strict
argument:
cdate <- "2025-04-16nonsense "
fymd(cdate)
#> [1] "2025-04-16"
fymd(cdate, strict = TRUE)
#> NAs introduced due to invalid date strings.
#> [1] NA
The character method of fymd()
parses input strings in a fixed, year, month
and day order. These values must be digits but can be separated by any non-digit
character. This is similar in spirit to the fastDate()
function in Simon
Urbanek’s fasttime package, using
pure text parsing and no system calls for maximum speed.
For extremely fast passing of POSIX style timestamps you will struggle to beat the performance of fasttime. This works fantastically for timestamps that do not need validation and are within the date range supported by the package (currently 1970-01-01 through to the year 2199).
fymd()
fills the, admittedly small, niche where you want fast parsing of YMD
strings along with date validation and support for a wider range of dates from
the Proleptic Gregorian calendar
(currently we support years in the range [-9999, 9999]
). This additional
capability does come with a small performance penalty but, hopefully, this has
been kept to a minimum and the implementation remains competitive.
library(microbenchmark)
# 1970-01-01 (UNIX epoch) to "2199-01-01"
dates <- seq.Date(from = .Date(0), to = fymd("2199-01-01"), by = "day")
# comparison timings for fymd (character method)
cdates <- format(dates)
(res_c <- microbenchmark(
fasttime = fasttime::fastDate(cdates),
fastymd = fymd(cdates),
ymd = ymd::ymd(cdates),
lubridate = lubridate::ymd(cdates),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fasttime 529.212 534.0410 614.6670 537.833 548.0680 3176.584 100
#> fastymd 775.403 780.3025 821.9901 783.288 788.7785 3701.388 100
#> ymd 4449.140 4525.8585 4643.8929 4551.756 4597.2325 7874.310 100
#> lubridate 5015.090 5166.2885 7039.9450 5314.832 6771.9085 37672.764 100
# comparison timings for fymd (numeric method)
ymd <- get_ymd(dates)
(res_n <- microbenchmark(
fastymd = fymd(ymd[[1]], ymd[[2]], ymd[[3]]),
lubridate = lubridate::make_date(ymd[[1]], ymd[[2]], ymd[[3]]),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fastymd 342.993 344.7160 380.7935 347.467 349.5500 2044.251 100
#> lubridate 537.447 544.7915 616.5864 547.742 551.2075 2400.349 100
# comparison timings for year getter
(res_get_year <- microbenchmark(
fastymd = get_year(dates),
ymd = ymd::year(dates),
lubridate = lubridate::year(dates),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fastymd 481.482 496.976 535.5263 501.881 504.3110 2243.586 100
#> ymd 498.755 507.937 543.8808 511.980 515.5915 2648.965 100
#> lubridate 7583.374 7600.496 7872.6538 7606.718 7629.4710 10319.503 100
# comparison timings for month getter
(res_get_month <- microbenchmark(
fastymd = get_month(dates),
ymd = ymd::month(dates),
lubridate = lubridate::month(dates),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fastymd 455.033 463.5890 510.1937 467.1205 471.7345 1949.854 100
#> ymd 532.889 537.4675 584.1824 539.7115 542.1165 2039.753 100
#> lubridate 8230.087 8265.2170 8872.8169 8281.1420 8319.9645 38371.485 100
# comparison timings for mday getter
(res_get_mday <- microbenchmark(
fastymd = get_mday(dates),
ymd = ymd::mday(dates),
lubridate = lubridate::day(dates),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fastymd 447.238 463.3640 504.0715 467.7415 471.8895 1838.937 100
#> ymd 536.435 540.2325 550.2922 542.3120 544.5660 867.186 100
#> lubridate 7529.523 7543.0935 7696.1561 7547.6175 7557.5015 8946.419 100