fastymd is a package for working with Year-Month-Day (YMD) style date objects. It provides extremely fast passing of character strings and numeric values to date objects as well as fast decomposition of these in to their year, month and day components. The underlying algorithms follow the approach of Howard Hinnant for calculating days from the UNIX Epoch of Gregorian Calendar dates and vice versa.
The API won’t give any surprises:
library(fastymd)
cdate <- c("2025-04-16", "2025-04-17")
(res <- fymd(cdate))
#> [1] "2025-04-16" "2025-04-17"
res == as.Date(cdate)
#> [1] TRUE TRUE
get_ymd(res)
#> year month day
#> 1 2025 4 16
#> 2 2025 4 17
fymd(2025, 4, 16) == res[1L]
#> [1] TRUE
Invalid dates will return NA
and a warning:
fymd(2021, 02, 29) # not a leap year
#> NAs introduced due to invalid month and/or day combinations.
#> [1] NA
More interesting is the handling of output after a valid date. Consider the following timestamp:
timelt <- as.POSIXlt(Sys.time(), tz = "UTC")
(timestamp <- strftime(timelt , "%Y-%m-%dT%H:%M:%S%z"))
#> [1] "2025-05-12T19:22:45+0000"
By default the time element is ignored:
(res <- fymd(timestamp))
#> [1] "2025-05-12"
res == as.Date(timestamp, tz = "UTC")
#> [1] TRUE
This ignoring of the timestamp is both good and bad. For timestamps it makes
perfect sense, but perhaps you have simple dates and a concern that some are
corrupted. For these we can use the strict
argument:
cdate <- "2025-04-16nonsense "
fymd(cdate)
#> [1] "2025-04-16"
fymd(cdate, strict = TRUE)
#> NAs introduced due to invalid date strings.
#> [1] NA
The character method of fymd()
parses input strings in a fixed, year, month
and day order. These values must be digits but can be separated by any non-digit
character. This is similar in spirit to the fastDate()
function in Simon
Urbanek’s fasttime package, using
pure text parsing and no system calls for maximum speed.
For extremely fast passing of POSIX style timestamps you will struggle to beat the performance of fasttime. This works fantastically for timestamps that do not need validation and are within the date range supported by the package (currently 1970-01-01 through to the year 2199).
fymd()
fills the, admittedly small, niche where you want fast parsing of YMD
strings along with date validation and support for a wider range of dates from
the Proleptic Gregorian calendar
(currently we support years in the range [-9999, 9999]
). This additional
capability does come with a small performance penalty but, hopefully, this has
been kept to a minimum and the implementation remains competitive.
library(microbenchmark)
# 1970-01-01 (UNIX epoch) to "2199-01-01"
dates <- seq.Date(from = .Date(0), to = fymd("2199-01-01"), by = "day")
# comparison timings for fymd (character method)
cdates <- format(dates)
(res_c <- microbenchmark(
fasttime = fasttime::fastDate(cdates),
fastymd = fymd(cdates),
ymd = ymd::ymd(cdates),
lubridate = lubridate::ymd(cdates),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fasttime 529.073 534.7990 576.1368 539.618 545.3535 1901.278 100
#> fastymd 775.796 780.8305 788.5262 783.881 788.9855 959.180 100
#> ymd 4384.888 4492.1355 4554.7300 4521.390 4561.2445 5976.746 100
#> lubridate 4955.098 5070.1950 5999.4278 5150.251 6527.8450 37595.723 100
# comparison timings for fymd (numeric method)
ymd <- get_ymd(dates)
(res_n <- microbenchmark(
fastymd = fymd(ymd[[1]], ymd[[2]], ymd[[3]]),
lubridate = lubridate::make_date(ymd[[1]], ymd[[2]], ymd[[3]]),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fastymd 340.839 342.167 365.6678 345.413 347.5220 1761.325 100
#> lubridate 535.485 541.316 640.3393 545.138 550.2875 2482.258 100
# comparison timings for year getter
(res_get_year <- microbenchmark(
fastymd = get_year(dates),
ymd = ymd::year(dates),
lubridate = lubridate::year(dates),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fastymd 488.537 497.4885 540.4405 501.6010 506.5205 1720.088 100
#> ymd 497.003 504.5170 549.3673 509.8665 514.8515 2041.231 100
#> lubridate 7589.773 7604.6310 7896.1416 7612.0955 7629.4425 10253.322 100
# comparison timings for month getter
(res_get_month <- microbenchmark(
fastymd = get_month(dates),
ymd = ymd::month(dates),
lubridate = lubridate::month(dates),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fastymd 452.900 464.857 572.1960 469.7465 474.3150 1960.229 100
#> ymd 531.428 536.847 566.3219 539.0115 544.1915 2074.063 100
#> lubridate 8215.207 8247.988 8876.5144 8271.0365 8307.0845 39568.495 100
# comparison timings for mday getter
(res_get_mday <- microbenchmark(
fastymd = get_mday(dates),
ymd = ymd::mday(dates),
lubridate = lubridate::day(dates),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fastymd 449.464 460.855 501.8638 466.430 469.852 1802.673 100
#> ymd 535.535 539.848 577.7772 542.463 545.624 1841.536 100
#> lubridate 7531.354 7546.001 7755.3286 7553.971 7568.017 9465.392 100