R/02-dates.R
guess_date_format.Rd
This function takes a tibble and a specific column. This column is evaluated
one observation after the other, and finally gives the best matching date
format for the whole column. The best matching format is tested across seven
different formats provided by the lubridate library. Along with the format,
the percentage of matching is given in the output tibble. The information of
the best matching format can be used to mutate a column using
as_any_date()
. The default format is yyyy-mm-dd.
guess_date_format(tbl, col = NULL)
R object(dataframe or tibble) of the input tbl
A character string specifying a column of interest
A tibble with information concerning the best matching date format, given an object to be evaluated.
Contrary to lubridate library or as.Date()
, the function evaluates
the column as a whole, and does not cast the column if there is ambiguity
between values. For example, ('19-07-1983', '02-03-1982') implies that 02
refers to the day and 03 refers to the month, since that order works for the
first element, and doesn't otherwise.
{
library(tidyr)
##### Example 1 -------------------------------------------------------------
# Non-ambiguous dates ----------------------------------------------------
time <-
tibble(time = c(
"1983-07-19",
"2003-01-14",
"2010-09-29",
"2023-12-12",
"2009-09-03",
"1509-11-30",
"1809-01-01"))
guess_date_format(time)
##### Example 2 -------------------------------------------------------------
# Ambiguous dates ----------------------------------------------------
time <-
tibble(time = c(
"1983-19-07",
"1983-10-13",
"2009-09-03",
"1509-11-30"))
guess_date_format(time)
##### Example 3 -------------------------------------------------------------
# Non date format dates --------------------------------------------------
time <-
tibble(time = c(
"198-07-19",
"200-01-14",
"201-09-29",
"202-12-12",
"2000-09-03",
"150-11-3d0",
"180-01-01"))
guess_date_format(time)
}
#> # A tibble: 1 × 4
#> name_var `Date format` `% values formated` `Date match`
#> <chr> <chr> <dbl> <chr>
#> 1 time ymd, ydm 14.3 Ambiguous match