8.1 Non-text headers e.g. dates

At the time of writing, readxl doesn’t convert Excel dates to R dates when they are in the header row.

Using tidyxl and unpivotr, you can choose to make a cell of any data type into a tidy ‘header’, and you can reformat it as text before spatter() turns it into the header of a data frame. Another way to format headers as part of the behead() will be shown later.

path <- system.file("extdata", "worked-examples.xlsx", package = "unpivotr")
xlsx_cells(path, sheet = "non-text headers") %>%
  behead("left", name) %>%
  behead("up", `academic-year`) %>%
  mutate(`academic-year` = strftime(`academic-year`, "%Y")) %>%
  select(row, data_type, `academic-year`, name, numeric) %>%
  spatter(`academic-year`) %>%
  select(-row)
## # A tibble: 2 x 3
##   name     `2017` `2018`
##   <chr>     <dbl>  <dbl>
## 1 Matilda       4      2
## 2 Nicholas      3      1

When a single set of headers is of mixed data types, e.g. some character and some date, behead() chooses the correct ones using the data_type column, before converting them all to text via format().

xlsx_cells(path, sheet = "non-text headers") %>%
  select(row, col, data_type, character, numeric, date) %>%
  behead("up", header)
## # A tibble: 6 x 7
##     row   col data_type character numeric date                header    
##   <int> <int> <chr>     <chr>       <dbl> <dttm>              <chr>     
## 1     2     1 character Matilda        NA NA                  Name      
## 2     2     2 numeric   <NA>            2 NA                  2018-01-01
## 3     2     3 numeric   <NA>            4 NA                  2017-01-01
## 4     3     1 character Nicholas       NA NA                  Name      
## 5     3     2 numeric   <NA>            1 NA                  2018-01-01
## 6     3     3 numeric   <NA>            3 NA                  2017-01-01

To format a header when a single set of headers are of mixed data types, you can specify a function for each data type in the call to behead().

xlsx_cells(path, sheet = "non-text headers") %>%
  select(row, col, data_type, character, numeric, date) %>%
  behead("up", header, formatters = list(date = ~ strftime(.x, "%Y"),
                                        character = toupper))
## # A tibble: 6 x 7
##     row   col data_type character numeric date                header
##   <int> <int> <chr>     <chr>       <dbl> <dttm>              <chr> 
## 1     2     1 character Matilda        NA NA                  NAME  
## 2     2     2 numeric   <NA>            2 NA                  2018  
## 3     2     3 numeric   <NA>            4 NA                  2017  
## 4     3     1 character Nicholas       NA NA                  NAME  
## 5     3     2 numeric   <NA>            1 NA                  2018  
## 6     3     3 numeric   <NA>            3 NA                  2017