4.2 Same table in several worksheets/files (using the sheet/file name)

sheet: humanities sheet: performance

Because tidyxl() imports cells from multiple sheets into the same data frame, tables on separate sheets can be imported by mapping over the different sheets. Just name each sheet in the xlsx_cell() call, or don’t name any to import them all.

As far as tidyxl() is concerned, the particular sheet (aka ‘tab’) that a cell is on is another coordinate like row and col, so the full location of a cell is its row, its col, and its sheet.

path <- system.file("extdata", "worked-examples.xlsx", package = "unpivotr")
all_cells <-
  xlsx_cells(path, sheets = c("humanities", "performance")) %>%
  dplyr::filter(!is_blank) %>%
  select(sheet, row, col, data_type, character, numeric)
all_cells
## # A tibble: 16 x 6
##    sheet         row   col data_type character numeric
##    <chr>       <int> <int> <chr>     <chr>       <dbl>
##  1 humanities      1     2 character Matilda        NA
##  2 humanities      1     3 character Nicholas       NA
##  3 humanities      2     1 character Classics       NA
##  4 humanities      2     2 numeric   <NA>            1
##  5 humanities      2     3 numeric   <NA>            3
##  6 humanities      3     1 character History        NA
##  7 humanities      3     2 numeric   <NA>            3
##  8 humanities      3     3 numeric   <NA>            5
##  9 performance     1     2 character Matilda        NA
## 10 performance     1     3 character Nicholas       NA
## 11 performance     2     1 character Music          NA
## 12 performance     2     2 numeric   <NA>            5
## 13 performance     2     3 numeric   <NA>            9
## 14 performance     3     1 character Drama          NA
## 15 performance     3     2 numeric   <NA>            7
## 16 performance     3     3 numeric   <NA>           12

To prepare the sheets to be mapped over, use tidyr::nest(). The data column contains the cells of each sheet.

all_cells %>%
  nest(-sheet)
## Warning: All elements of `...` must be named.
## Did you want `data = c(row, col, data_type, character, numeric)`?
## # A tibble: 2 x 2
##   sheet       data            
##   <chr>       <list>          
## 1 humanities  <tibble [8 × 5]>
## 2 performance <tibble [8 × 5]>

The function to unpivot each table in this case will be a couple of behead() statements. Further clean-up can be saved until the end.

unpivot <- function(cells) {
  cells %>%
    behead("up", name) %>%
    behead("left", subject)
}

After mapping the unpivot function over each sheet of cells, use tidyr::unnest() to show every row of data again.

all_cells %>%
  nest(-sheet) %>%
  mutate(data = map(data, unpivot)) %>%
  unnest()
## Warning: All elements of `...` must be named.
## Did you want `data = c(row, col, data_type, character, numeric)`?
## Warning: `cols` is now required.
## Please use `cols = c(data)`
## # A tibble: 8 x 8
##   sheet         row   col data_type character numeric name     subject 
##   <chr>       <int> <int> <chr>     <chr>       <dbl> <chr>    <chr>   
## 1 humanities      2     2 numeric   <NA>            1 Matilda  Classics
## 2 humanities      2     3 numeric   <NA>            3 Nicholas Classics
## 3 humanities      3     2 numeric   <NA>            3 Matilda  History 
## 4 humanities      3     3 numeric   <NA>            5 Nicholas History 
## 5 performance     2     2 numeric   <NA>            5 Matilda  Music   
## 6 performance     2     3 numeric   <NA>            9 Nicholas Music   
## 7 performance     3     2 numeric   <NA>            7 Matilda  Drama   
## 8 performance     3     3 numeric   <NA>           12 Nicholas Drama

Finally, do the clean-up operations that were saved until now.

all_cells %>%
  nest(-sheet) %>%
  mutate(data = map(data, unpivot)) %>%
  unnest() %>%
  transmute(field = sheet,
            name,
            subject,
            score = numeric)
## Warning: All elements of `...` must be named.
## Did you want `data = c(row, col, data_type, character, numeric)`?
## Warning: `cols` is now required.
## Please use `cols = c(data)`
## # A tibble: 8 x 4
##   field       name     subject  score
##   <chr>       <chr>    <chr>    <dbl>
## 1 humanities  Matilda  Classics     1
## 2 humanities  Nicholas Classics     3
## 3 humanities  Matilda  History      3
## 4 humanities  Nicholas History      5
## 5 performance Matilda  Music        5
## 6 performance Nicholas Music        9
## 7 performance Matilda  Drama        7
## 8 performance Nicholas Drama       12