4.3 Same table in several worksheets/files but in different positions

sheet: female sheet: male

This is almost the same as the section “Same table in several worksheets/files (using the sheet/file name)”. The only difference is that the function you write to unpivot the table must also find the table in the first place, and be robust to differences in the placement and context of the table on each sheet.

In this example, both tables begin in the same column, but there is an extra row of notes above one of the tables. There are a few ways to tackle this problem. Here, we filter for the Subject cell, which is either A3 or A4, and then extend the selection to include the whole table.

path <- system.file("extdata", "worked-examples.xlsx", package = "unpivotr")
all_cells <-
  xlsx_cells(path, sheets = c("female", "male")) %>%
  dplyr::filter(!is_blank) %>%
  select(sheet, row, col, data_type, character, numeric)
all_cells
## # A tibble: 21 x 6
##    sheet    row   col data_type character       numeric
##    <chr>  <int> <int> <chr>     <chr>             <dbl>
##  1 female     1     1 character Table of scores      NA
##  2 female     3     1 character Subject              NA
##  3 female     3     2 character Matilda              NA
##  4 female     3     3 character Olivia               NA
##  5 female     4     1 character Classics             NA
##  6 female     4     2 numeric   <NA>                  1
##  7 female     4     3 numeric   <NA>                  2
##  8 female     5     1 character History              NA
##  9 female     5     2 numeric   <NA>                  3
## 10 female     5     3 numeric   <NA>                  4
## # … with 11 more rows
unpivot <- function(cells) {
  cells %>%
    dplyr::filter(character == "Subject") %>%
    pull(row) %>%
    {dplyr::filter(cells, row >= .)} %>%
    behead("up", name) %>%
    behead("left", subject)
}

all_cells %>%
  nest(-sheet) %>%
  mutate(data = map(data, unpivot)) %>%
  unnest() %>%
  select(sex = sheet, name, subject, score = numeric)
## Warning: All elements of `...` must be named.
## Did you want `data = c(row, col, data_type, character, numeric)`?
## Warning: `cols` is now required.
## Please use `cols = c(data)`
## # A tibble: 8 x 4
##   sex    name     subject  score
##   <chr>  <chr>    <chr>    <dbl>
## 1 female Matilda  Classics     1
## 2 female Olivia   Classics     2
## 3 female Matilda  History      3
## 4 female Olivia   History      4
## 5 male   Nicholas Classics     3
## 6 male   Paul     Classics     0
## 7 male   Nicholas History      5
## 8 male   Paul     History      1