2.5 Layered meaningful formatting
Sometimes different kinds of formatting relate to clearly different aspects of an observation, e.g. yellow highlight for “uncertain data” and red text for “product no longer available”. Both yellow highlighting and red text in the same row would indicate uncertain data and unavailability of the product at the same time.
Deal with it by reading each kind of formatting into a separate column, e.g. fill colour into one column, font colour into another, bold/not-bold into a another, etc.
# Step 1: import the table taking only cell values and ignoring the formatting
path <- system.file("extdata", "worked-examples.xlsx", package = "unpivotr")
x <- read_excel(path, sheet = "combined-highlights")
# Step 2: import one kind of formatting of one column of the table
# `formats` is a pallette of fill colours that can be indexed by the
# `local_format_id` of a given cell to get the fill colour of that cell
fill_colours <- xlsx_formats(path)$local$fill$patternFill$fgColor$rgb
font_colours <- xlsx_formats(path)$local$font$color$rgb
# Import all the cells, filter out the header row, filter for the first column,
# and create a new column `fill` of the fill colours, by looking up the
# local_format_id of each cell in the `formats` pallette.
formats <-
xlsx_cells(path, sheet = "combined-highlights") %>%
dplyr::filter(row >= 2, col == 1) %>% # Omit the header row
mutate(fill_colour = fill_colours[local_format_id],
font_colour = font_colours[local_format_id]) %>%
select(fill_colour, font_colour)
# Step 3: append the `fill` column to the rest of the data
bind_cols(x, formats)
## # A tibble: 4 x 5
## Name Weight Price fill_colour font_colour
## <chr> <dbl> <dbl> <chr> <chr>
## 1 Knife 7 8 <NA> FF000000
## 2 Fork 5 6 FFFFFF00 FF000000
## 3 Spoon 3 4 <NA> FFFF0000
## 4 Teaspoon 1 2 FFFFFF00 FFFF0000
Here’s the same thing, but using only tidyxl and unpivotr.
fill_colours <- xlsx_formats(path)$local$fill$patternFill$fgColor$rgb
font_colours <- xlsx_formats(path)$local$font$color$rgb
cells <-
xlsx_cells(path, sheet = "combined-highlights") %>%
mutate(fill_colour = fill_colours[local_format_id],
font_colour = font_colours[local_format_id]) %>%
select(row, col, data_type, character, numeric, fill_colour, font_colour) %>%
behead("up", header) %>%
behead("left", Name) %>%
select(-col, -character)
values <-
cells %>%
select(-fill_colour, -font_colour) %>%
spread(header, numeric)
formats <- distinct(cells, row, fill_colour, font_colour)
left_join(values, formats, by = "row") %>%
select(-row)
## # A tibble: 4 x 6
## data_type Name Price Weight fill_colour font_colour
## <chr> <chr> <dbl> <dbl> <chr> <chr>
## 1 numeric Knife 8 7 <NA> FF000000
## 2 numeric Fork 6 5 FFFFFF00 <NA>
## 3 numeric Spoon 4 3 <NA> FFFF0000
## 4 numeric Teaspoon 2 1 FFFFFF00 FFFF0000