2.5 Layered meaningful formatting

Sometimes different kinds of formatting relate to clearly different aspects of an observation, e.g. yellow highlight for “uncertain data” and red text for “product no longer available”. Both yellow highlighting and red text in the same row would indicate uncertain data and unavailability of the product at the same time.

Deal with it by reading each kind of formatting into a separate column, e.g. fill colour into one column, font colour into another, bold/not-bold into a another, etc.

# Step 1: import the table taking only cell values and ignoring the formatting
path <- system.file("extdata", "worked-examples.xlsx", package = "unpivotr")
x <- read_excel(path, sheet = "combined-highlights")

# Step 2: import one kind of formatting of one column of the table

# `formats` is a pallette of fill colours that can be indexed by the
# `local_format_id` of a given cell to get the fill colour of that cell
fill_colours <- xlsx_formats(path)$local$fill$patternFill$fgColor$rgb
font_colours <- xlsx_formats(path)$local$font$color$rgb

# Import all the cells, filter out the header row, filter for the first column,
# and create a new column `fill` of the fill colours, by looking up the
# local_format_id of each cell in the `formats` pallette.
formats <-
  xlsx_cells(path, sheet = "combined-highlights") %>%
  dplyr::filter(row >= 2, col == 1) %>% # Omit the header row
  mutate(fill_colour = fill_colours[local_format_id],
         font_colour = font_colours[local_format_id]) %>%
  select(fill_colour, font_colour)

# Step 3: append the `fill` column to the rest of the data
bind_cols(x, formats)
## # A tibble: 4 x 5
##   Name     Weight Price fill_colour font_colour
##   <chr>     <dbl> <dbl> <chr>       <chr>      
## 1 Knife         7     8 <NA>        FF000000   
## 2 Fork          5     6 FFFFFF00    FF000000   
## 3 Spoon         3     4 <NA>        FFFF0000   
## 4 Teaspoon      1     2 FFFFFF00    FFFF0000

Here’s the same thing, but using only tidyxl and unpivotr.

fill_colours <- xlsx_formats(path)$local$fill$patternFill$fgColor$rgb
font_colours <- xlsx_formats(path)$local$font$color$rgb

cells <-
  xlsx_cells(path, sheet = "combined-highlights") %>%
  mutate(fill_colour = fill_colours[local_format_id],
         font_colour = font_colours[local_format_id]) %>%
  select(row, col, data_type, character, numeric, fill_colour, font_colour) %>%
  behead("up", header) %>%
  behead("left", Name) %>%
  select(-col, -character)

values <-
  cells %>%
  select(-fill_colour, -font_colour) %>%
  spread(header, numeric)

formats <- distinct(cells, row, fill_colour, font_colour)

left_join(values, formats, by = "row") %>%
  select(-row)
## # A tibble: 4 x 6
##   data_type Name     Price Weight fill_colour font_colour
##   <chr>     <chr>    <dbl>  <dbl> <chr>       <chr>      
## 1 numeric   Knife        8      7 <NA>        FF000000   
## 2 numeric   Fork         6      5 FFFFFF00    <NA>       
## 3 numeric   Spoon        4      3 <NA>        FFFF0000   
## 4 numeric   Teaspoon     2      1 FFFFFF00    FFFF0000