Pack cell values from separate columns per data type into one list-column
pack( cells, types = data_type, name = "value", drop_types = TRUE, drop_type_cols = TRUE ) unpack(cells, values = value, name = "data_type", drop_packed = TRUE)
cells | A data frame of cells, one row per cell. For |
---|---|
types | For |
name | A string. For |
drop_types | For |
drop_type_cols | For |
values | For |
drop_packed | For |
When cells are represented by rows of a data frame, the values of the cells
will be in different columns according to their data type. For example, the
value of a cell containing text will be in a column called chr
(or
character
if it came via tidyxl). A column called data_type
names, for
each cell, which column its value is in.
pack()
rearranges the cell values in a different way, so that they are all
in one column, by
taking each cell value, from whichever column.
making it an element of a list.
naming each element according to the column it came from.
making the list into a new list-column of the original data frame.
By default, the original columns are dropped, and so is the data_type
column.
unpack()
is the complement.
This can be useful for dropping all columns of cells
except the ones that
contain data. For example, tidyxl::xlsx_cells()
returns a very wide data
frame, and to make it narrow you might do:
select(cells, row, col, character, numeric, date)
But what if you don't know in advance that the data types you need are
character
, numeric
and date
? You might also need logical
and
error
.
Instead, pack()
all the data types into a single column, select it, and
then unpack.
pack(cells) %>% select(row, col, value) %>% unpack()
unpack
: Unpack cell values from one list-column into separate columns per data type
#> foo bar #> 1 1 a #> 2 2 b# The same data, represented by one row per cell, with integer values in the # `int` column and character values in the `chr` column. x <- as_cells(w) x#> # A tibble: 4 x 5 #> row col data_type chr int #> <int> <int> <chr> <chr> <int> #> 1 1 1 int NA 1 #> 2 2 1 int NA 2 #> 3 1 2 chr a NA #> 4 2 2 chr b NA# pack() and unpack() are complements pack(x)#> # A tibble: 4 x 3 #> row col value #> <int> <int> <named list> #> 1 1 1 <int [1]> #> 2 2 1 <int [1]> #> 3 1 2 <chr [1]> #> 4 2 2 <chr [1]>unpack(pack(x))#> # A tibble: 4 x 5 #> row col data_type chr int #> <int> <int> <chr> <chr> <int> #> 1 1 1 int NA 1 #> 2 2 1 int NA 2 #> 3 1 2 chr a NA #> 4 2 2 chr b NA# Drop non-data columns from a wide data frame of cells from tidyxl if (require(tidyxl)) { cells <- tidyxl::xlsx_cells(system.file("extdata", "purpose.xlsx", package = "unpivotr")) cells pack(cells) %>% dplyr::select(row, col, value) %>% unpack() }#>#> # A tibble: 1,721 x 7 #> row col data_type blank character error numeric #> <int> <int> <chr> <lgl> <chr> <chr> <dbl> #> 1 1 1 character NA The data is 'Sense of purpose by h… NA NA #> 2 2 1 character NA It can be found in the section 'Pe… NA NA #> 3 3 1 character NA The description provided by Statis… NA NA #> 4 5 1 character NA The 2014 New Zealand General Socia… NA NA #> 5 6 1 character NA It provides new and redeveloped da… NA NA #> 6 7 1 character NA Symbols used in this table: NA NA #> 7 8 1 character NA S - Data has been suppressed. NA NA #> 8 9 1 character NA * - Relative sampling error of 50 … NA NA #> 9 10 1 character NA For more tables using the NZGSS 20… NA NA #> 10 11 1 character NA Data quality NA NA #> # … with 1,711 more rows