This is an unofficial client for the GOV.UK Registers API.

Registers are authoritative lists of things, built and maintained by the UK government, for example, the country register is a list of countries.

It doesn’t really wrap the API. Instead, it downloads the ‘raw’ registers in RSF (Register Serialisation Format – not yet documented publicly), and parses that.

Installation

To install a very early version of the package for running old scripts:

Examples

Download registers

Download a single register.

Download all registers.

By default, the ‘beta’ (‘ready to use’) versions of registers are downloaded. If you need alpha (‘open for feedback’) registers, use phase = "alpha")

Explore register schema and data

The schema and data are in $shema and $data.

country <- registers$country
country$schema
#> $ids
#> # A tibble: 1 x 6
#>   `entry-number` type   key   timestamp           hash               name 
#>            <int> <chr>  <chr> <dttm>              <chr>              <chr>
#> 1              1 system name  2017-07-17 10:59:47 d3d8e15fbd410e08b… coun…
#> 
#> $names
#> # A tibble: 0 x 5
#> # ... with 5 variables: `entry-number` <int>, type <chr>, key <chr>,
#> #   timestamp <dttm>, hash <chr>
#> 
#> $custodians
#> # A tibble: 2 x 6
#>   `entry-number` type   key       timestamp           hash       custodian
#>            <int> <chr>  <chr>     <dttm>              <chr>      <chr>    
#> 1              2 system custodian 2017-07-17 10:59:47 6bdb76b1c… Tony Wor…
#> 2             12 system custodian 2017-11-02 11:18:00 aa98858fc… David de…
#> 
#> $fields
#> # A tibble: 8 x 11
#>   `entry-number` type   key    timestamp           hash     field datatype
#>            <int> <chr>  <chr>  <dttm>              <chr>    <chr> <chr>   
#> 1              3 system field… 2017-01-10 17:16:07 a303d05… coun… string  
#> 2              4 system field… 2017-01-10 17:16:07 a7a9f22… name  string  
#> 3              5 system field… 2017-01-10 17:16:07 5c4728f… offi… string  
#> 4              6 system field… 2017-01-10 17:16:07 494f6fa… citi… string  
#> 5              7 system field… 2017-01-10 17:16:07 1cff4c6… star… datetime
#> 6              8 system field… 2017-01-10 17:16:07 a557fa9… end-… datetime
#> 7             10 system field… 2017-08-29 11:30:00 f09c439… star… datetime
#> 8             11 system field… 2017-08-29 11:31:00 c5845bf… end-… datetime
#> # ... with 4 more variables: phase <chr>, register <chr>,
#> #   cardinality <chr>, text <chr>
country$data
#> # A tibble: 207 x 11
#>    `entry-number` type  key   timestamp           hash       country name 
#>             <int> <chr> <chr> <dttm>              <chr>      <chr>   <chr>
#>  1             13 user  SU    2016-04-05 13:23:05 e94c4a9ab… SU      USSR 
#>  2             14 user  DE    2016-04-05 13:23:05 e03f97c28… DE      West…
#>  3             15 user  DD    2016-04-05 13:23:05 e1357671d… DD      East…
#>  4             16 user  YU    2016-04-05 13:23:05 a074752a7… YU      Yugo…
#>  5             17 user  CS    2016-04-05 13:23:05 0031f311f… CS      Czec…
#>  6             18 user  GB    2016-04-05 13:23:05 6b1869387… GB      Unit…
#>  7             19 user  AF    2016-04-05 13:23:05 6bf7f01f2… AF      Afgh…
#>  8             20 user  AL    2016-04-05 13:23:05 9d04a7e04… AL      Alba…
#>  9             21 user  DZ    2016-04-05 13:23:05 3548cdf52… DZ      Alge…
#> 10             22 user  AD    2016-04-05 13:23:05 14fcb5099… AD      Ando…
#> # ... with 197 more rows, and 4 more variables: `official-name` <chr>,
#> #   `citizen-names` <chr>, `start-date` <chr>, `end-date` <chr>

You probably want to take a snapshot first. This will take the latest version of the schema, and the latest version of each record (e.g. the most recent name of a country)

Each field of each entry can contain more than one value, if the field has the property cardinality = 'n'. In this case, the field is a list-column, where each value is a vector of values.

Linked registers

Registers link in two ways.

  • Via a field with the "register" property set in the schema. This is like a foreign/primary key relationship in a relational database.
  • Via CURIEs of the form "prefix:reference", where prefix is the name of a register, and reference is a value in the primary field of that register (the field with the same name as the register).

Resolve links with the rr_resolve_*() family of functions. Because links refer to whole records, whole records are returned in a list-column of data frames.

If a matching record has multiple entries, every entry is returned in a multi-row data frame.

If a linking field is cardinality = 'n', a list of data frames is returned.

You can resolve to only the latest entry of each record by creating a registers object with snapshots.

Plot the links between registers with something like the ggraph package.

library(tidygraph)
library(ggraph)

registers$`statistical-geography` %>%
  rr_links() %>%
  as_tbl_graph() %>%
  ggraph(layout = "nicely") +
    geom_edge_fan(aes(alpha = ..index..), show.legend = FALSE) +
    geom_edge_loop() +
    geom_node_label(aes(label = name)) +
    theme_void()

Index registers

You can index registers by any column, using CURIE-like syntax.

country <- registers$country
rr_index(country, "start-date")
#> # A tibble: 50 x 2
#>    .curie             .data              
#>    <chr>              <list>             
#>  1 country:           <tibble [155 × 11]>
#>  2 country:1975-11-11 <tibble [1 × 11]>  
#>  3 country:1981-11-01 <tibble [1 × 11]>  
#>  4 country:1991-09-21 <tibble [1 × 11]>  
#>  5 country:1991-08-30 <tibble [1 × 11]>  
#>  6 country:1991-08-25 <tibble [1 × 11]>  
#>  7 country:1981-09-21 <tibble [1 × 11]>  
#>  8 country:1992-03-03 <tibble [1 × 11]>  
#>  9 country:1984-01-01 <tibble [1 × 11]>  
#> 10 country:1975-07-05 <tibble [1 × 11]>  
#> # ... with 40 more rows
rr_index(country, "end-date")
#> # A tibble: 5 x 2
#>   .curie             .data              
#>   <chr>              <list>             
#> 1 country:1991-12-25 <tibble [1 × 11]>  
#> 2 country:1990-10-02 <tibble [2 × 11]>  
#> 3 country:1992-04-28 <tibble [1 × 11]>  
#> 4 country:1992-12-31 <tibble [1 × 11]>  
#> 5 country:           <tibble [202 × 11]>
rr_index(country)
#> # A tibble: 199 x 2
#>    .curie     .data            
#>    <chr>      <list>           
#>  1 country:SU <tibble [1 × 11]>
#>  2 country:DE <tibble [2 × 11]>
#>  3 country:DD <tibble [1 × 11]>
#>  4 country:YU <tibble [1 × 11]>
#>  5 country:CS <tibble [1 × 11]>
#>  6 country:GB <tibble [1 × 11]>
#>  7 country:AF <tibble [1 × 11]>
#>  8 country:AL <tibble [1 × 11]>
#>  9 country:DZ <tibble [1 × 11]>
#> 10 country:AD <tibble [1 × 11]>
#> # ... with 189 more rows
rr_index(registers$`local-authority-eng`, "local-authority-type")
#> # A tibble: 7 x 2
#>   .curie                  .data              
#>   <chr>                   <list>             
#> 1 local-authority-eng:UA  <tibble [56 × 11]> 
#> 2 local-authority-eng:MD  <tibble [37 × 11]> 
#> 3 local-authority-eng:CTY <tibble [29 × 11]> 
#> 4 local-authority-eng:NMD <tibble [203 × 11]>
#> 5 local-authority-eng:LBO <tibble [32 × 11]> 
#> 6 local-authority-eng:CC  <tibble [1 × 11]>  
#> 7 local-authority-eng:SRA <tibble [1 × 11]>