Load Current Population Survey (CPS) microdata into R using the Census Bureau Data API, including basic monthly CPS and CPS ASEC microdata.
Note: This product uses the Census Bureau Data API but is not endorsed or certified by the Census Bureau.
For a Python version of this package, check out PyCPS.
To install cpsR, run the following code:
install.packages("cpsR")To install the development version of cpsR, run the following code:
# install.packages("devtools")
devtools::install_github("matt-saenz/cpsR")In order to use cpsR functions, you must supply a Census API key in one of two ways:
key argument (manually)CENSUS_API_KEY
(automatically)Using environment variable (or env var, for short)
CENSUS_API_KEY is strongly recommended for two reasons:
It is important to avoid including your key in scripts if you plan to share your code with others (like in the example below) since you should keep your key secret.
You can set up env var CENSUS_API_KEY in two steps:
First, open your .Renviron file. You can do so by
running:
# install.packages("usethis")
usethis::edit_r_environ()Second, add your Census API key to your .Renviron file
like so:
CENSUS_API_KEY='your_key_here'This enables cpsR functions to automatically look up your key by running:
Sys.getenv("CENSUS_API_KEY")library(cpsR)
library(dplyr)
library(purrr)
# Simple use of the basic monthly CPS
sep21 <- get_basic(
  year = 2021,
  month = 9,
  vars = c("prpertyp", "prtage", "pemlr", "pwcmpwgt")
)
sep21
#> # A tibble: 103,858 × 4
#>    prpertyp prtage pemlr pwcmpwgt
#>       <int>  <int> <int>    <dbl>
#>  1        2     80     5    1361.
#>  2        2     85     5    1411.
#>  3        2     80     5    4619.
#>  4        2     80     5    4587.
#>  5        2     42     1    3677.
#>  6        2     42     1    3645.
#>  7        1      9    -1       0 
#>  8        2     41     1    3652.
#>  9        2     32     7    4117.
#> 10        2     67     1    2479.
#> # ℹ 103,848 more rows
sep21 %>%
  filter(prpertyp == 2 & prtage >= 16) %>%
  summarize(
    pop16plus = sum(pwcmpwgt),
    employed = sum(pwcmpwgt[pemlr %in% 1:2])
  ) %>%
  mutate(epop_ratio = employed / pop16plus)
#> # A tibble: 1 × 3
#>    pop16plus   employed epop_ratio
#>        <dbl>      <dbl>      <dbl>
#> 1 261765646. 154025931.      0.588
# Pulling multiple years of CPS ASEC microdata
asec <- map_dfr(2020:2021, get_asec, vars = c("h_year", "marsupwt"))
count(asec, h_year, wt = marsupwt)
#> # A tibble: 2 × 2
#>   h_year          n
#>    <int>      <dbl>
#> 1   2020 325268182.
#> 2   2021 326195440.