2  The Dataset

We are utilizing data from the National Morbidity and Mortality Air Pollution Study (NMMAPS), focusing specifically on data pertaining to Chicago and the years 1997 to 2000 to ensure manageability of the plots. For a more comprehensive understanding of this dataset, readers can refer to Roger Peng’s book Statistical Methods in Environmental Epidemiology with R.

To import the data into our R session, we can employ read_csv() from the {readr} package. Subsequently, we’ll store the data in a variable named chic using the assignment arrow <-. Just Copy and Paste the following code.

chic <- readr::read_csv("https://raw.githubusercontent.com/rana2hin/ggplot_guide/master/chicago_data.csv")
Rows: 1461 Columns: 10
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (3): city, season, month
dbl  (6): temp, o3, dewpoint, pm10, yday, year
date (1): date

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Using namespace Directly

The :: symbolizes namespace and enables accessing a function without loading the entire package. Alternatively, you could load the readr package first using library(readr) and then execute chic <- read_csv(...) subsequently.

tibble::glimpse(chic)
Rows: 1,461
Columns: 10
$ city     <chr> "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic…
$ date     <date> 1997-01-01, 1997-01-02, 1997-01-03, 1997-01-04, 1997-01-05, …
$ temp     <dbl> 36.0, 45.0, 40.0, 51.5, 27.0, 17.0, 16.0, 19.0, 26.0, 16.0, 1…
$ o3       <dbl> 5.659256, 5.525417, 6.288548, 7.537758, 20.760798, 14.940874,…
$ dewpoint <dbl> 37.500, 47.250, 38.000, 45.500, 11.250, 5.750, 7.000, 17.750,…
$ pm10     <dbl> 13.052268, 41.948600, 27.041751, 25.072573, 15.343121, 9.3646…
$ season   <chr> "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "…
$ yday     <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18…
$ month    <chr> "Jan", "Jan", "Jan", "Jan", "Jan", "Jan", "Jan", "Jan", "Jan"…
$ year     <dbl> 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1…
library(gt)
head(chic, 10) %>% gt()
city date temp o3 dewpoint pm10 season yday month year
chic 1997-01-01 36.0 5.659256 37.500 13.052268 Winter 1 Jan 1997
chic 1997-01-02 45.0 5.525417 47.250 41.948600 Winter 2 Jan 1997
chic 1997-01-03 40.0 6.288548 38.000 27.041751 Winter 3 Jan 1997
chic 1997-01-04 51.5 7.537758 45.500 25.072573 Winter 4 Jan 1997
chic 1997-01-05 27.0 20.760798 11.250 15.343121 Winter 5 Jan 1997
chic 1997-01-06 17.0 14.940874 5.750 9.364655 Winter 6 Jan 1997
chic 1997-01-07 16.0 11.920985 7.000 20.228428 Winter 7 Jan 1997
chic 1997-01-08 19.0 8.678477 17.750 33.134819 Winter 8 Jan 1997
chic 1997-01-09 26.0 13.355892 24.000 12.118381 Winter 9 Jan 1997
chic 1997-01-10 16.0 10.448264 5.375 24.761534 Winter 10 Jan 1997