Data Management with R
Welcome
Overview
How to Use This Book
Recommended Chapter Flow
Project Structure
Required R Packages
Building the Book
Data and Reproducibility Notes
Intended Audience
Licence
1
Project Setup and Reproducible Workflows
1.1
Organizing a reproducible project
1.2
Working with RStudio Projects and file paths
1.3
Naming files and managing outputs
1.4
R Markdown and bookdown workflows
1.5
Git, GitHub, and reproducible analysis
1.6
Chapter summary
2
Tidyverse Basics
2.1
Loading packages and reading data
2.2
Inspecting data frames
2.3
Data manipulation with dplyr
2.4
Practice exercise
2.5
Chapter summary
3
Joining Data
3.1
Understanding join keys
3.2
Performing joins with dplyr
3.3
Checking for duplicate keys
3.4
Practical workflow for joining data
3.5
Practice exercise
3.6
Chapter summary
4
Data Cleaning and Data Management
4.1
Preparing the R environment
4.2
Exploring project files
4.3
Description of the datasets
4.4
Importing data into R
4.5
Inspecting datasets
4.6
Exploring and validating the data
4.7
Cleaning variable names and reshaping data
4.8
Practical considerations for R Markdown workflows
4.9
Chapter summary
5
Strings and Regular Expressions
5.1
Working with character data in R
5.2
Cleaning and separating character variables
5.3
Extracting patterns with regular expressions
5.4
Creating reusable functions
5.5
Summarizing and visualizing categorical data
5.6
Reshaping complex datasets
5.7
Using regular expressions inside pivot_longer()
5.8
Missing values and interpretation
5.9
Additional practice
5.10
Chapter summary
6
Visualization and Advanced Data Cleaning
6.1
Research context and datasets
6.2
Reviewing datasets before cleaning
6.3
Cleaning and reshaping population data
6.4
Visualizing population data
6.5
Identifying join keys
6.6
Cleaning mortality data
6.7
Handling missing and inconsistent values
6.8
Creating dates and calculating age
6.9
Imputing missing values
6.10
Working with ICD-10 codes
6.11
Visualizing cleaned mortality data
6.12
Saving cleaned datasets
6.13
Joining datasets and validating postal codes
6.14
Final assignment guidance
6.15
Chapter summary
7
Exploratory Analysis Project
7.1
Loading libraries and importing datasets
7.2
Reviewing dataset structure
7.3
Working with missing values
7.4
Duplicate records and cleaning decisions
7.5
Exploratory visualization
7.6
Organizing an exploratory analysis report
7.7
Practice activity
7.8
Chapter summary
8
Storyboarding and Reporting
8.1
Why storyboarding is important
8.2
Organizing the analytical narrative
8.3
Writing interpretation instead of only showing output
8.4
Separating technical details from reader-focused explanations
8.5
Building a polished R Markdown report
8.6
Example interpretation workflow
8.7
Practice activity
8.8
Chapter summary
References
Data Management with R
Licence
For educational use. See
LICENSE
for details.