Chapter 3 Joining Data
3.1 Learning objectives
By the end of this chapter, you should be able to:
- identify keys used to join datasets;
- understand one-to-one, one-to-many, and many-to-many joins;
- use
left_join(),inner_join(), and related join functions; - check for duplicate keys before joining.
3.3 Left join
A left join keeps all rows from the first dataset and brings in matching information from the second dataset.
## # A tibble: 4 x 3
## key val_x val_y
## <dbl> <chr> <chr>
## 1 1 x1 y1
## 2 2 x2 y2
## 3 2 x3 y2
## 4 1 x4 y1
3.4 Check duplicate keys
Before joining real datasets, always check whether the join key is unique.
## # A tibble: 2 x 2
## key n
## <dbl> <int>
## 1 1 2
## 2 2 2
## # A tibble: 0 x 2
## # i 2 variables: key <dbl>, n <int>