Chapter 36 6. Identify possible join keys
A join requires one or more common variables between datasets. Before joining, always check whether the keys are clean and whether the relationship is one-to-one, one-to-many, or many-to-many.
## $mort
## [1] "ID" "Sex" "Cause_of_death" "dbuid2016"
## [5] "Location_of_death" "Marital_status" "Postalcode" "B_year"
## [9] "B_month" "B_day" "D_year" "D_month"
## [13] "D_day"
##
## $pop
## [1] "...1" "Health Service Delivery Area" "Year"
## [4] "Gender" "<1" "04-Jan"
## [7] "09-May" "14-Oct" "15-19"
## [10] "20-24" "25-29" "30-34"
## [13] "35-39" "40-44" "45-49"
## [16] "50-54" "55-59" "60-64"
## [19] "65-69" "70-74" "75-79"
## [22] "80-84" "85-89" "90+"
## [25] "Total"
##
## $corr
## [1] "dbuid2016" "csduid2016" "hruid2017" "hrname_english" "hrname_french"
## [6] "dbpop2016"
##
## $env
## [1] "POSTALCODE12" "WTHNRC12_01" "WTHNRC12_02" "WTHNRC12_03" "WTHNRC12_04" "WTHNRC12_05"
## [7] "WTHNRC12_06" "WTHNRC12_07"
Possible join paths:
mortcan be joined tocorrusingdbuid2016.popcan potentially be joined tocorrusing health region or health service delivery area variables, but the names and codes need checking first.envcontains postal code data, but the link to mortality records needs postal code cleaning and validation.