CoRe.Rmd
The costs of reoffending (CoRe
) package exposes 5 functions as its API. These are:
rebase_reoffences
estimate_multipliers
estimate_wider_offences
estimate_unit_costs
estimate_total_costs
rebase_reoffences
is used to change from a number_reoffences
by index_offence_group
basis to a number_reoffences
by reoffence_group
basis. The latter basis is the basis used in the calculation of the costs of reoffending.
estimate_multipliers
is a convenience function used to estimate multipliers from the numbers of crimes as recorded by police to an estimate of wider crime for a given 12-month-period, as defined in the Home Office economic and social costs of crime report. It is already called in estimate_wider_offences
and estimate_unit_costs
.
estimate_wider_offences
takes user_data
and target_date
as input to estimate the number of wider (re)offences for a given cohort in a 12-month-follow-up-period. User data should have the columns year_ending
, reoffence_group
and number_offences
, at the least. It will group_by
all non-numeric columns, so ensure that you are using only columns you wish to use.
estimate_unit_costs
takes target_fy
to estimate the unit costs for a crime/reoffence inflated to the relevant financial year using GDP deflators. This is a convenience function which is already called in estimate_total_costs
and so needn’t be called to use that function.
estimate_total_costs
takes user_data
, target_date
and target_fy
to estimate the total costs of crime/reoffences for a given cohort in a 12-month-follow-up-period inflated to a specified financial year’s prices. User data should have the columns year_ending
, reoffence_group
and number_offences
, at the least.
If we wanted to calculate the cost of reoffending for the 2016 offender cohort, as in the official statistics, we can use the reoffending_stats
provided in the package.
estimate_total_costs(reoffending_stats, "2016-12-31")
#> Warning in validate_reoffending_data(user_data): You have multiple
#> 'year_endings' in your input data, only the closest to the 'target_date'
#> will be used.
#> Joining, by = c("offence", "multiplier")
#> Warning: Detecting old grouped_df format, replacing `vars` attribute by
#> `groups`
#> Joining, by = "offence_group"
#> Joining, by = "offence_group"
#> Joining, by = "offence"
#> Joining, by = "offence"
#> Joining, by = "coc_mapping"
#> Joining, by = "offence"
#> Joining, by = c("offence_group", "offence_subgroup", "offence", "year_ending", "number_offences", "coc_mapping")
#> Joining, by = "coc_mapping"
#> Joining, by = c("offence_group", "offence_subgroup", "offence", "year_ending", "number_offences", "coc_mapping", "crime_severity", "weighted_css", "multiplier", "expenditure_group", "cost")
#> Joining, by = c("offence_group", "number_offences", "multiplier", "expenditure_group", "cost")
#> Joining, by = c("offence_group", "offence_subgroup", "offence", "year_ending", "number_offences", "multiplier", "expenditure_group", "cost")
#> Warning: Grouping rowwise data frame strips rowwise nature
#> Joining, by = c("offence_group", "expenditure_group")
#> `mutate_if()` ignored the following grouping variables:
#> Column `offence_group`
#> Joining, by = "offence_group"
#> # A tibble: 3,250 x 14
#> offence_group proven_offences wider_offences number_offences
#> <chr> <dbl> <dbl> <dbl>
#> 1 Criminal dam… 31930 921189. 579254
#> 2 Criminal dam… 31930 921189. 579254
#> 3 Criminal dam… 31930 921189. 579254
#> 4 Criminal dam… 31930 921189. 579254
#> 5 Criminal dam… 31930 921189. 579254
#> 6 Criminal dam… 31930 921189. 579254
#> 7 Criminal dam… 31930 921189. 579254
#> 8 Criminal dam… 31930 921189. 579254
#> 9 Criminal dam… 31930 921189. 579254
#> 10 Criminal dam… 31930 921189. 579254
#> # … with 3,240 more rows, and 10 more variables: year_ending <dttm>,
#> # index_offence_group <chr>, number_offenders <chr>, age <chr>,
#> # number_reoffences <dbl>, reoffence_ratio <dbl>,
#> # wider_reoffences <dbl>, expenditure_group <chr>, cost <dbl>,
#> # total_cost <dbl>
These arguments are the defaults and they give a data frame with 14 columns. We can then estimate the total cost using dplyr
.
names(estimate_total_costs())
#> [1] "offence_group" "proven_offences" "wider_offences"
#> [4] "number_offences" "year_ending" "index_offence_group"
#> [7] "number_offenders" "age" "number_reoffences"
#> [10] "reoffence_ratio" "wider_reoffences" "expenditure_group"
#> [13] "cost" "total_cost"
estimate_total_costs() %>%
dplyr::summarise(total_cost = sum(total_cost, na.rm = TRUE))
#> # A tibble: 1 x 1
#> total_cost
#> <dbl>
#> 1 19707377567.
We can see the multipliers that were used to obtain the number of wider reoffences…
estimate_multipliers("2016-12-31")
#> # A tibble: 196 x 2
#> offence multiplier
#> <chr> <dbl>
#> 1 Abandoning a child under the age of two years 1.11
#> 2 Abduction of female 12.3
#> 3 Abuse of children through sexual exploitation 12.3
#> 4 Abuse of position of trust of a sexual nature 12.3
#> 5 Actual bodily harm (ABH) and other injury 1.92
#> 6 Aggravated burglary - business and community 0.831
#> 7 Aggravated burglary -Residential 2.99
#> 8 Aggravated burglary in a building other than a dwelling 0.831
#> 9 Aggravated burglary in a dwelling 2.99
#> 10 Aggravated vehicle taking 0.831
#> # … with 186 more rows
…and the number of wider (re)offences.
estimate_wider_offences() %>%
dplyr::group_by(c(offence_group)) %>%
dplyr::summarise(wider_offences = sum(wider_offences, na.rm = TRUE),
wider_reoffences = sum(wider_reoffences, na.rm = TRUE))
#> Warning in validate_reoffending_data(user_data): You have multiple
#> 'year_endings' in your input data, only the closest to the 'target_date'
#> will be used.
#> # A tibble: 13 x 3
#> `c(offence_group)` wider_offences wider_reoffences
#> <chr> <dbl> <dbl>
#> 1 Criminal damage and arson 23950909. 62778.
#> 2 Drug offences 3476538 38304.
#> 3 Fraud offences 86290057. 934052.
#> 4 Miscellaneous crimes against society 2116374 35691.
#> 5 Other 0 0
#> 6 Possession of weapons offences 860444 15578.
#> 7 Public order offences 8191352 196218.
#> 8 Robbery 4834006. 120496.
#> 9 Sexual offences 30294448. 72387.
#> 10 Summary motoring 0 50817
#> 11 Summary non-motoring 0 148924
#> 12 Theft offences 73862997. 2626271.
#> 13 Violence against the person 45708852. 425946.
reoffending_stats
also includes index_offence_group
and we can experiment with trying to replicate our earlier result after destroying the reoffence data here using the rebase_reoffences
function.
reoff <- reoffending_stats %>%
dplyr::group_by(year_ending, index_offence_group) %>%
dplyr::summarise(number_reoffences_by_index = sum(number_reoffences, na.rm = TRUE))
reoff <- rebase_reoffences(reoff, "2016-12-31")
estimate_total_costs(reoff, "2016-12-31") %>%
dplyr::summarise(total_cost = sum(total_cost, na.rm = TRUE))
#> # A tibble: 1 x 1
#> total_cost
#> <dbl>
#> 1 19707377567.