Package Structure

The costs of reoffending (CoRe) package exposes 5 functions as its API. These are:

  1. rebase_reoffences
  2. estimate_multipliers
  3. estimate_wider_offences
  4. estimate_unit_costs
  5. estimate_total_costs

rebase_reoffences is used to change from a number_reoffences by index_offence_group basis to a number_reoffences by reoffence_group basis. The latter basis is the basis used in the calculation of the costs of reoffending.

estimate_multipliers is a convenience function used to estimate multipliers from the numbers of crimes as recorded by police to an estimate of wider crime for a given 12-month-period, as defined in the Home Office economic and social costs of crime report. It is already called in estimate_wider_offences and estimate_unit_costs.

estimate_wider_offences takes user_data and target_date as input to estimate the number of wider (re)offences for a given cohort in a 12-month-follow-up-period. User data should have the columns year_ending, reoffence_group and number_offences, at the least. It will group_by all non-numeric columns, so ensure that you are using only columns you wish to use.

estimate_unit_costs takes target_fy to estimate the unit costs for a crime/reoffence inflated to the relevant financial year using GDP deflators. This is a convenience function which is already called in estimate_total_costs and so needn’t be called to use that function.

estimate_total_costs takes user_data, target_date and target_fy to estimate the total costs of crime/reoffences for a given cohort in a 12-month-follow-up-period inflated to a specified financial year’s prices. User data should have the columns year_ending, reoffence_group and number_offences, at the least.

Examples

If we wanted to calculate the cost of reoffending for the 2016 offender cohort, as in the official statistics, we can use the reoffending_stats provided in the package.

estimate_total_costs(reoffending_stats, "2016-12-31")
#> Warning in validate_reoffending_data(user_data): You have multiple
#> 'year_endings' in your input data, only the closest to the 'target_date'
#> will be used.
#> Joining, by = c("offence", "multiplier")
#> Warning: Detecting old grouped_df format, replacing `vars` attribute by
#> `groups`
#> Joining, by = "offence_group"
#> Joining, by = "offence_group"
#> Joining, by = "offence"
#> Joining, by = "offence"
#> Joining, by = "coc_mapping"
#> Joining, by = "offence"
#> Joining, by = c("offence_group", "offence_subgroup", "offence", "year_ending", "number_offences", "coc_mapping")
#> Joining, by = "coc_mapping"
#> Joining, by = c("offence_group", "offence_subgroup", "offence", "year_ending", "number_offences", "coc_mapping", "crime_severity", "weighted_css", "multiplier", "expenditure_group", "cost")
#> Joining, by = c("offence_group", "number_offences", "multiplier", "expenditure_group", "cost")
#> Joining, by = c("offence_group", "offence_subgroup", "offence", "year_ending", "number_offences", "multiplier", "expenditure_group", "cost")
#> Warning: Grouping rowwise data frame strips rowwise nature
#> Joining, by = c("offence_group", "expenditure_group")
#> `mutate_if()` ignored the following grouping variables:
#> Column `offence_group`
#> Joining, by = "offence_group"
#> # A tibble: 3,250 x 14
#>    offence_group proven_offences wider_offences number_offences
#>    <chr>                   <dbl>          <dbl>           <dbl>
#>  1 Criminal dam…           31930        921189.          579254
#>  2 Criminal dam…           31930        921189.          579254
#>  3 Criminal dam…           31930        921189.          579254
#>  4 Criminal dam…           31930        921189.          579254
#>  5 Criminal dam…           31930        921189.          579254
#>  6 Criminal dam…           31930        921189.          579254
#>  7 Criminal dam…           31930        921189.          579254
#>  8 Criminal dam…           31930        921189.          579254
#>  9 Criminal dam…           31930        921189.          579254
#> 10 Criminal dam…           31930        921189.          579254
#> # … with 3,240 more rows, and 10 more variables: year_ending <dttm>,
#> #   index_offence_group <chr>, number_offenders <chr>, age <chr>,
#> #   number_reoffences <dbl>, reoffence_ratio <dbl>,
#> #   wider_reoffences <dbl>, expenditure_group <chr>, cost <dbl>,
#> #   total_cost <dbl>

These arguments are the defaults and they give a data frame with 14 columns. We can then estimate the total cost using dplyr.

names(estimate_total_costs())
#>  [1] "offence_group"       "proven_offences"     "wider_offences"     
#>  [4] "number_offences"     "year_ending"         "index_offence_group"
#>  [7] "number_offenders"    "age"                 "number_reoffences"  
#> [10] "reoffence_ratio"     "wider_reoffences"    "expenditure_group"  
#> [13] "cost"                "total_cost"
estimate_total_costs() %>%
  dplyr::summarise(total_cost = sum(total_cost, na.rm = TRUE))
#> # A tibble: 1 x 1
#>     total_cost
#>          <dbl>
#> 1 19707377567.

We can see the multipliers that were used to obtain the number of wider reoffences…

estimate_multipliers("2016-12-31")
#> # A tibble: 196 x 2
#>    offence                                                 multiplier
#>    <chr>                                                        <dbl>
#>  1 Abandoning a child under the age of two years                1.11 
#>  2 Abduction of female                                         12.3  
#>  3 Abuse of children through sexual exploitation               12.3  
#>  4 Abuse of position of trust of a sexual nature               12.3  
#>  5 Actual bodily harm (ABH) and other injury                    1.92 
#>  6 Aggravated burglary - business and community                 0.831
#>  7 Aggravated burglary -Residential                             2.99 
#>  8 Aggravated burglary in a building other than a dwelling      0.831
#>  9 Aggravated burglary in a dwelling                            2.99 
#> 10 Aggravated vehicle taking                                    0.831
#> # … with 186 more rows

…and the number of wider (re)offences.

estimate_wider_offences() %>%
  dplyr::group_by(c(offence_group)) %>%
  dplyr::summarise(wider_offences = sum(wider_offences, na.rm = TRUE),
                   wider_reoffences = sum(wider_reoffences, na.rm = TRUE))
#> Warning in validate_reoffending_data(user_data): You have multiple
#> 'year_endings' in your input data, only the closest to the 'target_date'
#> will be used.
#> # A tibble: 13 x 3
#>    `c(offence_group)`                   wider_offences wider_reoffences
#>    <chr>                                         <dbl>            <dbl>
#>  1 Criminal damage and arson                 23950909.           62778.
#>  2 Drug offences                              3476538            38304.
#>  3 Fraud offences                            86290057.          934052.
#>  4 Miscellaneous crimes against society       2116374            35691.
#>  5 Other                                            0                0 
#>  6 Possession of weapons offences              860444            15578.
#>  7 Public order offences                      8191352           196218.
#>  8 Robbery                                    4834006.          120496.
#>  9 Sexual offences                           30294448.           72387.
#> 10 Summary motoring                                 0            50817 
#> 11 Summary non-motoring                             0           148924 
#> 12 Theft offences                            73862997.         2626271.
#> 13 Violence against the person               45708852.          425946.

reoffending_stats also includes index_offence_group and we can experiment with trying to replicate our earlier result after destroying the reoffence data here using the rebase_reoffences function.

reoff <- reoffending_stats %>%
  dplyr::group_by(year_ending, index_offence_group) %>%
  dplyr::summarise(number_reoffences_by_index = sum(number_reoffences, na.rm = TRUE))

reoff <- rebase_reoffences(reoff, "2016-12-31")

estimate_total_costs(reoff, "2016-12-31") %>%
  dplyr::summarise(total_cost = sum(total_cost, na.rm = TRUE))
#> # A tibble: 1 x 1
#>     total_cost
#>          <dbl>
#> 1 19707377567.