--- title: "Refinement building blocks" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Refinement building blocks} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Introduction In many pricing analyses, model estimation is followed by a translation step. A fitted GLM may capture the structure of the portfolio well, while some fitted effects still need to be reviewed before they are used in a tariff. Common reasons include: - irregular local variation - lack of monotonicity - externally imposed tariff structures - expert judgement not directly represented in the model - implementation constraints in policy administration systems For this reason, actuarial pricing work often distinguishes between: 1. model estimation 2. tariff refinement 3. final refit of the pricing structure `insurancerating` provides a staged refinement interface: 1. fit an unrestricted model 2. initialise a refinement object with `prepare_refinement()` 3. add one or more refinement steps 4. inspect these steps before refit 5. call `refit()` to obtain the final fitted model This separation can make tariff adjustments easier to understand, reproduce, and audit. ## When refinement can help Refinement can help when the estimated model output is useful, but the fitted coefficient pattern needs additional structure before it is used in a tariff. Typical use cases include: - smoothing a rating factor derived from a continuous variable - imposing monotonicity - restricting coefficients to a predefined relativity structure - introducing expert-based relativities within existing model levels - simplifying the final tariff for practical implementation In many workflows, refinement is applied to the model that represents the final pricing signal, such as a premium or pure-premium model. In other cases, it may also be useful for selected frequency or severity effects. The relevant question is whether the adjusted coefficient pattern is intended to support the tariff structure that will be reviewed or implemented. ## Example setup The example below starts from one common premium modelling setup: - analyse a continuous variable with a GAM - convert it to tariff segments - fit frequency and severity models - combine both into a premium proxy - fit an unrestricted premium model ```{r, message = FALSE, warning = FALSE} library(insurancerating) library(dplyr) age_policyholder_frequency <- risk_factor_gam( data = MTPL, claim_count = "nclaims", risk_factor = "age_policyholder", exposure = "exposure" ) age_segments_freq <- derive_tariff_segments(age_policyholder_frequency) dat <- MTPL |> add_tariff_segments(age_segments_freq, name = "age_policyholder_freq_cat") |> mutate(across(where(is.character), as.factor)) |> mutate(across(where(is.factor), ~ set_reference_level(., exposure))) freq <- glm( nclaims ~ bm + age_policyholder_freq_cat, offset = log(exposure), family = poisson(), data = dat ) sev <- glm( amount ~ zip, weights = nclaims, family = Gamma(link = "log"), data = dat |> filter(amount > 0) ) premium_df <- dat |> add_prediction(freq, sev) |> mutate(premium = pred_nclaims_freq * pred_amount_sev) burn_unrestricted <- glm( premium ~ zip + bm + age_policyholder_freq_cat, weights = exposure, family = Gamma(link = "log"), data = premium_df ) ``` Before refinement, inspect the unrestricted coefficient structure: ```{r} rating_table(burn_unrestricted) rating_table(burn_unrestricted) |> autoplot() ``` At this stage, the coefficients reflect the unrestricted model fit. This output is often informative by itself. If the pattern is too irregular, too granular or difficult to explain, a refinement step can be added explicitly. ## The refinement object Refinement begins with: ```{r} ref <- prepare_refinement(burn_unrestricted) ref ``` A `rating_refinement` object stores: - the fitted base model - the underlying model data - the refinement steps added through the refinement interface At this point, the model itself has not been refitted. The refinement object represents a proposed tariff adjustment structure, not yet the final fitted result. This distinction is useful because refinement steps can be inspected before they are incorporated into the final model. ## Smoothing ### Purpose Smoothing can be used when a rating factor derived from a continuous variable contains local variation that is hard to justify in a tariff. For example, a coefficient pattern such as: - age 30–34 lower - age 34–38 higher - age 38–42 lower again may be statistically possible, but difficult to explain or maintain. Smoothing adds a more stable structure to the rating factor. ### Adding smoothing ```{r} ref <- ref |> add_smoothing( model_variable = "age_policyholder_freq_cat", source_variable = "age_policyholder", breaks = seq(18, 95, 5), weights = "exposure" ) ``` The key arguments are: - `model_variable`: the grouped variable present in the GLM - `source_variable`: the original continuous portfolio variable - `breaks`: the preferred commercial cut points - `smoothing`: the smoothing specification - `weights`: optional weighting, typically exposure ### Inspecting smoothing before refit ```{r} print(ref) autoplot(ref, variable = "age_policyholder_freq_cat") ``` This plot belongs to the **pre-refit stage**. It shows: - the original fitted coefficients - the proposed smoothed structure The purpose is to inspect the refinement step itself, before it is incorporated into the final fitted model. ### Choosing a smoothing method Typical smoothing choices are: - `"spline"`: polynomial-style smoothing - `"gam"`: flexible smooth curve - `"mpi"`: monotone increasing - `"mpd"`: monotone decreasing The appropriate choice depends on the pricing context. For example: - age may justify a flexible smooth - insured value or power may require a monotonic relationship - low-exposure tails may benefit from exposure weighting ## Restrictions ### Purpose Restrictions can be used when coefficients need to follow a predefined structure. Typical examples include: - bonus-malus systems - governance-approved relativities - externally mandated tariff structures - implementation constraints in policy systems Restrictions differ from smoothing: - smoothing reshapes the fitted pattern - restriction imposes user-defined coefficients ### Adding restrictions ```{r} zip_df <- data.frame( zip = c(0, 1, 2, 3), zip_adj = c(0.8, 0.9, 1.0, 1.2) ) ref <- ref |> add_restriction(restrictions = zip_df) ``` The restriction table must contain exactly two columns: - the original factor levels - the adjusted coefficients ### Inspecting restrictions before refit ```{r} autoplot(ref, variable = "zip") ``` This shows the proposed restricted structure relative to the original fitted model. ## Expert-based relativities ### Purpose In some cases, the fitted model uses a broad factor level, while portfolio or business knowledge suggests that more granular differentiation may be useful. For example, a model may estimate one coefficient for "construction", while pricing practice distinguishes between: - residential construction - commercial construction - civil engineering This can be relevant when subgroup exposure is too limited to estimate stable coefficients directly. ### Adding relativities ```{r, eval = FALSE} relativities_activity <- relativities( split_level( "construction", c("residential_construction", "commercial_construction"), c(1.00, 1.15) ) ) ref <- ref |> add_relativities( model_variable = "business_activity", split_variable = "business_activity_split", relativities = relativities_activity, exposure = "exposure", normalize = TRUE ) ``` If `normalize = TRUE`, the relativities are scaled so that their exposure-weighted average remains equal to 1 within the original level. This preserves the original model signal while introducing finer structure. ## Refit ### Why refit is required Refinement steps alter part of the model structure. Once these changes are applied, the remaining coefficients may also adjust. For that reason, the sequence does not end with `add_smoothing()` or `add_restriction()`. The final step is: ```{r} burn_refined <- refit(ref) ``` This refits the model while incorporating the documented refinement steps. ### Inspecting the final fitted result After refit, use `rating_table()`: ```{r} rating_table(burn_refined) ``` At this point, the output no longer represents a proposed refinement plan. It represents the fitted coefficient structure after refinement. The distinction is: - before `refit()` --> inspect the refinement plan - after `refit()` --> inspect the fitted tariff structure If smoothing, restrictions, and relativities have been applied, they are now embedded in the fitted model output. ### Visualising the final structure ```{r} rating_table(burn_refined) |> autoplot() ``` ## Model data and rating grids After refit, model structure can be extracted with `extract_model_data()`: ```{r} md <- extract_model_data(burn_refined) head(md) ``` Observed model-point combinations can be obtained with `rating_grid()`: ```{r} grid <- rating_grid(burn_refined) head(grid) ``` This is typically used for: - tariff review - portfolio summaries - compact prediction input - implementation support ## Complete example One possible refinement sequence is: ```{r} zip_df <- data.frame( zip = c(0, 1, 2, 3), zip_adj = c(0.8, 0.9, 1.0, 1.2) ) burn_refined <- prepare_refinement(burn_unrestricted) |> add_smoothing( model_variable = "age_policyholder_freq_cat", source_variable = "age_policyholder", breaks = seq(18, 95, 5), weights = "exposure" ) |> add_restriction(zip_df) |> refit() rating_table(burn_refined) rating_table(burn_refined) |> autoplot() ``` ## Legacy interface Legacy entry points remain available: ```{r, eval = FALSE} burn_refined_old <- burn_unrestricted |> smooth_coef( x_cut = "age_policyholder_freq_man", x_org = "age_policyholder", breaks = seq(18, 95, 5) ) |> restrict_coef(zip_df) |> refit_glm() ``` These are primarily maintained for backward compatibility. For new code, the recommended interface is: ```{r, eval = FALSE} prepare_refinement() |> add_*() |> refit() ``` This keeps the sequence of tariff adjustments explicit. ## Summary The refinement interface helps separate: - model estimation - tariff adjustments - final fitted output This makes it easier to document and inspect adjustments before the model is refitted. In practice, this can support tariff structures that are: - statistically grounded - interpretable - commercially usable - easier to implement ## Next steps For the underlying pricing concepts, see: - [Pricing workflow building blocks](pricing-workflow-building-blocks.html) For an example sequence from portfolio analysis to fitted tariff, see: - [Getting started](getting-started.html)