simplified_slides

Session Objectives

In this workshop, we will discuss how rigor in applied epidemiological methods, combined with qualitative stakeholder involvement, can generate useful and realistic policy insights.

Modified Treatment Policies

Presenter: Upul

Demonstrates how to operationalize policy emulation using Modified Treatment Policies (MTPs). - Using a dynamic shifts in continuous exposures, providing a methodologically rigorous alternative to static intervention causal frameworks.

Causal Mediation Analysis

Presenter: Sharon

Will focus on identifying the mechanisms through which an exposure translates to an observed outcome. - Demonstrate utilizing causal mediation approaches, decompose total effects to understand structural pathways driving oral health inequalities.

Contextual Effects

Presenter: Huihua

Will detail the methodological requirements for properly isolating contextual effects, demonstrating how higher-level ecological conditions systematically influence individual-level conditions and individual-level outcomes.

Stakeholder and Consumer Integration

Presenter: Ruby

Will bridge methodological simulation with applied policy formulation. By integrating consumer lived experiences and established stakeholder perspectives, we ensure that the resulting causal models directly map to actionable, culturally relevant policy priorities.

The Policymaker’s Dilemma

Imagine you are a policymaker with a $1M public health budget.

What kind of evidence would you demand before investing?

A. Associative Evidence “Sugar and dental decay go hand in hand.”

B. Causal Evidence “If we reduce sugar consumption by 20g per day, we avert 15% of all new dental disease.”

You need causal answers, not just correlations.

The Problem: Association ≠ Causation

Hern´an MA, Robins JM (2020). Causal Inference:What If

🔍 Association (Comparing Subgroups)

The Approach: We partition the observed data.
The Contrast: We compare the dental outcomes of the “high sugar” fraction directly against the “low sugar” fraction.
The Focus: Contrasting different subsets of people.

🎯 Causation (Contrasting the Whole Population)

The Approach: We construct counterfactual scenarios.
The Contrast: We compare the entire population’s actual outcomes against the entire population’s expected outcomes under a new policy.
The Focus: Contrasting the exact same population under different conditions.

What Can Regression Tell Us?

Regression CAN tell us:	Regression CANNOT tell us:
“People who eat more sugar tend to have more cavities”	“If we reduce sugar, fewer people will get cavities”
Variables are related	What would happen under a POLICY
Prediction	Causation

Key message: Regression models isolate robust predictive patterns. However, unmeasured structural confounding ensures these statistical associations frequently fail to translate into causal policy impacts.

What is Exchangeability?

Exchangeability To make valid causal claims, the groups we compare must be fundamentally similar across all relevant dimensions before the intervention. We require a true “apples to apples” comparison.

Confounding When structural factors such as baseline income or education differ systematically between groups, exchangeability breaks down. The comparison becomes “apples to oranges,” (direct contrasts are biased).

https://en.wikipedia.org/wiki/False_equivalence

Our Dataset

Causal Evidence: “If we reduce sugar consumption by 20g per day, what is the amount of new dental caries we avert in the population”

A cross-sectional study of 1000 adults:

Variable	Description
`sugar_consumption`	Daily sugar intake (g/day)
`tooth_decay`	Cavities present? (1=Yes, 0=No)
`age`	Age in years
`sex`	0=Male, 1=Female
`income`	1 (low) → 4 (high)
`education`	1 (low) → 3 (high)

Overall decay prevalence: 10.4%

Structure of Confounding

Lower-income people consume more sugar AND have more tooth decay — for reasons beyond sugar alone

The Tempting Approach

Fit a logistic regression and read off the odds ratio:

fit_naive <- glm(tooth_decay ~ sugar_consumption + 
                                age + sex +
                                income + education,
                 family = binomial, data = dental_data)

broom::tidy(fit_naive, exponentiate = TRUE, conf.int = TRUE) |>
  filter(term == "sugar_consumption") |>
  select(term, OR = estimate, conf.low, conf.high, p.value) |>
  mutate(across(where(is.numeric), \(x) round(x, 3))) |>
  knitr::kable()

term	OR	conf.low	conf.high	p.value
sugar_consumption	1.032	1.02	1.045	0

“Each additional gram of daily sugar increases the odds of tooth decay by ~3%.”

But this answers: “Among people who happen to differ in sugar intake, how do outcomes compare?”
Not: “If we intervened to change sugar intake, what would happen?”

The Fundamental Problem of Causal Inference

“For each person, we only see ONE reality: what actually happened. We never see what would have happened if things were different.”

age	income	sugar_consumption	tooth_decay_observed	What if sugar reduced?
46	2	42.8	0	?
38	2	58.8	0	?
48	1	32.0	0	?
55	4	43.1	0	?
43	1	76.4	1	?
36	4	47.0	0	?
43	2	48.0	0	?

“Modified Treatment Policies let us ESTIMATE these counterfactuals without needing to see them directly.”

Static vs Modified Policy

Static (bad for continuous): > “Set everyone to exactly 40g sugar” — unrealistic, no data for many people

Modified Treatment Policy (good): > “For people eating more than 50g, reduce by 20g. For others, keep the same.”

Visual:

“This is like a REALISTIC policy — we only ask heavy eaters to cut back, not everyone.”

\[d_1(a_t, h_t) = \begin{cases} a_t - 20 & \text{if } a_t > 50 \\ a_t & \text{otherwise} \end{cases}\]

What Does the Intervention Look Like?

policy_reduce_20 <- function(data, trt) {
  a <- data[[trt]]
  ifelse(a > 50, a - 20, a)
}

Mirrors a realistic public health intervention (e.g., sugar tax, labelling policy)
Stays within the support of the observed data ✅
Respects individual baseline levels ✅
Avoids positivity violations ✅

Defining the Shift in R

# Policy: Reduce sugar by 20 g/day for those consuming > 50 g/day
policy_reduce_20 <- function(data, trt) {
  a <- data[[trt]]
  ifelse(a > 50, a - 20, a)
}

# Preview what it does to a few observations
dental_data |>
  select(sugar_consumption) |>
  mutate(
    sugar_policy = policy_reduce_20(dental_data, "sugar_consumption")
  ) |>
  slice_sample(n = 7)

# A tibble: 7 × 2
  sugar_consumption sugar_policy
              <dbl>        <dbl>
1              75.6         55.6
2              56.6         36.6
3              25.6         25.6
4              61.0         41.0
5              44.5         44.5
6              45.9         45.9
7              68.0         48.0

How Does lmtp Work?

Split data → “Practice on half, test on half”
Model treatment → “Learn who eats how much sugar”
Model outcomes → “Predict cavities based on sugar + other factors”
Combine & correct → “if either the treatment model or the outcome model is imperfect, the estimate is still valid”

In R: Observed world estimate

fit_obs <- lmtp_tmle(
  data         = dental_data,
  trt          = "sugar_consumption",
  outcome      = "tooth_decay",
  baseline     = c("age", "sex", "income", "education"),
  shift        = NULL,
  outcome_type = "binomial",
  learners_outcome = "SL.glm",
  learners_trt     = "SL.glm",
  folds = 3
)

tidy(fit_obs)

# A tibble: 1 × 4
  estimate std.error conf.low conf.high
     <dbl>     <dbl>    <dbl>     <dbl>
1    0.104   0.00966   0.0850     0.123

In R: Policiy scenario estimate

fit_mtp_20 <- lmtp_tmle(
  data         = dental_data,
  trt          = "sugar_consumption",
  outcome      = "tooth_decay",
  baseline     = c("age", "sex", "income", "education"),
  shift        = policy_reduce_20,
  mtp          = TRUE,
  outcome_type = "binomial",
  learners_outcome = "SL.glm",
  learners_trt     = "SL.glm",
  folds = 3)

tidy(fit_mtp_20)

# A tibble: 1 × 4
  estimate std.error conf.low conf.high
     <dbl>     <dbl>    <dbl>     <dbl>
1   0.0756   0.00913   0.0577    0.0935

The Causal Contrasts

Absolute causal contrasts

# Additive causal contrasts: E[Y^d] - E[Y^obs]
contrast_20 <- lmtp_contrast(fit_mtp_20, ref = fit_obs, type = "additive")

contrast_20$estimates |> mutate(Policy = "Reduce 20g if >50g")|>
select(Policy, estimate, conf.low, conf.high)

# A tibble: 1 × 4
  Policy             estimate conf.low conf.high
  <chr>                 <dbl>    <dbl>     <dbl>
1 Reduce 20g if >50g  -0.0283  -0.0392   -0.0174

Relative causal contrasts

# Relative causal contrasts: E[Y^d] / E[Y^obs]
contrast_20_rr <- lmtp_contrast(fit_mtp_20, ref = fit_obs, type = "rr")

contrast_20_rr$estimates |> mutate(Policy = "Reduce 20g if >50g")|>
select(Policy, estimate, conf.low, conf.high)

# A tibble: 1 × 4
  Policy             estimate conf.low conf.high
  <chr>                 <dbl>    <dbl>     <dbl>
1 Reduce 20g if >50g    0.728    0.630     0.825

Interpreting the Answers

In the real world: 10.4% (estimate of fit_obs= 0.104) have tooth decay

If we reduced sugar by 20g for heavy eaters: 7.5% (fit_mtp_20 estimate=0.0756) would have tooth decay

That’s a 2.8% point DROP (or 1-0.756 = 0.244 , i.e, 24.4% reduced risk) in tooth decay!

CAUTION: given exchangeability, positivity, & consistancy

Assumptions

Assumption	Simple explanation
Positivity	“The policy is realistic - everyone can actually do it”
No hidden factors	“We measured the important factors (income, age, etc.)”
No spillover	“Your sugar doesn’t affect my teeth”

Summary

Regression shows “what goes with what”
MTP answers “what if we did this policy?”
Works even for continuous exposures like sugar
Gives answers policy makers can actually use

Ask causal questions. Define feasible interventions. Use the right tools.

Resources

Key papers

Díaz et al. (2023). Nonparametric Causal Effects Based on Longitudinal Modified Treatment Policies. JASA.
Kennedy (2019). Nonparametric Causal Effects Based on Incremental Propensity Score Interventions. JASA.

Software

install.packages("lmtp")
# https://beyondtheate.com/

All analyses use a simulated dataset for illustration.