The R
package margins
provides average marginal effects for any model developed with lm
or glm
, among others. However, margins
does not work with brms
. Jack O’Bailey developed two functions for brms
here. There are two functions in this text file. The first function gets the average marginal effects for continuous independent variables. The second function get the average marginal effects for categorical variables. The link compares the results of the two functions with the margins
command from the package margins
. If you run the example, you can see that the results are remarkably similar.
Using the code above, this post attempts to recreate the functionality of the margins
command from the margins
package with ordinal logistic regression models.
The adapted function is below:
bayes_dydx_ordinal.factor <- function(model, data = NULL, variable, re_formula = NULL){
# Get data from model where data = NULL
if(is.null(data) == T){
d <- model$data
} else {
d <- data
}
# Get outcome from model
resp <- model$formula$resp
resp_ord <- model$formula$resp
# Omit outcome from data
d <-
d %>%
dplyr::select(-resp)
d <-
d %>%
group_by_all() %>%
count(name = "w") %>%
ungroup()
# Get factor levels
levs <- levels(as.factor(d[[variable]]))
base <- levs[1L]
cont <- levs[-1L]
# Create empty list for fitted draws
f <- list()
# For each list add fitted draws
for (i in seq_along(levs)){
# Fix variable in each list to factor level
d[[variable]] <- levs[i]
f[[i]] <-
d %>%
add_fitted_draws(model = model,
re_formula = NULL,
value = "eff") %>%
##### Need to include ordered within response variable
group_by_at(vars(".draw", ".category")) %>%
#group_by_at(vars(".draw")) %>%
summarise(
# eff = sum(eff),
# w = sum(w),
# eff_w_a = sum(eff*w),
eff_w = sum(eff * w)/sum(w)) %>%
#dplyr::select(eff_w) %>%
#dplyr::select(eff_w, w, eff, eff_w_a) %>%
ungroup() %>%
select(-.draw)
# Compute contrast if not base level
if (i > 1){
f[[i]]$eff_w <- f[[i]]$eff_w - f[[1]][[2]] #### Change here because also included the order of response
}
# # Rename column
names(f[[i]]) <- c(paste0("Category", levs[i]), levs[i])
#names(f[[i]]) <- levs[i]
}
# Remove data frame
d <- NULL
# Create output object
out <- do.call(cbind, f)
# Return AMEs
if (length(cont) == 1){
out <- out %>% tibble() %>% select(1,cont)
names(out)[2] <- "est"
out %>%
mutate(
var = variable,
resp = cont
) %>%
dplyr::select(Category=1,var, resp, est)
} else {
out %>%
select(1, cont) %>%
tidyr::pivot_longer(cols = -1 ,
names_to = "variable", values_to = "value") %>%
rename(resp = variable, est = value,
Category = 1)
}
} ### Close Function
The example uses the nhanes2
dataset, which is from the webuse
package.
We next run the ordinal Bayesian model using brms
. The code is below:
# Fit Bayesian model
bayes_ordinal <- brm(formula = health ~ female + black + age,
family=cumulative("logit"),
prior = c(prior(normal( 0, 1), class = b)),
data = nhanes2_a,
warmup = 1000,
iter = 2000,
chains = 2,
cores = 2)
AMEByVar = function(x,y) {
ParedEst = bayes_dydx_ordinal.factor(bayes_ordinal , nhanes2_a, variable = x ) %>%
group_by(Category, resp) %>%
summarise(p50 = median(est)) %>%
ungroup() %>%
mutate(Var = x)
}
BayesAME = purrr::map(as.list(c("black", "female")), AMEByVar) %>%
bind_rows(.)
As a first step, we can compare the log odds ratios between the Bayesian model with normally distributed priors and the previous model with flat priors:
As we can see, the coefficients are of similar value.
Coefficient Comparison | ||||||
---|---|---|---|---|---|---|
term | Bayesian Model Estimates | Flat Prior Model Estimates | ||||
Estimate | Q2.5 | Q97.5 | estimate | conf.low | conf.high | |
female1 | -0.117 | -0.189 | -0.046 | -0.117 | -0.187 | -0.047 |
black1 | -0.882 | -0.996 | -0.771 | -0.884 | -0.999 | -0.770 |
age | -0.041 | -0.043 | -0.039 | -0.041 | -0.043 | -0.039 |
We next use the adapted function bayes_dydx_ordinal.factor
to compare the marginal effects.
Below we compare the marginal effects from the bayes_dydx_ordinal.factor
function to the margins
function. Importantly, there is no documentation on how margins
works with ordinal regressions. There is great documentation here and a great technical paper, but neither covers ordinal regression.
Average Marginal Effects Comparison | |||
---|---|---|---|
Health Status | Response | Average Marginal Effects | |
AME | AME_Bayes | ||
poor | black | 0.073849996 | 0.073619320 |
poor | female | 0.007427055 | 0.007454756 |
fair | black | 0.073849996 | 0.092447854 |
fair | female | 0.007427055 | 0.011791804 |
average | black | 0.073849996 | 0.020653600 |
average | female | 0.007427055 | 0.006522333 |
good | black | 0.073849996 | -0.067484024 |
good | female | 0.007427055 | -0.006806429 |
excellent | black | 0.073849996 | -0.119285137 |
excellent | female | 0.007427055 | -0.018959347 |
As we can see the marginal effects from the margins
package do not match bayes_dydx_ordinal.factor
. However, the margins
estimates do not vary by outcome level. They are very similar at the first level, where health status is equal to poor compared to the estimates from bayes_dydx_ordinal.factor
, but vary considerably from the other outcomes.
Looking at STATA
STATA users seem to use average marginal effects more. This forum details the general process of average marginal (effects)[https://www.statalist.org/forums/forum/general-stata-discussion/general/1465336-ordered-probit-marginal-effects].
The basic idea is for each level of your dependent variable there should be a separate average marginal effect. The interpretation is then a one unit change in x is associated with a given probability of being in a category.
There is a paper that use the same nhanes2
dataset and provide average marginal effects by each outcome. The paper is here.
The third page provides the comparison. I put them into a table and compared them with the bayes_dydx_ordinal.factor
. The paper only provides the average marginal effects for the independent variable black
.
Average Marginal Effects Comparison - 2 | ||
---|---|---|
Health Status | AME | p50 |
poor | 0.074 | 0.074 |
fair | 0.093 | 0.092 |
average | 0.021 | 0.021 |
good | -0.068 | -0.067 |
excellent | -0.120 | -0.119 |
As we can see, our estimates line up with the estimates provided by Richard Williams’ paper discussed above.
Changes to the function:
There are very few changes to the original function:
The first change results from add_fitted_draws
. When we use this function with an ordinal regression model, it adds a variable to the output. The variable is called .category
which represents the levels of the dependent variable.
The change is simple:
- From:
group_by_at(vars( ".category"))
- To:
group_by_at(vars(".draw", ".category"))
This allows us to get a predicted draws from the data used to model.
f[[i]] <-
d %>%
add_fitted_draws(model = model,
re_formula = NULL,
value = "eff") %>%
##### Need to include ordered within response variable
group_by_at(vars(".draw", ".category")) %>%
#group_by_at(vars(".draw")) %>%
summarise(
# eff = sum(eff),
# w = sum(w),
# eff_w_a = sum(eff*w),
eff_w = sum(eff * w)/sum(w)) %>%
#dplyr::select(eff_w) %>%
#dplyr::select(eff_w, w, eff, eff_w_a) %>%
ungroup() %>%
select(-.draw)
# Compute contrast if not base level
if (i > 1){
f[[i]]$eff_w <- f[[i]]$eff_w - f[[1]][[2]] #### Change here because also included the order of response
}
The contrast code also needs to change as the data.frame
f[[1]]
now has two columns. The first column represents the category of outcome variable.
There are other changes regarding the format of the final output, but they are simple small changes using dplyr
.
Interpretation:
The interpretation of the average marginal effects for an ordinal model, as discussed earlier is a one-unit change in the independent variable is associated with a x% change of being in a given category.