Model

To analyze the factors associated with the likelihood of an arrest being made, we used a Bayesian Bernoulli regression model. The model aims to understand how subject sex and the date of the event influence the probability of an arrest. The formula used is:

arrest_made ∼ subject_sex + as.numeric(date)

 Family: bernoulli 
  Links: mu = logit 
Formula: arrest_made ~ subject_sex + as.numeric(date) 
   Data: clean_data (Number of observations: 154640) 
  Draws: 4 chains, each with iter = 1000; warmup = 500; thin = 1;
         total post-warmup draws = 2000

Regression Coefficients:
                Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept           3.20      2.82    -2.06     5.92 1.54        7       23
subject_sexmale     0.76      0.35     0.48     1.37 1.60        7       11
as.numericdate     -0.00      0.00    -0.00    -0.00 1.59        7       11

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

The model is a Bayesian Bernoulli regression analyzing the likelihood of an arrest with predictors subject_sex and date. The logit link function is used. The dataset includes 154,640 observations, and the model was fit using 4 chains of 1000 iterations each. The intercept estimate is 3.20, suggesting the baseline log-odds of an arrest, while subject_sexmale has a coefficient of 0.76, indicating males have higher log-odds of arrest. The date predictor has a negligible effect on arrest likelihood. Convergence diagnostics (Rhat) are close to 1, showing good model fit.

Summary of Bernoulli Model Coefficients
term Beta 95% CI Lower 95% CI Upper
Intercept 3.1957956 -2.0599654 5.9249951
Subject Sex (Male) 0.7617674 0.4798693 1.3708080
Date -0.0004272 -0.0006160 -0.0000077

Intercept: The estimate is 3.20 with a 95% credible interval ranging from -2.06 to 5.92. This value represents the baseline log-odds of an arrest when other predictors are at their reference levels. Subject Sex (Male): The coefficient is 0.76 with a 95% credible interval from 0.48 to 1.37. This suggests that being male is associated with higher log-odds of an arrest compared to the reference category (female). Date: The coefficient is -0.00 with a 95% credible interval from -0.00 to -0.00. This indicates a very slight decrease in the log-odds of an arrest with each passing day.