Technical and Programming

Forever Chemicals in the Water

Chemicals

Environment

Water

web-scraping

Exploring the concentration of PFOA and PFOS in the drinking water.

Jul 3, 2022

Moving to quarto

Quarto

Moving to Quarto from Distill

Jun 5, 2022

Pfizer BNT162b2 for Under 5s

Bayes

Covid-19

Clinical Trials

Stan

Causal Inference

A Bayesian reanalysis of estimates for the Pfizer Vaccine candidate for children under five years old. Frequentist statistics say it fails while Bayes would indicate that it should be approved.

Feb 11, 2022

Waste Water Monitoring and COVID-19

covid-19

mgcv

Investigating how well Waste Water predicts COVID-19 cases in North Carolina.

Jan 1, 2022

Can Bayesian Analysis Save Paxlovid?

Bayes

Covid-19

Clinical Trials

Stan

Causal Inference

Can Bayesian analysis be used to understand the impact of a treatment even though the frequentist results are not significant?

Dec 15, 2021

Advantage of Bayesian Hypothesis Testing

Bayes

brms

hypothesis

contrasts

Covid-19

Here looking at the differences in traditional linear contrasts versus Bayesian Hypothesis testing.

Nov 19, 2021

Neutralizing Antibody Titres

antibodies

nAbs

Covid-19

Immunology

Bayes

Stan

How long someone has detectable neutralizing antibodies after vaccination is important in understanding the impact of vaccination on disease transmission. In this post I step through several different models and re-examine data from a prior paper.

Nov 11, 2021

Thinking About Viral Evolution

musings

sars-cov-2

Covid-19

epidemiology

Exploring the impact of contact rate and virulence and their impact on the basic reproduction number of a pathogen.

Oct 9, 2021

RProfiles

workflow

rprofile

Functions and settings in your Rprofile can be dangerous for reproducibility, but can afford some nice workflow tools.

Oct 6, 2021

How Discerning is the Technical Challenge in GBBO?

IRT

baking

Are the technical challenges which are judged blindly good indcators of if a baker will win overall?

Sep 23, 2021

Comparing Mortality Rates is Hard

mortality

demography

rates

delays

time series

Comparing crude mortality rates across NC during the COVID-19 pandemic shows differences, but fails to capture the nuance of potential sources of bias.

May 9, 2021

Time to Vaccinate

bayes

time series

public health

Covid-19

Using Bayesian Structural Times Series to estimate when some North Carolina counties will be vaccinated to a sufficient number.

May 7, 2021

ShinyProxy Serving Websites

shiny

deploy

shinyproxy

This post discuses using the ShinyProxy framework to serve static html sites. These products could be generated from single R Markdown documents to entire websites. Serving these items in containers gives you all the benefits of containerising your work along with the ability to authenticate through ShinyProxy if desired.

Oct 3, 2020

Bayesian SIR

Bayes

SIR

Compartmental Model

Epidemiology

In this post I review how to build a compartmental model using the Stan probabilistic computing language. This is based largely by the case study, Bayesian workflow for disease transmission modeling in Stan which has been expanded to include a second compartment for exposed individuals as well as utilise case incidence data rather than prevalence.

Sep 5, 2020

Negative Binomial Distribution and Epidemics

negative binomial

epidemics

Super-spreading events can be characterised by a single case spreading to a larger than expected number of people. This phenomenon can be well-represented by a negative binomial distribution versus a standard Poisson distribution. In this post I review the overdispersion factor and how it can be parameterised in a model.

Sep 1, 2020

Optimisation with Stan

Stan

Optimisation

Using Stan for optimization.

Aug 27, 2020

ggdist and Epidemic Curves

pandemic

scenarios

curve statistics

This post explores using tools to summarise curves rather than fixed time summary methods. This includes using odin and ggdist to explore the risk of underestimating epidemic curves.

Aug 9, 2020

Sensitivity and Specificity

pandemic

bayes

sensitivity

Here I explore the implications of different levels of sensitivity and specificity in a Bayesian framework. All of this work is based on Gelman and Carpenter.

Aug 9, 2020

julia ABM SIR

pandemic

scenarios

curve statistics

julia

agent based models

Use Julia and R to run agent based models in Julia and visualise them in R.

Aug 9, 2020

Michael DeWitt

Flatten the Curve

pandemic

exponential growth modeling

In the post I explore the potential growth rate of Covid-19 to Forsyth County, NC. This also includes looking at the kind of load that this virus could place on our existing healthcare systems. I strongly advocate for acting to delay to flood of potential community acquired infections.

Mar 13, 2020

Airflow on Windows Linux Subsystem

airflow

wls

git

scheduling

apache airflow

In this I detail the process for getting a working instance of Apache Airflow on Windows Linux Subsystem. This is a combination of several different posts spread across the internet. Apache Airflow is an exceptional program for scheduling and running tasks.

Mar 4, 2020

2020 Plans

Resolutions

Stan

A preview of some of the items that I will try to write about in 2020.

Jan 1, 2020

How About Impeachment?

Political

Bayes

State Space

In a previous blog post I looked at approval ratings. Now that impeach is the topic of the day, I think it would be wise to try the same methodology with the public opinion surrounding impeachment. While the data are much more sparse, it will be fun to examine.

Oct 8, 2019

Approval Rating Now?

Political

Bayes

State Space

Given the current controversy regarding President Trump, let’s use a state-space Bayesian model to see what his approval rating currently is. As more surveys go into the field this will change, but let’s just look now.

Sep 26, 2019

Integrating Over Your Loss Function

Bayes

Loss Functions

Assessment

Stan

Often times when doing an analysis, it is important to put the results in the context of the loss. For example, a small effect that is cheaply implemented might be the best use of resources. Using Bayesian modeling and loss functions we can better assess the impact and provide better information for decision-making when it comes to allocation of scarce resources (especially in the world of small effect sizes).

Sep 18, 2019

Remembering Apollo

Some ruminations about the legacy of Apollo and doing things when failure isn’t an option.

Jul 19, 2019

On the use of command line tools

CLI

awk

sed

bash

Using AWK to parse court calendars

Jun 22, 2019

Defining a Project Workflow

Tooling

GPP

Workflow

Having a defined project workflow is important for many reasons. Consistency of design allows for easier sharing (you or other collaborators don’t have to look for things) and reduces some cognitive load by allowing you to focus on content and less on form. This is my lightly opinionated project structure. Of course these fews are ever evolving.

Jun 10, 2019

Finding the Needle in the Haystack

Sensitivity

Cost Benefit Analysis

Sometimes instead of accuracy we need to look at different metrics. One such metric is sensitivity, which is a measure of those who are actually targets how many does the model correctly identify. This can be the metric of choice over accuracy when you are dealing with a raw event such as a terrorist attack or even student retention. It is always important to understand what metrics you are optimising your models on.

Jun 9, 2019

State Space Models for Poll Prediction

Political

Bayes

State Space

In this section I replicate some state space poll modeling that James Savage and Peter Ellis used in a few different scenarios. State space modeling provides a great way to model times series effects when the data are collected at irregular intervals (e.g. opinion polling).

May 18, 2019

Re-districting in Winston-Salem

Political

In this post I explore a potential outcomes to the composition of the Winston-Salem city council.

Apr 8, 2019

Omitted Variable Bias

fake data

omitted variable

inference

Exploring the implicates of omited variables in analysis.

Apr 7, 2019

MRP Redux

Bayes

MRP

brms

Using fake data simulations to understand the our MRP model.

Apr 5, 2019

Speeding Things Up with Rcpp

Rcpp

Bayes

Metropolis Hasting samplers are typically slow in R because of inability to parallelise or vectorise operations. The Rcpp package allows a way to use C++ to conduct these MCMC operations at a much greater speed. This post explores how one would do this, achieving a >20x speed up.

Apr 4, 2019

Latex in ggplot2

ggplot2

data visualisation

This is a quick overview of a trick to add LaTex in ggplot2.

Apr 3, 2019

MRP using brms

Bayes

mrp

prediction

This post explores MRP using brms and tidyverse modeling.

Nov 7, 2018

Replicating gsynth

causal inference

synthetic controls

econometrics

The purpose of this post is to replicate the examples in the gsynth package for synthetic controls. This is a methodology for causal inference especially at the state level.

Oct 29, 2018

Hierarchical Time Series with hts

time series

This is just a quick reproduction of the items discussed in the hts package. This allows for hierarchical time series which is an important feature when looking at data that take a hierarchical format like counties within a state or precincts within counties within states.

Oct 28, 2018

the power of fake data simulations

Bayes

Hierarchical Modeling

Fake Data

Causal Inference

Looking at a blog post that Andrew Gelman posted on fake data simulations and HLM. The power of fake data simulations is that it really makes you think twice about what kind of effect for which you are looking as well as the power of your research design to detect it. This illustrates a really good practice for anyone looking to do this kind of analysis.

Sep 24, 2018

a foray into network analysis

network analysis

Network analysis provides an way to analyse the interconnectedness of different networks. This can provide insight into social networks, interconnected groups of text, tweets, etc. Visualisations help to show these relationships but also some numeric values to quantify them.

Sep 17, 2018

models of microeconomics

econometrics

modeling

Exploring the examples in Kleiber and Zeileis’ Applied Economics in R

Sep 16, 2018

Analysis of Short Time Series

time series

forecasting

Using Fourier Transform as coefficients in short time series data helps with prediction.

Jul 19, 2018

Michael DeWitt

make your own api

apis

packages

Exploring the concept of developing internal APIs. An API could also be an R package that can be used by people in your organisation to more easily connect to common data sources. This is a good example of some internal tooling that can make data access easier.

Jul 12, 2018

IRT and the Rasch Model

IRT

Constructs

Survey Analysis

Item Response Theory (IRT) is a method by which item difficulty is assessed and used to measure latent factors. Classical test theory has a shortcoming where the test-taker’s ability and the difficulty of the item cannot be separated. Thus there is a question of universalisability outside of the instrument. Additionally, the models make some assumptions that mathematically may not be justified. In come IRT which handles some of these issues.

Jul 11, 2018

Exploring forecast

timeseries

forecasting

Let’s examine some of the functions inside for forecast

Jul 7, 2018

Speed it up!

programming

This post explores how to see opportunities to make your code run faster.

Jul 6, 2018

Bayesian Time Series Analysis with bsts

forecasting

Bayes

Exploring the bsts package and what it provides for Bayesian structural time series modeling

Jul 5, 2018

ggrough

ggplot2

data visualisation

ggrough is a great package that can be used to make graphs that look hand-drawn. This can be a great aesthetic choice when giving presentations and making handouts.

Jul 5, 2018

gghighlight for the win

ggplot2

data visualisation

“Exploring the power of gghighlight package to automatically highlight charts”

Jul 4, 2018

Let’s Try Some Visualisation

ggplot2

data visualisation

An example of the value suppressing uncertainty scale. Great uses include forecast uncertainity.

Jul 4, 2018

Categories

Forever Chemicals in the Water

Moving to quarto

Pfizer BNT162b2 for Under 5s

Waste Water Monitoring and COVID-19

Can Bayesian Analysis Save Paxlovid?

Advantage of Bayesian Hypothesis Testing

Neutralizing Antibody Titres

Thinking About Viral Evolution

RProfiles

How Discerning is the Technical Challenge in GBBO?

Comparing Mortality Rates is Hard

Time to Vaccinate

ShinyProxy Serving Websites

Bayesian SIR

Negative Binomial Distribution and Epidemics

Optimisation with Stan

ggdist and Epidemic Curves

Sensitivity and Specificity

julia ABM SIR

Flatten the Curve

Airflow on Windows Linux Subsystem

2020 Plans

How About Impeachment?

Approval Rating Now?

Integrating Over Your Loss Function

Remembering Apollo

On the use of command line tools

Defining a Project Workflow

Finding the Needle in the Haystack

State Space Models for Poll Prediction

Re-districting in Winston-Salem

Omitted Variable Bias

MRP Redux

Speeding Things Up with Rcpp

Latex in ggplot2

MRP using brms

Replicating gsynth

Hierarchical Time Series with hts

the power of fake data simulations

a foray into network analysis

models of microeconomics

Analysis of Short Time Series

make your own api

IRT and the Rasch Model

Exploring forecast

Speed it up!

Bayesian Time Series Analysis with bsts

ggrough

gghighlight for the win

Let’s Try Some Visualisation