Following Gavin’s investigation I decided to run a self-experiment. The goal is to understand caffeine’s effect on my cognition.

As he suggested, I’ll be using Cambridge Brain Sciences for outcome measures. It’s a quick series of 3 tests, totalling <10 minutes daily. The three tests assess Memory, Reasoning, and Verbal, and then aggregate into a mysterious “C-score”. I’ll use the C-score as the main outcome metric, and only dig into the sub-scores for suggestive insights. I’ll take the CBS test each day between 9 and 10 AM.

The study will be as follows:

  1. Begin with 3 weeks of normal caffeine usage.
  2. Go through a 3-week “withdrawal” period without caffeine.
  3. Go 1 month of baseline “no-caffeine”.
  4. Go 1 month with 100mg caffeine pills daily.
  5. Go 1 month with 100mg caff + 100mg theanine.

Compared to Gavin’s protocol, I’m making a few changes, with the following goals:

  1. Include a “bake-in” time for CBS testing in case there are practice effects. As some quick data, I’ve done two days so far, and my second day scored 40% higher than my first day. So either there’s a ton of variance or a meaningful practice effect. Though of course I don’t expect the magnitude of this effect to stay so large.
  2. Test before and during withdrawal, to see the magnitude of effect and the recovery time.
  3. Isolate caffeine vs theanine, since theanine is relatively inconvenient and I want to know if it has any additional benefit.

I’m concerned about the dosage. This post suggests that 100mg is approximately the contents of 10g coffee beans, and I regularly consume 2-3x that amount daily. So is 100mg a “realistic” dosage? And if we want to understand the dose-effect relationship, shouldn’t we measure a more extreme value? Gelman discusses the general question:

  1. For estimating a linear effect, you typically want to measure on the extremes.
  2. If there are possible nonlinearities, you might want to measure in the middle as well. He concludes:

    Under reasonable assumptions about nonlinearity, we find that, unless sample size is very large, the design with no interior measurements is best, because with moderate total sample sizes, any nonlinearity in the dose–response will be difficult to detect.

This suggests I should try to pick the highest dose in the “rising” part of the graph, maybe 200 or 300mg. I have a lot of time to think about this though, since it doesn’t kick in until phase (4).

Other caveats/concerns:

  1. If there is a “ritual” effect from drinking hot water, decaf, or herbal tea, should I avoid those?
  2. If there is a slow practice effect it may confound the results.
  3. Over the study period, many things in my life will be relatively unstable, so this effect if small may wash out, or I may have spurious results.

Analysis plan:

My primary analysis will be a linear regression: \(y = \beta_0 + \beta_1x_1 + \beta_2x_2 + \beta_3x_3 + \beta_4x_4\). \(y\) is the c-score for a given day, the \(x_i\) are indicators for the periods 1,2,4,5 as described above. So the baseline is “normal caffeine usage”.

I also intend to run another regression with an additional day_since_start variable, which should tell me if there’s a practice effect.

To get a sense for the measurable effect size:

In [1]: import statsmodels.stats.power as ssp

In [2]: ssp.tt_ind_solve_power(effect_size=None, nobs1=30, alpha=.05, power=.8)
Out[2]: 0.7356198424871614

This output (0.74) is the smallest effect I’ll be able to pick up: 3/4 of a standard deviation. This is a fairly large effect, and I don’t expect the true effect to be this large.

It would also be good to know the optimal cycle time: I don’t expect to have nearly enough data to determine this. I’ll probably play around with some other exploratory analysis as well.