191 Comments

"No fun predictions" is a bad call! The reason to do fun ones is to keep yourself interested in the exercise and make it less painful to sit down and do. If you're worried about dragging down your calibration score, grade them separately.

Expand full comment
author

That might be right

Expand full comment

Do some more Apple gadget and pop culture ones on Twitter, I suspect your audience at least somewhat overlaps with your hobbies. Could always bucket them separately in the track record.

Expand full comment

+1 for Apple predictions

Expand full comment
founding

I would have thought the bigger worry was dragging *up* the calibration score!

Expand full comment

One thing I realized reading this list is the challenge of calibration when the predictions are correlated. You got one major thing wrong, inflation, and it messed up a lot of questions. That, sort of by necessity means either you’re right or wrong about a lot of questions so you’ll look too over/under confident each year.

A reason to include “fun” questions is they tend to be uncorrelated and so help you track calibration better.

Expand full comment

People are bad at correlations (see polling in the 2016 election for instance).

Expand full comment

As one of Dr. Tetlock's Superforecasters®, I have to give you an A+ for your process. Here's a list of things you did that most people fail to do:

Made a lot of forecasts. (I'd recommend doing 100, but we all have time constraints)

Wrote your forecasts down.

The forecasts had clear resolution criteria.

You shared the list.

Attached a probability to each forecast. (This is huge.)

Revisited the results, especially the ones in which you were over-confident. (Essentially everyone is over-confident, but good forecasters are less so.)

Analyzed the results and learned from errors.

Wrote up a new list and did it again.

Of course, you can improve your scores by being part of group of other smart people who you look to for reasons to change your mind, but as a sole forecaster, I'd say you've got a lot of promise, kid.

Expand full comment

Cool that you're one of those people...not involved but I liked the book. How do you feel about the twin forecasts of 2022 COVID deaths less than 2020/2021 having exactly the same probability? Assuming the actual death totals aren't the same, isn't that sort of arithmetically broken?

Expand full comment

Thanks, Hunter, though when someone refers to me as 'one of those people' they're usually referring to those of us that seem to have a tedium force field around us as we enter parties awhich seems to force others to find someone more entertaining to talk to. :)

I'm not familiar with the question you raise, so I'm afraid I can't comment on that.

Expand full comment

1. Matt Y will start podcasting again in 2022 (and not just as a guest). 95%

Expand full comment

Podcast will have a time machine 60%

Expand full comment

Regarding running, don't get discouraged! You don't compare yourself to the other runners you see (there will always be someone faster anyway). You compare yourself to all the people sitting on their couches at home who you don't see.

I used to be pretty overweight when I started running regularly in my mid-to-late 20s, but it put me in a better shape than I ever was when I was younger. I will never judge someone for getting out and moving at whatever pace works for them.

Expand full comment

I lost 55 lbs cycling it’s easy on the joints

Expand full comment

I can't speak for everyone's experience of course, but I've found that the most important thing for me is to listen to my body. The couple of times I pushed through the pain, I ended up taking a month or two off to heal. But finding whatever works for you is what's important!

Expand full comment

Swimming's even easier on the joints if you can find a convenient body of water.

Expand full comment

Swimming is great, but I don’t think it is sustainable for even people in good shape if you don’t know how to do it (ie, keeping your face in the water, breathing to the side, keeping the top of your head pointed in the direction you want to go). You actually have to spend a little bit of time learning how to do it, or I think you’ll be so frustrated you won’t keep at it.

Expand full comment

It’s actually really hard to do so

Expand full comment

Swimming through the air in zero G is even easier on the joints than water, but also even less convenient to find.

Expand full comment

Agree! Next thing with running is to sign up for some races. 5K’s (3.1 miles) are a great first experience. There will be runners of all different abilities so there’ll always be someone faster and slower than you. You should check out the Crystal City 5Ks, usually in April. Spend the winter building up endurance and then do a few races- next thing you know you’ll be angling for the Cherry Blossom or Marine Marathon :).

Expand full comment

If Stephen Breyer doesn’t retire in 2022 then he might as well join the federalist society. He will certainly be no “liberal” if he’s so willing to place his ego over protecting his legacy.

Expand full comment

Yeah, that one really jumped out at me. I've taken it absolutely for granted that he would retire in time to get a replacement confirmed by the senate before the midterms.

Expand full comment
deletedDec 30, 2021·edited Dec 30, 2021
Comment deleted
Expand full comment

Every day he doesn’t retire is another day Joe Manchin’s houseboat could crash or something, and boom, Mitch McConnell decides that we are just too close to the 2024 election to confirm another SC justice.

Expand full comment

Or an 85 year old Feinstein?

Expand full comment
Comment removed
Expand full comment

Good point.

Would the Dems lose the control of the Senate for the period while the seat is unfilled?

Expand full comment

Incidentally, accounting your own predictions is an awesome exercise, and every pundit should do it! Thank you Matt.

Expand full comment
Comment removed
Expand full comment

And accusations of "sexism" and "ageism" were mild. I recall either Slate or Salon ran a piece saying it was outright "misogynistic" to suggest she should retire!

Expand full comment

Go Blue Team!

*SNAP* *SNAP* *SNAP* *SNAP*

Expand full comment

Dark times.

Expand full comment

1. I think your accounting of your past predictions is extraordinarily useful. Most professional predictors never perform this basic exercise, and I'd say your record is considerably better than most.

2. I took up running from nada quite a while ago; unexpectedly I stuck with it and found that my stamina/endurance/speed continued to improve over time. Good luck!

Expand full comment
Dec 30, 2021·edited Dec 30, 2021

If you follow Michael Pettis you can get a sense of why China is so recalcitrant, they simply don't have the money. They see a very big crisis coming that they don't have a good solution for, and importing American goods would only exacerbate that crisis.

2022 is going to be a very important year for the future of China. They've got to make a decision about how they're going to address their housing/GDP/household income crisis and "build more highways, but this time in Africa" isn't a viable solution, as even Xi is now publicly admitting.

China is going to become more vulnerable to tariffs and international pressure, as the last thing they can afford is any slowdown in exports. It may not seem like it right now, but China is in a very vulnerable position.

Expand full comment
deletedDec 30, 2021·edited Dec 30, 2021
Comment deleted
Expand full comment

“As long as Chinese productivity continues to increase”

Define “productivity”. TFP is stagnant and has been for a good, long while. Labor inputs are falling as of this year, with an ever-steeper demographic bill sloping off towards the future. Capital inputs are continuing to increase (albeit more slowly), but the government has so many weights scattered around the scales that capital allocation is deeply distorted, even more so than a decade of QE in the US has produced.

There are some big-ass crises in the offing for the next decade or so in China.

Expand full comment

I'd be interested in seeing information the could prove/disprove my theory that China recognized that US demand went up so much that the tariffs didn't actually impact theirs sales.

Expand full comment

I remember back in August the comments section predicted that the fall of Afghanistan wouldn’t matter in 3 months. Now google search results shows that is the case. 100m searches for Afghan in august to like 3 million now.

Expand full comment
Dec 30, 2021·edited Dec 30, 2021

People might not be actively searching for Afghanistan anymore, but it does nothing for the theory that the stain of incompetence on Biden and his slide starts there.

Expand full comment

By definition, withdrawing from Afghanistan was going to precipitate a run for the exits on the part of the various “allies” who were no longer being paid off.

Trying to leave in a politically palatable fashion would be as impossible as outrunning your shadow on a sunny day. It was always going to end with everyone who could negotiating their defection and American forces occupying a postage stamp at the mercy of the Taliban to cover their exit.

Biden may be incompetent, but Afghanistan ain’t evidence.

Expand full comment
Dec 31, 2021·edited Dec 31, 2021

"This vast pile of evidence, Coll and Entous write, amounts to “a dispiriting record of misjudgment, hubris, and delusion.” They could have added to that list of adjectives: self-delusion, incompetence, and sheer mendacity. Not that getting out of Afghanistan was a bad idea. But the way our leaders got out should shock even a jaded observer of shady politics.

Presidents Donald Trump and Joe Biden, as well as some of their top aides, come off rather badly in the article....."

https://slate.com/news-and-politics/2021/12/afghanistan-biden-trump-taliban-zalmay-khalilzad.html

I think having to rush troops in to desperately keep Kabul Airport open but not having the foresight to keep the other airbase open is a sign of poor preparation.

Expand full comment

“Presidents Donald Trump and Joe Biden, as well as some of their top aides, come off rather badly in the article.”

Lol. You’re *half* right, which is a sight better than usual.

Biden inherited a situation in which the US had 2,500 troops in-country and couldn’t give any sign that it was abandoning ship lest the local power structures shift even faster than they did.

Any attempt to prepare for a withdraw provokes a general collapse in which you get caught.

Pull troops back to Kabul? Some of them get overrun when locals defect as pre-arranged.

Drag US citizens out the door? They start getting kidnapped by the Afghani Army units they’re supporting for leverage when those people try to defect.

Occupy airfields to facilitate a withdrawal? Mortar fire, yay!

Announce a schedule for leaving Bagram instead of pulling out unannounced in a single night? Suicide bombers and very probably harassing fire from local turncoats on the morning you’re due to leave!

Short of ending Afghani sovereignty and redeploying 200,000 troops to occupy the whole country, there was nothing to do but leave with our tails between our legs. Biden at least had the spine to admit defeat even though it was politically disastrous.

Expand full comment

If you ask people what they are pissed about they say inflation. If you ask hard core progressives they are mad BBB is dead. No one is saying we should go back in.

Expand full comment

It doesn't help that the media doesn't really cover it anymore. But yes, in terms of what interests the median American, Afghanistan was never a high priority and that's only decreased with the withdrawal of US forces.

Expand full comment

“The same party wins both Senate races in Georgia (95%) YES

This one was obvious.“

Was it? The actual results make 95% seem overconfident to me. Some very rough rule of thumb reasoning here: the two races ended up having a 0.8% difference in margin, and I expect the vast majority of possible outcomes for the race end up in a band that’s smaller than (20 * 0.8 = 16%), which leads me to think the chance of landing in that 0.8% range where the races split is higher than 5% (since the split outcomes are closer to the centre of that band).

Expand full comment
Dec 30, 2021·edited Dec 30, 2021

I had the same reaction reading the Georgia prediction description. One thing to do is look at the base rate. A little bit of an unusual situation to have the election at the same time, but swing states do split and even during the race one of the candidates was less popular. The unemployment ones are also difficult, looking at the historical distributions would similarly help. In general when I make a forecast I like to think about if I would take a bet with those odds which helps check overconfidence.

I’d recommend Matt that you get your questions on metaculus, see other people’s predictions, and then update your predictions. I think that you’ve read Tetlock, but two of the findings in Superforecasters are that (a) people do better individually when they work in groups and (b) that people do better when they update regularly.

Expand full comment

You’ve got through the crappy part of running! And there’s plenty of benefits to not taking it too seriously… namely not getting hurt

Happy New Year sir, a hurt runner

Expand full comment
Dec 30, 2021·edited Dec 30, 2021

As a retired intel analyst, this is near and dear to my heart. Forecasting and prediction is, indeed, a very humbling activity made all the more difficult because of human cognition, our reliance on pattern matching and confirmation bias, and our tendency to assume we know more than we actually do. And one very important thing to consider is that accurate predictions (as opposed to lucky guesses) are very often not possible because so many things are emergent or driven by emergent factors.

In the intelligence world the most obvious example of this is estimating what other countries will do in the future. This is nigh impossible in the many cases when the countries in question haven't actually decided what they intend to do.

Anyway, before I descent into the weeds and start quoting Sherman Kent and Cynthia Grabo, I'll just say I appreciate this exercise. If nothing else, it's a useful tool to recalibrate one's ability for introspection.

I do have a couple of recommendations, though, if you want to up the rigor of your predictions:

First, I'd ditch the percentages which imply a false precision. I would instead adopt the estimative language ("words of estimative probability) similar to what the intelligence community or the IPCC uses. Or just adopt theirs.

Secondly, in addition to making predictions regarding the likelihood of something occurring, consider also providing the confidence level for your prediction. This is also standard practice in the IC, IPCC and others.

Third, and this is the time-consuming part, is to show your work and explain your reasoning, particularly when it comes to the confidence level. This is a good method for checking the validity of your assumptions and it will also let you know when you made a correct prediction, but for the wrong reasons.

So for example, in your first prediction, you might change it to something like this:

"1. Democrats lose both houses of Congress (virtually certain, high confidence). Historical patterns, current trends and seat vulnerability analysis all strongly support Democrats losing both houses of Congress."

And if you want to up the game even further, you can assess what indicators or factors would need to change for your prediction to be wrong. So in the case of Congress one might add:

"Democrats retaining control of one or both houses of Congress would likely require a fundamental shift in current political and economic trends as well as circumstances (such as an emergent crisis) that fundamentally shifts the political landscape prior to the election (low confidence)."

Now, all of that is a lot of work, but it can be a useful exercise to do on a few predictions. And, down the line, this is very helpful to determine why a prediction was right or wrong - and that, IMO, is more important than scoring for these types of predictions.

Expand full comment

Please continue to use probabilities Matt. Ordinary language is imprecise to the point of being impossible to score and seeing how you do is one of the things that Tetlock finds help forecasters to improve.

Expand full comment

I'm talking about defined terms, not vague ordinary language. For example, here is how the IPCC explains its probability terminology:

"The following terms have been used to indicate the assessed likelihood of an outcome or result:

- virtually certain 99–100% probability

- very likely 90–100%

- likely 66–100%

- about as likely as not 33–66%

- unlikely 0–33%

- very unlikely 0–10%

- exceptionally unlikely 0–1%.

Additional terms (extremely likely 95–100%; more likely than not >50–100%; and extremely unlikely 0–5%) are also used when appropriate."

Expand full comment

"about as likely as not 33–66%" This is literally false: something that occurs with 66% probability is TWICE as likely as something that occurs with 33%. For the IPCC and climate change what really matters are end tail distributions where disaster strikes and this language is insufficiently precise to describe them. Lots of things have very low probability, but you still want much more precise predictions. For example, for a young person the probability of death from COVID was already <1 in 1000 prior to vaccination, so you can't communicate the benefits of vaccination by stating that the probability of death is <1%. We care about low probability events when the outcome is extreme. And even for less extreme more common things, 66% is different from 50% and really different from 33%.

Expand full comment

Of course one wants more precise predictions. Ideally we want complete certainty. The issue is that many problems have ambiguous evidence, information gaps and other issues that result in a very high degree of uncertainty.

"We care about low probability events when the outcome is extreme."

That is a different issue that is very much separate from forecasting and forecasting accuracy.

Expand full comment
Feb 9, 2022·edited Feb 9, 2022

Yeah, "twice as likely" is "about as likely". Idk what you're going on about.

Expand full comment

There was a whole book with a lot of supporting data about how using the words is bad and using numbers is way better and how groups that think they are good at predicting (like the ones you mention) are not that good.

Superforecasting by Tetlock

Expand full comment

Yes, I'm aware of that. Tetlock's methods rely on using precise numbers because it uses crowdsourcing and big-data performance analysis of a large group of forecasters with a lot of historical forecasting data.

Expand full comment

In Superforecasting, Tetlock says that using precise probabilities makes superforecasters more accurate:

"Barbara Mellers has shown that granularity predicts accuracy: the average forecaster who sticks with the tens – 20%, 30%, 40% – is less accurate than the finer-grained forecaster who uses fives – 20%, 25%, 30% – and still less accurate than the even finer-grained forecaster who uses ones – 20%, 21%, 22%. As a further test, she rounded forecasts to make them less granular, so a forecast at the greatest granularity possible in the tournament, single percentage points, would be rounded to the nearest five, and then the nearest ten. This way, all of the forecasts were made one level less granular. She then recalculated Bier scores and discovered that superforecasters lost accuracy in response to even the smallest-scale rounding, to the nearest 0.05, whereas regular forecasters lost little even from rounding four times as large, to the nearest 0.2."

Expand full comment

That's because Tetlock's method relies on many forecasters doing lots of the same or similar forecasts for a long time. It's designed to evaluate a forecaster's overall performance over many predictions compared to other forecasters for a given question or set of questions. IOW, Bier scores mean nothing in isolation and can only show forecasting skill after a large number of forecasts are judged and compared to other forecasters. That large dataset is not optional to the method and it's what allows the sort of fidelity that Tetlock is talking about.

Let's assume, for instance, that one weather forecaster predicts a 60% chance of rain tomorrow and another forecaster predicts a 75% chance of rain. Now let's assume it actually does rain. We can calculate the Bier score for each, but that doesn't tell us who is the better forecaster because it's only one comparison and one data point. To see who is the better forecaster, we need a lot of rain predictions over time - the more the better.

That is how the ability to achieve granularity can be achieved - by tracking lots of forecasts by lots of forecasters over time.

Secondly, Tetlock's method can't determine the accuracy of any individual prediction. All it can do is identify people who are better forecasters generally, or for specific topics and subjects. Because even the best forecaster is sometimes or even often wrong. And some subjects and topics have more uncertainty than others, which makes prediction harder and less accurate for everyone, even the best forecasters.

So Matt isn't doing any of that. He's not (as far as I'm aware) participating in the Good Judgement Project. If he were, then utilizing granular percentages in predictions is appropriate to start building his overall score so that it can be compared to others to see how good of a forecaster he actually is. By contrast, using granular percentages in isolation on single predictions is false precision.

Expand full comment

Readers may falsely interpret it as precision but it is not actually false precision, it is assigning confidence to a prediction using numbers.

Expand full comment

If I say that the probability that country A will attack country B is 90% and another analyst says the probability is 88%. A third analyst says 91%. Where is the meaningful distinction between these?

Expand full comment

Love this—thanks for writing it. One of the countless things I don’t understand about the world right now is the certainty so many people seem to feel about so many things. People with a platform to express such certainty publicly rarely seem daunted by (or possibly even aware of) the actual track record of their proclamations.

I do, however, wish there were more fun ones!

Expand full comment

I think that's simply because most people find measured, low-confidence predictions boring. You don't get a big platform by being boring to most people, so the only people with big platforms are people who are overconfident about how right they are, or at least play that part in their public posting.

Expand full comment

That, and also people use predictions more to show their feelings about something than as an actual prediction. Anti-Biden people in 2020 loved to predict a Trump landslide, but they weren’t really predicting, they were saying “I hate Biden.”

Expand full comment
author

Yeah, I think this is one of the biggest issues — a lot of emotive predicting that isn't even intended to be correct.

Expand full comment

I'm sure that's right, but what I specifically don't understand is how people can maintain this posture over the longer term. I would be mortified if there were a public record of my innumerable dumbass erroneous predictions.

Expand full comment

I agree. Would be fun to vote on what we'd like to see Matt predict.

Expand full comment

I think adding the going rate from metaculus, predictid, etc... as a point of comparison would be massively useful in evaluating your success next year. If the market says 99% but you say 60% it is in some ways very impressive for you should the event not happen.

Expand full comment

I recommend lifting, which is less hard to get into than is it commonly portrayed, and is also not just for dudes who want to get huge and impress other dudes. If you frequent places where men people talk about lifting, they will talk almost exclusively about programs involving barbells, and there are a couple popular beginner barbell programs, but those obviously require access to that equipment. Nerd Fitness has a beginner bodyweight program (no equipment) and a beginner kettlebell program and they are both fine. R/Fitness has a much more intensive beginner bodyweight program in their wiki as well.

Expand full comment

This. I’d love for MattY to pull a ‘Do You Even Lift Bro?’ while totally destroying Ben Shapiro in a discussion.

But seriously, running has limits for many body types. Getting into a proper lifting routine (ie not CrossFit) has done much more for me at 40 than all the running I did at 30.

Expand full comment

Dang these new predictions are much more depressing ones. Hopefully your poor track record from last year repeats itself and we have a better than expected 2022!

Expand full comment

It was a little jarring seeing “here’s hoping I do better” and thinking, well, no!

Expand full comment