Easier to monetize and analyze user data from a native app than web. I don’t understand why they still allow email tbh. Not saying I like it just that that’s how you make money.
Seems bad for your business to be pushing people who want to subscribe to you into substack's proprietary app. I'd love to listen please put it into an RSS feed so I don't have to deal with their garbage.
They seem to be *really* pushing hard to get us to use the app - but without doing the basic usability upgrades that keep me uninstalling it whenever I try again. They stopped sending me notifications of replies to comments, to try to get me to use the app, but the app still doesn’t let me find replies to comments, or even properly open and close subthreads in comments, so I am staying with the awkward browser interface.
It is embarrassing how bad navigating the comment section is on mobile. Also remember when substack was billed as an alternative to social media. Now it's indistinguishable from twitter when I open it and I just got a notification someone was "going live". I wish matt would go make his own site
Between the substack (available on substack.com or the app), Politix (available on all podcast apps; paywall on politix.fm), Matt’s twitter (you need a twitter account to use core features like seeing replies), and now this, Matt is increasingly unbundling himself
You’ve both expressed frustrations with the public health community and aspects of its response to the covid-19 pandemic. (Based on polling, you’re not alone in that; trust has fallen in CDC and other agencies, per polling.)
As occasional critics, how do you think public health agencies and experts can restore public trust? And do you see any of that riding on the next election, given Trump and Harris’s starkly different approaches to public health and government oversight?
Since it's on the crappy, buggy Substack app whose video always crashes I'll pass.
But if we can submit questions, I'd like Matt to ask Nate, "Why do you use three significant figures in your model outputs? E.g., Harris's chances of winning are 53.7%. The range of uncertainty is huge. Why not say, 'Harris's chances of winning are between 50 and 55%? (Or even, 50 and 60%). Is it because that number will never move and so you'll lose all the clicks of people who would log in to see if the percentage went from 53.7 to 54.5?"
Back in maybe the mid teens, FiveThirtyEight iterated on a series of presentation styles to try to get people to think of the model as being much less precise than people wanted it ("three in seven" rather than 42.86%). Nothing seemed to work. At this point I assume he's just kind of given up and copy-pastes the number that the computer spits out.
In what way, I wonder. If he's an honest quantitative forecaster, he knows that the 3-figure significance is totally bogus. That he persists raises all kinds of questions about his motivations. I mean, look at the tiny shifts between late September and early October: https://www.natesilver.net/p/5545-is-a-really-close-race. Yes, it's a close race. So why publish those meaningless shifts daily?
Quite. Nate Silver won't answer the question from me, so I'm hoping someone with much more weight like Matt might get him to address his motivation here.
I think if you take a step back and don't start with the assumption Nate has some nefarious motivation, then your question just isn't that important. Even by your logic, there's no particular reason to stop at 0 decimals. Why not just say 50% instead of 54? It's just a convenient place to stop and accurately reflects the model's output. Doesn't seem like a big deal.
The question is how you come to trust an expert. You have to take a lot on faith. One way to do that is to look at things that are more transparent. Silver is obviously a good modeler and has expertise at it that I can't evaluate. But when I see the delight he takes in punching at Democrats and then I see him produce daily what can fairly be called clickbait because he *has* to know what excites his audience, then that can call into question his more opaque (to me) expertise.
Also, I just think it's unprofessional. He knows better but he does it anyway. I agree: there's so much uncertainty that he should only offer *one* significant figure and own up to it. Don't put that on his audience.
If I understand what Nate has written correctly in this instance the error bars are a measure of accuracy whereas the significant digits he's reporting are a measure of precision.
Take a single poll. We poll 100 people and 50 of them said they're voting for Harris. Unless the participants mumbled a lot or misunderstood the question then the precision of this poll is essentially infinite. There is no question what number we measured. The accuracy of the poll is uncertain because we aren't sure how representative of the total population our sample was., but I can put as many zeroes after 50% as I want.
Now Nate isn't actually measuring anything. His model simulates the election a bunch of times and says e.g. Trump wins 51% of the time. He can put as many significant digits as we wants to after that because it's just (# of times Trump won)/(# of runs). I'd guess he runs it enough times that those 3 digits settle down and additional runs change things further to the right of the decimal point.
If I understand it correctly the model doesn't have error bars and it's wrong to interpret the distribution of probabilities that way. The inputs do, but those are used to generate to probability distribution.
To use a poker metaphor (which seems very Nate) I say the odds of drawing a full house are 0.1441% and you say it's 5%. We draw one hand and it is not a full house. Which of us is correct?
But how do you validate a 53.7% probability on Oct. 14? It's not like the poker hand; there's no election today.
Let's say the model gives Harris 85% odds on Nov 4 and she wins the next day. What does that say about the 63.7% probability on Oct 14? Obviously it means things changed radically in the following 3 weeks but we didn't need a model to tell us that.
I thought of a better explanation. Assume the model is a normal distribution. In order to generate a normal distribution I need 2 numbers: a mean and a standard deviation. Uncertainty in the input data will increase the standard deviation, but it doesn't affect the mean.
The model basically simulates the election a whole bunch of times and outputs the % of runs a given candidate won. High uncertainty data means you get more weird runs where Kamala wins by 20 points or something whereas low uncertainty data means most of the runs are clustered around the median outcome.
Your question is essentially "why doesn't the standard deviation affect the mean?" and the answer is because that's not how a probability distribution works.
I'm sorry but there's no normal distribution because it's all fake. You're not running the election 100 times on Oct. 14; you're running a model based on a bunch of inputs which have never been validated because there is no election on Oct. 14. I have no idea what the "distribution" looks like because there is no distribution. Maybe it's a negative binomial, I dunno.
By your logic, he could produce a current probability of Harris winning of 53.7183561% but of course that would look silly to everyone. So why is 53.7% less silly?
This by the way is very different from your poker hand example. If we differ in the odds of drawing one hand to a full house and say we get a full house, it doesn't mean either model is good. It's just one hand. If we play 1000 hands and we get one that draws to a full house, your estimate is far superior to mine. There's no world in which Harris wins 537 times out of 1000 elections except in the abstract imaginary world of his computer runs. Which again cannot be validated because there's no election today.
I think you are fundamentally misunderstanding probability. Just because the election isn’t actually being run multiple times in reality doesn’t mean there’s only one way it can turn out. It’s not deterministic, at least as far as our minds can yet conceive of. The model is just counting the different ways things can turn out. This is separate from the issue of how many digits to put after the decimal point—it’s called a point estimate and anyone who understands what variance is intuitively understands that the number of digits after the decimal may or may not be relevant.
Of course what you say is true. But I'm not sure how much it matters. Beyond the point of being unprofessional by offering clickbait to get people excited when these probabilities jiggle around a slow-moving mean like particles exhibiting Brownian motion,* it's the idea that the model output is telling us something significant about Nov. 5. I don't recall if Silver is doing his forecasting/nowcasting thing anymore but it doesn't matter either way. What we know on Oct. 14 may or may not mean anything when the election actually happens. But I don't think the model can tell us anything more about Nov. 5 here on Oct. 14 then simpler tools -- well, like the current polling average or polls of the swing states -- can. It's pseudo-science: it gives the appearance of certainty without ever being put to the test (since we can never *prove* that Harris has a 53.7% chance of winning as of Oct. 14).
I would *so* much more respect an Yglesian forecasting Substack whose only output was to say each day "Gonna be a close one!" Even if by Nov. 4 it changes that to say, "Welp, look at those polls: looks like it's *not* gonna be a close one!"
*The fact that the polls are so close that the race is basically a tossup is the signal. The stupid perturbations of the 3-significant figure probabilities each day is the noise. We should promote the former and downgrade the latter. If Silver doesn't understand that, there's an excellent book on the topic I can recommend to him.
I will plan to hack into the discussion and yell at you both for being unwilling to ban fracking and all other fossil fuel production in North America. True believers know that Global Warming Existential Risk ™ can only be mitigated by pushing all oil and gas production to Venezuela, Saudi Arabia and Iran.
This might not end up being the correct venue, and even if it is, there might not be enough time for the type of depth that I'm curious about. But even if this is better for a Nate/Matt discussion in the future, I want to at least put a pin on this in hopes that the two of them see it and consider it.
Anyway, I would be very eager to hear a point/counterpoint discussion on the topic of gambling between Nate and Matt, as for Nate it's obviously a big part of his life, whereas Matt has grown increasingly negative of it in recent times. My curiosity is what points of agreement and disagreement they'd have on gambling, and what type of equilibrium could result from their combined views.
Nate will say that he doesn’t gamble. Sure there is uncertainty in poker, and many people are gambling when they play poker, but people who take it seriously aren’t.
Yglesias helped get rid of the environmental review and permitting requirements so the water pump could be built in the Village, though it had to be moved outside the Landmarked district
I do not have a smart phone (yes, I'm the one). Does that mean I'll be unable to view/hear the event live? If so, will there at least be a transcript available afterwards for "slow boring" folks like myself?
Because that’s something you would like but the company doesn’t think it’s worth doing. They are probably focused on building other things they think are more valuable!
Why would I? Not worth it. I've viewed dozens of live and archived events as well as transcripts of same without one. As to my questions, I'm hoping that perhaps Mr. Yglesias could either post a link to the event after the fact, if that is technically possible, or provide a transcript if it is not.
With Slow Boring and Silver Bulletin being literally the only two Subtracks I pay for, I think this is awesome. Some potential questions for Nate:
1) In retrospect, would he have not added the convention bounce to his model? It seems like it added a lot of unnecessary volatility to the forecast. (Note that, unlike a lot of unhinged folks on Twitter, I don't hold this against him, but I think it would still be worth hearing his perspective on.)
2) In Nate's opinion, is there a meaningful difference between a candidate being up 51-49 instead of, say, 47-45?
3) Seemingly everybody is assuming that the election will be very close, which makes me wonder if that means it won't actually be that close. On the one hand, both 2016 and 2020 were very close, but on the other hand, we're a normal polling error away from Harris winning the tipping point state by 3-4 points or Trump winning at by 2-3 points. Does Nate think that "actually the election won't be that close" is a good contrarian prediction?
SBxSB
YAY Nate/Matt collaboration!
BOO mandatory apps!
I wish I could upvote this comment orders of magnitude more than once. Why on earth is there not a web option available?
Easier to monetize and analyze user data from a native app than web. I don’t understand why they still allow email tbh. Not saying I like it just that that’s how you make money.
I am well aware that that is Substack's motive--but Nate and Matt don't have to play ball with that motive.
Will there be a replay available?
Seems bad for your business to be pushing people who want to subscribe to you into substack's proprietary app. I'd love to listen please put it into an RSS feed so I don't have to deal with their garbage.
I don't hate that it's on Substack, but maybe Substack could be encouraged to have it accessible on their website as well as their app?
They seem to be *really* pushing hard to get us to use the app - but without doing the basic usability upgrades that keep me uninstalling it whenever I try again. They stopped sending me notifications of replies to comments, to try to get me to use the app, but the app still doesn’t let me find replies to comments, or even properly open and close subthreads in comments, so I am staying with the awkward browser interface.
It is embarrassing how bad navigating the comment section is on mobile. Also remember when substack was billed as an alternative to social media. Now it's indistinguishable from twitter when I open it and I just got a notification someone was "going live". I wish matt would go make his own site
Between the substack (available on substack.com or the app), Politix (available on all podcast apps; paywall on politix.fm), Matt’s twitter (you need a twitter account to use core features like seeing replies), and now this, Matt is increasingly unbundling himself
+1 to disliking the app.
"Like everyone, I’m hoping that he’ll tell us definitively who’s going win the election."
There's that dry Yglesias wit we all love.
You’ve both expressed frustrations with the public health community and aspects of its response to the covid-19 pandemic. (Based on polling, you’re not alone in that; trust has fallen in CDC and other agencies, per polling.)
As occasional critics, how do you think public health agencies and experts can restore public trust? And do you see any of that riding on the next election, given Trump and Harris’s starkly different approaches to public health and government oversight?
Since it's on the crappy, buggy Substack app whose video always crashes I'll pass.
But if we can submit questions, I'd like Matt to ask Nate, "Why do you use three significant figures in your model outputs? E.g., Harris's chances of winning are 53.7%. The range of uncertainty is huge. Why not say, 'Harris's chances of winning are between 50 and 55%? (Or even, 50 and 60%). Is it because that number will never move and so you'll lose all the clicks of people who would log in to see if the percentage went from 53.7 to 54.5?"
Back in maybe the mid teens, FiveThirtyEight iterated on a series of presentation styles to try to get people to think of the model as being much less precise than people wanted it ("three in seven" rather than 42.86%). Nothing seemed to work. At this point I assume he's just kind of given up and copy-pastes the number that the computer spits out.
"Nothing seemed to work."
In what way, I wonder. If he's an honest quantitative forecaster, he knows that the 3-figure significance is totally bogus. That he persists raises all kinds of questions about his motivations. I mean, look at the tiny shifts between late September and early October: https://www.natesilver.net/p/5545-is-a-really-close-race. Yes, it's a close race. So why publish those meaningless shifts daily?
It sounds like you've decided on the answer to your question already and you want someone to launder it for you.
Quite. Nate Silver won't answer the question from me, so I'm hoping someone with much more weight like Matt might get him to address his motivation here.
I think if you take a step back and don't start with the assumption Nate has some nefarious motivation, then your question just isn't that important. Even by your logic, there's no particular reason to stop at 0 decimals. Why not just say 50% instead of 54? It's just a convenient place to stop and accurately reflects the model's output. Doesn't seem like a big deal.
The question is how you come to trust an expert. You have to take a lot on faith. One way to do that is to look at things that are more transparent. Silver is obviously a good modeler and has expertise at it that I can't evaluate. But when I see the delight he takes in punching at Democrats and then I see him produce daily what can fairly be called clickbait because he *has* to know what excites his audience, then that can call into question his more opaque (to me) expertise.
Also, I just think it's unprofessional. He knows better but he does it anyway. I agree: there's so much uncertainty that he should only offer *one* significant figure and own up to it. Don't put that on his audience.
This is a super strange minute detail to focus on when the fact is simply that nate silver annoys you as a person.
Insert "why not both?" gif.
I think he has answered this - high error bars don't mean the number of significant digits is wrong.
What do significant digits mean, if not a measure of the error bars?
If I understand what Nate has written correctly in this instance the error bars are a measure of accuracy whereas the significant digits he's reporting are a measure of precision.
Take a single poll. We poll 100 people and 50 of them said they're voting for Harris. Unless the participants mumbled a lot or misunderstood the question then the precision of this poll is essentially infinite. There is no question what number we measured. The accuracy of the poll is uncertain because we aren't sure how representative of the total population our sample was., but I can put as many zeroes after 50% as I want.
Now Nate isn't actually measuring anything. His model simulates the election a bunch of times and says e.g. Trump wins 51% of the time. He can put as many significant digits as we wants to after that because it's just (# of times Trump won)/(# of runs). I'd guess he runs it enough times that those 3 digits settle down and additional runs change things further to the right of the decimal point.
If I understand it correctly the model doesn't have error bars and it's wrong to interpret the distribution of probabilities that way. The inputs do, but those are used to generate to probability distribution.
To use a poker metaphor (which seems very Nate) I say the odds of drawing a full house are 0.1441% and you say it's 5%. We draw one hand and it is not a full house. Which of us is correct?
But how do you validate a 53.7% probability on Oct. 14? It's not like the poker hand; there's no election today.
Let's say the model gives Harris 85% odds on Nov 4 and she wins the next day. What does that say about the 63.7% probability on Oct 14? Obviously it means things changed radically in the following 3 weeks but we didn't need a model to tell us that.
As always the best model is GBACO.
I thought of a better explanation. Assume the model is a normal distribution. In order to generate a normal distribution I need 2 numbers: a mean and a standard deviation. Uncertainty in the input data will increase the standard deviation, but it doesn't affect the mean.
The model basically simulates the election a whole bunch of times and outputs the % of runs a given candidate won. High uncertainty data means you get more weird runs where Kamala wins by 20 points or something whereas low uncertainty data means most of the runs are clustered around the median outcome.
Your question is essentially "why doesn't the standard deviation affect the mean?" and the answer is because that's not how a probability distribution works.
IMO, both your attempts at explain this were great! I teach basic statistics and you’re nailing it.
Thank you. That's very kind of you.
I'm sorry but there's no normal distribution because it's all fake. You're not running the election 100 times on Oct. 14; you're running a model based on a bunch of inputs which have never been validated because there is no election on Oct. 14. I have no idea what the "distribution" looks like because there is no distribution. Maybe it's a negative binomial, I dunno.
By your logic, he could produce a current probability of Harris winning of 53.7183561% but of course that would look silly to everyone. So why is 53.7% less silly?
This by the way is very different from your poker hand example. If we differ in the odds of drawing one hand to a full house and say we get a full house, it doesn't mean either model is good. It's just one hand. If we play 1000 hands and we get one that draws to a full house, your estimate is far superior to mine. There's no world in which Harris wins 537 times out of 1000 elections except in the abstract imaginary world of his computer runs. Which again cannot be validated because there's no election today.
I think you are fundamentally misunderstanding probability. Just because the election isn’t actually being run multiple times in reality doesn’t mean there’s only one way it can turn out. It’s not deterministic, at least as far as our minds can yet conceive of. The model is just counting the different ways things can turn out. This is separate from the issue of how many digits to put after the decimal point—it’s called a point estimate and anyone who understands what variance is intuitively understands that the number of digits after the decimal may or may not be relevant.
Of course what you say is true. But I'm not sure how much it matters. Beyond the point of being unprofessional by offering clickbait to get people excited when these probabilities jiggle around a slow-moving mean like particles exhibiting Brownian motion,* it's the idea that the model output is telling us something significant about Nov. 5. I don't recall if Silver is doing his forecasting/nowcasting thing anymore but it doesn't matter either way. What we know on Oct. 14 may or may not mean anything when the election actually happens. But I don't think the model can tell us anything more about Nov. 5 here on Oct. 14 then simpler tools -- well, like the current polling average or polls of the swing states -- can. It's pseudo-science: it gives the appearance of certainty without ever being put to the test (since we can never *prove* that Harris has a 53.7% chance of winning as of Oct. 14).
I would *so* much more respect an Yglesian forecasting Substack whose only output was to say each day "Gonna be a close one!" Even if by Nov. 4 it changes that to say, "Welp, look at those polls: looks like it's *not* gonna be a close one!"
*The fact that the polls are so close that the race is basically a tossup is the signal. The stupid perturbations of the 3-significant figure probabilities each day is the noise. We should promote the former and downgrade the latter. If Silver doesn't understand that, there's an excellent book on the topic I can recommend to him.
https://en.wikipedia.org/wiki/The_Signal_and_the_Noise
I will plan to hack into the discussion and yell at you both for being unwilling to ban fracking and all other fossil fuel production in North America. True believers know that Global Warming Existential Risk ™ can only be mitigated by pushing all oil and gas production to Venezuela, Saudi Arabia and Iran.
Why won't Biden simply declare a John Emergency?
This might not end up being the correct venue, and even if it is, there might not be enough time for the type of depth that I'm curious about. But even if this is better for a Nate/Matt discussion in the future, I want to at least put a pin on this in hopes that the two of them see it and consider it.
Anyway, I would be very eager to hear a point/counterpoint discussion on the topic of gambling between Nate and Matt, as for Nate it's obviously a big part of his life, whereas Matt has grown increasingly negative of it in recent times. My curiosity is what points of agreement and disagreement they'd have on gambling, and what type of equilibrium could result from their combined views.
Nate will say that he doesn’t gamble. Sure there is uncertainty in poker, and many people are gambling when they play poker, but people who take it seriously aren’t.
+1
Does Nate consider you a Riverian or a Villager?
Yglesias runs the water pump that supplies the village with water from the river.
Yglesias helped get rid of the environmental review and permitting requirements so the water pump could be built in the Village, though it had to be moved outside the Landmarked district
This is a crime against graphic design.
When I saw this cover on the Slow Boring homepage feed, I genuinely thought Substack had bugged out when loading the image.
It’s worse than a crime, it’s a mistake.
Will replays be available later?
I do not have a smart phone (yes, I'm the one). Does that mean I'll be unable to view/hear the event live? If so, will there at least be a transcript available afterwards for "slow boring" folks like myself?
No product company is ever going to build anything to address your situation. Why would they? Not worth it.
Why not just let the "app" content be available on a lap/desktop?
Because that’s something you would like but the company doesn’t think it’s worth doing. They are probably focused on building other things they think are more valuable!
good reply - but it doesn't answer the questions I asked.
The answer is you should get an iPhone
Why would I? Not worth it. I've viewed dozens of live and archived events as well as transcripts of same without one. As to my questions, I'm hoping that perhaps Mr. Yglesias could either post a link to the event after the fact, if that is technically possible, or provide a transcript if it is not.
Cool story bro. That’s what Substack says about you!
A transcript is better than live event
Will the video be available for people who are unable to catch it live?
This feels like my two divorced dads are getting back together again!
With Slow Boring and Silver Bulletin being literally the only two Subtracks I pay for, I think this is awesome. Some potential questions for Nate:
1) In retrospect, would he have not added the convention bounce to his model? It seems like it added a lot of unnecessary volatility to the forecast. (Note that, unlike a lot of unhinged folks on Twitter, I don't hold this against him, but I think it would still be worth hearing his perspective on.)
2) In Nate's opinion, is there a meaningful difference between a candidate being up 51-49 instead of, say, 47-45?
3) Seemingly everybody is assuming that the election will be very close, which makes me wonder if that means it won't actually be that close. On the one hand, both 2016 and 2020 were very close, but on the other hand, we're a normal polling error away from Harris winning the tipping point state by 3-4 points or Trump winning at by 2-3 points. Does Nate think that "actually the election won't be that close" is a good contrarian prediction?
On (2) Pat Ruffini had a good recent blog post on his Substack about margin vs vote share.