12 Comments
User's avatar
badspeler's avatar

It seems to me like your analysis is assuming a low correlation between blog readers. If readership primarily drew from random internet traffic, that might make sense, but if it draws from social communities then it doesn't. We already have many social communities of high-IQ people, universities and tech campuses. If SSC readership comes from word-of-mouth recommendations from within those kinds of communities, or other communities that self-select for high IQ, then that's all you need.

People I know IRL who read the blog all have IQs around 137 or whatever, so it doesn't seem implausible to me.

Expand full comment
Vilgot Huhn's avatar

I think that's a good point. Still I'm not sure that thinking of the interest function as a combination of probability of finding the blog interesting and probability of being recommended the blog from a friend is enough to shift the function all the way that's required to get a mean that high. If we let the first interest function represent probability of interest and then have an identical interest function that represent "probability of being recommended" we can multiply these together to get a combined probability. This shifts the interest function a bit to the left, but not dramatically. We only shift the peak of the resulting distribution a bit less than 4 points on the IQ-scale.

If you add to this the fact that human social networks are small-world networks (six degrees of separation between me and the pope, and all that), this additional assumption doesn't shift my prior enough to get to where we need to be.

Expand full comment
Glen Raphael's avatar

You're missing a confounder - people who *know their IQ* are likely to have high IQs. More specifically, in the US they're likely to have one that's at least 130.

When I was growing up, grade schools in the US had special programs for extra-smart kids - the one in California was called "Mentally Gifted Minors". The cutoff for being assigned to this program was an IQ of 130 (aka ">98% on standardized tests"). So in my experience (and also that of many other SSC readers) if your IQ is >130 you were asked (around 4th grade IIRC) to attend special classes because you have a high IQ. People who in their youth were told their IQ is notably high have a big incentive to later go get tested to see how high it is; people who have NOT been told that have much LESS incentive to do so. So survey-takers whose IQ is under 130 are likely to not know the number so they skip the question; survey-takers whose IQ is over 130 are much more likely to know the number (and if they don't know it, they can safely *guess* it's at least that high).

Expand full comment
Vilgot Huhn's avatar

That's an interesting confounder. I didn't think of that. I'm working as a psychologist so most valid IQ tests I come across are used in assessments within psychiatry. In my country they're most often used in schools via youth psychiatry when there's a problem (we don't really have gifted-kids programs), so for kids that's a pretty different selection criteria.

Expand full comment
Malmesbury's avatar

I agree with you, but here's a possible confounder: for people to read Scott they must not only find it interesting, but also learn about it in the first place. A lot of ACX readers are coming from the Lesswrong/Rat-sphere and this community probably has a much steeper interest function, since it can get really technical there. That might drive the IQ scores of Scott's readers up.

An analogy would be the Manhattan project's after-work happy hour (let's pretend it was a thing). Many people in the general population would be interested in having a beer after work, but I'm pretty confident the people who went to that particular happy hour had a really high IQ. (This is not a claim that LWers are as smart as the Manhattan project people.)

Expand full comment
Vilgot Huhn's avatar

I can see what you mean, and similarly you could argue that the people that take the poll are perhaps more die-hard fans and that those fans are different from the general reader. However I think that SSC in 2020 was a pretty big blog with a pretty far reach. (I later checked and got similar results for the 2022 survey).

There's no answer-key here, of course, and I get that people have different intuitions (which is why I made the shiny-app). But I do feel that the final interest-function is truly Extreme no matter how you look at it. Copying from a comment I wrote on reddit:

"To argue the case against the final interest function in the context of slatestarcodex readers, if you look at "probability of interest" for 130 IQ you have 0.002%, for 145 you have 3%, with the function flattening out/starting to get saturated at quite absurd levels.

Another example from that final interest function is if we look at relative rates of probability of interest (to get around the problem of the function saturating at levels that are unmeasurable):

-A person with 145 IQ is then supposed to be 1767 times as likely to find the blog interesting compared to someone with average IQ.

-Someone with 160 IQ should be 1413 times as likely to find the blog interesting as someone with 115 IQ."

Expand full comment
Pearson's avatar

My N=2 study of people I know who read SSC are both above 2 standard deviations on the SAT.

I did the math specifically on SATs using the dataset here in the replies.

https://x.com/zachariahschwab/status/1726823692779520168?s=46&t=2CJiNcQMwZ4NyiGuszE80g

If you look at the graph here: https://x.com/zachariahschwab/status/1727119616139485661?s=46&t=2CJiNcQMwZ4NyiGuszE80g

The distribution doesn’t seem to follow some kind of sigmoid function but rather a simple exponential.

I have my doubts that people are being accurate but the data definitely updated me towards thinking that SSC is extremely heavily selected or at least the people embroiled enough to take the survey are.

Expand full comment
Vilgot Huhn's avatar

Interesting. Though I think the same processes that would make people over-report their IQ would probably make them over-report their SAT-score.

If you take the ratio between a normal distribution and a normal distribution shifted to the left you get an exponential and I think that's consistent with people just "on average overreporting".

Expand full comment
Pearson's avatar

With the IQ tests, it seemed like there were statistical artifacts in the reporting such as heaping that weren’t present in the SAT data. The SAT mean was also shifted by a lower standard deviation than IQs but still quite a selected one. I think it’s probable that mean IQ is closer to something like 120 or 125 than the 137 reported.

Though I didn’t know a shift in the mean would lead to exponential differences in relative likelihood, that’s a good point.

Expand full comment
Ben's avatar

I think you're bring to much "general population" statistics into this. If you start with the population of people who read *any* blog, I suspect you've already dropped a most of the people below 100. I mean, I agree that it's probably not 137 -- as you note, people self-report high, and the number of 160+ is also likely inflated.

But I do think your "base rates" aren't really the base rates (bit a of a philosophical question). Whether that's due to reading a blog, being in the Bay Area or Rationalist, reading long articles, open to heterodoxy, I don't know.

Expand full comment
Performative Bafflement's avatar

I personally think from the top-down. Scott's blog is probably either THE best or at least top 2-3 blogs in the entire world in terms of content quality, complexity, and volume for a certain type of mind. Nerd bait, in other words. If we assume that there are lonely half-starved high-IQ minds out there in the world, shambolic, shuffling from place to place in the howling endless wastelands of the internet, starving of intellectual stimulation, and then run across a high-complexity, high-interest, high-information blog that produces in *volume,* what are the odds of them preferentially aggregating and engaging there? Pretty high. If we assume that high-IQ people have strong intrinsic selection drive for engaging with high-complexity, high-information blogs produced in volume, they'll preferentially find the blog, perhaps even along an IQ gradient. Scott's blog is top .00001%, and the preferentially clustered high-IQ minds merely top 2% when averaged with all the other readers. Seems reasonable.

But let's do the top-down math. Now I'd argue if you're on the internet to the extent that you're subscribing, commenting, or responding to surveys on nerdy Substacks or forums AT ALL, your base is probably at least 115 IQ to begin with, and there's probably selection effects along an IQ gradient such that the higher your IQ, the more likely you are to be engaged. My argument for this is that the higher your IQ, the less interesting, relevant, and engaging 99% of content is, so ACTUALLY interesting and engaging content produced at volume is more precious than gold, and you are irresistibly drawn to it like moths to IQ-illuminating flame, and the higher the IQ, the more you're drawn.

Last I remember, Scott had something like 200k subscribers, and the population of interest is the highly-engaged commenting / survey-answering / subscribing pop. Let's assume that's roughly 5x subscribers, so we need 1M highly engaged readers to populate. Let's assume they're all from the US, although this selection-effect argument would actually benefit from Scott's numerous high-IQ international readers / commenters / subscribers. There's around 6.6M IQ 130-145 folk in the USA, and 46.6M 115-130. Discounting for under 18's (22%), we're at 5.1M 130-145's and 36.3M 115-130 folk. This means there's at least 80k adults out on the far tail of 145+.

What distribution drawing from this reads-so-often-they're-answering-surveys population can plausibly average 137?

Let's assume the drive for engaging with high-quality-information-volume blogs get's higher with IQ, as posited above:

We'll get 90% of our far tail (72k), who average 153 IQ

We'll get 75% (3.1M) of our average 137 tail, who average 137.

And we've already more than allocated the relevant highly-engaged 1M population we were looking for, and can do it with numbers as low as 20% of the 5.1M 130-145 pop.

Seems suspicious, there's got to be enough lower than 137 folk to round out Scott's large number of readers / subscribers. Substack claims 5-10% of free readers go to paid. There's also a cap on readers, because Substack also says they get roughly 25M readers a month over all of Substack. Going by that, Scott probably has something like 2-3M readers total with 200k paid whosits.

Knowing there's 5.1M average 137 IQ folk just in the US, could Scott capture half of them? Honestly, it still seems plausible to me given selection effects and the intrinsic drive for highly-interesting, complex content produced in volume within high IQ people themself. The real question then is, how is Scott keeping lower IQ folk OUT, which is the bane and fundamental damnation of essentially all other internet sites everywhere?

Here I think the inherent information content of the blog itself, epistemic and cultural norms in the comment sections, founding effects from LW and the rat-sphere, and (formerly) robust moderation and banning of lower quality commenters came into play. Essentially, this is a selection-effects argument, with the relevant sub-population strongly self-selecting into the pop being measured.

If we assume highly engaged commenters / subscribers / survey answerers on nerdy substacks are 115+ to begin with, and then take into account Scott's blog literally being one of the best in the world for this pop, and posit an IQ-gradient selection effect for engagement, it seems at least back-of-the-envelope plausible to me for the 137 to be directionally correct.

Expand full comment
User's avatar
Comment deleted
Dec 3, 2023Edited
Comment deleted
Expand full comment
Vilgot Huhn's avatar

Thank you for your comment!

>Looking at your interest function: Do you really think 100% of people with IQs of 130 who stumble on SSC like the blog? I think the actual retention rate is much, much lower than that--like almost surely less than 5%.

I'm interested in the shape and mean in the posterior distribution so only relative rates matter here. Dividing the interest so that it maxes out at 5% instead of 100% would just change the y-axis on the resulting distribution graph, not the mean.

I assumed that people have different ideas about what interest function is most likely. That's fair, and is why I made the Shiny app for people to play around in.

I think the problem for the idea of "continuing returns" is that it's probably not enough. To get a mean this high you also basically have to make sure the interest function is low even at 130.

The app is of course limited to this sigmoid function. The real function could be more complicated.

https://vilgothuhn.shinyapps.io/InterestFunction/

Expand full comment