Dark Mode Toggle

Galaxy brain resistance

2025 Nov 07 See all posts

Galaxy brain resistance

One important property for a style of thinking and argumentation to have is what I call galaxy brain resistance: how difficult is it to abuse that style of thinking to argue for pretty much whatever you want - something that you already decided elsewhere for other reasons? The spirit here is similar to falsifiability in science: if your arguments can justify anything, then your arguments imply nothing.

You want to get to step 2 and then stop.

It's easiest to motivate the need to think about galaxy brain resistance by looking at what happens in its absence. You've probably heard many cases of people saying things like this:

We are building a new decentralized ____ marketplace that will revolutionize how ____ customers interact with their providers, and will allow creators to turn their audiences into digital nation states. $____ is a governance token that lets you play a direct role in this rapidly growing market. If we capture even 1% of the ___ market share, this will be a $___ billion market, and the governance token will be worth billions.

In politics, you can get much worse. Like, say:

[ethnic minority] is responsible for a lot of persistent social disruption and drains our resources. If we were to just completely remove them (and I do mean completely remove, so they don't come back), it would be a grim one-time act, but in the long term, if doing it makes our economy grow even half a percentage point faster, then in 500 years, our country will be twelve times richer than it otherwise would have been. That is a huge number of much happier and more fulfilled lives. It would be a grave injustice to condemn our descendants to vastly poorer lives simply out of cowardly fear of paying this one-time cost today.

One way to deal with arguments like the above is to deal with them as mathematical objects in a philosophy class, try to identify the specific premises or steps that you disagree with, and come up with refutations. But a more realistic approach is to notice that, in the real world, arguments like the above are almost always not reasoning - they are rationalization.

The speaker first came to their conclusion, likely for self-interested or emotional reasons (being a bagholder of the token, or really really hating that ethnic group), and then they came up with fancy arguments that supposedly show why they are justified. The purpose of the fancy arguments is (i) to trick the speaker's own higher mind into following the lead of their base instincts, and (ii) to try to grow the movement by getting not just the deluded, but also vErY smArT pEopLe (or worse, actually smart people) on board.

In this post, I will argue that patterns of reasoning that are very low in galaxy brain resistance are a common phenomenon, some with consequences that are mild and others with consequences that are extreme. I will also describe some patterns that are high in galaxy brain resistance, and advocate for their use.

Patterns of argumentation and reasoning that are low in galaxy brain resistance

Inevitabilism

Consider this recent tweet, which is a pretty good archetypal example of a lot of Silicon Valley AI boosterism:

This is a clear example of the inevitability fallacy. The post starts from a (perhaps reasonable) claim that full automation of the economy is eventually bound to happen, and then skips straight to arguing that this eventuality (and therefore disemployment of all human labor) should therefore be actively hastened. Why should it be actively hastened? Well, we know why: because this tweet was written by a company whose entire business is actively hastening it.

Now, inevitabilism is a philosophical error, and we can refute it philosophically. If I had to refute it, I would focus on three counterarguments:

Inevitabilism overly assumes a kind of infinitely liquid market where if you don't act, someone else will step into your role. Some industries are sort of like that. But AI is the exact opposite: it's an area where a large share of progress is being made by very few people and businesses. If one of them stops, things really would appreciably slow down.
Inevitabilism under-weights the extent to which people make decisions collectively. If one person or company makes a certain decision, that often sets an example for others to follow. Even if no one else follows immediately, it can still set the stage for more action further down the line. Bravely standing against one thing can even remind people that brave stands in general can actually work.
Inevitabilism over-simplifies the choice space. Mechanize could keep working toward full automation of the economy. They also could shut down. But also, they could pivot their work, and focus on building out forms of partial automation that empower humans that remain in the loop, maximizing the length of the period when humans and AI together outperform pure AI and thus giving us more breathing room to handle a transition to superintelligence safely. And other options I have not even thought about.

But in the real world, inevitabilism cannot be defeated purely as a logical construct because it was not created as a logical construct. Inevitabilism in our society is most often deployed as a way for people to retroactively justify things that they have already decided to do for other reasons - which often involve chasing political power or dollars. Simply understanding this fact is often the best mitigation: the moment when people have the strongest incentive to make you give up opposing them is exactly the moment when you have the most leverage.

Longtermism

Longtermism is the pattern of thought that emphasizes the very large stakes involved in the long-term future. These days, many people associate the term with effective altruist longtermism, eg. this 80000 hours intro:

If we're just asking about what seems possible for the future population of humanity, the numbers are breathtakingly large. Assuming for simplicity that there will be 8 billion people for each century of the next 500 million years,¹⁰ our total population would be on the order of forty quadrillion ... And once we're no longer planet-bound, the potential number of people worth caring about really starts getting big.

But the general concept of appealing to the long term is much older. Appeals to sacrifice today for much larger benefits in the future have been done for centuries by personal financial planners, economists, philosophers, people talking about the best time to plant trees, and many others.

One reason that I hesitate to criticize longtermism is that, well, the long term is actually really important. The reason we don't hard fork a blockchain every time someone loses money to a hacker is that doing so may have a very visible one-time benefit, but it would permanently damage the blockchain's credibility. As Tyler Cowen argues in Stubborn Attachments, the reason why economic growth is so important is that it's one of very few things that reliably compounds forever into the future, instead of disappearing or going in cycles. Educating your children only has payoffs after over a decade. If you're not longtermist to some extent, you will never build a road. And when you don't value the long term, you run into problems. A major one I fight against is technical debt: when software developers focus on short-term targets without a coherent view of the long-term picture, the result is that software turns into uglier and uglier junk over time (see: my push to simplify the Ethereum L1).

But there is a catch: longtermist arguments have very low galaxy brain resistance. After all, the long term is far away, and you can make beautiful stories about how if you do X, just about anything will happen. We see the downsides of this play out in the real world over and over again, when we look at the behavior of both markets and politics.

In a market, the variable that chooses between these two modes is the prevailing interest rate. When interest rates are high, it only makes sense to invest in projects that show a clear near term profit. But when interest rates are low, well, at this point the phrase "low-interest rate environment" is a well-understood byword for a situation that involves lots of people creating and chasing narratives that are ultimately unrealistic, leading to a bubble and then a crash.

In politics, there are common complaints about politicians acting in short-term ways to impress voters, hiding problems under the rug so that they only reappear after the next election. But there is also the idea of a "bridge to nowhere": an infrastructure project justified by a story about long-term value that never ends up materializing.

Left: a bridge to nowhere in Latvia. Right: Dentacoin, "The Blockchain Solution for the Global Dental Industry", used to have a market cap of over $1.8 billion.

The core problem in both cases is that thinking about the long term enables disconnection from reality. In a short-term-favoring environment, sure, you are ignoring the long term, but at least there is a feedback mechanism: if a proposal is justified by a claim of near-future benefits, then in the near future everyone will be able to see if those benefits actually come to pass. In a long-term-favoring environment, an argument about benefits in the long term does not have to be correct, it just has to sound correct. And so even though the game everyone is claiming to play is choosing ideas based on what brings value in the long term, the game they're actually playing is choosing ideas based on what wins in an often dysfunctional and highly adversarial social environment.

If you can use stories about vague but extremely large positive consequences in the long term to justify anything, then a story about vague but extremely large positive consequences in the long term tells you nothing.

How do we get the benefits of long-term thinking without getting disconnected from reality? First of all, I would say it's a really hard problem. But getting beyond that, I do think there are some basic rules of thumb. The easiest is: does the thing you are doing in the name of long-term benefits actually have a solid long-term track record of achieving those benefits? Economic growth is like this. Not making species go extinct is also like this. Trying to install a one-world government does not - in fact, it's one of many examples of something that has a solid long-term track record of failing and causing lots of harm in the process. If an action you're considering has speculative long-term benefits but reliable known long-term harms, then... don't do it. This rule doesn't always apply, because sometimes we really are living in unprecedented times. But it's also important to keep in mind that "we really are living in unprecedented times" has very low galaxy brain resistance.

Bad excuses for banning things for personal aesthetic reasons

I find uni disgusting. You're literally eating sexual organs of a sea urchin. Sometimes, at omakases, this stuff even gets shoved in front of my face. But even still, I oppose banning it, as a matter of principle.

One thing that I despise is people using the coercive power of government to impose what are ultimately personal aesthetic preferences on the personal lives of millions of other people. Having aesthetics is fine. Keeping aesthetics in mind when designing public environments is good. Imposing your aesthetics on other people's personal lives is not - the cost you're imposing on others is vastly higher than any psychological benefit to yourself, and if everyone tries to do it that inevitably leads to either cultural hegemony or political war of all against all.

It's not hard to find obvious slam-dunk cases of politicians pushing to ban things for no better reason than "eww, I find it disgusting". An easy goldmine is anti-homosexuality crusades. Like St. Petersburg Duma deputy Vitaly Mironov:

LGBT have no rights. Their rights are not included in the socially significant list of protected values in our country. The so-called perverts have all the rights that they have as people, citizens of our country, but they are not included in some extended top list. We will remove them forever from the list of human rights issues in our country.

Or even Vladimir Putin himself, who tried to justify invading Ukraine by complaining about ... the United States having too much "satanism". A more recent and somewhat different case is movements in the United States to ban synthetic meat:

Cultured meat is not meat ... it is made by man, real meat is made by God Himself ... If you really want to try the nitrogen-based protein paste, go to California.

But many people are a step more cultured, and try to wrap this in some kind of excuse. A common one is "the moral fabric of society", "social stability", and various similar reasons. Arguments like this are also often used to justify censorship. What's the problem? I'll let Scott Alexander handle this one:

The Loose Principle of Harm says that the government can get angry at complicated indirect harms, things that Weaken The Moral Fabric Of Society ... But allowing the Loose Principle Of Harm restores all of the old wars to control other people that liberalism was supposed to prevent. The one person says "Gay marriage will result in homosexuality becoming more accepted, leading to increased rates of STDs! That's a harm! We must ban gay marriage!" Another says "Allowing people to send their children to non-public schools could lead to kids at religious schools that preach against gay people, causing those children to commit hate crimes when they grow up! That's a harm! We must ban non-public schools!" And so on, forever.

The moral fabric of society is a real thing - some societies are much more moral than others in easily observable ways. But also, it's vague and undefined, which makes it so incredibly easy to say that just about anything contravenes the moral fabric of society. This applies also to more direct appeals to the "wisdom of repugnance", which have already been ruinous to the progress of science and medicine. And it also appeals to newer catchall "ban it because I don't like it" wrappers, of which a common one is the desire to assert local culture against undefined "global elites". Some more quotes from the anti synthetic meat crusaders (remember, these people are not trying to explain why they are not going to eat synthetic meat personally, they are explaining why they are coercively imposing their choice on everyone else):

Global elites want to control our behavior and push a diet of petri dish meat and bugs on Americans.

Florida is saying no. I was proud to sign SB 1084 to keep lab grown meat out of Florida and prioritize our farmers and ranchers over the agenda of elites and the World Economic Forum.

Some folks probably like to eat bugs with Bill Gates, but not me.

This is a big part of my sympathy toward a moderate libertarianism. I want to live in a society where banning something requires a clear story about harm or risk imposed on clearly identified victims, and if that story is successfully challenged in court then the law is repealed. This greatly reduces the potential to capture government and use it to impose one culture's preferences over the personal lives of others, or fight a war of all against all as every team tries to do the same.

Apologia for bad finance

In crypto, you often hear bad arguments for why you should throw your money into various high-risk projects. Sometimes, they are smart-sounding arguments about how a project is "disrupting" (ie. participating in) a trillion-dollar industry, and how this particular project is really unique and doing things everyone else is not. Other times, it's just "number go up because celebrity".

I am not opposed to people having fun, including having fun by risking some of their money. I am opposed to people being encouraged to put half their net worth into a token that the influencers all say will definitely go up, when the most realistic outcome is that two years later the token is worth nothing. But what I am even more opposed to is people arguing that speculative token games are morally righteous, because poor people need that rapid 10x gain to have a fair chance in the modern economy. Like, say, this:

This is a bad argument. One way to see why it's a bad argument is to approach it like any other argument, and deconstruct and refute the claim that this is actually a meaningful or helpful form of "class mobility".

The core problem with the argument is: casinos are zero-sum games. As a first approximation, each person who goes up a social class, there's a person who goes down a social class. But if you dig deeper into the math, it gets worse. In any standard welfare economics textbook, one of the first ideas that you will see is that a person's utility function in money is concave. Each dollar is worth less to you the richer you already are.

An example of a utility curve. Notice how the slope (value per dollar) decreases the more dollars you have.

This model has an important conclusion: random coin flips, especially large ones, are on average bad for you. Losing $100,000 is more bad for you than gaining $100,000 is good. If we take a model where you currently have $200,000, and each 2x change in wealth pushes you up or down a social class, then if you win a $100,000-sized coin flip, you go up about half a social class, but if you lose the coin flip, you go down a full social class.

Economic models created by people whose motivation is to, well, study human decision making and try to find ways to improve people's lives, pretty much always output conclusions like this. What kind of economic model outputs the opposite conclusion - that it's good to throw in all your money in search of a 10x? Stories told by people whose goal is to feel good about pumping coins.

Here, my goal is not to blame people who actually are poor and desperate and are looking for a way out of their situation. Rather, my goal is to blame people who are financially doing quite well, who are using "poor and desperate people who really need that 10x" as a meme to justify creating situations that encourage poor and desperate people to get into even deeper trouble.

This is a big part of why I have been pushing for the Ethereum ecosystem to focus on low-risk defi. Escaping your money being zeroed out by a political collapse, and getting first-world interest rates, is an excellent thing for people in the third world to have access to, and can work wonders at pushing people up social classes without pushing people down social calsses. Recently, someone asked me: why not say "good defi" instead of "low-risk defi"? After all, not all high-risk defi is bad, and not all low-risk defi is good. My response was: if we focus on "good defi", then it's easy for anyone to make a galaxy-brain argument that any particular type of defi is "good". But if you say "low-risk defi", that's a categorization that has teeth - it's actually hard to make a galaxy brain argument that a type of activity that is clearly regularly causing people to go bankrupt in a day is "low-risk".

I certainly do not oppose high-risk defi existing - after all, I am a fan of prediction markets. But it's a healthier ecosystem when low-risk defi is the mainstay, and high-risk defi is the side dish - something fun or experimental, and not meant for people to put half of their life savings into.

Final question: is the idea that prediction markets are "not just gambling", because they benefit society by improving access to accurate information, itself just a galaxy-brained retroactive rationalization? Some people certainly think so:

I will offer my defense against this charge. The way that you can tell that this is not retroactive rationalization is there is a thirty-year-old intellectual tradition of appreciating prediction markets and trying to bring them into existence, which long predated any possibility of making a serious profit off of them (either by creating such projects or by participating in them). This kind of pre-existing intellectual tradition is not something that exists for memecoins, or even more borderline cases like personal tokens. But, once again, prediction markets are not low-risk defi, and so they are a side dish, not something for you to put half your net worth into.

Power maximization

In the AI-related corners of the effective altruist community, there are many powerful people who, if you go up to them and ask them, will explicitly tell you that their strategy is to accumulate as much power as possible. Their goal is to be well-positioned so that, when some kind of "pivotal moment" comes, they can come out with guns blazing and lots of resources under their command and "do the right thing".

Power maximization is the ultimate galaxy brain tactic. "Give me power so I can do X" is as close as it gets to an argument that is equally convincing no matter what the X. All the way up until the critical moment (which, in AI eschatology, is the moment right before we either get utopia or all die and turn into paperclips), the actions that you would take to maximize power for altruistic reasons, and the actions that you would take to maximize power because you're a greedy egomaniac, are exactly the same. Hence, anyone trying to do the latter can, at zero cost, just tell you that they are trying to do the former and convince you that they are a good person.

From the outside view, this kind of argument is clearly crazy: everyone thinks they're more ethical than everyone else, and so it's easy to see how even though each person thinks their power maximization is net-good, actually it's really not. But from the inside, if you look at the world, and you see the hate on social media, the political corruption, the invasions, the other AI companies behaving unscrupulously, the idea that you personally are the good guy and you should just ignore the corrupt outside world and go solve things yourself certainly feels compelling. And this is exactly why it's healthy to take an outside view.

Alternatively, you can take a different and more humble inside view. Here's a fun argument from the effective altruism forums:

Arguably the largest advantage of investing is that it can exponentially grow financial resources, which can be used for good at a later point. The S&P 500 has had an inflation-adjusted annualized return of ~7% since its inception in 1926

...

The risk of value drift is even harder to estimate, but an important factor. For instance, these three sources (1,2,3) collectively suggest a yearly value drift rate of ~10% for individuals within the effective altruism community.

That is, while it's true that each year your wealth grows by 7%, it's also empirically true that if you believe in a cause today, you're likely to believe in it about 10% less tomorrow. This matches up with an observation by Tanner Greer that public intellectuals tend to have a "shelf life" of about 10-15 years, after which their ideas stop being better than the surrounding background noise (I will let the reader decide the significance of the fact that I started publicly writing in 2011).

Hence, if you grow your wealth to act later, your future self may well do something with that extra wealth that your present self does not even support.

I'm-doing-more-from-within-ism

One problem that repeatedly happens in AI safety is a sort of hybrid between power maximization and inevitabilism: people deciding that the best way to advance the cause of AI safety is to join companies making superintelligent AI happen even faster, and try to improve them from within. Here, you often get rationalizations like, say, this:

From the inside view, this seems reasonable. From the outside view, however, you basically end up with this:

Another good example of this school of thought is the modern Russian political establishment. Here, I'll just quote this Financial Times article:

The full-scale assault on Ukraine on February 24, three days after Putin recognised the Donbas separatists, exceeded their worst fears. They discovered Putin's true intentions along with the rest of the world: on television. Putin's failure to heed the technocrats' warnings devastated them. "I'd never seen [Gref] like that. He was completely bereft, in a state of total shock," says a former executive who saw Gref in the war's early days. "Everyone thinks this is a catastrophe, him more than anyone else."

...

Within the narrow confines of the Russian political elite, technocrats such as Gref and Nabiullina were once thought of as modernisers, a reformist counterbalance to the siloviki, the hardline security services veterans at Putin's other shoulder.

However, when faced with a historic chance to defend their belief in open markets and speak out against the war, they demurred. Instead of breaking with Putin, the technocrats have cemented their role as his enablers, using their expertise and tools to soften the blow of western sanctions and hold Russia's wartime economy together, according to former officials.

Again, the problem is that "I'm doing more from within" has very low galaxy brain resistance. It's easy to say "I'm doing more from within" regardless of what is the actual specific thing that you're doing from within. And so you end up just being a cog in the machine, with the same effect as the other cogs who are there to ~~help their family live in a beautiful mansion in a premium neighborhood and eat expensive dinners every day~~ feed their family, but with a slightly better justification.

So how do you avoid galaxy braining yourself?

There are lots of different things you can do, but I will focus on two:

Have principles

Have hard rules of what you're not willing to do - don't kill innocent people, don't steal, don't defraud, respect people's personal freedom - and have a very high bar for considering any exceptions.

Philosophers generally call this deontological ethics. Deontological ethics confuses many people - surely, if your rules have some underlying reason behind them, you should just go straight to pursuing that underlying reason. If "don't steal" is a rule because stealing usually hurts the victim more than it benefits you, then you should just follow the rule of not doing things that hurt the victim more than they benefit you. If sometimes stealing benefits you more than it hurts the victim - then steal!

"The victims are faceless corporations with billionaire shareholders, therefore my shoplifting crusade is righteous"

The problem with this kind of consequentialist approach is that it has no galaxy brain resistance. Our brains are really good at coming up with arguments why, in this particular case, the thing that you already want for other reasons happens to also be great for humanity. Deontological ethics says: no, you can't do that.

One form of deontology that many people follow is rule utilitarianism: choose rules based on what leads to the greatest good, but when it comes time to choose individual actions, just follow the rules you've already chosen.

Hold the right bags

One other common theme above is that your actions are often set by your incentives - in crypto lingo, what bags you hold. This pressure is very difficult to resist. The easiest way to avoid this is to not give yourself bad incentives.

Another corollary is to avoid holding the wrong social bags: what friend cluster you're closely attached to. You should not try to avoid having social bags - doing so is counter to our most basic human instincts. But it is possible to at least diversify them. The easiest one-step action you can take to make a big difference here is to choose your physical location well.

This brings me to my own contribution to the already-full genre of recommendations for people who want to contribute to AI safety:

Don't work for a company that's making frontier fully-autonomous AI capabilities progress even faster
Don't live in the San Francisco Bay Area