excerpts and reflections, deviant ideas and intrusive thoughts.

2025 年 8 月文摘

Posted on 2025-08-31

Slow

What problems can human beings only solve over a very long period of time? And how can we build institutions that solve those problems?

Reflections on Palantir

If you wanted to work on these ‘harder’ areas of the economy but also wanted a Silicon Valley work culture, Palantir was basically your only option for awhile.

In general, as I begin to survey more startups, I find that the talent level at PayPal is not uncommon for a Silicon Valley startup, but the differentiating factor may have been the level of intensity from the top: both Peter Thiel and Max Levchin were extremely intense people - hyper-competitive, hard-working, and unwilling to accept defeat. I think this sort of leadership is what pushes the “standard” talented team to be able to do great things and, subsequently, contributes to producing a wellspring of later achievements.

Palantir was an unusually intense and weird place. I remember my first time I talked to Stephen Cohen he had the A/C in his office set at 60, several weird-looking devices for minimizing CO2 content in the room, and had a giant pile of ice in a cup.

I like to meet candidates with no data about them: no résumé, no preliminary discussions or job description, just the candidate and me in a room. I ask a fairly random question, one that is orthogonal to anything they would be doing at Palantir. I then watch how they disaggregate the question, if they appreciate how many different ways there are to see the same thing. I like to keep interviews short, about 10 minutes. Otherwise, people move into their learned responses and you don’t get a sense of who they really are.

many people have copied the ‘hardcore’ working culture and the ‘this is the Marines’ vibe, but few have the intellectual atmosphere, the sense of being involved in a rich set of ideas. This is hard to LARP - your founders and early employees have to be genuinely interesting intellectual thinkers.

When I joined, Palantir was divided up into two types of engineers:

Engineers who work with customers, sometimes known as FDEs, forward deployed engineers.

Engineers who work on the core product team (product development - PD), and rarely go visit customers.

… This made the software hard to describe concisely - it wasn’t just a database or a spreadsheet, it was an end-to-end solution to that specific problem, and to hell with generalizability. Your job was to solve the problem, and not worry about overfitting; PD’s job was to take whatever you’d built and generalize it, with the goal of selling it elsewhere. … FDEs tend to write code that gets the job done fast, which usually means – politely – technical debt and hacky workarounds. PD engineers write software that scales cleanly, works for multiple use cases, and doesn’t break. One of the key ‘secrets’ of the company is that generating deep, sustaining enterprise value requires both.

Tyler Cowen has a wonderful saying, ‘context is that which is scarce’, and you could say it’s the foundational insight of this model.

Why is data integration so hard? The data is often in different formats that aren’t easily analyzed by computers – PDFs, notebooks, Excel files (my god, so many Excel files) and so on. But often what really gets in the way is organizational politics: a team, or group, controls a key data source, the reason for their existence is that they are the gatekeepers to that data source, and they typically justify their existence in a corporation by being the gatekeepers of that data source (and, often, providing analyses of that data).

The overall ‘vibe’ of the company was more of a messianic cult than a normal software company. But importantly, it seemed that criticism was highly tolerated and welcomed – one person showed me an email chain where an entry-level software engineer was having an open, contentious argument with a Director of the company with the entire company (around a thousand people) cc’d. As a rationalist-brained philosophy graduate, this particular point was deeply important to me – I wasn’t interested in joining an uncritical cult. But a cult of skeptical people who cared deeply and wanted to argue about where the world was going and how software fit into it – existentially – that was interesting to me.

I’m not sure if they still do this, but at the time when you joined they sent you a copy of Impro, The Looming Tower (9/11 book), Interviewing Users, and Getting Things Done. I also got an early PDF version of what became Ray Dalio’s Principles. … But why Impro? … Impro is popular with nerds partly because it breaks down social behavior mechanistically.

One of my favorite insights from Tyler Cowen’s book ‘Talent’ is that the most talented people tend to develop their own vocabularies and memes, and these serve as entry points to a whole intellectual world constructed by that person. Tyler himself is of course a great example of this. Any MR reader can name 10+ Tylerisms instantly - ‘model this’, ‘context is that which is scarce’, ‘solve for the equilibrium’, ‘the great stagnation’ are all examples. You can find others who are great at this. Thiel is one. Elon is another (“multiplanetary species”, “preserving the light of consciousness”, etc. are all memes). Trump, Yudkowsky, gwern, SSC, Paul Graham, all of them regularly coin memes. It turns out that this is a good proxy for impact.

This insight goes for companies, too, and Palantir had its own, vast set of terms, some of which are obscure enough that “what does Palantir actually do?” became a meme online. ‘Ontology’ is an old one, but then there is ‘impl’, ‘artist’s colony’, ‘compounding’, ‘the 36 chambers’, ‘dots’, ‘metabolizing pain’, ‘gamma radiation’, and so on. The point isn’t to explain all of these terms, each of which compresses a whole set of rich insights; it’s that when you’re looking for companies to join, you could do worse than look for a rich internal language or vocabulary that helps you think about things in a more interesting way.

One of the things that (I think) came from Peter was the idea of not giving people titles. When I was there, everyone had the “forward deployed engineer” title, more or less, and apart from that there were five or six Directors and the CEO. Occasionally someone would make up a different title (one guy I know called himself “Head of Special Situations”, which I thought was hilarious) but these never really caught on. It’s straightforward to trace this back to Peter’s Girardian beliefs: if you create titles, people start coveting them, and this ends up creating competitive politics inside the company that undermines internal unity. Better to just give everyone the same title and make them go focus on the goal instead.

Some people were more influential than others, but the influence was usually based on some impressive accomplishment, and most importantly nobody could tell anyone else what to do. So it didn’t matter if somebody was influential or thought your idea was dumb, you could ignore them and go build something if you thought it was the right thing to do.

The cost of this was that the company often felt like there was no clear strategy or direction, more like a Petri dish of smart people building little fiefdoms and going off in random directions. But it was incredibly generative.

This is an uncomfortable stance for many, precisely because you’re not guaranteed to be doing 100% good at all times. You’re at the mercy of history, in some ways, and you’re betting that (a) more good is being done than bad (b) being in the room is better than not. This was good enough for me. Others preferred to go elsewhere.

Palantir Foundry

“Context is that which is scarce”

When judging people for leadership positions, or for jobs that require strongly synthetic abilities, you should consider how well they are capable of generating an understanding of context across a broad range of domains, including ex nihilo, so to speak.  How to test for understanding of context is itself a topic we could consider in more depth.

Lessons from Peter Thiel

In hiring, value intelligence highly.

If you are designing something that a customer is going to use or that will represent us in public, it’s not good enough unless it’s flawless and extraordinary.

Don’t waste time talking about what you plan to think about; instead, work through it immediately.

So we have to use our judgment to seek out intelligent people who disagree with us — or even intelligent people who have simply taken a different approach — and be open about what we might learn from them.

In the abstract, this is because the incentives of the other company will not line up with ours, and even if they do for the moment, they no longer will once the situation changes. In specific, other companies don’t have the same culture of execution that Palantir does, and we don’t have the power to instill that culture in them.

Instead, it’s often instructive to imagine that you were working from a clean slate and design the feature from scratch.

You don’t need a niche, you need a point of view

When you shift from being a repository of specialized knowledge to being a lens through which people see the world, you become a curator of many ideas across different domains and synthesize them under one body of work. You no longer own a category or topic, you own a perspective that creates a unique constellation of interconnected ideas that can’t be replicated because they’re filtered through your specific combination of values, experiences, and goals.

CS183: Startup - Peter Thiel Class Notes

We tend to think of all computers as more or less identical. Maybe some features are different, but the systems are mostly homogeneous. People, by contrast, are very different from one another. We look at the wide range of human characteristics—from empathy to cruelty, kindness to sociopathy—and perceive people to be quite diverse. Since people run our legal system, this heterogeneity translates into a wide range of outcomes in disputes. After all, if people are all different, it may matter a great deal who is the judge, jury, or prosecutor in your case. The converse of this super naive intuition is that, since all computers are the same, an automized legal system would be one in which you get the same answer in all sorts of different contexts.

如何从敌人身上获益

普鲁塔克认为,最应该训练和实践的,不是如何规避不幸,而是如何把困境转化为机会。人应当珍惜自己拥有的,好好地使用并从中获得真正的快乐,这样,即便有一天失去了,也能够更加平静地面对。

借贷人的“信用”意味着他本来就拥有财富,而既然他有财富,他又何必借贷?因此,避免债务才是最值得奉行的生活准则。

Digital hygiene: Notifications

Developing a good relationship to your phone is an intentional process. It doesn’t happen by accident. All apps and media, by design, are fighting for your attention. I’ve heard the term “attention economy” thrown around, and I feel like it’s an apt description of the battle for our increasingly fractured attentions.

对话富达基金张笑牧:如何寻找解决社会问题的 “十倍股”?

那其实对于企业来说也是这样的,就说大家在讨论这个企业能不能最终帮股东赚到钱,它其。其实是来自于它的商业化模式,它的货币化模式你就必须要回到根子上去思考怎么样才能够让这个企业持续地赚到钱,那么也就是他的消费者,他的客户就要持续地愿意付他钱,就往前再推一步呢。如果有一个企业他能够为社会上的一些重大的问题去提供一种可持续的解决方案,那么在长期的过程中,那社会就会持续地需要这个企业的服务、产品或者技术。所以在这种情况下,社会的参与者,也就是他的客户,无论是消费者、企业还是政府,就会愿意向他支付一个对价,这个对价其实就是他的是货币化的结果,所以最终我去识别这个可持续成长的这个方法论,就总结为我去找那些能够为社会重要问题提供可持续解决方案的企业,那我觉得他们在长期中就有大概率能够为社会所需要,然后获得社会的回馈,同时他们就有更高的概率去做有较高回报的再投资,然后形成利润层面的、现金流层面的可持续的成长。所以这样的话你可以看出来,就是我所定义的这个可持续的成长,它的这个来龙去脉。

其实我们回过头去看电商的话,它其实解决的就是一个信息流、物流、资金流,在一个中国全国化的一个统一大市场这么一个构建的过程中,从 20 年前中国的这个线下零售可能是非常割裂的,完全靠线下渠道商去铺货的模式,到今天基本上在中国任何一个地方生产的产品,你作为消费者你都可以很轻松地从电商上、平台上买到。

投入会很高,但是我们也会问他说,那你投入相对高的话,你的 ROI 会是怎么样的?我觉得当时创始人有一句话打动我,说他也无法预测或者给我们一个指引,说它的 ROI 比起端游会高还是低,但是他们做的每一款核心的游戏他都会亲自去玩,他们没有采取当时很盛行的低投入、高周转的这个手游开发模式。那我觉得这也是一种正确的差异化,所以也导致了后来其实他们的手游转型是非常成功的,我觉得他们其实是坚持了他们公司自己的公司文化,没有在一个技术转型的过程中随波逐流,但是他们在面对技术创新的时候又是非常积极去迎接的。

那从投研角度来说,情绪价值这个指标的话是不是可以作为一个量化的一个依据呢?因为这方面可能也是大家现在对新消费的投研当中遇到的一个比较大的难题,因为前几年我们在讲科技股的时候还提过市梦率这个词,也是一个比较虚无缥缈的一个市场预期的概念。

我觉得情绪价值这个东西之所以让投资者这么兴奋,而且它能够突破整个消费疲软的宏观格局,脱颖而出,其实包含了一些非常底层、深层次的人性层面的东西。但是越是这样的东西,你其实是越难量化的。

读《1000 个铁粉》

有时候,个人的缺陷反而是个人 IP 的辨识度, 比如麻城口音的戴老师,歌手周深等。 因为天赋来自于你的缺陷,你所有的强项都可以被复制,但你的缺陷却很难被模仿。

这也是很多精心设计出来的视频,反而没有一个随时拍的视频效果好,因为一个人的感情、神态、情绪是没法精心设计的。

当思考按字数收费

未来,稀缺的不是知识,而是注意力、验证能力和意义建构能力。

Happiness Research: Get Used to It

Most of us are familiar with striking examples of people who seem to be adapting well to circumstances that are extremely adverse. We may have seen footage of malnourished children playing happily in garbage dumps or know of severely handicapped people who maintain a cheerful disposition in spite of their disabilities… This chapter examines both the extent and limits of hedonic adaptation — processes that attenuate the long-term emotional or hedonic impact of favorable and unfavorable circumstances.

OpenAI’s new open-source model is basically Phi-5

As it turns out, it does very well on model benchmarks but disappoints in practice. Searching for the reception to each Phi model shows the same pattern: very impressive benchmarks, lots of enthusiasm, and then actual performance far weaker than the benchmarks would suggest.

Why would OpenAI train Phi-style models, knowing that they’ll perform better on benchmarks than in real-world applications? For the same reason that Microsoft probably continued to train Phi-style models: safety.

Why would OpenAI train Phi-style models, knowing that they’ll perform better on benchmarks than in real-world applications? For the same reason that Microsoft probably continued to train Phi-style models: safety. … It’s not discussed publically very often, but the main use-case for fine-tuning small language models is for erotic role-play, and there’s a serious demand. Any small online community for people who run local models is at least 50% perverts.

For OpenAI, it must have been very compelling to train a Phi-style model for their open-source release. They needed a model that beat the Chinese open-source models on benchmarks, while also not misbehaving in a way that caused yet another scandal for them. Unlike Meta, they don’t need their open-source model to be actually good, because their main business is in their closed-source models.

归类是理解的假动作| 🥫 阅读罐头(7 月刊)

我们不是靠头衔或忙碌来定义自己——那些标签一旦消失,你会发现自己其实还在,只是世界一时不太知道怎么「看」你。这种感觉很真实,尤其是当你突然失业、换角色,或者生活按下暂停键的时候

真正属于自己的东西,比如好奇心、幽默感、看见微小美好的能力,这些不会因为你暂时「无所事事」就消失。

崔庆龙_

一个复杂的现象(比如一个人),便是它自身最短的描述。你无法用一个更简单的公式、理论或诊断标签来完全概括它。在这个视角下,任何简化都是一种信息损失,任何预设都会让彼此产生一种被切割了主体完整性的抵抗感。

我也让我联想到数字时代的一种困局,也就是网络上出现的越来越多、也越来越简化的人类指称词汇,它将每一个独特的经验个体打包压缩成了非常简洁的指令,这是一种对人类主体性的暴力切割和全然漠视。

也就是说,在对一个人的理解上,我们能做到的最大的努力,仅仅是对于一个人内在经验过程的描述,我们越是能做到这一点,就越是能消除人与人之间的隔阂与愤怒。

YAMA: You’re Always Missing Out (And That’s A-Okay)

YAMA recognizes that missing out is a fundamental part of the human condition. Rather than fighting this reality or trying to optimize around it, YAMA suggests simply accepting that we’re finite creatures in an infinite world.

I Tried Every Todo App and Ended Up With a .txt File

What Actually Happened With Each App Notion: Built an entire life operating system. Spent three weeks perfecting it. Used it for two days. Now it’s a graveyard of abandoned databases.

Todoist: Great until I realized I was gaming the points system instead of doing actual work. Turns out completing “drink water” 8 times a day doesn’t make you productive.

Things 3: Beautiful. Expensive. Tricked me into thinking I had my life together. But I kept forgetting to check it.

Trello: Turned my todo list into a board with columns. Realized I’m not a startup. I’m just one person trying to remember to buy milk.

OmniFocus: So powerful I needed a manual to use it. Spent more time learning OmniFocus than finishing my actual projects.

The Future Isn’t Model Agnostic

As builders, it’s time we stop hedging our bets and embrace the convergence reality. Every startup pitch deck with ‘model-agnostic’ as a feature should become a red flag for investors who understand product-market fit. Stop putting ‘works with any LLM’ in your one-liner. It screams ‘we don’t know what we’re building.’

Why Cursor is About to Ditch Vector Search (and You Should Too)

Turns out, vector databases actually aren’t THE solution for everything. There are inherent limitations and downsides to using/implementing/maintaining vector databases that the industry is finally discovering.

Vector search gives you “most similar” stuff, but not necessarily “most relevant” stuff. This is especially painful when it comes to coding, or any use case that requires specificity.

Vector search should not be applied to text where semantic similarity is irrelevant.

For coding, similarity != relevance. Similarity is fuzzy; relevance is precise and exact.

Kagi Small Web

To begin with, while there is no single definition, “small web” typically refers to the non-commercial part of the web, crafted by individuals to express themselves or share knowledge without seeking any financial gain. This concept often evokes nostalgia for the early, less commercialized days of the web, before the ad-supported business model took over the internet (and we started fighting back!)

The small web is beautiful

However, it’s not just about raw size, but about an “ethos of small”. It’s caring about the users of your site: that your pages download fast, are easy to read, have interesting content, and don’t load scads of JavaScript for Google or Facebook’s trackers. Building a website from scratch is not everyone’s cup of tea, but for those of us who do it, maybe we can promote templates and tools that produce small sites that encourage quality over quantity.

It’s been said before, but microservices solve a people problem, not a technical one. But beware of Conway’s Law: your architecture will mimic your company structure. Or the reverse – you’ll have to hire and reorg so that your company structure matches the architecture that microservices require: lots of engineers on lots of small teams, with each team managing a couple of microservices.

项飙 × 迈克尔·桑德尔:越努力越幸运是一种假象

这就是优绩主义的黑暗面:这种残酷的输赢伦理让成功者过于飘飘然,于是他们忘记了那些成功路上的运气因素和助力——家庭、老师、社会阶层、国家和时代。

项飙 × 迈克尔·桑德尔:不够幸运的人,怎样过好这一生?

不知道我们能否在年轻人中构建一种生活愿景,让他们更加关注当地。这就是为什么我一直在提“附近”——关注你的周围,了解附近的人,你的父母如何生活,谁是你的邻居,谁在清扫你的街道,垃圾是如何被收集的,然后在附近、在触手可及的生活中找到意义,而不是白白做梦。

“你能去到你梦想的任何地方。”不是的,你要知道,你的梦并不真正地属于你自己,它只是霸权在你脑海中的投影。做白日梦的时候,你其实已经在某种程度上成为了霸权的俘虏。真正的自我是在附近、在你与周围人的关系中找到的。

[BUG] Claude says “You’re absolutely right!” about everything #3382

Claude is way too sycophantic, saying “You’re absolutely right!” (or correct) on a sizeable fraction of responses.

Do things that don’t scale, and then don’t scale

A little over a decade ago, Paul Graham popularized “Do things that don’t scale.” The idea was: at first, you do the scrappy, personal, labor-intensive stuff just to get traction… and then you figure out how to make it huge.

But with GPT-assisted coding, I think we’re in an era where you can just stop after the first part. You can do something that doesn’t scale — and leave it that way. That might actually be the best version of it.

The cost to build is so low now.

Some things work precisely because they’re small.

The version that only works for my mom is the safest — and best — version.

The Pattern • See a need that matters to you.

• Build the smallest, simplest thing that solves it.

• Resist the urge to make it bigger.

• Enjoy it.

The real luxury of building with today’s tools isn’t speed, or cost, or even the magic of AI — it’s the freedom to stop.

知足常乐-水星投资理财的基本意念

時間彷彿只有在我們身上靜止,在小孩與長輩身上卻是加速沖刷著。

Do Things that Don’t Scale

My favourite German word¶

My favourite word in the German language is Gegenstand, for object or thing.

Gegenstand, “stand-against”. An object is something that stands against you. It’s not you. It’s outside you. It has its own rules. It doesn’t conform to your desires. You run up against it, and it resists.

Objects aren’t just inert stuff – they do something. Even better, Gegenstand shows how objects help define us. I know what I am, what the limits of myself are, because I am able to come up against resistant objects. That’s how I know and have a sense of my own body, its extension in space, where it ends and where the world begins.

I am able to have a self in the first place only because the world of objects is there to stand against me.

Dicing an Onion the Mathematically Optimal Way

GPT-5: The Reverse DeepSeek Moment

The problem is that the release was so botched that OpenAI is now experiencing a Reverse DeepSeek Moment – all the forces that caused us to overreact to DeepSeek’s r1 are now working against OpenAI in reverse.

This threatens to give Washington DC and its key decision makers a very false impression of a lack of AI progress, especially progress towards AGI, that could lead to some very poor decisions, and it could do the same for corporations and individuals.

We had the DeepSeek Moment because of a confluence of factors misled people:

The ‘six million dollar model’ narrative gave a false impression on cost. They offered a good clean app with visible chain of thought, it went viral. The new style caused an overestimate of model quality. Timing was impeccable, both in order of model releases and within the tech tree. Safety testing and other steps were skipped, leaving various flaws, and this was a pure fast follow, but in our haste no one took any of that into account. A false impression of ‘momentum’ and stories about Chinese momentum. The ‘always insist open models will win’ crowd amplified the vibes. The stock market was highly lacking in situational awareness, suddenly realizing various known facts and also misunderstanding many important factors.

37 岁退休一周年:经验与心得分享

计划的核心是做好三件事:财务、心理、家人。

其中财务上的准备是最容易量化和执行的,也往往是第一步;心理建设则更复杂,它要求你直面内心、严守纪律、破除障碍,是第二步;至于家人的支持,每个家庭的情况都不同,几乎无法借鉴他人经验,只能靠自己去摸索与沟通。

对于本金的积累,大多数人都是通过自己的“人力资本”来换“金融资本”,通俗的讲就是上班赚钱。而人力资本最值钱的时候,是在 40 岁之前。在新兴且需要终身学习的行业,这个年龄界限往往还会提前,可能到 35 岁左右。

可一旦你开始 FIRE,离开工作岗位,当门禁失效、工位被收回、企业微信注销,外界对你的态度也会随之改变。你不再被拜托帮忙,聚会中也不再是焦点,甚至本来经常联系的人也渐渐沉默,很多人此时会疑惑:我到底是谁?我还算什么?

一个人的价值不能只被工作定义,工作之外的你,也需要有自我认同和心理调节的能力。否则,一旦身份标签被剥离,内心很容易陷入失落与迷茫。

每个人想要的退休生活都不相同,但大概率是想摆脱他人的要求、期望,回归自己的热爱:

程序员终于能写“无用但有趣”的代码 游戏人尝试开发独立游戏 手工爱好者专注小众但满足自我的创作 甚至理直气壮的“虚度光阴”

Your Review: Dating Men In The Bay Area

Sometimes I’m convinced there’s a note taped to my back that says, “PLEASE SPILL YOUR SOUL UPON THIS WOMAN.” I am not a therapist, nor in any way certified to deal with emotional distress, yet my presence seems to cause people to regurgitate their traumas.

This quirk of mine becomes especially obvious when dating. Many of my dates turn into pseudo-therapy sessions, with men sharing emotional traumas they’ve kept bottled up for years. One moment I’m learning about his cat named Daisy, and then half a latte later, I’m hearing a detailed account of his third suicide attempt, complete with a critique of the food in the psychiatric ward.

Yocar-冯骥

《黑神话:悟空》发售后有相当长一段时间,我过得云里雾里。 一个心心念近二十年的事情,终于等到一个结果。而这个结果,超出最初的预期太多。 按理说,应该满地打滚,应该天天轻哼。 遗憾的是人类底层的预设不是这样,强烈的正面情绪持续时间好像都特别短,快乐总是一眨眼就过去。 那段时间我脑子里真正挥之不去的,主要是迷茫、虚无与惶恐(我知道这么说很矫情,别开枪)。可无论我怎么为自己“快乐不起来”感到羞愧,这些情绪依然不受控制地袭来,而且汹涌澎湃——尤其是被淹没在“DLC 到底做没做 DLC 都有谁啥时候发 DLC”的时候。

《岩田先生》一书中,任天堂的老社长说:“在既有的延长线上,是没有未来的。”

短视频里的中老年人,饭桌上的导师

How AI researchers accidentally discovered that everything they thought about learning was wrong

Five years ago, suggesting that AI researchers train neural networks with trillions of parameters would have earned you pitying looks. It violated the most fundamental rule in machine learning: make your model too large, and it becomes a glorified photocopier, memorising training data whilst learning nothing useful.

This wasn’t mere convention—it was mathematical law, backed by three centuries of statistical theory. Every textbook showed the same inexorable curve: small models underfit, optimal models generalise, large models catastrophically overfit. End of story.

For over 300 years1, one principle governed every learning system: the bias-variance tradeoff. The mathematics was elegant, the logic unassailable. Build a model too simple, and it misses crucial patterns. Build it too complex, and it memorises noise instead of signals.

https://www.pnas.org/doi/10.1073/pnas.1903070116

The models didn’t collapse. After an initial stumble where they appeared to memorise their training data, something extraordinary occurred. Performance began improving again. Dramatically.

The phenomenon earned the name “double descent”—first the expected rise in error as models overfit, then an unexpected second descent as they somehow transcended overfitting entirely. Mikhail Belkin and his colleagues, who documented this discovery, noted it “contradicts conventional wisdom derived from bias-variance analysis.”

Hidden within every large network, they found “winning tickets”—tiny subnetworks that could match the full network’s performance. They could strip away 96% of parameters without losing accuracy. The vast majority of every successful network was essentially dead weight.

But here lay the crucial insight: these winning subnetworks only succeeded with their original random starting weights. Change the initial values, and the same sparse architecture failed completely.

The lottery ticket hypothesis crystallised: large networks succeed not by learning complex solutions, but by providing more opportunities to find simple ones. … Training becomes a massive lottery draw, with the best-initialised small network emerging victorious whilst billions of others fade away.

this reframes intelligence itself. … Intelligence isn’t about memorising information—it’s about finding elegant patterns that explain complex phenomena. Scale provides the computational space needed for this search, not storage for complicated solutions.

For AI development, this understanding suggests both promise and limits. Scaling works because larger models provide more lottery tickets, more chances to find optimal solutions. But this mechanism implies natural bounds. As networks become more successful at finding minimal solutions, additional scale yields diminishing returns.

This aligns with expert concerns about current approaches’ limits. Yann LeCun argues that fundamental architectural constraints may prevent language models from achieving true understanding regardless of scale.

The accidental discovery that revolutionised AI offers a profound lesson: the universe often holds elegant surprises for those bold enough to test conventional wisdom’s boundaries.

所以才有人说,由 Ass(屁股)驱动的经济动力,远远超过了由 AI(人工智能),OnlyFans 单平台的收入就比所有 AI 新兴公司加起来还多,是的,把你想得起名字的 AI 公司,OpenAI、Midjourney、Runway 等等全部加起来,都比不上 OnlyFans 能赚钱。

AI 负责改变未来,Ass 负责满足当下,看来大家都还是不怎么喜欢延迟满足的 ⋯⋯

Less is more

改变世界与不被世界改变

买股票就应该选不会被世界改变的公司,和能改变世界的公司

不会被世界改变的公司,其业务天然具有不受科技进步影响的定力,因此他们天然应该不具备“先进”属性。如果任何东西是靠技术竞争领先而占有优势的,那么它便不是但斌所说的不会被世界改变的公司,因为科技的东西大多数都是短命的。

我们原来在 A 股,所以努力寻找不会被世界改变的公司;现在我们踏足国际市场,所以我们寻找能够改变世界的公司

How does Palantir help organizations comply with data protection and security requirements?

We refer to the systematic mapping of data, logic, and action to meaningful semantic concepts as an “ontology.” Organizations benefit from building and using an Ontology to organize and leverage their data, enabling connectivity at scale to view local decisions in a more global context, interpretability for more robust and effective analysis, economies of scale with the ability to build entire applications and use cases in the existing Ontology, decision capture through write back actions made in-platform to original source systems, and operational AI/ML directly in-platform.

112. 和广密聊大模型季报:分化与收敛、全家桶与垂直整合、L4 体验与挖矿窗口

犹太人的金融,华人的 AGI

前三家就是 OpenAI、Gemini、 Anthropic 这三个 AI lab,我觉得是叫智能为先的一个文化。到底是智能为先还是产品为先?其实对整个团队的配置和思考问题的方式其实影响还是很大的。我猜 Mira 可能是以产品为先的,因为 Mira 可能觉得现有的智能技术已经能做出很好的产品,所以比如说探索下一代的产品和下一代的交互,我觉得这个为先也是蛮有意思的嗯。嗯,第三就是我觉得 Mirra 是全球范围内有可能适合做苹果公司 CEO 的,就是产品为先的理念也是 match 苹果的,我觉得我们可以观察未来一两年苹果董事会是否会邀请 Mira 过去,我觉得是可以期待一下的,就是如果这是什么脑洞这个猜测,然后如果苹果没有一个真正懂 AI 的团队,那下一代手机如何规划都很难呢?因为下一代手机的核心到底是什么?是更主动新的交互,还是 24 个小时的 always on 的能力?不然我觉得苹果管理层可能无法规划下一代手机的。

如果今天有 100 块钱买股票,你会买谁家的股票啊?​

我会,如果是今天这个时间切片下的回答, 40 块钱我会放到 open i,嗯, 40 块钱买字节的股票, 10 块钱给 Anthropic, 10 块钱给 Google。

就是今年 25 年来了之后,我感觉 ChatGPT 经历了一个品牌的大众化的下沉。就是尤其是他的心智和品牌的壁垒今年是在大大变强的,增长还在加速,他的陡峭程度比其他人都要陡峭的,就是它有可能是迈过了某个大众用户渗透的门槛。就今天你问很多身边的新的 AI 的用户,其实很大比例的用户会把 ChatGPT 当一个首选的 AI 工具

就是如果今天我是一个 a i 创业公司的 CEO,我怎么去面对头上的这几个模型公司的竞争?我手上是没有硬牌的,只能硬着头皮往前走的。

上一代的产品经理很多都是码农出身,就码农出身的人,其实他们能知道哪些功能是可以实现的?但所以你看,所以我们就倾向于说这一代的产品经理大概率可能是算法或者模型出身,他又比较,他又有比较好的产品或者业务的 sense。就不然的话你就没法利用好关键的模型的红利,那也无法判断好这个未来 6 ~ 12 个月模型的变化。嗯,所以肯定倾向于说从模型,比较懂模型的这帮人里面找。

就是你在美国待久了,尤其在加州就会有一个突出感觉,就是美国现在最核心的是两件事,一个是叫金融的铸币权,嗯,一个是硅谷的科技领先,就除了这两个东西,可能美国的其他东西都在被中国的产业化蚕食,而且蚕食得很厉害,

所以这就是非共识,就是语言是一个很特别的东西,嗯,语言和 code 和 pattern 这些东西是很不一样的,嗯,所以我觉得这有可能是非共识。嗯,机器人可能不会那么快的,可能还需要迈过好几个 GP4 级别的技术成熟的。

💻🕵️ THE NVIDIA AI GPU BLACK MARKET 💸 | Smuggling, Corruption & Global Scandal 🌍🔥

Conversation

Claude 4 just refactored my entire codebase in one call.

25 tool invocations. 3,000+ new lines. 12 brand new files.

It modularized everything. Broke up monoliths. Cleaned up spaghetti.

None of it worked. But boy was it beautiful.

The leverage paradox

When my family and I were driving through France a few years ago, we were enchanted by the hundreds of storybook cows grazing on picturesque pastures right next to the highway.

Then, within twenty minutes, we started ignoring the cows. … Cows, after you’ve seen them for a while, are boring. They may be perfect cows, attractive cows, cows with great personalities, cows lit by beautiful light, but they’re still boring.

A Purple Cow, though. Now that would be interesting.

不脱裤子如何证明自己屌更大?

比屌大这个说法兴许有些性别歧视的意味,但奇怪的是,这种行为模式的确在男性身上更常见,它的另一个名字叫做「比谁尿得远」。雄性似乎有一种天然的胜负欲,这种胜负欲不只体现在性能力和排便的能力上,还被投射到了各个方面。

最了不起的个人能力

一个人真正决定要去做什么的时候,那就去做了。一个人决定不要去做什么,又承认自己其实应该去做的时候,哇,那个话就多了,上至天文下至地理中间加上权力结构分析,平常三锤打不出个闷屁的人,这时候随便都能给你写出几百字论证严谨,证据充分的小论文来。就为了说明一件事:不可行。

所以这就是我内心的真实想法:你不用告诉我为什么不可行,为什么没意义,为什么不可能做到,尤其不要写几百字的小作文来,我不想看,我又没有邀请你来和我讨论,请求你给出自己的观点。大多数的事情在大多数时候都不可行,不需要任何人来论证这件事,更不需要谁来告知我。你不想做,不需要征求我的同意,也无需取得我的认可,那是你自己的事,你自己的决定,不需要第二个人做见证。

基于这种想法,我认为最不了起的个人能力就是想到了就动手去做。一个人都不需要什么更高阶更强大的能力,单有这一项能力就已经超过了人群中 90% 以上的人口。当我们要讨论提升培养个人能力,讨论项目可行性的时候,应该是在这 10% 的人口进行内部讨论。

连个黄网都找不到,你还能干什么?

生存和繁衍,这是任何生物种群最为顽固和强大的需求。自人类发明避孕套之后,繁衍已经不再是重点,性成为了一种娱乐。即便如此,它的驱动 力也依然异常强大。在我看来,力必多是否强劲有力,表征着生命力是否依然旺盛。如果这点心思都没有了,很难相信这个人还能做出点别的什么来。

相逢的人总能相逢。这无非是因为所有最终能够相逢的人都具备相同的人生态度,故而有相同的行动能力,最 终也就总能成为幸福的少数人。