AI · Web3 · Tech trends and insights at a glance
AI · Web3 · Tech trends and insights at a glance
It was no accident that 'Learning to Reason with LLMs' and 'DeepSeek-R1' topped HackerNews side by side. As brute pretraining scaling hits diminishing returns, rewarding reasoning directly through reinforcement learning and letting models think longer at answer time has become the new battleground. The shift reshapes HBM demand curves, datacenter power design, and the odds of an open-weight catch-up.
When two posts climbed the HackerNews front page on the same day, the pairing read like a thesis statement. One described training a model to unfurl a long chain of thought before committing to an answer. The other was DeepSeek-R1, claiming that this very capability could be coaxed out through reinforcement learning rewards alone, without an elaborate scaffold of supervised reasoning traces. They pointed in the same direction: the formula that had governed the field for years — bigger models, more data, longer pretraining — had reached a point where it no longer pushed capability upward in a straight line, and a different lever was quietly taking its place.
The logic of pretraining scaling was seductive precisely because it was simple. Loss fell predictably as parameters and tokens grew together, and that empirical regularity became a coordinate system for pouring capital in a single direction. But the curve grew expensive on one axis and gentle on the other. High-quality text is finite, and doubling a cluster did not double a model's ability to reason through a hard problem. What DeepSeek-R1 demonstrated is that a base model of fixed size can leap forward on verifiable tasks like mathematics and code when the path to a correct answer is itself turned into a reward signal. Left to optimize against that signal, the model began generating longer deliberations on its own, second-guessing intermediate steps and revisiting its own conclusions. The locus of capability shifted from knowledge frozen into pretrained weights toward computation summoned in the moment of answering.
The economics of this move run opposite to those of pretraining. Pretraining front-loads an enormous one-time cost, after which inference is comparatively cheap. Inference-time reasoning inverts that ledger: every time the model meets a hard question it generates thousands or tens of thousands of fresh tokens of thought, so cost accumulates not at the training stage but at the serving stage, scaling with usage rather than being paid once. A genuinely new design variable enters the picture — trade more thinking time for more accuracy — and with it the prospect that reasoning demand expands without a natural ceiling as adoption grows.
The first thing to wobble downstream is the character of memory demand. Producing a long chain of thought means holding a long context and piling key-value caches into memory, which strains bandwidth and capacity before it strains compute cores. If the story propping up HBM demand was once memory to train ever-larger models, the reasoning era reframes it as memory to serve the long deliberations of many concurrent users. Training demand concentrates in a handful of giant clusters; inference demand spreads broad and thick across the entire service frontier. The very shape of the demand curve changes.
Datacenter power design feels the same pressure from a different angle. Training is close to a predictable, sustained load, whereas inference is a volatile one that swells and recedes with user traffic and question difficulty — and a model tuned to think harder spends more tokens and more watts on the same prompt. With power already the real bottleneck on datacenter expansion, operators now weigh the marginal kilowatt-hour against marginal accuracy. The better the answer you want, the larger the electricity bill and the carbon footprint that come attached.
The same shift, almost paradoxically, hands the open-weight camp a credible route to catch up. The pretraining scaling race was a game for the few who could marshal astronomical capital, but reinforcement-learned reasoning can be layered onto a published base model with comparatively modest resources. The proof arrived quickly: once DeepSeek-R1 released its weights and recipe, a wave of follow-on work reproduced and remixed the same approach. If the secret of reasoning lies more in well-designed verifiable rewards than in a secret data pipeline, the moat the closed labs have enjoyed may be shallower than assumed. What DeepSeek-R1 opened was not merely one smarter model, but a new phase that relocated the axis of competition, the center of gravity of cost, and the height of the barrier to entry all at once.
The Land-Permit Paradox of Korea's Chip Belt, When the Cluster's Boom Prices Out Its Own Engineers
Dongtan, Giheung, and Guri have been folded into Korea's land-transaction permit regime just as the AI chip capex boom reshapes the property market around the country's largest fabs. The very prosperity the cluster generates is raising the cost for the engineers it depends on to settle nearby. The real test of agglomeration may lie not in siting megafabs but in housing and labor mobility.
The Collapse of the Closed AI Moat and the Supply-Chain Paradox of Unverifiable Weights
DeepSeek-R1's open reasoning weights and Llamafile's single-file distribution are eroding the performance and distribution moats that closed labs once charged a premium for. Yet the same openness collides head-on with the gap exposed by the "250 samples to break an LLM" research: weight distribution that no recipient can verify. Democratized competition and accumulated security debt now sit on the same scale.
Forty-Year Yen Lows as the Hidden Subsidy Behind Japan's Chip Revival
As the yen slides into its weakest territory in four decades, Takaichinomics has entered uncharted monetary terrain. A cheap yen functions as a silent subsidy for Rapidus, Kioxia, and TSMC's Kumamoto fabs—yet the same currency inflates the cost of imported tools and materials and intensifies the talent war with Korea. The question is whether monetary policy can stand in for industrial policy, and what that means for Korea's memory champions.