🎙️Podcast Review: 20VC w/ Groq Founder Jonathan Ross
Scaling Laws, AI Growth Constraints, and AI Economics
Key Takeaways:
Plan for Better, Faster, and Cheaper AI: Founders should build with the assumption that AI capabilities will continue improving exponentially. Scaling laws matter more than short-term optimizations.
AI Growth Bottlenecks: The three biggest constraints on AI growth: data, algorithms, and compute power. High Bandwidth Memory (HBM) supply is a critical chokepoint, and power availability will soon be the biggest limiting factor.
The Inference Wars: NVIDIA dominates training but Groq is leading inference, offering 5x lower cost and 1/3rd the energy per token. The real battlefield is tokens per dollar and tokens per watt, not raw TFLOPs.
Datacenters Are Energy, Not Just Real Estate: The AI race isn’t about who has the best chips, but about who controls the power. Inference demand is 20x larger than training, and in 3-4 years, power constraints will slow AI more than chip shortages.
AI Business Models Must Optimize for Scale, Not Spend: Startups shouldn’t focus on spending more; they should focus on how much they can scale efficiently. NVIDIA enjoys 70-80% margins, Groq operates at 20% but prioritizes speed and scale.
The Big O of Organization scale and hiring: AI startups should scale teams logarithmically, not linearly. If customer growth doubles, you shouldn’t need to double your employees. If you can scale logarithmically, it is a sign that you are doing the right things like automation to scale. Hire generalists and folks that book the win early (i.e. loss bias).
China and Europe: China is winning through brute force deployment of compute, but talent that wants to innovate is likely to leave due to government constraints. Europe lacks an AI “Silicon Valley” due to cultural risk aversion—it needs a "Risk-On" enclave to foster AI breakthroughs.
The Path to AGI? Solving Hallucinations is critical. Whoever fixes AI hallucinations first will define the industry. Agentic AI (autonomous reasoning systems) comes after this milestone, but hallucination errors must be solved before AI can truly replace high-risk human decision-making.
The Economics of AI: More money will be made than incinerated in the long-term. AI is in a Keynesian beauty contest, where marketing and hype matter as much as technology. Most AI startups will fail, but the winners will dominate markets for decades.
The Future of AI: The real economic opportunity isn’t in training models but in owning infrastructure, inference, and execution. Prompt engineering will drive the next wave of creativity and business innovation and not constrain extraction of value from AI to just engineers.
The Freight Train vs. The Mopeds
Jonathan Ross might just be a cyborg. Listening to his conversation with Harry Stebbings on VC20 felt like downloading a GPU-accelerated stream of insights on AI, compute economics, and organizational scaling. From Big O notation applied to scaling companies, to Keynesian beauty contests, to the looming bottlenecks in AI infrastructure and the roadmap to AGI. One analogy that stuck with me is that AI compute is like transporting coal across a city using a freight trains versus mopeds.
If you're using edge computing, you’re essentially moving coal via mopeds—expensive, inefficient, and constrained by individual vehicle capacity.
If you centralize compute in data centers, you’re running a freight train—economies of scale, lower energy per token, and higher throughput.
And if you’re NVIDIA, you're monopolizing the railway lines and charging premium freight rates.
The Big O of AI: Scaling Laws and Compute Bottlenecks
Ross highlighted one of the most under-appreciated truths in AI: Founders should build with the mindset that AI will get better, faster, and cheaper. That means thinking in terms of scaling laws, not just short-term optimizations.
Big O Notation in AI Scaling
The difference between O(log n) vs. O(n^2) is the difference between a company scaling exponentially versus getting buried in inefficiencies.
GPUs vs. LPUs: Traditional GPUs process AI workloads like O(n^2)—great for brute force, but inefficient at scale. LPUs (Language Processing Units) are designed for O(log n) efficiencies, dramatically improving throughput and cost per token.
The Three Bottlenecks of AI:
Compute (Hardware) → Limited supply of High Bandwidth Memory (HBM) from Samsung & Micron.
Data (Training & Inference) → Synthetic data is proving more effective than human-generated data.
Algorithmic Efficiency → Moore’s Law is slowing, but architectural improvements (like keeping parameters in-chip) are stepping up.
And yet, inference is 20x the market of training. So while NVIDIA dominates training, the real war is happening in inference efficiency.
Training + Inference: NVIDIA and Groq
NVIDIA has a 70-80% margin in AI compute. Groq, in contrast, operates with a 20% margin but makes up for it in speed and cost efficiency:
Groq’s inference is 5x lower in cost and uses 1/3rd the energy per token.
40% of NVIDIA’s profit comes from inference, yet their model isn’t optimized for it.
Enterprise buyers focus on specsmanship (TFLOPs, FLOPS, etc.), yet what really matters is tokens per dollar and tokens per watt.
This is where business model innovation becomes more important than raw compute power. The limit isn’t how much we can spend, but how we scale our investments.
The Economics of AI: Why Power Is the Next Bottleneck
There is a mismatch in economics across the AI value chain. The amortization schedules and economics vary drastically:
Training model → 6 months
Buying chips → 3-5 years
Buying data centers → 10-15 years
Power Plans → 15-20 years
Ross pointed out that power is the next major constraint in AI infrastructure:
15 gigawatts currently powers all existing data centers.
An additional 20 gigawatts of power is is needed to be used.
In 3-4 years, power—not chips—will be the limiting factor.
The solution? Diversifying workloads across generic infrastructure.
Data centers are not just real estate; they are energy hubs.
A GPU data center today can be repurposed for EV charging, edge computing, or other workloads.
The AI Talent Equation: How to Scale Without Over-hiring
Ross’s Big O complexity philosophy on hiring is a wake-up call for every founder scaling an AI startup:
If you double customers, do you need to double employees? Not if you’re scaling correctly.
Choose Generalists over Specialists at early stages
Invest in automation over hiring at later stages.
The worst trap? Over-hiring before you hit PMF (Product-Market Fit).
This is Amazon’s approach to scaling vs. Walmart’s. Amazon doesn’t need to double websites every time revenue doubles. Walmart has to double stores, inventory, and headcount. AI startups need to scale like Amazon, not Walmart.
AI Outside of the US
Two regions stand at critical junctures:
China’s Scaling Strategy
This is like Sputnik 2.0 for the US
China is winning in Distillation of models
China may not have chip efficiency, but they have sheer scale. If chips aren’t efficient, they deploy more.
Yet, top talent is fleeing China due to regulatory constraints (Jack Ma effect).
Europe’s AI Problem: Risk Aversion
The biggest barrier to AI innovation in Europe? A culture of risk aversion.
Ross’s suggestion? Create enclaves of “Risk-On” founders who think like Silicon Valley.
The Future: AI, Energy, and the Next 20 Years
This isn’t just another AI bubble—though a lot of money will be incinerated, even more will be made.
Who will define AI? The ones who solve hallucinations first.
Who will dominate inference? The ones who optimize for tokens per dollar and tokens per watt.
Where is the next breakthrough? Agentic AI that doesn’t just predict—but invents.
Ross’s closing thought?
“If I knew I couldn’t fail, I’d place 100% of orders for every chip. But scaling AI isn’t about spending—it’s about surviving.”
In the AI arms race, it’s not the best technology that wins—it’s the best execution.
My Ask
Thank you for reading this article. I would be very grateful if you complete one or all of the three actions below. Thank you!
Like this article by using the ♥︎ button at the top or bottom of this page.
Share this article with others.
Subscribe to the elastic tier newsletter! *note* please check your junk mail if you cannot find it
References
NVidia Scaling Laws White paper: https://arxiv.org/abs/2001.08361
7 Powers (Corned resources): https://7powers.com/
LPUs: https://groq.com/wp-content/uploads/2024/07/GroqThoughts_WhatIsALPU-vF.pdf
$1.5bn Investment in Groq (Rev share): https://www.prnewswire.com/news-releases/saudi-arabia-announces-1-5-billion-expansion-to-fuel-ai-powered-economy-with-ai-tech-leader-groq-302372643.html
Monopsony: https://en.wikipedia.org/wiki/Monopsony
Jevons Paradox: https://books.google.com/books?id=gAAKAAAAIAAJ&q=editions:AAotKDT6KKcC&pg=PR3#v=onepage&q=editions%3AAAotKDT6KKcC&f=false