Why Enterprise AI Deployment Gets Stuck: Lessons from Mistral AI's Approach

Why Enterprise AI Deployment Gets Stuck: Lessons from Mistral AI's Approach (Illustration: Layer upon layer, every piece matters. Like this ham air-drying in the breeze, sacrificing itself for the greater good. Trust can’t be built all at once; it has to be stacked, one layer at a time. The first layer? Start by hitting that like button. Image source: Ernest.)

✳️ The Boring Plumbing Work

What actually blocks enterprise AI deployment is almost never that the models aren't good enough. Mistral AI CTO Timothée Lacroix put it bluntly: today's model capabilities are already sufficient to unlock massive enterprise value, but you first need to get all the connectors, data formats, and permission management right — all this "boring plumbing work" — before enterprise token consumption truly takes off. He used the word “plumbing” (and I’d stress the “system” part, not just the pipes). He said it three times throughout the interview. We’re still in the construction phase, he noted — most enterprises haven’t even gotten basic data connectivity right, let alone running AI agents executing tasks at scale in the background. (What we see on the ground is even bleaker: wishlists that don’t map to actual data. No wonder everyone just jumps straight to word-chaining chatbots — sarcasm mode engaged.)

Read More

The Man Who Built Claude Code Was Changed By It

Post Title Image (Illustration: Mt Rainier in Seattle, Reflection Lake inside the national park. Facing the mountain, facing time, what can we humans do in a post-AGI world? At the very least, being a decent human being should be the baseline, you demons. I am not sure I could live in the countryside, but I do enjoy having nothing to distract me. Then I found out that a Chinese idiom meaning “undistracted focus” is also a Year of the Horse blessing?! Image source: Ernest.)

✳️ This man has not touched a single line of code since November

I recently listened to Lenny’s Podcast interview with Boris Cherny, the person in charge of Anthropic’s Claude Code, formerly one of the most productive engineers at Instagram. Since last November, 100% of his code has been written by Claude Code. He has not manually edited a single line, shipping 10 to 30 PRs per day, and during the recording he had 5 agents running simultaneously. Anthropic's engineering team grew 4x while each engineer's output increased by 200%. In his previous role at Meta, he was responsible for code quality across the entire company. Back then, hundreds of engineers spending a full year would typically improve productivity by only a few percentage points. Now it is hundreds of percentage points. A completely different order of magnitude.

Read More

From Vibe Coding to Agentic Coding: Clear Communication Is the New Bottleneck

Post Title Image (Illustration: Taking the day off. Happy Lunar New Year! Image source: Business Next.)

✳️ Coding is easy, Context is hard

I recently recorded an episode of Podcast “Digital Keywords” EP228 with James from Business Next, discussing a core theme I kept chewing on throughout my 2025 year-in-review.

AI has crossed the threshold in writing code. It is no longer a small assistant that auto-completes one line at a time. Take Claude Code for example: it reads through your entire project directory on its own, understands cross-file context and dependencies, plans how to coordinate changes, even dispatches sub-agents to handle different tasks in parallel, and then delivers an entire feature in one go. More importantly, it has memory. You write your project specs, goals, and coding style into a file, and every time it starts working, it reads that file first. No need to re-explain everything from scratch.

From Cline and RooCode, to Cursor, to Claude Code and Kiro, my team and I have walked the entire tool evolution path together, feeling the role shift at every step. Humans can no longer compete with AI on speed. The bottleneck has moved from “think fast, code fast, iterate fast” to “can you articulate your requirements, context, and intent clearly?” I joked with my team: we all need to start practicing how to communicate well. This is not just an observation. It is the firmest conclusion I reached after being pushed through all of 2025, and the central theme of this podcast episode.

Read More

Coding Is Just Typing, Then What? From Explicit Programming to Implicit Programming

Post Title Image (A dish that breaks the rules of the game can show up even during a casual meal. Turns out Greek cuisine has its own version of sashimi, paired with a light touch of avocado mousse that balances out the richness of the tuna, making it the star among the bite-sized appetizers. At Taverna in Palo Alto, a dancing silhouette tucked in the corner of the menu reminds us: “Every day is a gift.” We will live well, and challenge the rules together. Image source: Ernest.)

✳️ Typing has become a commodity

Jensen Huang, holding a glass of white wine at the late-night session of Cisco AI Summit 2026, said: “Coding as it turns out is just typing. And typing as it turns out is a commodity.”

This isn’t putting down engineers. For 60 years, we’ve been telling computers exactly what to do, line by line in Fortran, C, C++. That’s called explicit programming. Now we’re entering the era of implicit programming, where we tell computers our intent and they figure out how to solve it. Just like that, the scarcity of “being able to write code” has vanished.

Since last Lunar New Year, I’ve gone from Cline to Cursor to Claude Code, and my deepest takeaway is: Coding is Easy, Context is Hard. When AI can write the code we want in seconds, the bottleneck has long shifted from typing speed to whether we can articulate the context of the problem, the boundary conditions, and the edge cases clearly. That’s why I moved from Vibe Coding toward Spec-Driven Development combined with organizational workflow frameworks: define “what to do” and “why” in human language first, then let AI handle “how to do it.”

Read More

2025 Year in Review: Slow Down to Go Fast

2025 Year In Review: Slow Down to Go Fast (Caption: Deliberately captured Tokyo Ginkgo with the Ginkgo Camera GR IV. Image source: Ernest Chiang.)

Sometimes you got to slow down to go fast.

This phrase kept echoing in my mind throughout 2025. While every industry chases the efficiency gains promised by AI, and anxiety about the future and artificial intelligence pervades, I kept returning to this note in my journal: “True progress often comes from deliberate pauses and deep thinking.”

The hard part isn’t stopping or slowing down—it’s being deliberate.

Deliberate practice, deliberate thinking, deliberate connection, deliberate verification.

Read More

From Snoopy to Sony: How Japan's Entertainment Giant is Building Global Franchises Through IP Acquisition

Post Title Image (Illustration: At AWS re:Invent 2025 CEO Keynote in Las Vegas, listening to Sony CDO’s eight-minute speech on Technology to “Create” and “Deliver” KANDO. Image source: Sony.)

✳️ tl;dr

  • Ever since listening to Sony CDO’s eight-minute speech on Technology to "Create" and "Deliver" KANDO at AWS re:Invent 2025 CEO Keynote in Las Vegas two weeks ago, I was not only impressed by AWS’s nuanced breakdown of AI, but deeply admired how this established company Sony integrates Amazon Bedrock into its own KANDO culture and execution.12
  • Sony acquired controlling interest (80% stake) in Peanuts Holdings for $460 million, combined with its initial $185 million investment in 2018, totaling $645 million over seven years. The transaction is expected to generate revaluation gains on equity as operating income.3
  • This is a case of balancing financial discipline with strategic vision. Sony chose staged investment rather than outright acquisition, validating the business model before increasing investment, reducing risk while building partnerships.3
  • Peanuts generates $2.5 billion in annual retail sales, with holiday products contributing $500 million. Compared to Sony’s $185 million investment in 2018, the brand demonstrates strong cash flow generation capacity and stable licensing revenue.4

  • Sony’s entertainment business revenue grew from 26% in fiscal 2012 to 60% in fiscal 2023, with content IP investment accounting for 57% of strategic investments (1.5 trillion yen). This shows the company’s successful transformation into a content-driven entertainment group.5
  • Sony CEO Yoshida and CFO Totoki emphasize “Creation Shift” and “Creative Entertainment Vision.” Totoki stated: “I am obsessed with growth. When growth stagnates, you fall into a negative spiral.”65
  • Sony’s recent IP investments include 10% stake in Kadokawa (50 billion yen), 2.5% stake in Bandai Namco (68 billion yen), and acquisition of Crunchyroll. These investments build an anime and gaming IP ecosystem.78
  • Compared to studio acquisitions (Bungie, Firewalk underperformed), direct IP investment offers more controllable risk. IP can be monetized across multiple platforms without being constrained by single-studio operational risks.9
  • Apple TV+ exclusive streaming agreement through 2030 ensures long-term revenue visibility. Platforms are willing to pay premium for classic IP because it attracts multi-generational audiences and builds cultural resonance, reducing churn.10
  • Sony plans to spin off part of its financial services business in 2025 to focus on entertainment and content creation. This demonstrates management’s determination to simplify the business portfolio and improve capital allocation efficiency.5
  • WildBrain received $460 million from the sale, to be used for debt repayment and investment in Strawberry Shortcake, Teletubbies, and digital content networks. For WildBrain, this represents portfolio optimization, focusing on core assets.3
  • (Speculation) Sony may increase Peanuts brand value by 50-100% within 5-10 years through cross-media integration (gaming, music, film) and expansion into new markets (Asia, Latin America).
  • The Schulz family retains a 20% stake to ensure brand heritage and quality control. This equity structure balances commercial interests with cultural legacy protection, crucial for long-term brand value.3

  • Peanuts’ cultural significance in Japan (inspiring Hello Kitty’s creation) provides Sony with unique advantages. Sony may strengthen Asian market expansion, as two-thirds of revenue already comes from outside the U.S.114
  • This case demonstrates how strategic IP acquisition builds lasting competitive advantage rather than chasing short-term trends. Peanuts’ 75-year history proves the enduring value and cross-generational appeal of classic IP.4

Read More

Is AI a Bubble? Howard Marks Dissects the $5 Trillion Infrastructure Bet

Post Title Image (Illustration: Frozen soap bubble. Image source: Photo by Jill Warvel on Unsplash.)

✳️ tl;dr

  • Howard Marks believes AI shows signs of “irrational exuberance,” but bubbles can usually only be identified in retrospect, and current valuations, while high, have not yet reached crazy levels 1
  • “One of the most interesting aspects of bubbles is their regularity, not in terms of timing, but rather the progression they follow. Something new and seemingly revolutionary appears and worms its way into people’s minds. It captures their imagination, and the excitement is overwhelming. The early participants enjoy huge gains. Those who merely look on feel incredible envy and regret and – motivated by the fear of continuing to miss out – pile in.”

  • AI exhibits bubble characteristics: revolutionary technology, FOMO-driven speculation, extremely high valuations, but “this time is different” may hold true with a 20% probability. (Everyone wants to predict the future but also hedge their bets?)
  • Circular deals raise concerns: Nvidia invests $100 billion in OpenAI, which uses that money to purchase Nvidia chips, with Goldman Sachs estimating 15% of Nvidia’s sales come from such transactions
  • Data center investment scale is staggering: JPMorgan estimates total AI infrastructure buildout costs at approximately $5 trillion, with spending approaching $500 billion next year
  • Debt financing risks escalate: Oracle, Meta, and Alphabet issue 30-year bonds to finance AI investments, with yields exceeding US Treasuries by only 100 basis points or less
  • Warren Buffett reminds us: automobiles were the most important invention of the first half of the 20th century, but only 3 out of 2,000 car companies survived, proving that technological importance doesn’t guarantee investor profits
  • AWS Hero Ernest recommends that technical decision-makers establish a multi-vendor strategy, avoid sole dependence on Nvidia, and evaluate alternatives such as AWS Trainium and Google TPU for cost control and supply chain resilience

  • AI chips have an actual useful life of only 1-3 years, yet companies use 5-6 year depreciation schedules, with Michael Burry accusing tech giants of inflating earnings 2
  • Nvidia shifted from a 2-year to an annual product cycle, with Jensen Huang joking: “Once Blackwell starts shipping, you couldn’t give Hoppers away”
  • Anthropic derives 80% of revenue from enterprise customers, with B2B models showing more promise due to higher transaction values, suggesting product strategy should prioritize enterprise markets 3
  • OpenAI expects to continue massive losses until 2028, with HSBC estimating it won’t be profitable by 2030, requiring an additional $207 billion in funding 4
  • Google TPU emerges as Nvidia’s strongest competitor, with 7th generation Ironwood offering 2x power efficiency improvement and 1.4x Nvidia’s cost-effectiveness, targeting 10% market share by 2027 5

  • SPVs (Special Purpose Vehicles) are used for data center financing, hiding off-balance-sheet debt, raising concerns similar to the Enron model
  • WEF predicts AI will displace 85 million jobs by 2030 but create 97 million new ones, though 77% of new jobs require master’s degrees, necessitating fundamental HR strategy adjustments 6
  • 77,999 jobs have already been lost to AI in 2025, averaging 491 people unemployed daily, with Microsoft reporting 30% of code written by AI while 40% of layoffs target engineers 7
  • Historical analogy: The AI bubble resembles the 1860s railroad boom and 1920s aviation bubble, both being “inflection bubbles” that accelerated technology adoption at investors’ expense 1
  • Current AI giants average a P/E ratio of about 34, lower than the dot-com bubble’s 59, but Shiller CAPE reaches 40.40, approaching dot-com bubble levels 8

Read More

Think in Context: AWS re:Invent 2025 Special Closing Keynote with Dr. Werner Vogels

Post Title Image (Illustration: AWS re:Invent 2025 Special Closing Keynote with Dr. Werner Vogels. Image source: AWS.)

Dr. Werner Vogels, AWS VP and CTO, delivered his final re:Invent keynote after 14 consecutive years since 2012. (Sigh, I was in the audience, feeling sad as I listened.) Instead of announcing new services, Werner presented “The Renaissance Developer” framework with five qualities that define how developers should evolve in the AI era. Guest speaker Clare Liguori demonstrated spec-driven development with the Kiro IDE, and Werner shared stories from his travels across Africa and Latin America to illustrate how developers are solving real-world problems.

✳️ tl;dr

One theme “The Renaissance Developer” runs throughout, with five qualities:

  • Be Curious: Experimentation and willingness to fail, the Yerkes-Dodson Law (stress-performance curve), social learning and touching the grass, global travel stories (AJE, Ocean Cleanup, Rwanda Health Intelligence, KOKO Networks), AWS Heroes (265 across 58 countries) (waving).
  • Think in Systems: Donella Meadows’ systems thinking, Yellowstone wolves and trophic cascades, reinforcing and balancing feedback loops, “Leverage Points: Places to Intervene in the System” paper.
  • Communicate: Spec-driven development reduces ambiguity, historical examples (Dijkstra’s structured programming, Apollo Guidance System), Clare Liguori and Kiro IDE (from vibe coding to spec-driven development, feature-driven specs, notification system shipped in half the time), rapid prototyping (Engelbart’s mouse analogy).
  • Be an Owner: Verification debt (AI generates code faster than you can understand it), hallucination challenges, mechanisms vs good intentions (Jeff Bezos / Amazon Andon Cord), S3 durability reviews, code reviews are more important than ever in the AI era.
  • Become a Polymath: I-shaped vs T-shaped developers, Jim Gray (Turing Award, transactions, Sloan Digital Sky Survey), deep domain expertise combined with broad knowledge.

✳️ Live Experience

(Caption: The keynote lineup at this year’s AWS re:Invent annual developer conference was significantly reshuffled from previous years. When I first saw the agenda, I noticed Monday Night Live had disappeared, replaced by a Special Closing Keynote as the grand finale of all five keynotes. My immediate thought was “Could it be…” — and it wasn’t until I arrived in Las Vegas that I confirmed “It’s really happening…” I made sure to clear my schedule to be there in person, to pay tribute to the CTO who essentially inspired me to start deconstructing and integrating the world. By the end of the talk, I was genuinely moved by this man on stage. I’ll never forget the energy when he looked at us and said “Now, Go Build!”)

(Caption: Before this year’s keynote, I was fortunate to meet CTO Werner Vogels again at a private dinner. Looking forward to his next chapter.)

Read More

Think in Context: AWS re:Invent 2025 Partner Keynote with Dr. Ruba Borno

Post Title Image (Illustration: AWS re:Invent 2025 Partner Keynote with Dr. Ruba Borno. Image source: AWS.)

Dr. Ruba Borno, VP of Global Specialists and Partners at AWS, delivered the Partner Advantage Keynote at re:Invent 2025 with a theme drawn from Arthur C. Clarke: "Magic is simply science we don't yet understand." Through live customer stories and product launches, she revealed the science behind extraordinary partner-driven outcomes. From Conde Nast flipping to 70% digital revenue and saving $10M, to Toyota realizing $1B in supply-chain AI value, to AWS Marketplace producing three billionaire partners (Datadog $2B, Snowflake $3B, Salesforce $3B), the keynote demonstrated how partners act as the catalyst that turns AWS technology into transformative customer results. The closing segment featuring Kiwa Digital’s indigenous AI platform challenged the audience to consider AI not just as a tool for disruption, but as a guardian of 40,000-year-old cultural heritage.

✳️ tl;dr

One theme “Partners as Catalyst” runs throughout, with six topic areas:

  • Agentic AI + Partner Foundation: AI Competency (300+ partners globally), customers working with competency partners are 30% more likely to deploy AI into production and move 25% faster; three new Agentic AI Competency categories (Applications, Tools, Consulting Services) launched with 50% more marketing development funds.
  • Customer Transformation Stories: Conde Nast flipped to 70% digital / 30% print with $10M cost reduction across 66 digital properties; A3Data + Mater Dei built 12 autonomous agents on AgentCore achieving 517% ROI, reducing procedure authorizations from 2 days to 40 minutes; Toyota Motor NA realized $1B business value with 60% inventory reduction and 85% customer preference prediction accuracy.
  • Data as Intelligence Layer: World Surf League + AllCloud turned ocean unpredictability into a data asset, capturing 100 data points per second per surfer, using Amazon Bedrock for real-time AI-powered commentary.
  • Migration and Modernization: AWS Transform saved 800K+ hours, analyzed 1B+ lines of mainframe code, and saved 380 developer years; Composability now GA, allowing partners to integrate their own tools and agents into Transform.
  • Marketplace Innovation: Express Private Offers GA with AI-powered custom pricing; Agent Mode for conversational solution discovery; Multi-Product Solutions combining software and professional services from multiple vendors; Marketplace billionaires: Datadog $2B, Snowflake $3B, Salesforce $3B; startup sales growing 130% YoY; 80% transaction volume now self-service.
  • Security and AgentCore: CrowdStrike Falcon next-gen SIEM automated discovery with pay-as-you-go pricing; IAM Temporary Delegation for safe partner access; AgentCore Runtime (isolated MicroVM per session), AgentCore Observability (OpenTelemetry compatible), and AgentCore Identity (seamless IAM across AWS and third-party apps).
  • Meta-observation: For every $1 of AWS services deployed, partners realize $7.13 in revenue. With AWS targeting $300-400B, the partner ecosystem opportunity is $2-3 trillion.

✳️ Live Experience

(Caption: Every year, the AWS re:Invent annual developer conference draws a massive gathering of partners from across the AWS ecosystem worldwide. The value generated by 60,000 people congregating and celebrating right before the year-end holidays is no joke. We have been practicing every year to integrate, build, and deepen our presence in this ecosystem, aiming to bring enterprise operations strategies that combine knowledge foundations and process orchestration to the clients who trust us.)

(Caption: Partners from all sectors host countless private events and meetings of all sizes at the AWS re:Invent annual developer conference. Some connections come through introductions from generous mentors and industry elders, while others are built through sustained outreach, resource exchange, and earned trust over time — making those rare biannual or annual face-to-face meetings all the more precious. A mentor once reminded me: you don’t always have to talk business. Being able to share your heart, your dreams, and your life is what builds the kind of circle that catches you when times get tough — that’s what true partnership looks like. Share the good times, shoulder the hard times, share information openly, and stay open-hearted. Words to live by.)

Read More

Think in Context: AWS re:Invent 2025 Keynote with Peter DeSantis and Dave Brown

Post Title Image (Illustration: AWS re:Invent 2025 Keynote with Peter DeSantis and Dave Brown. Image source: AWS.)

Peter DeSantis, SVP of Utility Computing at AWS, and Dave Brown took the stage at re:Invent 2025 to deliver a keynote that went deep into the infrastructure fundamentals powering the AI era. Rather than chasing the latest AI hype, they made a compelling case that the core cloud attributes we have relied on for two decades, security, availability, elasticity, agility, and cost, matter more than ever. The announcements ranged from Graviton5 with 192 cores in a single package and 5x more L3 cache, to Lambda Managed Instances that bridge the EC2-Lambda divide, to S3 Vectors hitting GA with sub-100ms queries over 2 billion vectors. On the AI acceleration front, Trainium3 UltraServer delivers 5x output tokens per megawatt, and PyTorch native support means porting GPU code to Trainium is literally a one-line change.

✳️ tl;dr

One theme “Infrastructure Fundamentals for the AI Era” runs throughout, with six sections:

  • Core Cloud Attributes: Security, availability, elasticity, agility, and cost remain the foundation. These attributes guided every AWS decision for 20 years and are even more critical in the AI era.
  • Nitro and Graviton Evolution: From custom silicon (Nitro) eliminating virtualization jitter to Graviton5 delivering 192 cores with 5x L3 cache. M9g instances offer up to 25% better performance than M8g. Guest: Payam Mirrashidi (Apple) on Swift + Graviton achieving 40% performance gains.
  • Serverless Expansion: Lambda Managed Instances bridges the gap between EC2 performance and Lambda simplicity. Your Lambda functions run on EC2 instances you choose, while Lambda manages provisioning, patching, and scaling.
  • Inference and Bedrock Architecture: Project Mantle inference engine powers Bedrock with service tiers (priority, standard, flexible), per-customer queue fairness, Journal for fault tolerance, and confidential computing.
  • Vector Search and S3 Vectors: Nova Multimodal Embeddings unifies text, image, video, audio into shared vector space. S3 Vectors GA achieves sub-100ms queries on 2 billion vectors. 250K+ vector indexes created in 4 months. Guest: Jae Lee (TwelveLabs) on video intelligence.
  • Trainium3 and AI Acceleration: Trainium3 UltraServer with 144 chips, 360 PetaFLOPS, 20TB HBM. 5x output tokens per megawatt. NKI GA and Neuron Explorer for performance profiling. PyTorch native support. Guest: Dean Leitersdorf (Decart) on real-time visual intelligence.

✳️ Live Experience

(Caption: This session used to be held on Monday evenings, originally known as Monday Night Live, and I’d often miss it due to dinner or meeting conflicts — only to catch up later. This year it clashed with other commitments again, so here are some stand-in photos of the Amazon EC2 UltraServer displayed on the CEO Keynote stage and the cute, mischievous Kiro who loves playing hide-and-seek — giving you a glimpse of the physical and protocol world that truly exists behind the cloud’s abstraction.)

Read More