Bottlerocket-Powered ECS Managed Instances Bring Enterprise-Grade Security to Simplified Container Management

(Photo by Rikin Katyal on Unsplash)

✳️ tl;dr

AWS introduces ECS Managed Instances, achieving the optimal balance between operational simplicity and flexibility by offloading infrastructure management to AWS while maintaining full EC2 control ¹
Runs on Bottlerocket OS, which maintains only ~100 package definitions compared to general-purpose operating systems with 50,000 packages, significantly reducing attack surface and management complexity ²
Protects the root filesystem with dm-verity and SELinux enforcing mode, making it difficult to persist attacks even after container escape, with automatic system restart upon tampering detection ³⁴

The container orchestration market is projected to grow from $10.8 billion in 2025 to $76.5 billion by 2034, with a CAGR of 24.16%, demonstrating strong demand for managed container services ⁵
Research shows that heterogeneous task allocation strategies can reduce container orchestration costs by 23% to 32%, with ECS Managed Instances’ automatic workload consolidation being key to achieving this goal ⁶

Bottlerocket’s atomic update model reduces the time to patch critical vulnerabilities from days or weeks to hours, potentially cutting update-related downtime by 80% compared to traditional systems ⁷⁴
The service is currently available in six AWS regions, including US East (North Virginia), US West (Oregon), Europe (Ireland), Africa (Cape Town), Asia Pacific (Singapore), and Asia Pacific (Tokyo), with plans to expand to more regions to support global deployment needs
Supports deployment through AWS Management Console, CLI, CDK, and CloudFormation, seamlessly integrating with existing DevOps toolchains to lower adoption barriers
In 2022, Ernest shared “Running Laravel/PHP on AWS” at AWS Builders Day Taiwan, comparing various Amazon ECS Launch Types. Looks like it’s time to update those slides. ⁸

⁹¹⁰¹¹

Introducing Claude Sonnet 4.5

Published: 2025-09-30

by Ernest Chiang

Random Notes

(Photo by Brett Jordan on Unsplash)

✳️ tl;dr

Claude Code introduces Checkpoints feature, enabling real-time progress saving and rollback to previous states, solving pain points in long-term development ¹²
Native VS Code extension released, supporting real-time inline diffs display of Claude Code’s code changes
Claude Sonnet 4.5 achieves 77.2% on SWE-bench Verified (full 500 problems), using simple bash and file editing tools ¹³

High-compute configuration with parallel testing reaches 82.0%, using rejection sampling and internal scoring models to select best candidates ¹
API adds memory tools and automatic context management, allowing agents to maintain context across long-running tasks ¹⁴
Pricing remains the same as Claude Sonnet 4: $3 per million input tokens, $15 per million output tokens, with prompt caching saving up to 90% costs ⁵

Cursor CEO states Claude Sonnet 4.5 demonstrates state-of-the-art programming performance on long-horizon tasks, making it the top choice for developers solving complex problems ¹

Achieves 61.4% on OSWorld benchmark, leading all competitors; Sonnet 4 was only 42.2% four months ago
Implements AI Safety Level 3 (ASL-3) protection framework, CBRN threat detection false positive rate reduced tenfold compared to initial description
Makes progress in prompt injection attack defense, one of the most serious security risks facing agentic AI systems
First inclusion of “Interpretability” technical assessment in System Card, improving model transparency and credibility ⁶

⁷

Hands-on with Claude for Chrome - Research Preview

Published: 2025-09-30

by Ernest Chiang

Random Notes

(Photo by Jonny Gios on Unsplash)

✳️ tl;dr

New toy Claude for Chrome ¹ ²
Signed up and joined the waitlist a long time ago
Received the invitation email yesterday
Found some time to play with it today
Need to continue working on my tasks
Wait, shouldn’t I ask it to help me with my tasks?!

Looks like I’m in the second batch of Max plan users getting Claude for Chrome research preview access
Tried a few tasks with Claude for Chrome during the holiday
It requests permission for a specific domain in the URL first, and you can approve step by step

If I were to read the documentation and operate manually by myself, especially when unfamiliar (or doing it for the first time), it would take about 2~3 minutes (reading while operating on dual screens) to complete the task. But it requires mental processing and finding things on the screen (or using cmd+f to find action links or buttons to proceed).
In comparison, Claude for Chrome takes about 10~15 minutes, but it did complete the tasks (for example, helping me set up GTM, helping me dig through Quickbooks to find the transaction history of a specific Account). This is because it needs to operate step by step - each step requires a screenshot, interpreting the screen and deciding where to click next, some screens require typing text into input fields, and even after clicking a text box it takes another screenshot to confirm the focus is on the text box (I almost impulsively wanted to help it click several times, but held back to avoid disrupting the flow).

I’m currently using it with Edge (because I need vertical tabs and multiple isolated profiles), and it seems I encountered situations where Claude for Chrome changed my original tab group names, closed other tabs in the original tab group without asking me, and Claude for Chrome opened new tabs (e.g., GA4) in the tab group and started operating on its own, while I was still on the GTM tab on my screen.
From an experience perspective, of course at this stage it’s not efficient.
But from a results perspective, Claude for Chrome did complete the tasks, which means the reasoning for breaking down steps and the knowledge base are sufficient. We just need to wait for next-generation models or possible future local infrastructure to reduce the round-trip time, and efficiency will directly improve.

Are you also thinking about how to introduce AI Agents or Agentic Workflows into your team or workflow? Feel free to chat.

Debugging Kiro's zsh CLI Session Issues: Missing Command Return Values

Published: 2025-09-19

by Ernest Chiang

Problem Solving

Problem Description

Recently, while playing with Kiro (Agentic IDE) and experimenting with multiple-role agents and hooks, I encountered a quite frustrating problem: within the same CLI session, Kiro couldn't retrieve the return values from the second command.

The problem scenario was as follows:

When I asked Kiro to execute the first command, everything worked perfectly - the command executed successfully and could retrieve the output correctly.
However, when executing the second command within the same shell session, Kiro would be unable to get the return value from that command.
So Kiro would just hang there waiting for the response, unable to get the vibe going.

This problem significantly impacted my workflow:

Frequent session restarts:
- After executing the first command, I had to manually close the CLI session each time. If I need to intervene this frequently as a human, that’s not the AI agent I want.
Manual output copying:
- Or I needed to manually copy the return values from the second command to Kiro Chat.
Reduced work efficiency:
- The originally smooth agentic coding workflow became fragmented. While I can relax when it’s time to relax, for work tasks, I want to improve efficiency.

Problem scenario video record (some key information in the video has been regenerated, please don’t panic):

Fixing LM Studio gpt-oss Model Outputting Mixed Reasoning Content

Published: 2025-08-12

by Ernest Chiang

Random Notes | AI

(Photo by Felix Eka Putra Kuntjoro on Unsplash)

Background

I just thought of switching the post-processing for MacWhisper Dictation from the original google/gemma-3-12b to using openai/gpt-oss-20b in LM Studio, but I kept encountering an issue where the gpt-oss model would return the reasoning process as part of the dictation output results. Here’s the problematic output:

We need to correct punctuation: use full-width. 
Input: "嗨,我們明天去兒童樂園玩好嗎?" 
We replace comma with ，, question mark with ？. 
Also add period at end? 
The sentence ends with question mark already. 
So output: "嗨，我們明天去兒童樂園玩好嗎？"
嗨，我們明天去兒童樂園玩好嗎？

AWS Reshaping AI Infrastructure: How S3 Vectors Changes the Vector Search Game

Published: 2025-07-29

by Ernest Chiang

Random Notes | AI

(Illustration: By decompositioning and reintegrating the world using vectors, we may be able to rediscover a new world. Image source: Photo by James Wainscoat on Unsplash)

✳️ tl;dr for Technical Managers

Amazon S3 Vectors integrates native vector search functionality directly into S3 object storage, aiming to simplify architecture and reduce costs.¹
For RAG, semantic search and other AI applications, AWS claims up to 90% savings in vector storage and query costs.
This means we no longer need to maintain a separate, expensive vector database for certain AI scenarios, significantly reducing operational complexity (TCO).²
S3 Vectors provides sub-second query performance, suitable for large-scale applications with non-real-time latency requirements.³

Core Advantage: Achieves “Storage-Compute Separation” for vector data, maximizing the cost-effectiveness of long-term storage.⁴
Through integration with Amazon OpenSearch, enables “hot-cold” data tiering strategies that balance cost with high-performance query requirements.⁵
Seamless integration with Amazon Bedrock Knowledge Bases makes building and scaling RAG applications unprecedentedly simple.⁶
The emergence of S3 Vectors may force existing vector database vendors (like Pinecone) to rethink their market positioning and pricing strategies.

Technical teams now need to reassess existing AI technology stacks to determine which workloads can migrate to S3 Vectors for cost optimization.
The feature introduces new Vector Buckets, making vector management as simple as managing regular S3 objects.
Speculation: S3 Vectors may further enhance its query capabilities in the future, such as supporting hybrid search to adapt to more complex scenarios.

✳️ tl;dr for Engineers & Developers

Amazon S3 now natively supports vector storage and search with the new S3 Vectors feature.¹
You can directly create Vector Buckets and Vector Indexes in S3, then use APIs to store and query embeddings.
For many RAG applications, you can skip the step of deploying and managing a separate Vector DB.⁷
APIs support k-NN similarity search with sub-second response times, supporting distance metrics like Cosine and Euclidean.

Development Highlight: Queries can use metadata filters, such as (category = 'ernest-pkm' AND year > 2023), which is very practical.³
Seamless integration with Amazon Bedrock Knowledge Bases - once you set up S3 as a data source, Bedrock automatically handles embedding and synchronization.⁶
If your application needs lower latency or more complex searches (like hybrid search), you can export hot vectors to Amazon OpenSearch Service.⁵

For developers already using AWS, the learning curve is low - basically just learning a few more S3 API calls.
The entire service is serverless, meaning you don’t need to worry about scaling, provisioning, or aRPU - just focus on customer application scenarios.
You can more economically build a massive long-term memory repository for your AI agents, with all interaction records or knowledge vectorized and stored in S3.
For massive documents or images stored in S3 Data Lake,⁴ you can now directly build indexes in-place and perform semantic search without ETL to another system. (Well… if your massive files aren’t in S3 yet… just make a few API calls XD)

✳️ tl;dr for Marketing & Product People

Imagine all your company’s documents, customer service conversations, product images, and even videos being searchable using natural language. Amazon S3 Vectors is making this cheaper and simpler.¹
Previously, only resource-rich companies could afford to build large-scale vector search systems. Now, AWS has built this capability directly into S3, dramatically lowering the technical barriers and costs.
Use Cases: Media companies can quickly find relevant clips in PB-scale video libraries; healthcare institutions can identify similar cases among millions of medical images.
For e-commerce, this can build more precise semantic search engines that understand what users “want,” not just what they “type.”

Business Value: This is not just a technical upgrade, but a key to unlocking the value of enterprise “unstructured data”.⁴
S3 Vectors makes RAG (Retrieval-Augmented Generation) technology more accessible, meaning your chatbots or AI customer service can provide relatively accurate, well-grounded responses.⁶
Market Trend: Vector search is evolving from a niche technology to part of cloud storage infrastructure.

This innovation will accelerate AI adoption across industries by solving the fundamental “data preparation” and “knowledge storage” cost problems.
For product managers, this means you can be bolder when planning AI features that require massive knowledge bases. (Think of S3 as the master key, like the keymaker in The Matrix.)
S3 Vectors’ integration with Amazon Bedrock Knowledge Bases provides a one-stop knowledge base solution.⁶
Speculation: More third-party SaaS applications based on S3 Vectors will emerge, focusing on industry-specific knowledge management and semantic search. This speculation is based on the development patterns of ISVs in the AWS ecosystem.²
Enterprises should now consider: What dormant data can be “vectorized” to create new business value? Don’t worry about data formats initially - if you think it’s “data,” try it out, starting with small datasets.

How Anthropic Teams Use Claude Code: Comprehensive Agentic Coding from Infrastructure to Product to Security to Legal

Published: 2025-07-28

by Ernest Chiang

Random Notes | AI | Software Development

(Illustration: Claude Code performing agentic coding. Image source: Anthropic.)

✳️ tl;dr

Anthropic recently shared real-world use cases of how their internal teams use Claude Code ¹ ², giving me a glimpse of the evolution from simple code completion to “agentic Software Development Life Cycle (agentic SDLC)”.
Data Infrastructure teams let Claude Code use OCR to read error screenshots, diagnose Kubernetes IP exhaustion, and provide fix commands
Non-technical finance staff can simply describe requirements in natural language, and Claude Code automatically generates queries and outputs Excel reports
Product Development teams in auto-accept mode let Claude Code autonomously write 70% of Vim mode code
Security Engineering uses Claude Code to quickly parse Terraform plans, complete security reviews, and reduce development bottlenecks
Inference teams rely on Claude Code to generate unit tests covering edge cases, reducing research and development time by 80%
DS/ML teams use Claude Code to build 5,000-line TypeScript dashboards, transitioning from one-time analysis to long-term reusable tools

MCP (Model Context Protocol) ³ allows Claude to access precise configurations and data in secure environments
Claude Code leverages “self-verification loops”: write code → run tests/CI → automatically fix errors, advancing agentic SDLC
Third-generation AI coding tools are integrating into end-to-end development processes, from requirements to deployment with full automation
Anthropic uses RLAIF and Constitutional AI training methods to enable Claude to demonstrate industry-leading self-correction capabilities in code generation

Have you ever wondered what degree AI coding tools can achieve within your organization or for your own workflows?
In May this year, I shared “Reinventing Programming: How AI Transforms Enterprise Software Development” at AWS Summit Hong Kong 2025, breaking it down from SDLC. Those interested can read along with the slides.

⁴⁵⁶⁷⁸

America's AI Action Plan (2025-07): Reshaping Global AI Dominance Through Three-Pillar Strategy

Published: 2025-07-26

by Ernest Chiang

Random Notes | AI Policy

(Illustration: Hydrographic survey storage area. Each tube contains a registered hydrographicsurvey. Image source: Photo by NOAA on Unsplash.)

✳️ tl;dr

The US government released America's AI Action Plan in July 2025, aiming to reshape America’s global AI dominance
Three strategic pillars: Accelerate AI Innovation, Build AI Infrastructure, Lead International AI Diplomacy & Security
Revoked previous administration’s AI Executive Order 14110, removing “red tape and overregulation”

Emphasizes protecting free speech and American values, ensuring AI systems objectively pursue truth
Encourages open-source and open-weight AI models development to promote innovation and commercial adoption
Establishes streamlined permitting for data centers, semiconductor manufacturing, and energy infrastructure, realizing the “Build, Baby, Build” vision

Revitalizes US semiconductor manufacturing through CHIPS Program Office
Builds military-grade high-security data centers to resist nation-state attack threats
Exports complete AI technology stack to partners

Establishes regulatory sandboxes and AI excellence centers, enabling research institutions, startups, and enterprises to rapidly deploy and test AI tools
Creates AI Workforce Research Center to continuously assess AI’s impact on labor markets and provide policy recommendations
Invests in automated cloud laboratories covering engineering, materials science, chemistry, biology, and other scientific fields

Includes full text: America's AI Action Plan (PDF to Markdown) ¹

²³¹⁴⁵⁶⁷

Kiro: Agentic IDE by AWS - Beyond Vibe Coding Blind Box

Published: 2025-07-16

by Ernest Chiang

Random Notes | Software Development

(Caption: Installing Kiro. Image source: Ernest’s MBP.)

✳️ tl;dr

Feel like AI | Vibe coding is like opening a blind box?
Kiro ¹ uses Specs to help you read the manual before unboxing
One prompt → automatically expands into user stories, complete with EARS requirement standards

Ernest attempts to deconstruct Kiro’s four-layer architecture (Intent Layer, Knowledge Layer, Execution Layer, Oversight Layer) ²
Kiro AI = Kiro Agentic IDE

Tasks list directly connects to unit/integration tests, reducing the awkwardness of forgetting to write tests
Hooks let everyone unleash their imagination with event-driven automation - build your own automation
Steering project guidance principles ensure consistency, making Kiro follow organizational culture and connect knowledge management
Supports RWD and A11y - frontend is well taken care of too

Free during preview period, supports Mac/Win/Linux (grab it while you can!)
Kiro is based on Code OSS, compatible with VS Code
VS Code users should be able to migrate seamlessly, though some extensions aren’t available in Kiro yet
Kiro + WSL2 solution ²

Extended use cases: Kiro + dev container for isolation
Extended use cases: Kiro + Remote SSH + EC2 (CloudShell?) within VPC

³⁴⁵

Firecracker-Powered Containers Arrive on Cloudflare

Published: 2025-06-25

by Ernest Chiang

Random Notes

(Illustration: Brazil’s largest port, Port of Santos, provides container loading and unloading services. Image source: Photo by sergio souza on Unsplash。)

✳️ tl;dr

Cloudflare Containers ¹ enters public beta, immediately available for paid users with full Workers integration.
Region: Earth global deployment, containers start in seconds, developers don’t need to select regions.
Through Worker→Container binding, dynamically generates isolated instances by ID, suitable for multi-tenant platforms.
Three instance types: dev/basic/standard covering 256 MiB, 1 GiB, 4 GiB memory requirements.
10ms billing granularity with separate CPU, memory, and disk metering, plus free tier included.
Built-in Metrics/Logs retained for 7 days, supports external LogSink, reducing observability integration costs.
Upcoming: autoscale = true enables global auto-scaling and latency-aware routing.

Cloudflare Containers runs on AWS-developed open-source Firecracker microVM ² with KVM isolation, reducing multi-tenant side-channel risks while maintaining startup speed and resource efficiency.
Firecracker microVM: < 125ms cold start, < 5 MiB memory, balancing security and density.
Ernest Chiang demonstrated ³ running 4,000 microVMs in 90 seconds on i3.metal at COSCUP 2020 Firecracker workshop.

⁴⁵⁶⁷⁸⁹¹⁰