Fixing LM Studio gpt-oss Model Outputting Mixed Reasoning Content

Post Title Image (Photo by Felix Eka Putra Kuntjoro on Unsplash)

Background

I just thought of switching the post-processing for MacWhisper Dictation from the original google/gemma-3-12b to using openai/gpt-oss-20b in LM Studio, but I kept encountering an issue where the gpt-oss model would return the reasoning process as part of the dictation output results. Here’s the problematic output:

We need to correct punctuation: use full-width. 
Input: "嗨,我們明天去兒童樂園玩好嗎?" 
We replace comma with ,, question mark with ?. 
Also add period at end? 
The sentence ends with question mark already. 
So output: "嗨,我們明天去兒童樂園玩好嗎?"
嗨,我們明天去兒童樂園玩好嗎?

Read More

AWS Reshaping AI Infrastructure: How S3 Vectors Changes the Vector Search Game

Post Title Image (Illustration: By decompositioning and reintegrating the world using vectors, we may be able to rediscover a new world. Image source: Photo by James Wainscoat on Unsplash)

✳️ tl;dr for Technical Managers

  • Amazon S3 Vectors integrates native vector search functionality directly into S3 object storage, aiming to simplify architecture and reduce costs.1
  • For RAG, semantic search and other AI applications, AWS claims up to 90% savings in vector storage and query costs.
  • This means we no longer need to maintain a separate, expensive vector database for certain AI scenarios, significantly reducing operational complexity (TCO).2
  • S3 Vectors provides sub-second query performance, suitable for large-scale applications with non-real-time latency requirements.3

  • Core Advantage: Achieves “Storage-Compute Separation” for vector data, maximizing the cost-effectiveness of long-term storage.4
  • Through integration with Amazon OpenSearch, enables “hot-cold” data tiering strategies that balance cost with high-performance query requirements.5
  • Seamless integration with Amazon Bedrock Knowledge Bases makes building and scaling RAG applications unprecedentedly simple.6
  • The emergence of S3 Vectors may force existing vector database vendors (like Pinecone) to rethink their market positioning and pricing strategies.

  • Technical teams now need to reassess existing AI technology stacks to determine which workloads can migrate to S3 Vectors for cost optimization.
  • The feature introduces new Vector Buckets, making vector management as simple as managing regular S3 objects.
  • Speculation: S3 Vectors may further enhance its query capabilities in the future, such as supporting hybrid search to adapt to more complex scenarios.

✳️ tl;dr for Engineers & Developers

  • Amazon S3 now natively supports vector storage and search with the new S3 Vectors feature.1
  • You can directly create Vector Buckets and Vector Indexes in S3, then use APIs to store and query embeddings.
  • For many RAG applications, you can skip the step of deploying and managing a separate Vector DB.7
  • APIs support k-NN similarity search with sub-second response times, supporting distance metrics like Cosine and Euclidean.

  • Development Highlight: Queries can use metadata filters, such as (category = 'ernest-pkm' AND year > 2023), which is very practical.3
  • Seamless integration with Amazon Bedrock Knowledge Bases - once you set up S3 as a data source, Bedrock automatically handles embedding and synchronization.6
  • If your application needs lower latency or more complex searches (like hybrid search), you can export hot vectors to Amazon OpenSearch Service.5

  • For developers already using AWS, the learning curve is low - basically just learning a few more S3 API calls.
  • The entire service is serverless, meaning you don’t need to worry about scaling, provisioning, or aRPU - just focus on customer application scenarios.
  • You can more economically build a massive long-term memory repository for your AI agents, with all interaction records or knowledge vectorized and stored in S3.
  • For massive documents or images stored in S3 Data Lake,4 you can now directly build indexes in-place and perform semantic search without ETL to another system. (Well… if your massive files aren’t in S3 yet… just make a few API calls XD)

✳️ tl;dr for Marketing & Product People

  • Imagine all your company’s documents, customer service conversations, product images, and even videos being searchable using natural language. Amazon S3 Vectors is making this cheaper and simpler.1
  • Previously, only resource-rich companies could afford to build large-scale vector search systems. Now, AWS has built this capability directly into S3, dramatically lowering the technical barriers and costs.
  • Use Cases: Media companies can quickly find relevant clips in PB-scale video libraries; healthcare institutions can identify similar cases among millions of medical images.
  • For e-commerce, this can build more precise semantic search engines that understand what users “want,” not just what they “type.”

  • Business Value: This is not just a technical upgrade, but a key to unlocking the value of enterprise “unstructured data”.4
  • S3 Vectors makes RAG (Retrieval-Augmented Generation) technology more accessible, meaning your chatbots or AI customer service can provide relatively accurate, well-grounded responses.6
  • Market Trend: Vector search is evolving from a niche technology to part of cloud storage infrastructure.

  • This innovation will accelerate AI adoption across industries by solving the fundamental “data preparation” and “knowledge storage” cost problems.
  • For product managers, this means you can be bolder when planning AI features that require massive knowledge bases. (Think of S3 as the master key, like the keymaker in The Matrix.)
  • S3 Vectors’ integration with Amazon Bedrock Knowledge Bases provides a one-stop knowledge base solution.6
  • Speculation: More third-party SaaS applications based on S3 Vectors will emerge, focusing on industry-specific knowledge management and semantic search. This speculation is based on the development patterns of ISVs in the AWS ecosystem.2
  • Enterprises should now consider: What dormant data can be “vectorized” to create new business value? Don’t worry about data formats initially - if you think it’s “data,” try it out, starting with small datasets.

Read More

How Anthropic Teams Use Claude Code: Comprehensive Agentic Coding from Infrastructure to Product to Security to Legal

Post Title Image (Illustration: Claude Code performing agentic coding. Image source: Anthropic.)

✳️ tl;dr

  • Anthropic recently shared real-world use cases of how their internal teams use Claude Code 1 2, giving me a glimpse of the evolution from simple code completion to “agentic Software Development Life Cycle (agentic SDLC)”.
  • Data Infrastructure teams let Claude Code use OCR to read error screenshots, diagnose Kubernetes IP exhaustion, and provide fix commands
  • Non-technical finance staff can simply describe requirements in natural language, and Claude Code automatically generates queries and outputs Excel reports
  • Product Development teams in auto-accept mode let Claude Code autonomously write 70% of Vim mode code
  • Security Engineering uses Claude Code to quickly parse Terraform plans, complete security reviews, and reduce development bottlenecks
  • Inference teams rely on Claude Code to generate unit tests covering edge cases, reducing research and development time by 80%
  • DS/ML teams use Claude Code to build 5,000-line TypeScript dashboards, transitioning from one-time analysis to long-term reusable tools

  • MCP (Model Context Protocol) 3 allows Claude to access precise configurations and data in secure environments
  • Claude Code leverages “self-verification loops”: write code → run tests/CI → automatically fix errors, advancing agentic SDLC
  • Third-generation AI coding tools are integrating into end-to-end development processes, from requirements to deployment with full automation
  • Anthropic uses RLAIF and Constitutional AI training methods to enable Claude to demonstrate industry-leading self-correction capabilities in code generation

45678

Read More

America's AI Action Plan (2025-07): Reshaping Global AI Dominance Through Three-Pillar Strategy

Post Title Image (Illustration: Hydrographic survey storage area. Each tube contains a registered hydrographicsurvey. Image source: Photo by NOAA on Unsplash.)

✳️ tl;dr

  • The US government released America's AI Action Plan in July 2025, aiming to reshape America’s global AI dominance
  • Three strategic pillars: Accelerate AI Innovation, Build AI Infrastructure, Lead International AI Diplomacy & Security
  • Revoked previous administration’s AI Executive Order 14110, removing “red tape and overregulation”

  • Emphasizes protecting free speech and American values, ensuring AI systems objectively pursue truth
  • Encourages open-source and open-weight AI models development to promote innovation and commercial adoption
  • Establishes streamlined permitting for data centers, semiconductor manufacturing, and energy infrastructure, realizing the “Build, Baby, Build” vision

  • Revitalizes US semiconductor manufacturing through CHIPS Program Office
  • Builds military-grade high-security data centers to resist nation-state attack threats
  • Exports complete AI technology stack to partners

  • Establishes regulatory sandboxes and AI excellence centers, enabling research institutions, startups, and enterprises to rapidly deploy and test AI tools
  • Creates AI Workforce Research Center to continuously assess AI’s impact on labor markets and provide policy recommendations
  • Invests in automated cloud laboratories covering engineering, materials science, chemistry, biology, and other scientific fields

  • Includes full text: America's AI Action Plan (PDF to Markdown) 1

2314567

Read More

Kiro: Agentic IDE by AWS - Beyond Vibe Coding Blind Box

Post Title Image (Caption: Installing Kiro. Image source: Ernest’s MBP.)

✳️ tl;dr

  • Feel like AI | Vibe coding is like opening a blind box?
  • Kiro 1 uses Specs to help you read the manual before unboxing
  • One prompt → automatically expands into user stories, complete with EARS requirement standards

  • Ernest attempts to deconstruct Kiro’s four-layer architecture (Intent Layer, Knowledge Layer, Execution Layer, Oversight Layer) 2
  • Kiro AI = Kiro Agentic IDE

  • Tasks list directly connects to unit/integration tests, reducing the awkwardness of forgetting to write tests
  • Hooks let everyone unleash their imagination with event-driven automation - build your own automation
  • Steering project guidance principles ensure consistency, making Kiro follow organizational culture and connect knowledge management
  • Supports RWD and A11y - frontend is well taken care of too

  • Free during preview period, supports Mac/Win/Linux (grab it while you can!)
  • Kiro is based on Code OSS, compatible with VS Code
  • VS Code users should be able to migrate seamlessly, though some extensions aren’t available in Kiro yet
  • Kiro + WSL2 solution 2

  • Extended use cases: Kiro + dev container for isolation
  • Extended use cases: Kiro + Remote SSH + EC2 (CloudShell?) within VPC

345

Read More

Firecracker-Powered Containers Arrive on Cloudflare

Post Title Image (Illustration: Brazil’s largest port, Port of Santos, provides container loading and unloading services. Image source: Photo by sergio souza on Unsplash。)

✳️ tl;dr

  • Cloudflare Containers 1 enters public beta, immediately available for paid users with full Workers integration.
  • Region: Earth global deployment, containers start in seconds, developers don’t need to select regions.
  • Through Worker→Container binding, dynamically generates isolated instances by ID, suitable for multi-tenant platforms.
  • Three instance types: dev/basic/standard covering 256 MiB, 1 GiB, 4 GiB memory requirements.
  • 10ms billing granularity with separate CPU, memory, and disk metering, plus free tier included.
  • Built-in Metrics/Logs retained for 7 days, supports external LogSink, reducing observability integration costs.
  • Upcoming: autoscale = true enables global auto-scaling and latency-aware routing.

  • Cloudflare Containers runs on AWS-developed open-source Firecracker microVM 2 with KVM isolation, reducing multi-tenant side-channel risks while maintaining startup speed and resource efficiency.
  • Firecracker microVM: < 125ms cold start, < 5 MiB memory, balancing security and density.
  • Ernest Chiang demonstrated 3 running 4,000 microVMs in 90 seconds on i3.metal at COSCUP 2020 Firecracker workshop.

45678910

Read More

Interoperate Integrate Iterate a 10 Year Pm Survival Kit for Traditional Sectors

Post Title Image (Illustration: Ernest at Taiwan Product Conference 2025. Image source: Bob Chao.)

Witnessing Taiwan’s journey from nascent tech communities blossoming into full-fledged technical conferences, we’ve observed an organic proliferation of product-centric opportunities emerging alongside industrial metamorphosis and the ever-increasing complexity of cross-disciplinary integration. This ecosystem now encompasses the multifaceted spectrum of product operation, product marketing, product design, product management, and product development—a verdant and thriving landscape.

Yet beneath this seemingly lush canopy, one wonders: are we nurturing an organic greenhouse, or merely cultivating a wild tangle of weeds? As we navigate through this labyrinth of uncertainty, none among us possess the definitive answer. But perhaps not knowing is precisely what makes everything possible—it means we can still venture forth to explore, retreat home to experiment, and dare to iterate through our discoveries. On this sweltering weekend, we—a collective of souls orbiting the product ecosystem—gathered at the inaugural Taiwan Product Conference 2025, attempting to forge something meaningful together.

Conference reflections shall be compiled separately.

This piece unfolds in two movements: first, the release of presentation slides, followed by supplementary Q&A elucidations. I warmly invite you to utilize the feedback form found on the final slide to share perspectives from any angle, pose questions, or engage in dialogue. Looking forward to our next shared endeavor.

Read More

Latency Ping: A Cloud Global Data Center Speed Testing Tool

Post Title Image (Illustration: Walk nearby Le Bouchon Ogasawara in Shibuya, Tokyo. Image source: Ernest)

✳️ tl;dr

With the launch of AWS Taipei Region (ap-east-2), it’s time to update that overgrown web latency testing tool.

  • HTTP overhead is quite heavy, but when you don’t have CLI tools available or when dealing with remote clients, it can provide a simple evaluation.

  • Arranged roughly according to city distances.
  • Based on AWS Region naming rules, supplemented by my modest geographical knowledge.
  • Following the principle of not over-categorizing to avoid too fine granularity. If your principles differ, feel welcome to fork from upstream and modify it to your liking.

Read More

AWS Summit Hong Kong 2025 Dev Lounge: Reinventing Programming - How AI Transforms Our Enterprise Coding Approach

Post Title Image (Illustration: AWS Summit Hong Kong 2025 Dev Lounge. Image source: Dorothy.)

✳️ Background

I’m honored to have been selected after submitting my talk proposal—thank you to everyone who has quietly encouraged me along the way :) It’s also a privilege to be part of the 10th anniversary of AWS Summit Hong Kong. Thinking back to 2014, when Lenie, Locarno, and I invited Jeff Barr to Taiwan to deliver a keynote at COSCUP, to supporting the AWS Hero program, and now standing here in Hong Kong sharing with developers and technical managers at the AWS Summit Dev Lounge—it truly feels like a fortunate convergence of serendipity and iterative progress. My heart is full of gratitude.

This talk draws from over a year of experience working with traditional industries, streamlining processes, and modeling object states. It also reflects how our own product and technology integration (PTI) teams have adapted our workflows to collaborate with AI tools (with a fair share of pitfalls along the way). By combining Amazon Q CLI with an AI-augmented perspective on existing processes, we explore the mindset of inviting AI to be a new team member to join us. Like onboarding any new colleague, there will be a period of adjustment—but unless we invite this new teammate (or team?!) of AI to join us, that adaptation can never begin.

Read More

My Workflow: Setting up nRF52 DK Development Environment on Apple Silicon (M4 Pro)

Post Title Image (Illustration: Unbox Apple MacBook Pro M4 Pro and reMarkable Paper Pro. Taken at AWS re:Invent 2024, Las Vegas. Image source: Ernest)

Today I spent some time setting up a development environment for the nRF52 DK (PCA10040) board on my macOS Sequoia 15.1.1 machine running on an Apple Macbook Pro M4 Pro (Apple Silicon). This blog post documents the process and can serve as a reference for anyone working with Nordic Semiconductor’s nRF52 DK (development kit).

Read More