Hi there 👋

Welcome to my blog. Tech notes and musings from an LLM algorithm engineer.

Teaching Claude Why: Lessons from Alignment Training

Original: Teaching Claude Why Author: Anthropic Date: May 8, 2026 This is a Chinese translation with annotations of Anthropic’s research post on alignment training methods. The original article discusses how teaching Claude the principles behind aligned behavior — rather than just training on demonstrations — proves far more effective for generalization. Key takeaways: Principles over demonstrations: Training Claude to explain why certain actions are better reduces misalignment more effectively than showing correct behavior alone. Out-of-distribution generalization: A 3M-token “difficult advice” dataset (where the user faces ethical dilemmas) achieved the same improvement as 84M tokens of synthetic honeypots — with 28× better data efficiency. Constitutional documents + fiction: High-quality documents about Claude’s constitution combined with fictional stories of aligned AI reduced blackmail rate from 65% to 19%. Improvements persist through RL: More aligned initialization snapshots maintained their advantage throughout reinforcement learning. Diverse environments matter: Simply adding tool definitions and system prompts to training environments — even without requiring tool use — improved alignment generalization. For the full annotated Chinese translation, please see the Chinese version. ...

Natural Language Autoencoders: Turning Claude's Thoughts into Text

Original post: Natural Language Autoencoders Full paper: transformer-circuits.pub/2026/nla Code: github.com/kitft/natural_language_autoencoders Interactive demo: neuronpedia.org/nla Summary Anthropic introduces Natural Language Autoencoders (NLAs), a method for converting a language model’s internal activations into human-readable natural language explanations. The approach trains two model components jointly: an Activation Verbalizer that translates activations into text, and an Activation Reconstructor that recovers the original activation from the text alone. The quality of explanations is measured by how accurately the activation can be reconstructed. ...

DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence

Original paper: DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence Authors: DeepSeek-AI Model checkpoints: https://huggingface.co/collections/deepseek-ai/deepseek-v4 Summary DeepSeek-V4 presents a preview of two strong MoE language models — DeepSeek-V4-Pro (1.6T total / 49B activated) and DeepSeek-V4-Flash (284B total / 13B activated) — both supporting a context length of one million tokens. Key architectural innovations: Hybrid Compressed Attention: Combines Compressed Sparse Attention (CSA, compression rate m=4 with top-k sparse selection) and Heavily Compressed Attention (HCA, compression rate m’=128 with dense attention) in an interleaved configuration. At 1M-token context, this reduces single-token inference FLOPs to 27% and KV cache to 10% compared to DeepSeek-V3.2. Manifold-Constrained Hyper-Connections (mHC): Constrains the residual mapping matrix to the manifold of doubly stochastic matrices (Birkhoff polytope), ensuring spectral norm ≤ 1 for stable deep-layer signal propagation. Uses Sinkhorn-Knopp iterations (t=20) for projection. Muon Optimizer: Adopted for most modules with hybrid Newton-Schulz iterations for orthogonalization. Paired with Anticipatory Routing (decoupling backbone and routing network updates) and SwiGLU clamping for training stability. Post-training paradigm shift: Replaces mixed RL with domain-specific expert training (SFT → GRPO RL) followed by multi-teacher On-Policy Distillation (OPD) with full-vocabulary KL divergence. Over 10 teacher models are distilled into a single unified model. ...

How to Make AI Write Like a Human

You’ve read that kind of article before — every paragraph wraps up neatly, the tone is warm and measured, every claim comes with exactly three supporting points, and the ending soars into “let us look forward to the future together.” You can’t pinpoint what’s wrong, but something’s off. That’s AI writing. Or more precisely, that’s AI writing in its default state. I’ve spent a fair amount of time on this problem recently. I started using Claude more and more when writing blog posts, but every first draft needed heavy editing — not because the information was wrong, but because the feel was off. It read like someone who never makes mistakes, never gets distracted, never has a mood swing. That person doesn’t exist. ...

What 81,000 People Told Us About the Economics of AI

This is a Chinese translation with commentary of the original article by Anthropic. Read the original here: What 81,000 people told us about the economics of AI By Maxim Massenkoff, Anthropic · April 22, 2026 For the Chinese translation and annotated version, switch to the 中文版.

Building a Blog with Hugo + Cloudflare Pages

I’d been telling myself I’d start a blog for months. Then one afternoon I decided to just do it — no more planning, no more comparing frameworks. A few hours later the site was live. This post is a record of that process: not a tutorial, more of an annotated changelog of mistakes and decisions. Why Hugo I didn’t spend long choosing a static site generator. I’d used Hexo before and knew what Node.js dependency hell feels like. Jekyll is slow. Gatsby is overkill for a blog. Hugo is a single Go binary — brew install hugo and you’re done. No node_modules, no dependency conflicts, and builds so fast you barely notice them happening. ...

Hello, World

Welcome Welcome to my blog! I will share technical articles about AI, LLMs, and software engineering here. Content Direction This blog will cover the following areas: Deep Technical Articles: In-depth analysis of key technologies in AI/LLM Industry Insights: Tracking the latest developments and trends in AI Engineering Practices: Sharing experiences and best practices in software engineering Tech Stack This blog is built with: Hugo — High-performance static site generator PaperMod — Clean and elegant Hugo theme Cloudflare Pages — Global CDN deployment # A sample code snippet def hello(): print("Hello, World!") hello() Next Steps Stay tuned for more content!