<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
  xmlns:atom="http://www.w3.org/2005/Atom"
  xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Eugene&#39;s Page</title>
    <link>https://eugenepage.com/</link>
    
    <atom:link href="https://eugenepage.com/rss.xml" rel="self" type="application/rss+xml"/>
    
    <description></description>
    <pubDate>Sat, 27 Jun 2026 11:42:29 GMT</pubDate>
    <generator>http://hexo.io/</generator>
    
    <item>
      <title></title>
      <link>https://eugenepage.com/zh-CN/2026/06/27/20250801.UEConceptScrapbox/</link>
      <guid>https://eugenepage.com/zh-CN/2026/06/27/20250801.UEConceptScrapbox/</guid>
      <pubDate>Sat, 27 Jun 2026 11:42:29 GMT</pubDate>
      
        
        
      <description>&lt;h2 id=&quot;UE“反射”概念：&quot;&gt;&lt;a href=&quot;#UE“反射”概念：&quot; class=&quot;headerlink&quot; title=&quot;UE“反射”概念：&quot;&gt;&lt;/a&gt;UE“反射”概念：&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;反射&lt;/strong&gt;：UE 通过 UClass&amp;#x2</description>
        
      
      
      
      <content:encoded><![CDATA[<h2 id="UE“反射”概念："><a href="#UE“反射”概念：" class="headerlink" title="UE“反射”概念："></a>UE“反射”概念：</h2><ul><li><strong>反射</strong>：UE 通过 UClass&#x2F;UProperty 等系统在运行时提供类型信息和动态访问能力。<br>UE 的反射系统是通过 UHT 工具和特定宏实现的代码生成机制。你用 UCLASS 标记类、UPROPERTY 标记变量、UFUNCTION 标记函数，这些宏会被 UHT 识别。<br>UHT 在编译前扫描这些标记，生成.generated.h 和.cpp 文件，里面包含类的反射注册代码，比如 StaticClass () 函数和 UClass 对象的构造逻辑。<br>生成的代码会把类信息注册到引擎全局的 GObjectClasses 数组里，让引擎在运行时能动态获取类结构、调用函数或访问属性，这支撑了蓝图交互、垃圾回收等核心功能。</li></ul><p>因为 UE 需要在运行时动态处理代码信息。比如蓝图可视化编程，引擎得通过反射知道 C++ 类有哪些函数和变量，才能让蓝图调用它们。</p><p>比如你在 C++ 里写了一个角色类，里面有个 UFUNCTION 标记的跳跃函数 Jump ()。没有反射的话，蓝图编辑器根本不知道这个 Jump () 函数存在，因为编译后的机器码里，函数名和参数这些信息都被优化掉了。<br>有了反射，UHT 会在编译时为这个 Jump () 函数生成反射元数据，包括函数名、参数类型、返回值，以及它属于哪个类。引擎运行时能通过这些元数据，在蓝图编辑器里把 Jump () 函数显示出来，你才能拖拽节点调用它。<br>如果后续你在 C++ 里给 Jump () 加了一个高度参数，反射系统会自动更新元数据，蓝图里对应的函数节点也会同步显示出新参数，整个过程不需要手动写任何蓝图和 C++ 交互的绑定代码。</p><h2 id="回退操作-Command-模式（轻量级）："><a href="#回退操作-Command-模式（轻量级）：" class="headerlink" title="回退操作 Command 模式（轻量级）：**"></a>回退操作 Command 模式（轻量级）：**</h2><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">每次操作封装为 ICommand &#123; Do(); Undo(); &#125;</span><br><span class="line">维护 undoStack 和 redoStack</span><br><span class="line">执行操作 → 压入 undoStack，清空 redoStack</span><br><span class="line">Undo → 弹出 undoStack，执行 Undo()，压入 redoStack</span><br></pre></td></tr></table></figure><p><strong>2. Snapshot 模式（适用于复杂场景）：</strong></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">操作前序列化整个对象状态的快照</span><br><span class="line">Undo 时直接恢复快照</span><br><span class="line">优点：实现简单，不容易出 bug</span><br><span class="line">缺点：内存开销大</span><br></pre></td></tr></table></figure><p><strong>实际项目中的混合方案：</strong></p><ul><li>简单属性修改 → Command 模式（记录 oldValue&#x2F;newValue）</li><li>复杂操作（节点图变更、场景编辑）→ Snapshot 或 Diff 模式</li><li>合并机制：连续同类操作合并（如拖拽 Slider 时合并为一条记录）</li></ul><h2 id="UE智能指针对比表"><a href="#UE智能指针对比表" class="headerlink" title="UE智能指针对比表"></a>UE智能指针对比表</h2><table><thead><tr><th>指针类型</th><th>管理对象</th><th>所有权</th><th>核心作用</th><th>适用场景</th></tr></thead><tbody><tr><td>TObjectPtr</td><td>UObject派生类</td><td>共享</td><td>安全访问UObject，自动参与垃圾回收</td><td>替代传统UPROPERTY指针，日常UObject引用</td></tr><tr><td>TWeakObjectPtr</td><td>UObject派生类</td><td>无</td><td>弱引用UObject，不阻止回收</td><td>避免循环引用，临时访问可能被销毁的UObject</td></tr><tr><td>TSoftObjectPtr</td><td>UObject派生类</td><td>无</td><td>软引用UObject，支持资源异步加载</td><td>引用可能未加载的资源，如关卡外的模型、纹理</td></tr><tr><td>TSharedPtr</td><td>非UObject类型</td><td>共享</td><td>通过引用计数管理生命周期</td><td>需要多持有者共享非UObject资源</td></tr><tr><td>TUniquePtr</td><td>非UObject类型</td><td>独占</td><td>唯一拥有对象，不可复制</td><td>管理无需共享的非UObject资源，如自定义数据结构</td></tr><tr><td>TWeakPtr</td><td>非UObject类型</td><td>无</td><td>弱引用TSharedPtr，不增加引用计数</td><td>配合TSharedPtr避免循环引用</td></tr></tbody></table><h2 id="关键区别说明"><a href="#关键区别说明" class="headerlink" title="关键区别说明"></a>关键区别说明</h2><ol><li><p><strong>管理对象边界</strong>：前三种严格用于UObject派生类，依赖UE垃圾回收系统；后三种用于非UObject类型，靠手动内存管理机制。</p></li><li><p><strong>UObject指针细分</strong>：</p></li></ol><ul><li>TObjectPtr是强引用，会让UObject保持存活，是日常开发的首选。</li><li>TWeakObjectPtr是弱引用，当UObject被标记为回收时，指针会自动置空，常用在UI控件引用角色对象这类场景。</li><li>TSoftObjectPtr存储的是资源路径而非直接内存地址，对象未加载时可异步加载，适合开放世界游戏引用远处的资源。</li></ul><ol start="3"><li><strong>非UObject指针细分</strong>：</li></ol><ul><li>TSharedPtr通过引用计数共享对象，当引用计数为0时自动释放内存，但需注意手动避免循环引用。</li><li>TUniquePtr是独占式指针，不允许复制，只能通过移动语义转移所有权，性能开销最小。</li><li>TWeakPtr需要绑定到TSharedPtr使用，当TSharedPtr释放对象后，TWeakPtr会自动失效，解决循环引用问题。</li></ul><h2 id="ECS-架构是什么？和传统-OOP-有什么区别？"><a href="#ECS-架构是什么？和传统-OOP-有什么区别？" class="headerlink" title="ECS 架构是什么？和传统 OOP 有什么区别？"></a>ECS 架构是什么？和传统 OOP 有什么区别？</h2><table><thead><tr><th></th><th>OOP</th><th>ECS</th></tr></thead><tbody><tr><td>数据布局</td><td>对象分散在堆上</td><td>Component 连续内存排列</td></tr><tr><td>缓存友好性</td><td>差（指针跳转）</td><td>好（数据局部性）</td></tr><tr><td>逻辑组织</td><td>方法绑定在类上</td><td>System 独立遍历 Component</td></tr><tr><td>组合性</td><td>需要多重继承&#x2F;组合模式</td><td>天然组合（挂 Component 即可）</td></tr><tr><td>其实ECS节省的是cpu去查找的时间。</td><td></td><td></td></tr></tbody></table><p><strong>核心概念：</strong></p><ul><li><strong>Entity</strong>：ID 标识，不存数据</li><li><strong>Component</strong>：纯数据（Position, Velocity, Health…）</li><li><strong>System</strong>：纯逻辑（MovementSystem 遍历所有 Position+Velocity 组件）</li></ul><p>核心区别：OOP 以对象为核心，数据与逻辑封装在类中，易形成复杂继承树；ECS 将数据与逻辑分离，实体为组件容器，系统批量处理同类组件，数据连续存储提升缓存效率，支持动态组合与并行计算。<br>UE5 Mass 系统案例：作为 ECS 实现，Mass 用 “片段” 存储实体数据，“处理器” 统一处理逻辑。如《黑客帝国》Demo 中的万人级 crowd 模拟，通过将角色位置、速度等数据打包连续存储，移动处理器可批量更新所有角色坐标，性能远超传统 Actor 方案。</p><h2 id="堆Stack-栈heap"><a href="#堆Stack-栈heap" class="headerlink" title="堆Stack 栈heap"></a>堆Stack 栈heap</h2><ul><li><strong>堆（Heap）</strong>：动态分配内存，大小不固定，生命周期由程序员控制，访问速度较慢，适合存储大对象或需要在运行时确定大小的数据。（没有固定的存取顺序）</li><li><strong>栈（Stack）</strong>：自动分配内存，大小固定，生命周期由函数调用控制，访问速度快，适合存储局部变量和函数参数。（有固定的存取顺序，后进先出）</li></ul><h2 id="Function-Calling-的原理是什么？你在项目中怎么用的？"><a href="#Function-Calling-的原理是什么？你在项目中怎么用的？" class="headerlink" title="Function Calling 的原理是什么？你在项目中怎么用的？"></a>Function Calling 的原理是什么？你在项目中怎么用的？</h2><p><strong>原理：</strong> LLM 不直接执行函数，而是 <strong>输出结构化的函数调用意图</strong>（函数名 + 参数），由宿主程序解析并执行。</p><h2 id="RAG-是什么？你是怎么实现的？"><a href="#RAG-是什么？你是怎么实现的？" class="headerlink" title="RAG 是什么？你是怎么实现的？"></a>RAG 是什么？你是怎么实现的？</h2><p><strong>RAG（Retrieval-Augmented Generation）</strong> &#x3D; 先检索相关文档，再让 LLM 基于检索结果回答。</p><h2 id="ControlNet-是什么？它解决了什么问题？"><a href="#ControlNet-是什么？它解决了什么问题？" class="headerlink" title="ControlNet 是什么？它解决了什么问题？"></a>ControlNet 是什么？它解决了什么问题？</h2><p><strong>参考答案：</strong></p><p><strong>ControlNet</strong> 为预训练 Diffusion Model 添加 <strong>空间控制能力</strong>。</p><p><strong>解决的问题：</strong> 原始 Text-to-Image 无法精确控制生成图像的构图、姿态、边缘等空间结构。</p><p><strong>原理：</strong></p><ul><li>在 Stable Diffusion 的 U-Net 每个 Block 上添加一个并行的 “Zero Convolution” 分支</li><li>输入额外的条件图（边缘检测&#x2F;Canny、深度图、姿态&#x2F;OpenPose、法线贴图等）</li><li>训练时只训练 ControlNet 分支，冻结原始模型</li></ul><p><strong>常见 ControlNet 类型：</strong></p><ul><li>Canny Edge：控制轮廓</li><li>Depth：控制深度结构</li><li>OpenPose：控制人物姿态</li><li>Segment：控制区域分割</li><li>Scribble：控制草图</li></ul><h2 id="LoRA-是什么？为什么它很受欢迎？"><a href="#LoRA-是什么？为什么它很受欢迎？" class="headerlink" title="LoRA 是什么？为什么它很受欢迎？"></a>LoRA 是什么？为什么它很受欢迎？</h2><p><strong>LoRA（Low-Rank Adaptation）</strong> 是一种参数高效微调方法。<br>LoRA 是一种参数高效的大模型微调技术，核心是冻结原模型权重，仅训练少量低秩矩阵来模拟任务适配所需的参数更新。它参数量仅为全量微调的 0.1%-1%，大幅降低显存占用和训练成本，且推理时可合并权重无额外延迟。<br>在游戏领域，能快速微调图生图模型生成风格统一的角色装备、场景素材，或微调对话模型让 NPC 生成符合设定的自然台词，适配小团队高效开发需求。</p><h2 id="MVC、MVP、MVVM-的区别是什么？"><a href="#MVC、MVP、MVVM-的区别是什么？" class="headerlink" title="MVC、MVP、MVVM 的区别是什么？"></a>MVC、MVP、MVVM 的区别是什么？</h2><table><thead><tr><th>模式</th><th>组件职责</th><th>组件关系</th><th>优缺点</th></tr></thead><tbody><tr><td>MVC</td><td>Model（数据）<br>View（界面）<br>Controller（逻辑）</td><td>Controller 直接操作 Model 和 View</td><td>简单直观，适合小型项目；Controller 可能变得臃肿</td></tr><tr><td>MVP</td><td>Model（数据）<br>View（界面）<br>Presenter（逻辑）</td><td>Presenter 直接操作 Model，间接更新 View</td><td>Presenter 可测试性强；View 依赖 Presenter，增加耦合</td></tr><tr><td>MVVM</td><td>Model（数据）<br>View（界面）<br>ViewModel（逻辑）</td><td>ViewModel 直接操作 Model，通过数据绑定更新 View</td><td>双向绑定简化 UI 更新；学习曲线较陡峭，可能引入性能问题</td></tr></tbody></table><ul><li>MVP的Preseter和MVVM的ViewModel在职责上非常相似，都是处理业务逻辑和数据交互的中介，但MVVM通过数据绑定机制让ViewModel直接更新View，减少了Presenter中大量的UI更新代码，使得代码更简洁、可测试性更强。MVVM适合复杂UI交互较多的项目，而MVP则更适合简单UI或需要严格分离测试的场景。</li></ul><h2 id="GPU-渲染流水线的完整阶段？"><a href="#GPU-渲染流水线的完整阶段？" class="headerlink" title="GPU 渲染流水线的完整阶段？"></a>GPU 渲染流水线的完整阶段？</h2><p><strong>参考答案：</strong></p><p>GPU 渲染管线（Rendering Pipeline）是 GPU 执行图形渲染的完整流程：</p><p><strong>应用阶段（CPU 侧）：</strong></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">1. 应用阶段（Application Stage）</span><br><span class="line">   → 视锥体裁剪（Frustum Culling）</span><br><span class="line">   → 批次合批（Draw Call Batching）</span><br><span class="line">   → 输出 Draw Call + 顶点数据到 GPU</span><br></pre></td></tr></table></figure><p><strong>几何阶段（GPU 顶点着色器）：</strong></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">2. 顶点着色器（Vertex Shader）</span><br><span class="line">   → 模型空间 → 世界空间 → 视图空间 → 齐次裁剪空间</span><br><span class="line">   → 顶点变换：LocalMatrix × WorldMatrix × ViewMatrix × ProjectionMatrix</span><br><span class="line"></span><br><span class="line">3. 曲面细分（Tessellation，可选）</span><br><span class="line">   → Hull Shader → Tessellator → Domain Shader</span><br><span class="line"></span><br><span class="line">4. 几何着色器（Geometry Shader，可选）</span><br><span class="line">   → 以图元为单位处理，可生成/销毁图元</span><br></pre></td></tr></table></figure><p><strong>光栅化阶段（Rasterization）：</strong></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">5. 图元装配 &amp; 裁剪</span><br><span class="line">   → Clipping（齐次空间裁剪）</span><br><span class="line">   → Perspective Divide → NDC → Viewport Transform</span><br><span class="line"></span><br><span class="line">6. 背面剔除（Back-face Culling）</span><br><span class="line">   → 根据顶 点环绕顺序（顺时针/逆时针）剔除背面</span><br><span class="line"></span><br><span class="line">7. 光栅化（Rasterization）</span><br><span class="line">   → 离散化：点/线/三角形 → 片段（Fragment）</span><br><span class="line">   → 视口变换：NDC → Screen Space</span><br></pre></td></tr></table></figure><p><strong>片段&#x2F;像素阶段：</strong></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">8. 片段着色器（Fragment / Pixel Shader）</span><br><span class="line">   → 逐像素着色：光照计算、纹理采样、颜色输出</span><br><span class="line"></span><br><span class="line">9. 逐片段操作（Per-Fragment Operations）</span><br><span class="line">   → 深度测试（Depth Test / Z-Test）</span><br><span class="line">   → 模板测试（Stencil Test）</span><br><span class="line">   → 混合（Alpha Blending）</span><br><span class="line">   → 输出到 Framebuffer</span><br></pre></td></tr></table></figure>]]></content:encoded>
      
      
      
      
      <comments>https://eugenepage.com/zh-CN/2026/06/27/20250801.UEConceptScrapbox/#disqus_thread</comments>
      
    </item>
    
    <item>
      <title></title>
      <link>https://eugenepage.com/2026/06/27/20250801.UEConceptScrapbox/</link>
      <guid>https://eugenepage.com/2026/06/27/20250801.UEConceptScrapbox/</guid>
      <pubDate>Sat, 27 Jun 2026 11:42:29 GMT</pubDate>
      
        
        
      <description>&lt;p&gt;lang: “en”&lt;/p&gt;
&lt;h2 id=&quot;UE-“Reflection”-Concept&quot;&gt;&lt;a href=&quot;#UE-“Reflection”-Concept&quot; class=&quot;headerlink&quot; title=&quot;UE “Reflection” Concept&quot;&gt;&lt;/a</description>
        
      
      
      
      <content:encoded><![CDATA[<p>lang: “en”</p><h2 id="UE-“Reflection”-Concept"><a href="#UE-“Reflection”-Concept" class="headerlink" title="UE “Reflection” Concept"></a>UE “Reflection” Concept</h2><ul><li><strong>Reflection</strong>: UE provides runtime type information and dynamic access capabilities through systems like UClass and UProperty.<br>UE’s reflection system is a code-generation mechanism implemented via the UHT tool and specific macros. You mark classes with UCLASS, variables with UPROPERTY, and functions with UFUNCTION — these macros are recognized by UHT.<br>Before compilation, UHT scans these markers and generates <code>.generated.h</code> and <code>.cpp</code> files containing the reflection registration code for each class, such as the <code>StaticClass()</code> function and the construction logic for UClass objects.<br>The generated code registers class information into the engine’s global <code>GObjectClasses</code> array, enabling the engine at runtime to dynamically retrieve class structure, invoke functions, or access properties — which in turn powers blueprint interaction, garbage collection, and other core features.</li></ul><p>This exists because UE needs to dynamically process code information at runtime. Blueprint visual scripting, for example, requires the engine to know via reflection which functions and variables a C++ class exposes so that blueprints can call them.</p><p>Say you write a character class in C++ with a <code>Jump()</code> function marked with UFUNCTION. Without reflection, the blueprint editor has no idea <code>Jump()</code> exists — the function name, parameters, and all that metadata get optimized away in the compiled machine code.<br>With reflection, UHT generates reflection metadata for <code>Jump()</code> at compile time, including its name, parameter types, return value, and which class it belongs to. At runtime, the engine uses this metadata to surface <code>Jump()</code> in the blueprint editor as a callable node you can drag in.<br>If you later add a height parameter to <code>Jump()</code> in C++, the reflection system automatically updates the metadata, and the corresponding blueprint node syncs up to show the new parameter — no manual binding code between blueprint and C++ required.</p><h2 id="Undo-Redo-—-Command-Pattern-Lightweight"><a href="#Undo-Redo-—-Command-Pattern-Lightweight" class="headerlink" title="Undo&#x2F;Redo — Command Pattern (Lightweight)"></a>Undo&#x2F;Redo — Command Pattern (Lightweight)</h2><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">Wrap each operation as ICommand &#123; Do(); Undo(); &#125;</span><br><span class="line">Maintain undoStack and redoStack</span><br><span class="line">Execute operation → push to undoStack, clear redoStack</span><br><span class="line">Undo → pop from undoStack, call Undo(), push to redoStack</span><br></pre></td></tr></table></figure><p><strong>2. Snapshot Pattern (for complex scenarios):</strong></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">Serialize a snapshot of the entire object state before each operation</span><br><span class="line">On Undo, restore the snapshot directly</span><br><span class="line">Pros: simple to implement, less prone to bugs</span><br><span class="line">Cons: high memory overhead</span><br></pre></td></tr></table></figure><p><strong>Hybrid approach for real projects:</strong></p><ul><li>Simple property edits → Command pattern (record oldValue&#x2F;newValue)</li><li>Complex operations (node graph changes, scene edits) → Snapshot or Diff pattern</li><li>Merge mechanism: collapse consecutive operations of the same type (e.g., dragging a slider merges into a single history entry)</li></ul><h2 id="UE-Smart-Pointer-Comparison"><a href="#UE-Smart-Pointer-Comparison" class="headerlink" title="UE Smart Pointer Comparison"></a>UE Smart Pointer Comparison</h2><table><thead><tr><th>Pointer Type</th><th>Managed Object</th><th>Ownership</th><th>Core Role</th><th>Use Case</th></tr></thead><tbody><tr><td>TObjectPtr</td><td>UObject-derived</td><td>Shared</td><td>Safe UObject access, participates in garbage collection automatically</td><td>Replaces traditional UPROPERTY pointers; everyday UObject references</td></tr><tr><td>TWeakObjectPtr</td><td>UObject-derived</td><td>None</td><td>Weak reference to UObject, does not prevent GC</td><td>Avoids circular references; temporary access to UObjects that may be destroyed</td></tr><tr><td>TSoftObjectPtr</td><td>UObject-derived</td><td>None</td><td>Soft reference to UObject, supports async asset loading</td><td>References assets that may not be loaded, e.g., meshes or textures outside the current level</td></tr><tr><td>TSharedPtr</td><td>Non-UObject types</td><td>Shared</td><td>Manages lifetime via reference counting</td><td>Multiple owners sharing a non-UObject resource</td></tr><tr><td>TUniquePtr</td><td>Non-UObject types</td><td>Exclusive</td><td>Sole ownership, non-copyable</td><td>Managing non-UObject resources that don’t need sharing, e.g., custom data structures</td></tr><tr><td>TWeakPtr</td><td>Non-UObject types</td><td>None</td><td>Weak reference to a TSharedPtr, does not increment ref count</td><td>Avoids circular references when used alongside TSharedPtr</td></tr></tbody></table><h2 id="Key-Distinctions"><a href="#Key-Distinctions" class="headerlink" title="Key Distinctions"></a>Key Distinctions</h2><ol><li><p><strong>Object boundary</strong>: The first three are strictly for UObject-derived classes and rely on UE’s garbage collection system; the last three are for non-UObject types and use manual memory management.</p></li><li><p><strong>UObject pointer breakdown</strong>:</p><ul><li>TObjectPtr is a strong reference that keeps the UObject alive — the go-to choice for everyday development.</li><li>TWeakObjectPtr is a weak reference; when a UObject is marked for collection, the pointer is automatically nulled. Common in scenarios like UI widgets holding references to character objects.</li><li>TSoftObjectPtr stores a resource path rather than a direct memory address. The asset can be asynchronously loaded when it isn’t in memory, making it ideal for open-world games referencing distant assets.</li></ul></li><li><p><strong>Non-UObject pointer breakdown</strong>:</p><ul><li>TSharedPtr shares an object via reference counting, automatically freeing memory when the count reaches zero. Be mindful of circular references — they must be avoided manually.</li><li>TUniquePtr is an exclusive pointer: no copying allowed, ownership transfers only through move semantics. Lowest performance overhead.</li><li>TWeakPtr must be bound to a TSharedPtr. Once TSharedPtr releases the object, TWeakPtr automatically becomes invalid, resolving circular reference issues.</li></ul></li></ol><h2 id="What-is-ECS-Architecture-How-Does-It-Differ-from-Traditional-OOP"><a href="#What-is-ECS-Architecture-How-Does-It-Differ-from-Traditional-OOP" class="headerlink" title="What is ECS Architecture? How Does It Differ from Traditional OOP?"></a>What is ECS Architecture? How Does It Differ from Traditional OOP?</h2><table><thead><tr><th></th><th>OOP</th><th>ECS</th></tr></thead><tbody><tr><td>Data layout</td><td>Objects scattered on the heap</td><td>Components laid out in contiguous memory</td></tr><tr><td>Cache friendliness</td><td>Poor (pointer chasing)</td><td>Good (data locality)</td></tr><tr><td>Logic organization</td><td>Methods bound to classes</td><td>Systems iterate over Components independently</td></tr><tr><td>Composability</td><td>Requires multiple inheritance &#x2F; composition patterns</td><td>Natural composition (just attach Components)</td></tr></tbody></table><p>ECS essentially saves CPU time on data lookups.</p><p><strong>Core concepts:</strong></p><ul><li><strong>Entity</strong>: an ID only, stores no data</li><li><strong>Component</strong>: pure data (Position, Velocity, Health…)</li><li><strong>System</strong>: pure logic (MovementSystem iterates all Position+Velocity components)</li></ul><p>Core difference: OOP centers on objects — data and logic are encapsulated in classes, which tends to grow complex inheritance trees. ECS separates data from logic: entities are containers for components, systems process batches of the same component type, contiguous data storage improves cache efficiency, and the architecture naturally supports dynamic composition and parallel computation.<br>UE5 Mass system example: as an ECS implementation, Mass stores entity data in “fragments” and unifies logic in “processors.” The Matrix Awakens demo’s crowd simulation of thousands of characters packs position, velocity, and other data into contiguous storage, letting the movement processor batch-update all character coordinates — performance that far exceeds the traditional Actor approach.</p><h2 id="Stack-vs-Heap"><a href="#Stack-vs-Heap" class="headerlink" title="Stack vs. Heap"></a>Stack vs. Heap</h2><ul><li><strong>Heap</strong>: Dynamically allocated memory, variable size, lifetime controlled by the programmer, slower access. Suitable for large objects or data whose size is determined at runtime. (No fixed access order.)</li><li><strong>Stack</strong>: Automatically allocated memory, fixed size, lifetime controlled by the function call, fast access. Suitable for local variables and function parameters. (Fixed access order: last in, first out.)</li></ul><h2 id="What-Is-Function-Calling-and-How-Have-You-Used-It-in-Projects"><a href="#What-Is-Function-Calling-and-How-Have-You-Used-It-in-Projects" class="headerlink" title="What Is Function Calling and How Have You Used It in Projects?"></a>What Is Function Calling and How Have You Used It in Projects?</h2><p><strong>How it works:</strong> The LLM doesn’t execute functions directly — it <strong>outputs a structured function-call intent</strong> (function name + arguments), which the host program parses and executes.</p><h2 id="What-Is-RAG-and-How-Did-You-Implement-It"><a href="#What-Is-RAG-and-How-Did-You-Implement-It" class="headerlink" title="What Is RAG and How Did You Implement It?"></a>What Is RAG and How Did You Implement It?</h2><p><strong>RAG (Retrieval-Augmented Generation)</strong> &#x3D; retrieve relevant documents first, then have the LLM answer based on those retrieved results.</p><h2 id="What-Is-ControlNet-and-What-Problem-Does-It-Solve"><a href="#What-Is-ControlNet-and-What-Problem-Does-It-Solve" class="headerlink" title="What Is ControlNet and What Problem Does It Solve?"></a>What Is ControlNet and What Problem Does It Solve?</h2><p><strong>Reference answer:</strong></p><p><strong>ControlNet</strong> adds <strong>spatial control capabilities</strong> to a pretrained Diffusion Model.</p><p><strong>The problem it solves:</strong> Raw Text-to-Image generation cannot precisely control the composition, pose, edges, or other spatial structure of generated images.</p><p><strong>How it works:</strong></p><ul><li>A parallel “Zero Convolution” branch is added to each block of Stable Diffusion’s U-Net</li><li>Additional conditioning images are fed as input (edge detection &#x2F; Canny, depth maps, pose &#x2F; OpenPose, normal maps, etc.)</li><li>During training, only the ControlNet branch is trained; the original model weights are frozen</li></ul><p><strong>Common ControlNet types:</strong></p><ul><li>Canny Edge: controls outlines</li><li>Depth: controls depth structure</li><li>OpenPose: controls human pose</li><li>Segment: controls region segmentation</li><li>Scribble: controls sketch-based guidance</li></ul><h2 id="What-Is-LoRA-and-Why-Is-It-So-Popular"><a href="#What-Is-LoRA-and-Why-Is-It-So-Popular" class="headerlink" title="What Is LoRA and Why Is It So Popular?"></a>What Is LoRA and Why Is It So Popular?</h2><p><strong>LoRA (Low-Rank Adaptation)</strong> is a parameter-efficient fine-tuning method.<br>LoRA freezes the original model weights and trains only a small number of low-rank matrices to approximate the parameter updates needed for task adaptation. The trainable parameter count is just 0.1%–1% of full fine-tuning, drastically reducing VRAM usage and training cost. During inference, the weights can be merged with the base model, adding no extra latency.<br>In game development, LoRA lets you quickly fine-tune an image-to-image model to generate stylistically consistent character equipment and environment assets, or fine-tune a dialogue model so NPCs produce setting-appropriate natural dialogue — a great fit for small teams that need to move fast.</p><h2 id="What-Is-the-Difference-Between-MVC-MVP-and-MVVM"><a href="#What-Is-the-Difference-Between-MVC-MVP-and-MVVM" class="headerlink" title="What Is the Difference Between MVC, MVP, and MVVM?"></a>What Is the Difference Between MVC, MVP, and MVVM?</h2><table><thead><tr><th>Pattern</th><th>Component Responsibilities</th><th>Component Relationships</th><th>Pros &#x2F; Cons</th></tr></thead><tbody><tr><td>MVC</td><td>Model (data) &#x2F; View (UI) &#x2F; Controller (logic)</td><td>Controller directly operates both Model and View</td><td>Simple and intuitive, good for small projects; Controller can become bloated</td></tr><tr><td>MVP</td><td>Model (data) &#x2F; View (UI) &#x2F; Presenter (logic)</td><td>Presenter directly operates Model, updates View indirectly</td><td>Presenter is highly testable; View depends on Presenter, increasing coupling</td></tr><tr><td>MVVM</td><td>Model (data) &#x2F; View (UI) &#x2F; ViewModel (logic)</td><td>ViewModel directly operates Model, updates View via data binding</td><td>Two-way binding simplifies UI updates; steeper learning curve, potential performance overhead</td></tr></tbody></table><p>MVP’s Presenter and MVVM’s ViewModel are very similar in responsibility — both act as intermediaries handling business logic and data interaction. The key difference is that MVVM’s data-binding mechanism lets ViewModel update the View directly, eliminating the large amount of UI-update code you’d write in a Presenter. This makes the code more concise and testable. MVVM suits projects with complex, frequent UI interactions; MVP fits simpler UIs or scenarios where strict test isolation is needed.</p><h2 id="What-Are-the-Complete-Stages-of-the-GPU-Rendering-Pipeline"><a href="#What-Are-the-Complete-Stages-of-the-GPU-Rendering-Pipeline" class="headerlink" title="What Are the Complete Stages of the GPU Rendering Pipeline?"></a>What Are the Complete Stages of the GPU Rendering Pipeline?</h2><p><strong>Reference answer:</strong></p><p>The GPU Rendering Pipeline is the full process by which a GPU executes graphics rendering:</p><p><strong>Application Stage (CPU side):</strong></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">1. Application Stage</span><br><span class="line">   → Frustum Culling</span><br><span class="line">   → Draw Call Batching</span><br><span class="line">   → Outputs Draw Calls + vertex data to GPU</span><br></pre></td></tr></table></figure><p><strong>Geometry Stage (GPU vertex shaders):</strong></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">2. Vertex Shader</span><br><span class="line">   → Model Space → World Space → View Space → Homogeneous Clip Space</span><br><span class="line">   → Vertex transform: LocalMatrix × WorldMatrix × ViewMatrix × ProjectionMatrix</span><br><span class="line"></span><br><span class="line">3. Tessellation (optional)</span><br><span class="line">   → Hull Shader → Tessellator → Domain Shader</span><br><span class="line"></span><br><span class="line">4. Geometry Shader (optional)</span><br><span class="line">   → Processes per-primitive, can emit or discard primitives</span><br></pre></td></tr></table></figure><p><strong>Rasterization Stage:</strong></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">5. Primitive Assembly &amp; Clipping</span><br><span class="line">   → Clipping (homogeneous space clipping)</span><br><span class="line">   → Perspective Divide → NDC → Viewport Transform</span><br><span class="line"></span><br><span class="line">6. Back-face Culling</span><br><span class="line">   → Discards back-facing primitives based on vertex winding order (CW/CCW)</span><br><span class="line"></span><br><span class="line">7. Rasterization</span><br><span class="line">   → Discretization: points / lines / triangles → Fragments</span><br><span class="line">   → Viewport Transform: NDC → Screen Space</span><br></pre></td></tr></table></figure><p><strong>Fragment &#x2F; Pixel Stage:</strong></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">8. Fragment Shader (Pixel Shader)</span><br><span class="line">   → Per-pixel shading: lighting calculations, texture sampling, color output</span><br><span class="line"></span><br><span class="line">9. Per-Fragment Operations</span><br><span class="line">   → Depth Test (Z-Test)</span><br><span class="line">   → Stencil Test</span><br><span class="line">   → Alpha Blending</span><br><span class="line">   → Output to Framebuffer</span><br></pre></td></tr></table></figure>]]></content:encoded>
      
      
      
      
      <comments>https://eugenepage.com/2026/06/27/20250801.UEConceptScrapbox/#disqus_thread</comments>
      
    </item>
    
    <item>
      <title>Obsidian 学习路径与功能笔记</title>
      <link>https://eugenepage.com/zh-CN/2026/05/08/20260509.ObsidianFunctionLearning/</link>
      <guid>https://eugenepage.com/zh-CN/2026/05/08/20260509.ObsidianFunctionLearning/</guid>
      <pubDate>Fri, 08 May 2026 16:00:00 GMT</pubDate>
      
        
        
      <description>&lt;h1 id=&quot;Obsidian-学习路径与功能笔记&quot;&gt;&lt;a href=&quot;#Obsidian-学习路径与功能笔记&quot; class=&quot;headerlink&quot; title=&quot;Obsidian 学习路径与功能笔记&quot;&gt;&lt;/a&gt;Obsidian 学习路径与功能笔记&lt;/h1&gt;&lt;blockquo</description>
        
      
      
      
      <content:encoded><![CDATA[<h1 id="Obsidian-学习路径与功能笔记"><a href="#Obsidian-学习路径与功能笔记" class="headerlink" title="Obsidian 学习路径与功能笔记"></a>Obsidian 学习路径与功能笔记</h1><blockquote><p>目标：以最少的折腾时间，把 Obsidian 用成”长期可复利”的知识库；先稳住基本功，再按需扩展插件与方法论。</p></blockquote><hr><h2 id="0-为什么是-Obsidian"><a href="#0-为什么是-Obsidian" class="headerlink" title="0. 为什么是 Obsidian"></a>0. 为什么是 Obsidian</h2><ul><li><strong>本地优先</strong>：所有笔记是 <code>.md</code> 纯文本，跟随 Git&#x2F;网盘随便同步；与本仓库 Hexo 博客天然兼容（<code>notes/_posts/**/*.md</code> 可直接被博客引擎消费）。</li><li><strong>链接驱动</strong>：用 <code>[[wikilink]]</code> 把碎片连成网，长期沉淀越久越值钱。</li><li><strong>插件生态</strong>：核心插件 + 社区插件 ≈ “可编程的笔记系统”。</li><li><strong>零锁定</strong>：随时可以离开，文件即数据。</li></ul><h2 id="1-我自己的文件目录路径"><a href="#1-我自己的文件目录路径" class="headerlink" title="1. 我自己的文件目录路径"></a>1. 我自己的文件目录路径</h2><p>这个 Vault 的根目录 <code>C:\Users\youdr\iCloudDrive\Doc\notes\</code> 下有四个隐藏文件夹，分别服务于不同的工具链：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">notes/</span><br><span class="line">├── .claude/       # Claude Code 的 vault 级配置</span><br><span class="line">├── .claudian/     # Claudian 插件的运行时数据</span><br><span class="line">├── .obsidian/     # Obsidian 本体的所有配置</span><br><span class="line">└── .omc/          # oh-my-claudecode (OMC) 的状态存储</span><br></pre></td></tr></table></figure><div class="canvas-embed" data-canvas-slug="attachments/Canvas/20260509.ObsidianFunctionLearning-vault-directory-map"><svg xmlns="http://www.w3.org/2000/svg" class="canvas-svg" width="249" height="280" viewBox="-40 -980 1680 1890" preserveAspectRatio="xMidYMid meet" role="img" aria-label="20260509.ObsidianFunctionLearning-vault-directory-map"><title>20260509.ObsidianFunctionLearning-vault-directory-map</title><defs><marker id="canvas-arrow" viewBox="0 0 10 10" refX="9" refY="5" markerWidth="6" markerHeight="6" orient="auto-start-reverse"><path d="M 0 0 L 10 5 L 0 10 z" /></marker></defs><g class="canvas-groups"></g><g class="canvas-edges"><g class="canvas-edge-group" data-id="b1b2c3d4e5f60001"><path class="canvas-edge" d="M 340 0 C 610.7192067232927 0, 209.2807932767073 -800, 480 -800" fill="none" marker-end="url(#canvas-arrow)" /></g><g class="canvas-edge-group" data-id="b1b2c3d4e5f60002"><path class="canvas-edge" d="M 340 0 C 432.6162932629987 0, 387.3837067370013 -240, 480 -240" fill="none" marker-end="url(#canvas-arrow)" /></g><g class="canvas-edge-group" data-id="b1b2c3d4e5f60003"><path class="canvas-edge" d="M 340 0 C 432.6162932629987 0, 387.3837067370013 240, 480 240" fill="none" marker-end="url(#canvas-arrow)" /></g><g class="canvas-edge-group" data-id="b1b2c3d4e5f60004"><path class="canvas-edge" d="M 340 0 C 604.154836748786 0, 215.845163251214 780, 480 780" fill="none" marker-end="url(#canvas-arrow)" /></g><g class="canvas-edge-group" data-id="b1b2c3d4e5f60005"><path class="canvas-edge" d="M 760 -800 C 818.3333333333334 -800, 841.6666666666666 -905, 900 -905" fill="none" marker-end="url(#canvas-arrow)" /></g><g class="canvas-edge-group" data-id="b1b2c3d4e5f60006"><path class="canvas-edge" d="M 760 -800 C 806.9337594677625 -800, 853.0662405322375 -815, 900 -815" fill="none" marker-end="url(#canvas-arrow)" /></g><g class="canvas-edge-group" data-id="b1b2c3d4e5f60007"><path class="canvas-edge" d="M 760 -800 C 812.941267247562 -800, 847.058732752438 -725, 900 -725" fill="none" marker-end="url(#canvas-arrow)" /></g><g class="canvas-edge-group" data-id="b1b2c3d4e5f60008"><path class="canvas-edge" d="M 760 -240 C 817.3488351136175 -240, 842.6511648863825 -340, 900 -340" fill="none" marker-end="url(#canvas-arrow)" /></g><g class="canvas-edge-group" data-id="b1b2c3d4e5f60009"><path class="canvas-edge" d="M 760 -240 C 810.1386965216377 -240, 849.8613034783623 -185, 900 -185" fill="none" marker-end="url(#canvas-arrow)" /></g><g class="canvas-edge-group" data-id="b1b2c3d4e5f60010"><path class="canvas-edge" d="M 760 240 C 814.3366563401008 240, 845.6633436598992 156.5, 900 156.5" fill="none" marker-end="url(#canvas-arrow)" /></g><g class="canvas-edge-group" data-id="b1b2c3d4e5f60011"><path class="canvas-edge" d="M 760 240 C 852.6162932629987 240, 807.3837067370013 480, 900 480" fill="none" marker-end="url(#canvas-arrow)" /></g><g class="canvas-edge-group" data-id="b1b2c3d4e5f60012"><path class="canvas-edge" d="M 1140 480 C 1205.659052011974 480, 1254.340947988026 400, 1320 400" fill="none" marker-end="url(#canvas-arrow)" /></g><g class="canvas-edge-group" data-id="b1b2c3d4e5f60013"><path class="canvas-edge" d="M 1140 480 C 1200.2079728939614 480, 1259.7920271060386 495, 1320 495" fill="none" marker-end="url(#canvas-arrow)" /></g><g class="canvas-edge-group" data-id="b1b2c3d4e5f60014"><path class="canvas-edge" d="M 1140 480 C 1209.462219947249 480, 1250.537780052751 585, 1320 585" fill="none" marker-end="url(#canvas-arrow)" /></g><g class="canvas-edge-group" data-id="b1b2c3d4e5f60015"><path class="canvas-edge" d="M 760 780 C 809.0181372328425 780, 850.9818627671575 735, 900 735" fill="none" marker-end="url(#canvas-arrow)" /></g><g class="canvas-edge-group" data-id="b1b2c3d4e5f60016"><path class="canvas-edge" d="M 760 780 C 810.1386965216377 780, 849.8613034783623 835, 900 835" fill="none" marker-end="url(#canvas-arrow)" /></g><g class="canvas-edge-group" data-id="b1b2c3d4e5f60017"><path class="canvas-edge" d="M 1140 -725 C 1224.8528137423857 -725, 1235.1471862576143 -905, 1320 -905" fill="none" marker-end="url(#canvas-arrow)" /></g><g class="canvas-edge-group" data-id="b1b2c3d4e5f60018"><path class="canvas-edge" d="M 1140 -725 C 1207.0820393249937 -725, 1252.9179606750063 -815, 1320 -815" fill="none" marker-end="url(#canvas-arrow)" /></g><g class="canvas-edge-group" data-id="b1b2c3d4e5f60019"><path class="canvas-edge" d="M 1140 -725 C 1200 -725, 1260 -725, 1320 -725" fill="none" marker-end="url(#canvas-arrow)" /></g><g class="canvas-edge-group" data-id="b1b2c3d4e5f60020"><path class="canvas-edge" d="M 1140 -725 C 1207.0820393249937 -725, 1252.9179606750063 -635, 1320 -635" fill="none" marker-end="url(#canvas-arrow)" /></g></g><g class="canvas-nodes"><g class="canvas-node canvas-node--text" data-id="a1b2c3d4e5f60001" data-color="4"><rect class="canvas-node__bg" x="0" y="-40" width="340" height="80" rx="8" /><foreignObject x="0" y="-40" width="340" height="80"><div xmlns="http://www.w3.org/1999/xhtml" class="canvas-node__body"><p><strong>notes&#x2F; 根目录</strong><br><code>C:\Users\youdr\iCloudDrive\Doc\notes</code></p></div></foreignObject></g><g class="canvas-node canvas-node--text" data-id="a1b2c3d4e5f60002" data-color="5"><rect class="canvas-node__bg" x="480" y="-840" width="280" height="80" rx="8" /><foreignObject x="480" y="-840" width="280" height="80"><div xmlns="http://www.w3.org/1999/xhtml" class="canvas-node__body"><p><strong><code>.claude/</code></strong><br>Claude Code vault 级配置</p></div></foreignObject></g><g class="canvas-node canvas-node--text" data-id="a1b2c3d4e5f60003"><rect class="canvas-node__bg" x="900" y="-940" width="240" height="70" rx="8" /><foreignObject x="900" y="-940" width="240" height="70"><div xmlns="http://www.w3.org/1999/xhtml" class="canvas-node__body"><p><strong><code>agents/</code></strong><br>自定义 Agent 角色（空）</p></div></foreignObject></g><g class="canvas-node canvas-node--text" data-id="a1b2c3d4e5f60004"><rect class="canvas-node__bg" x="900" y="-850" width="240" height="70" rx="8" /><foreignObject x="900" y="-850" width="240" height="70"><div xmlns="http://www.w3.org/1999/xhtml" class="canvas-node__body"><p><strong><code>commands/</code></strong><br>自定义斜杠命令（空）</p></div></foreignObject></g><g class="canvas-node canvas-node--text" data-id="a1b2c3d4e5f60005"><rect class="canvas-node__bg" x="900" y="-760" width="240" height="70" rx="8" /><foreignObject x="900" y="-760" width="240" height="70"><div xmlns="http://www.w3.org/1999/xhtml" class="canvas-node__body"><p><strong><code>skills/</code></strong><br>vault 专属技能包（4个）</p></div></foreignObject></g><g class="canvas-node canvas-node--text" data-id="a1b2c3d4e5f60006" data-color="6"><rect class="canvas-node__bg" x="480" y="-280" width="280" height="80" rx="8" /><foreignObject x="480" y="-280" width="280" height="80"><div xmlns="http://www.w3.org/1999/xhtml" class="canvas-node__body"><p><strong><code>.claudian/</code></strong><br>Claudian 插件运行时数据</p></div></foreignObject></g><g class="canvas-node canvas-node--text" data-id="a1b2c3d4e5f60008"><rect class="canvas-node__bg" x="900" y="-220" width="240" height="70" rx="8" /><foreignObject x="900" y="-220" width="240" height="70"><div xmlns="http://www.w3.org/1999/xhtml" class="canvas-node__body"><p><strong><code>sessions/</code></strong><br>对话历史元数据（4条）</p></div></foreignObject></g><g class="canvas-node canvas-node--text" data-id="a1b2c3d4e5f60009" data-color="2"><rect class="canvas-node__bg" x="480" y="200" width="280" height="80" rx="8" /><foreignObject x="480" y="200" width="280" height="80"><div xmlns="http://www.w3.org/1999/xhtml" class="canvas-node__body"><p><strong><code>.obsidian/</code></strong><br>Obsidian 核心配置</p></div></foreignObject></g><g class="canvas-node canvas-node--text" data-id="a1b2c3d4e5f60010"><rect class="canvas-node__bg" x="900" y="60" width="280" height="193" rx="8" /><foreignObject x="900" y="60" width="280" height="193"><div xmlns="http://www.w3.org/1999/xhtml" class="canvas-node__body"><p><strong>配置文件</strong><br>• <code>app.json</code> 应用层设置<br>• <code>appearance.json</code> 外观主题<br>• <code>core-plugins.json</code> 核心插件<br>• <code>community-plugins.json</code> 社区插件<br>• <code>hotkeys.json</code> 快捷键配置<br>• <code>daily-notes.json</code> 日记路径<br>• <code>graph.json</code> 知识图谱参数<br>• <code>workspace.json</code> 窗口布局 ⚠️不跟踪</p></div></foreignObject></g><g class="canvas-node canvas-node--text" data-id="a1b2c3d4e5f60011"><rect class="canvas-node__bg" x="900" y="440" width="240" height="80" rx="8" /><foreignObject x="900" y="440" width="240" height="80"><div xmlns="http://www.w3.org/1999/xhtml" class="canvas-node__body"><p><strong><code>plugins/</code></strong><br>已安装插件本体（3个）</p></div></foreignObject></g><g class="canvas-node canvas-node--text" data-id="a1b2c3d4e5f60012"><rect class="canvas-node__bg" x="1320" y="360" width="280" height="80" rx="8" /><foreignObject x="1320" y="360" width="280" height="80"><div xmlns="http://www.w3.org/1999/xhtml" class="canvas-node__body"><p><strong>obsidian-image-auto-upload</strong><br>图片自动上传至 PicList 图床</p></div></foreignObject></g><g class="canvas-node canvas-node--text" data-id="a1b2c3d4e5f60013"><rect class="canvas-node__bg" x="1320" y="460" width="240" height="70" rx="8" /><foreignObject x="1320" y="460" width="240" height="70"><div xmlns="http://www.w3.org/1999/xhtml" class="canvas-node__body"><p><strong>pdf-plus&#x2F;</strong><br>PDF 增强阅读与标注</p></div></foreignObject></g><g class="canvas-node canvas-node--text" data-id="a1b2c3d4e5f60014"><rect class="canvas-node__bg" x="1320" y="550" width="240" height="70" rx="8" /><foreignObject x="1320" y="550" width="240" height="70"><div xmlns="http://www.w3.org/1999/xhtml" class="canvas-node__body"><p><strong>realclaudian&#x2F;</strong><br>Claudian 插件本体（本 AI）</p></div></foreignObject></g><g class="canvas-node canvas-node--text" data-id="a1b2c3d4e5f60015" data-color="1"><rect class="canvas-node__bg" x="480" y="740" width="280" height="80" rx="8" /><foreignObject x="480" y="740" width="280" height="80"><div xmlns="http://www.w3.org/1999/xhtml" class="canvas-node__body"><p><strong><code>.omc/</code></strong><br>oh-my-claudecode 状态存储</p></div></foreignObject></g><g class="canvas-node canvas-node--text" data-id="a1b2c3d4e5f60016"><rect class="canvas-node__bg" x="900" y="700" width="260" height="70" rx="8" /><foreignObject x="900" y="700" width="260" height="70"><div xmlns="http://www.w3.org/1999/xhtml" class="canvas-node__body"><p><strong><code>sessions/</code></strong><br>OMC session 上下文快照</p></div></foreignObject></g><g class="canvas-node canvas-node--text" data-id="a1b2c3d4e5f60017"><rect class="canvas-node__bg" x="900" y="800" width="260" height="70" rx="8" /><foreignObject x="900" y="800" width="260" height="70"><div xmlns="http://www.w3.org/1999/xhtml" class="canvas-node__body"><p><strong><code>state/sessions/</code></strong><br>agent 间共享状态</p></div></foreignObject></g><g class="canvas-node canvas-node--text" data-id="a1b2c3d4e5f60018"><rect class="canvas-node__bg" x="1320" y="-940" width="260" height="70" rx="8" /><foreignObject x="1320" y="-940" width="260" height="70"><div xmlns="http://www.w3.org/1999/xhtml" class="canvas-node__body"><p><strong><code>defuddle</code></strong><br>网页抓取为干净 Markdown</p></div></foreignObject></g><g class="canvas-node canvas-node--text" data-id="a1b2c3d4e5f60019"><rect class="canvas-node__bg" x="1320" y="-850" width="260" height="70" rx="8" /><foreignObject x="1320" y="-850" width="260" height="70"><div xmlns="http://www.w3.org/1999/xhtml" class="canvas-node__body"><p><strong><code>json-canvas</code></strong><br>读写 .canvas 白板文件</p></div></foreignObject></g><g class="canvas-node canvas-node--text" data-id="a1b2c3d4e5f60020"><rect class="canvas-node__bg" x="1320" y="-760" width="260" height="70" rx="8" /><foreignObject x="1320" y="-760" width="260" height="70"><div xmlns="http://www.w3.org/1999/xhtml" class="canvas-node__body"><p><strong><code>obsidian-cli</code></strong><br>obsidian:&#x2F;&#x2F; URI 协议调用</p></div></foreignObject></g><g class="canvas-node canvas-node--text" data-id="a1b2c3d4e5f60021"><rect class="canvas-node__bg" x="1320" y="-670" width="260" height="70" rx="8" /><foreignObject x="1320" y="-670" width="260" height="70"><div xmlns="http://www.w3.org/1999/xhtml" class="canvas-node__body"><p><strong><code>obsidian-markdown</code></strong><br>Obsidian 专属 MD 语法</p></div></foreignObject></g><g class="canvas-node canvas-node--text" data-id="a1b2c3d4e5f60007"><rect class="canvas-node__bg" x="900" y="-400" width="280" height="120" rx="8" /><foreignObject x="900" y="-400" width="280" height="120"><div xmlns="http://www.w3.org/1999/xhtml" class="canvas-node__body"><p><strong><code>claudian-settings.json</code></strong><br>模型: sonnet · 权限: yolo<br>思考预算: low · 努力: high<br>UI 位置: 侧边栏右侧</p></div></foreignObject></g></g></svg><span class="canvas-embed__expand" aria-hidden="true" title="点击放大">⛶</span></div><hr><h2 id="1-学习路径总览（建议按周推进）"><a href="#1-学习路径总览（建议按周推进）" class="headerlink" title="1. 学习路径总览（建议按周推进）"></a>1. 学习路径总览（建议按周推进）</h2><table><thead><tr><th>阶段</th><th>时长</th><th>核心目标</th><th>关键产出</th></tr></thead><tbody><tr><td><strong>W1 基础</strong></td><td>3–5 天</td><td>掌握 Vault、Markdown、双链、标签</td><td>第一篇带链接的笔记</td></tr><tr><td><strong>W2 组织</strong></td><td>1 周</td><td>文件夹策略、模板、每日笔记</td><td>个人 PKM 结构成型</td></tr><tr><td><strong>W3 进阶</strong></td><td>1 周</td><td>Dataview、Templater、Graph View</td><td>自动化索引页</td></tr><tr><td><strong>W4 工作流</strong></td><td>1 周</td><td>与 Hexo &#x2F; Git &#x2F; VSCode 联动</td><td>笔记 → 博客一键流程</td></tr><tr><td><strong>持续</strong></td><td>—</td><td>方法论（Zettelkasten &#x2F; PARA &#x2F; Johnny Decimal）</td><td>可复利的二阶笔记</td></tr></tbody></table><hr><h2 id="2-W1：基础——把”骨架”立起来"><a href="#2-W1：基础——把”骨架”立起来" class="headerlink" title="2. W1：基础——把”骨架”立起来"></a>2. W1：基础——把”骨架”立起来</h2><h3 id="2-1-核心概念"><a href="#2-1-核心概念" class="headerlink" title="2.1 核心概念"></a>2.1 核心概念</h3><ul><li><strong>Vault（库）</strong>：一个文件夹 &#x3D; 一个 Vault。所有 <code>.md</code> 与 <code>.obsidian/</code> 配置都在里面。</li><li><strong>Note（笔记）</strong>：一个 <code>.md</code> 文件 &#x3D; 一条笔记。命名建议：<code>日期前缀 + 主题</code>，如 <code>20260509.ObsidianFunctionLearning.md</code>。</li><li><strong>Frontmatter（YAML 元数据）</strong>：文件最顶部的 <code>---</code> 块，存放 <code>title / tags / date / status</code>，被 Dataview &#x2F; 主题 &#x2F; Hexo 共同消费。</li><li><strong>Link（双链）</strong>：<code>[[文件名]]</code> 或 <code>[[文件名|显示文本]]</code>；<code>[[A#二级标题]]</code> 跳转到具体小节；<code>[[A^block-id]]</code> 引用块。</li><li><strong>Backlink（反向链接）</strong>：右侧面板自动列出”谁链接了我”，是 Obsidian 的灵魂功能。</li><li><strong>Tag（标签）</strong>：<code>#topic/subtopic</code> 支持层级；和文件夹是互补关系，不是替代。</li></ul><h3 id="2-2-必须背下来的快捷键"><a href="#2-2-必须背下来的快捷键" class="headerlink" title="2.2 必须背下来的快捷键"></a>2.2 必须背下来的快捷键</h3><table><thead><tr><th>操作</th><th>快捷键</th></tr></thead><tbody><tr><td>全局命令面板</td><td><code>Ctrl + P</code></td></tr><tr><td>快速切换文件</td><td><code>Ctrl + O</code></td></tr><tr><td>新建笔记</td><td><code>Ctrl + N</code></td></tr><tr><td>双链补全</td><td>输入 <code>[[</code> 触发</td></tr><tr><td>切换源码&#x2F;预览</td><td><code>Ctrl + E</code></td></tr><tr><td>打开当天 Daily Note</td><td><code>Ctrl + Shift + D</code>（启用 Daily Notes 插件后）</td></tr><tr><td>源码模式切换（自己设置）</td><td>ctrl + &#x2F;</td></tr></tbody></table><h3 id="2-3-W1-练习"><a href="#2-3-W1-练习" class="headerlink" title="2.3 W1 练习"></a>2.3 W1 练习</h3><ol><li>在 <code>notes/</code> 打开 Vault，写 3 条笔记（任意主题）。</li><li>让其中两条用 <code>[[]]</code> 互相链接。</li><li>给每条加 <code>tags: [...]</code>，在右侧面板看 Backlink。</li></ol><hr><h2 id="3-W2：组织——确立结构与模板"><a href="#3-W2：组织——确立结构与模板" class="headerlink" title="3. W2：组织——确立结构与模板"></a>3. W2：组织——确立结构与模板</h2><h3 id="3-1-文件夹策略（与本仓库现状对齐）"><a href="#3-1-文件夹策略（与本仓库现状对齐）" class="headerlink" title="3.1 文件夹策略（与本仓库现状对齐）"></a>3.1 文件夹策略（与本仓库现状对齐）</h3><p>当前仓库已有的目录可直接套用：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">notes/</span><br><span class="line">├── _posts/        # Hexo 发布的正式文章（双语）</span><br><span class="line">│   ├── zh-CN/</span><br><span class="line">│   └── en/</span><br><span class="line">├── Learning/      # 学习笔记 / 个人草稿（本文件所在）</span><br><span class="line">├── AIdocs/        # 项目级架构、决策、路线图</span><br><span class="line">└── about/         # 关于页</span><br></pre></td></tr></table></figure><p>建议在 <code>Learning/</code> 下再分：</p><ul><li><code>daily/</code>：每日笔记（自动创建）</li><li><code>topic/</code>：主题长文（成熟后迁出到 <code>_posts/</code>）</li><li><code>inbox/</code>：临时草稿，未分类</li></ul><h3 id="3-2-三种主流方法论（任选其一即可，别全上）"><a href="#3-2-三种主流方法论（任选其一即可，别全上）" class="headerlink" title="3.2 三种主流方法论（任选其一即可，别全上）"></a>3.2 三种主流方法论（任选其一即可，别全上）</h3><table><thead><tr><th>方法</th><th>一句话</th><th>适合谁</th></tr></thead><tbody><tr><td><strong>Zettelkasten</strong></td><td>一卡一念，靠双链组网，不靠分类</td><td>长期写作者、研究者</td></tr><tr><td><strong>PARA</strong></td><td>Project &#x2F; Area &#x2F; Resource &#x2F; Archive</td><td>项目驱动型工作者</td></tr><tr><td><strong>Johnny Decimal</strong></td><td><code>10-19 / 11.01</code> 编号制</td><td>偏好结构与索引的人</td></tr></tbody></table><blockquote><p><strong>建议</strong>：你已经有 <code>_posts / Learning / AIdocs</code> 这种”项目+资源”分布，<strong>先跑 PARA</strong>，等笔记数量过 500 篇再考虑 Zettelkasten。</p></blockquote><h3 id="3-3-必装核心插件（自带）"><a href="#3-3-必装核心插件（自带）" class="headerlink" title="3.3 必装核心插件（自带）"></a>3.3 必装核心插件（自带）</h3><p>进入 <code>设置 → 核心插件</code>，把这些打开：</p><ul><li>✅ <strong>Daily Notes</strong>：每日一篇时间轴笔记</li><li>✅ <strong>Templates</strong>：插入模板内容</li><li>✅ <strong>Outline</strong>：右侧大纲</li><li>✅ <strong>Backlinks &#x2F; Outgoing Links</strong>：反向 &#x2F; 正向链接面板</li><li>✅ <strong>Graph View</strong>：知识图谱</li><li>✅ <strong>Tag Pane</strong>：标签面板</li><li>✅ <strong>File Recovery</strong>：自动备份，<strong>强烈推荐</strong></li><li>⚠️ <strong>Workspaces</strong>：多布局切换（进阶可开）</li></ul><hr><h2 id="4-W3：进阶——让笔记自己动起来"><a href="#4-W3：进阶——让笔记自己动起来" class="headerlink" title="4. W3：进阶——让笔记自己动起来"></a>4. W3：进阶——让笔记自己动起来</h2><h3 id="4-1-必装社区插件（短列表，不要贪多）"><a href="#4-1-必装社区插件（短列表，不要贪多）" class="headerlink" title="4.1 必装社区插件（短列表，不要贪多）"></a>4.1 必装社区插件（短列表，不要贪多）</h3><table><thead><tr><th>插件</th><th>作用</th></tr></thead><tbody><tr><td><strong>Dataview</strong></td><td>用类 SQL 查询笔记元数据，自动生成索引页</td></tr><tr><td><strong>Templater</strong></td><td>比内置 Templates 强 100 倍，支持 JS 脚本</td></tr><tr><td><strong>Excalidraw</strong></td><td>手绘 &#x2F; 流程图，附带双链</td></tr><tr><td><strong>Advanced Tables</strong></td><td>表格编辑器（写 Markdown 表格的人都需要）</td></tr><tr><td><strong>Obsidian Git</strong></td><td>Vault 自动 commit &#x2F; push（你这个仓库正好用得上）</td></tr><tr><td><strong>Iconize</strong> &#x2F; <strong>Iconic</strong></td><td>给文件夹&#x2F;文件加图标，提升可视性</td></tr><tr><td><strong>Style Settings</strong></td><td>调主题细节（字体、间距、颜色）</td></tr><tr><td><strong>Linter</strong></td><td>Markdown 风格统一，YAML 排序</td></tr></tbody></table><h3 id="4-2-Dataview-入门示例"><a href="#4-2-Dataview-入门示例" class="headerlink" title="4.2 Dataview 入门示例"></a>4.2 Dataview 入门示例</h3><p>在任意笔记里写：</p><figure class="highlight markdown"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="code">```dataview</span></span><br><span class="line"><span class="code">TABLE date, status, file.tags AS tags</span></span><br><span class="line"><span class="code">FROM &quot;Learning&quot;</span></span><br><span class="line"><span class="code">WHERE status = &quot;in-progress&quot;</span></span><br><span class="line"><span class="code">SORT date DESC</span></span><br><span class="line"><span class="code">```</span></span><br></pre></td></tr></table></figure><p>→ 自动列出 <code>Learning/</code> 下所有 <code>status: in-progress</code> 的笔记。</p><h3 id="4-3-Templater-模板示例"><a href="#4-3-Templater-模板示例" class="headerlink" title="4.3 Templater 模板示例"></a>4.3 Templater 模板示例</h3><p>在 <code>notes/Templates/learning.md</code>：</p><figure class="highlight markdown"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">---</span><br><span class="line">title: &lt;% tp.file.title %&gt;</span><br><span class="line">date: &lt;% tp.date.now(&quot;YYYY-MM-DD&quot;) %&gt;</span><br><span class="line">tags: []</span><br><span class="line"><span class="section">status: in-progress</span></span><br><span class="line"><span class="section">---</span></span><br><span class="line"></span><br><span class="line"><span class="section"># &lt;% tp.file.title %&gt;</span></span><br><span class="line"></span><br><span class="line"><span class="section">## 背景</span></span><br><span class="line"></span><br><span class="line"><span class="section">## 内容</span></span><br><span class="line"></span><br><span class="line"><span class="section">## 参考</span></span><br></pre></td></tr></table></figure><p>→ 新建笔记时一键套用，<code>title / date</code> 自动填。</p><hr><h2 id="5-W4：工作流——把-Obsidian-嵌进现有管线"><a href="#5-W4：工作流——把-Obsidian-嵌进现有管线" class="headerlink" title="5. W4：工作流——把 Obsidian 嵌进现有管线"></a>5. W4：工作流——把 Obsidian 嵌进现有管线</h2><p>本仓库是 <strong>Hexo 博客 + Git 版本管理 + Obsidian 笔记</strong> 的三件套，目标是：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">草稿（Learning/inbox/） →</span><br><span class="line">成熟（Learning/topic/） →</span><br><span class="line">发布（_posts/zh-CN/ 与 _posts/en/） →</span><br><span class="line">博客上线（Hexo build）</span><br></pre></td></tr></table></figure><h3 id="5-1-与-Hexo-兼容的-frontmatter"><a href="#5-1-与-Hexo-兼容的-frontmatter" class="headerlink" title="5.1 与 Hexo 兼容的 frontmatter"></a>5.1 与 Hexo 兼容的 frontmatter</h3><p>博客文章需要的字段（参考 <code>notes/_posts/</code> 现有文章）：</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">---</span></span><br><span class="line"><span class="attr">title:</span> <span class="string">文章标题</span></span><br><span class="line"><span class="attr">date:</span> <span class="number">2026-05-09 12:00:00</span></span><br><span class="line"><span class="attr">categories:</span> [<span class="string">分类</span>]</span><br><span class="line"><span class="attr">tags:</span> [<span class="string">标签1</span>, <span class="string">标签2</span>]</span><br><span class="line"><span class="attr">lang:</span> <span class="string">zh-CN</span></span><br><span class="line"><span class="meta">---</span></span><br></pre></td></tr></table></figure><h3 id="5-2-与-Git-联动"><a href="#5-2-与-Git-联动" class="headerlink" title="5.2 与 Git 联动"></a>5.2 与 Git 联动</h3><ul><li>用 <code>Obsidian Git</code> 插件做”自动 commit”。</li><li>但本仓库已经有自己的提交规范（见 <code>git log</code> 风格），建议：<ul><li><strong>写作期间</strong>：手动 commit。</li><li><strong>每日睡前</strong>：用 <code>Obsidian Git</code> 一键 push。</li></ul></li></ul><h3 id="5-3-与博客主题（hexo-theme-magnetic）的注意事项"><a href="#5-3-与博客主题（hexo-theme-magnetic）的注意事项" class="headerlink" title="5.3 与博客主题（hexo-theme-magnetic）的注意事项"></a>5.3 与博客主题（hexo-theme-magnetic）的注意事项</h3><ul><li>你当前主题里的 <code>tag-graph.js</code> 与 Obsidian Graph View 是<strong>两套图谱</strong>，互不影响。</li><li>笔记里 <code>[[wikilink]]</code> 在博客渲染时<strong>不会</strong>自动转成超链接（除非装 Hexo 插件 <code>hexo-filter-github-emojis</code> 类的扩展）。如果要发到博客，改成标准 Markdown 链接。</li></ul><hr><h2 id="6-进阶专题（按需展开）"><a href="#6-进阶专题（按需展开）" class="headerlink" title="6. 进阶专题（按需展开）"></a>6. 进阶专题（按需展开）</h2><h3 id="6-1-Canvas（白板）"><a href="#6-1-Canvas（白板）" class="headerlink" title="6.1 Canvas（白板）"></a>6.1 Canvas（白板）</h3><p>内置功能，<code>新建白板</code> → 把多张笔记拖进来当卡片，画连线。适合做<strong>知识地图、项目看板</strong>。</p><h3 id="6-2-Sync-方案对比"><a href="#6-2-Sync-方案对比" class="headerlink" title="6.2 Sync 方案对比"></a>6.2 Sync 方案对比</h3><table><thead><tr><th>方式</th><th>成本</th><th>优点</th><th>坑</th></tr></thead><tbody><tr><td><strong>Obsidian Sync 官方</strong></td><td>$4&#x2F;月</td><td>端到端加密、最稳</td><td>收费</td></tr><tr><td><strong>iCloud &#x2F; OneDrive</strong></td><td>免费</td><td>简单</td><td><code>.obsidian/</code> 容易冲突</td></tr><tr><td><strong>Git（推荐你这种）</strong></td><td>免费</td><td>完整版本史</td><td>大文件需 LFS</td></tr><tr><td><strong>Syncthing</strong></td><td>免费</td><td>局域网快</td><td>配置略折腾</td></tr></tbody></table><h3 id="6-3-移动端"><a href="#6-3-移动端" class="headerlink" title="6.3 移动端"></a>6.3 移动端</h3><ul><li>iOS &#x2F; Android 客户端免费。</li><li>移动端 + iCloud &#x2F; Git 跨设备 → 手机随手记，电脑深度整理。</li></ul><hr><h2 id="7-路径布置建议（针对本仓库）"><a href="#7-路径布置建议（针对本仓库）" class="headerlink" title="7. 路径布置建议（针对本仓库）"></a>7. 路径布置建议（针对本仓库）</h2><blockquote><p><strong>关键问题</strong>：根目录 <code>D:\Project\UGit\EugenePage\.obsidian</code> 已存在，说明 Vault 当前打开的是<strong>整个仓库</strong>而不是 <code>notes/</code>。</p></blockquote><p>两种方案，<strong>二选一</strong>：</p><h3 id="方案-A：把-notes-单独作为-Vault（推荐）"><a href="#方案-A：把-notes-单独作为-Vault（推荐）" class="headerlink" title="方案 A：把 notes/ 单独作为 Vault（推荐）"></a>方案 A：把 <code>notes/</code> 单独作为 Vault（推荐）</h3><ul><li>在 Obsidian 起始页 → “打开文件夹作为库” → 选 <code>D:\Project\UGit\EugenePage\notes</code>。</li><li>优点：Vault 范围干净，只看到笔记，不被 <code>themes/</code>、<code>scripts/</code> 干扰。</li><li>操作：把根目录的 <code>.obsidian/</code> 移动到 <code>notes/.obsidian/</code>（或删掉重建），并在 <code>.gitignore</code> 里<strong>保留</strong> <code>notes/.obsidian/workspace.json</code>（个人布局，不必跟踪）但<strong>保留</strong>核心插件配置。</li></ul><h3 id="方案-B：保持仓库根作为-Vault"><a href="#方案-B：保持仓库根作为-Vault" class="headerlink" title="方案 B：保持仓库根作为 Vault"></a>方案 B：保持仓库根作为 Vault</h3><ul><li>优点：可以同时编辑主题代码与笔记。</li><li>缺点：Graph View 会扫描所有 <code>.md</code>，大量噪音。</li><li>必须做：在 Obsidian <code>设置 → 文件与链接 → 排除的文件</code> 里把 <code>themes/</code>、<code>node_modules/</code>、<code>public/</code> 全部排除。</li></ul><h3 id="gitignore-建议（任一方案都加）"><a href="#gitignore-建议（任一方案都加）" class="headerlink" title=".gitignore 建议（任一方案都加）"></a><code>.gitignore</code> 建议（任一方案都加）</h3><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"># Obsidian 个人配置（团队不共享）</span><br><span class="line">.obsidian/workspace.json</span><br><span class="line">.obsidian/workspace-mobile.json</span><br><span class="line">.obsidian/cache</span><br><span class="line">.obsidian/plugins/*/data.json   # 视情况，含敏感的不要 push</span><br></pre></td></tr></table></figure><p>但 <code>.obsidian/core-plugins.json</code>、<code>community-plugins.json</code>、<code>appearance.json</code>、<code>hotkeys.json</code> <strong>建议跟踪</strong>，方便多机同步。</p><hr><h2 id="8-插件-流程"><a href="#8-插件-流程" class="headerlink" title="8. 插件&#x2F;流程"></a>8. 插件&#x2F;流程</h2><h3 id="Image-Auto-Upload"><a href="#Image-Auto-Upload" class="headerlink" title="Image Auto Upload"></a>Image Auto Upload</h3><p>复制或拖入图片时自动上传至图床，与 PicGo 生态兼容。底层依赖 <strong>PicList</strong>（PicGo 的社区增强版）的命令行接口，需提前配置好图床后方可使用。功能定位与 Typora 的图片上传一致，是笔记软件的基础素质之一。</p><h3 id="Obsidian-CLI-Claudian"><a href="#Obsidian-CLI-Claudian" class="headerlink" title="Obsidian CLI + Claudian"></a>Obsidian CLI + Claudian</h3><p>让 AI（Claude Code）直接读写 Vault 的桥梁，分两步启用：</p><ol><li><strong>开启 CLI</strong>：<code>设置 → 关于 → Obsidian CLI</code> → 点击注册 → 重启 Obsidian。</li><li><strong>安装插件</strong>：从社区插件市场搜索并安装 <strong>Claudian</strong>。<blockquote><p>前提：本机已完成 Claude Code 的配置，Claudian 会自动识别并接入。<br><img src="https://cdn.jsdelivr.net/gh/youdrew/MyPicGo@main/Images/20260515-040708.png" alt="image.png"></p></blockquote></li></ol><p>整体的体验下来，会很像在 VSCode 里面使用 GitHub Copilot。在输入框的 Yolo 功能,，相当于是自动化修改。<br>而且 Claude 里面其实也有 Plan 模式的，点击 Shift + Tab 就可以直接在对话框里切换 Plan 模式<br>然后和它共同商量每一步应该怎么做，最后再让它去执行.<br>同时使用斜杠，依旧可以调用一些 Claude 里面的命令.</p><p>这一部分参考教程：<br><a href="https://www.bilibili.com/video/BV1xFwxzKE5D">https://www.bilibili.com/video/BV1xFwxzKE5D</a></p><h3 id="配套-Claude-Code-Skills（kepano-obsidian-skills）"><a href="#配套-Claude-Code-Skills（kepano-obsidian-skills）" class="headerlink" title="配套 Claude Code Skills（kepano&#x2F;obsidian-skills）"></a>配套 Claude Code Skills（kepano&#x2F;obsidian-skills）</h3><p>Obsidian CEO Steph Ango 在 <a href="https://github.com/kepano/obsidian-skills">kepano&#x2F;obsidian-skills</a> 发布了一组官方 Agent Skill，让 Claude Code 真正”懂” Obsidian 的文件格式与协议。安装方式：把每个 skill 文件夹放到 vault 的 <code>.claude/skills/&lt;name&gt;/</code> 下，<strong>仅对该 vault 启动的 Claude Code 生效</strong>，不会污染全局或其它项目。</p><table><thead><tr><th>Skill</th><th>用途</th><th>我个人是否安装</th></tr></thead><tbody><tr><td><strong>obsidian-markdown</strong></td><td>读写 Obsidian Flavored Markdown：<code>[[wikilink]]</code>、<code>![[embed]]</code>、callouts（<code>&gt; [!note]</code>）、properties frontmatter 等 Obsidian 专属语法。不装的话 Claude 写 <code>.md</code> 时会按通用 Markdown 处理，可能破坏专属语法。</td><td>未安装（计划安装）</td></tr><tr><td><strong>obsidian-bases</strong></td><td>读写 <code>.base</code> 文件（Obsidian 1.9+ 引入的数据库视图，支持 views &#x2F; filters &#x2F; formulas &#x2F; summaries）</td><td>未安装（暂不需要，当前 vault 还没有 <code>.base</code> 文件，等真正用到 Bases 再补）</td></tr><tr><td><strong>json-canvas</strong></td><td>读写 <code>.canvas</code> 文件（白板的 JSON 格式，包含节点、边、组、连线），让 Claude 能直接生成或修改 Canvas</td><td>未安装（计划安装）</td></tr><tr><td><strong>obsidian-cli</strong></td><td>教 Claude 调用 Obsidian <strong>内置</strong>的 <code>obsidian://</code> URI 协议（如 <code>obsidian://open?vault=...&amp;file=...</code>）以及可选的 HTTP API。<strong>不需要额外安装任何命令行二进制</strong>——所有调用走 Obsidian 自带能力。</td><td>未安装（计划安装）</td></tr><tr><td><strong>defuddle</strong></td><td>用 Defuddle 库从网页抽取干净 Markdown，自动去掉导航栏、广告、推荐等噪音，节省 token，适合”网页剪藏 → 笔记”场景</td><td>未安装（计划安装）</td></tr></tbody></table><blockquote><p><strong>关于 <code>obsidian-cli</code> 的常见误解</strong>：这个 skill 不等于”装一个独立 CLI 工具”。<code>obsidian://</code> URI 协议从 Obsidian 1.0 起就是<strong>默认内置功能</strong>，skill 的作用只是让 Claude 学会调用它来实现”打开某篇笔记、触发某个命令、跳转到指定 block”等操作。HTTP API 部分若想启用，需要额外安装社区插件 <strong>Local REST API</strong>（可选）。</p></blockquote><blockquote><p><strong>与 Claudian 的关系</strong>：<code>Claudian</code> 是 Obsidian 端的插件，提供”在 Obsidian UI 里跟 Claude Code 对话”的入口；上面这些 skill 是 Claude Code 端的能力包，让 Claude 在读写 vault 文件时更专业。两者互补、不冲突。</p></blockquote><p>下面我挨个介绍我比较推荐的这几个 skills。</p><h3 id="Advanced-Canvas-插件-json-canvas"><a href="#Advanced-Canvas-插件-json-canvas" class="headerlink" title="Advanced Canvas 插件 + json-canvas"></a>Advanced Canvas 插件 + json-canvas</h3><p>比如说我这一篇文章的顶部，有一个关于文件路径的介绍。上面有个思维导图，这个思维导图就是用 JSON Canvas 画出来的。<br>如果遇到一些比较难的文章或者是比较复杂的架构，可以让他帮你整理思维导图，方便理解。</p><p><strong>Advanced Canvas</strong> 提供 30+ 增强功能：自定义流程图节点样式、Graph View 集成、幻灯片演示模式；支持 Portal（Canvas 套娃）与单节点嵌入 Markdown。</p><h2 id="9-参考资源"><a href="#9-参考资源" class="headerlink" title="9. 参考资源"></a>9. 参考资源</h2><ul><li>官方文档：<a href="https://help.obsidian.md/">https://help.obsidian.md/</a></li><li>官方论坛：<a href="https://forum.obsidian.md/">https://forum.obsidian.md/</a></li><li>中文社区：少数派（sspai.com）的 Obsidian 系列</li><li>YouTube：Linking Your Thinking、Nicole van der Hoeven、Bryan Jenks</li><li>方法论：<ul><li>Niklas Luhmann《How to Take Smart Notes》（Zettelkasten 圣经）</li><li>Tiago Forte《Building a Second Brain》（PARA 提出者）</li></ul></li></ul>]]></content:encoded>
      
      
      
      <category domain="https://eugenepage.com/tags/Obsidian/">Obsidian</category>
      
      <category domain="https://eugenepage.com/tags/PKM/">PKM</category>
      
      <category domain="https://eugenepage.com/tags/NoteTaking/">NoteTaking</category>
      
      <category domain="https://eugenepage.com/tags/LearningPath/">LearningPath</category>
      
      
      <comments>https://eugenepage.com/zh-CN/2026/05/08/20260509.ObsidianFunctionLearning/#disqus_thread</comments>
      
    </item>
    
    <item>
      <title>Houdini MCP Project Comparison</title>
      <link>https://eugenepage.com/2026/05/02/20260502.HoudiniMCPComparison/</link>
      <guid>https://eugenepage.com/2026/05/02/20260502.HoudiniMCPComparison/</guid>
      <pubDate>Sat, 02 May 2026 04:00:00 GMT</pubDate>
      
        
        
      <description>&lt;h1 id=&quot;Houdini-MCP-Project-Comparison-capoomgit-houdini-mcp-vs-healkeiser-fxhoudinimcp&quot;&gt;&lt;a href=&quot;#Houdini-MCP-Project-Comparison-capoomgit-</description>
        
      
      
      
      <content:encoded><![CDATA[<h1 id="Houdini-MCP-Project-Comparison-capoomgit-houdini-mcp-vs-healkeiser-fxhoudinimcp"><a href="#Houdini-MCP-Project-Comparison-capoomgit-houdini-mcp-vs-healkeiser-fxhoudinimcp" class="headerlink" title="Houdini MCP Project Comparison: capoomgit&#x2F;houdini-mcp vs healkeiser&#x2F;fxhoudinimcp"></a>Houdini MCP Project Comparison: capoomgit&#x2F;houdini-mcp vs healkeiser&#x2F;fxhoudinimcp</h1><h2 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h2><p>As the MCP (Model Context Protocol) standard gains traction, more and more DCC applications are adding AI assistant integrations. In the Houdini ecosystem, two major open-source MCP projects currently exist:</p><ol><li><strong><a href="https://github.com/capoomgit/houdini-mcp">capoomgit&#x2F;houdini-mcp</a></strong> — an early-stage project with a clean, minimal structure</li><li><strong><a href="https://github.com/healkeiser/fxhoudinimcp">healkeiser&#x2F;fxhoudinimcp</a></strong> — a newer, more feature-complete implementation</li></ol><p>This post compares the two across architecture design, feature coverage, installation experience, and extensibility to help you pick the right one for your workflow.</p><hr><h2 id="Overview-Comparison"><a href="#Overview-Comparison" class="headerlink" title="Overview Comparison"></a>Overview Comparison</h2><table><thead><tr><th>Dimension</th><th>houdini-mcp (capoomgit)</th><th>fxhoudinimcp (healkeiser)</th></tr></thead><tbody><tr><td>Focus</td><td>Lightweight MCP bridge</td><td>Full-featured Houdini MCP server</td></tr><tr><td>Tool count</td><td>Unspecified; covers basic operations</td><td><strong>168 tools</strong> + 8 resources + 6 workflow prompts</td></tr><tr><td>Architecture</td><td>Custom TCP socket (port 9876)</td><td>Houdini’s built-in <code>hwebserver</code> (port 8100)</td></tr><tr><td>Installation</td><td>Manual file copy to Houdini directory</td><td>PyPI package, <code>pip install fxhoudinimcp</code></td></tr><tr><td>Package manager dependency</td><td>Requires <code>uv</code></td><td>Standard <code>pip</code> works fine</td></tr><tr><td>Thread safety</td><td>Not explicitly addressed</td><td><code>hdefereval.executeInMainThreadWithResult()</code></td></tr><tr><td>License</td><td>Not specified</td><td>MIT</td></tr><tr><td>Maintenance status</td><td>Community-maintained</td><td>Actively developed</td></tr></tbody></table><hr><h2 id="Architecture-Comparison"><a href="#Architecture-Comparison" class="headerlink" title="Architecture Comparison"></a>Architecture Comparison</h2><h3 id="houdini-mcp-capoomgit"><a href="#houdini-mcp-capoomgit" class="headerlink" title="houdini-mcp (capoomgit)"></a>houdini-mcp (capoomgit)</h3><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">Claude Desktop  ──(stdio)──&gt;  MCP Bridge Script  ──(TCP :9876)──&gt;  Houdini Plugin</span><br></pre></td></tr></table></figure><ul><li><strong>Communication</strong>: The MCP Bridge Script talks to Claude via stdin&#x2F;stdout and to Houdini via a custom TCP socket.</li><li><strong>Server side</strong>: A hand-rolled <code>HoudiniMCPServer</code> listening on <code>localhost:9876</code>.</li><li><strong>Inspired by</strong>: Adapted from <a href="https://github.com/ahujasid/blender-mcp">blender-mcp</a>.</li></ul><h3 id="fxhoudinimcp-healkeiser"><a href="#fxhoudinimcp-healkeiser" class="headerlink" title="fxhoudinimcp (healkeiser)"></a>fxhoudinimcp (healkeiser)</h3><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">Claude Desktop / Cursor / Claude Code  ──(stdio/streamable-http)──&gt;  FXHoudini MCP Server  ──(HTTP :8100)──&gt;  Houdini hwebserver</span><br></pre></td></tr></table></figure><ul><li><strong>Communication</strong>: The MCP Server talks to AI clients via stdio or streamable-http, and talks to Houdini via HTTP&#x2F;JSON.</li><li><strong>Server side</strong>: Uses Houdini’s built-in <code>hwebserver</code> directly — no custom server process needed.</li><li><strong>Thread safety</strong>: Uses <code>hdefereval.executeInMainThreadWithResult()</code> to ensure all <code>hou.*</code> API calls run on the main thread.</li></ul><h3 id="Architecture-Analysis"><a href="#Architecture-Analysis" class="headerlink" title="Architecture Analysis"></a>Architecture Analysis</h3><table><thead><tr><th>Aspect</th><th>houdini-mcp</th><th>fxhoudinimcp</th></tr></thead><tbody><tr><td>Server implementation</td><td>Custom socket</td><td>Houdini native <code>hwebserver</code></td></tr><tr><td>Transport protocol</td><td>TCP</td><td>HTTP &#x2F; JSON</td></tr><tr><td>MCP transport</td><td>stdio</td><td>stdio + streamable-http</td></tr><tr><td>Thread safety</td><td>Unknown</td><td>Explicitly guaranteed</td></tr><tr><td>Dependency complexity</td><td>Requires a separate Bridge Script process</td><td>MCP Server communicates directly with hwebserver</td></tr></tbody></table><p><strong>Verdict</strong>: fxhoudinimcp’s architecture is more robust — it reuses Houdini’s native components, reducing the surface area for custom-code bugs.</p><hr><h2 id="Feature-Coverage-Comparison"><a href="#Feature-Coverage-Comparison" class="headerlink" title="Feature Coverage Comparison"></a>Feature Coverage Comparison</h2><h3 id="houdini-mcp-Feature-Set"><a href="#houdini-mcp-Feature-Set" class="headerlink" title="houdini-mcp Feature Set"></a>houdini-mcp Feature Set</h3><p>Provides basic Houdini control:</p><ul><li>Create and modify nodes</li><li>Execute Python &#x2F; HScript code</li><li>Basic scene operations</li><li><strong>OPUS integration</strong>: Connects to the OPUS procedural furniture and environment asset library via RapidAPI (exclusive feature)</li></ul><h3 id="fxhoudinimcp-Feature-Set-19-categories-168-tools"><a href="#fxhoudinimcp-Feature-Set-19-categories-168-tools" class="headerlink" title="fxhoudinimcp Feature Set (19 categories, 168 tools)"></a>fxhoudinimcp Feature Set (19 categories, 168 tools)</h3><table><thead><tr><th>Category</th><th>Tools</th><th>Description</th></tr></thead><tbody><tr><td>Scene Management</td><td>7</td><td>Open, save, import&#x2F;export, scene info</td></tr><tr><td>Node Operations</td><td>16</td><td>Create, delete, copy, connect, layout, flag</td></tr><tr><td>Parameters</td><td>10</td><td>Get&#x2F;set values, expressions, keyframes, spare parameters</td></tr><tr><td>Geometry (SOPs)</td><td>12</td><td>Points, primitives, attributes, groups, sampling, nearest-point lookup</td></tr><tr><td>LOPs&#x2F;USD</td><td>18</td><td>Stage inspection, Prim, layers, composition, variants, lights</td></tr><tr><td>DOPs</td><td>8</td><td>Simulation info, DOP objects, step&#x2F;reset, memory usage</td></tr><tr><td>PDG&#x2F;TOPs</td><td>10</td><td>Cook, Work Items, scheduler, dependency graph</td></tr><tr><td>COPs (Copernicus)</td><td>7</td><td>Image nodes, layers, VDB data</td></tr><tr><td>HDAs</td><td>10</td><td>Create, install, and manage digital assets</td></tr><tr><td>Animation</td><td>9</td><td>Keyframes, playbar control, frame range</td></tr><tr><td>Rendering</td><td>9</td><td>Viewport screenshots, render nodes, settings, render launch</td></tr><tr><td>VEX</td><td>5</td><td>Create&#x2F;edit Wrangle nodes, validate VEX code</td></tr><tr><td>Code Execution</td><td>4</td><td>Python, HScript, expressions, environment variables</td></tr><tr><td>Viewport&#x2F;UI</td><td>11</td><td>Pane management, screenshots, status messages, error detection</td></tr><tr><td>Scene Context</td><td>8</td><td>Network overview, Cook chain, selection, scene summary, error analysis</td></tr><tr><td>Workflows</td><td>8</td><td>One-click Pyro&#x2F;RBD&#x2F;FLIP&#x2F;Vellum setup, SOP chains, render configuration</td></tr><tr><td>Materials</td><td>4</td><td>List, inspect, create materials and shader networks</td></tr><tr><td>CHOPs</td><td>4</td><td>Channel data, CHOP nodes, export channels to parameters</td></tr><tr><td>Cache</td><td>4</td><td>List, inspect, clear, write file caches</td></tr><tr><td>Takes</td><td>4</td><td>List, create, switch Takes and parameter overrides</td></tr></tbody></table><p><strong>Highlights</strong>:</p><ul><li><strong>One-click workflows</strong>: Instant Pyro, RBD, FLIP, and Vellum simulation setup</li><li><strong>Full USD&#x2F;LOPs support</strong>: 18 tools covering the USD pipeline</li><li><strong>Copernicus (COPs) support</strong>: Image processing node operations</li><li><strong>Scene context analysis</strong>: Error detection and Cook chain tracing</li></ul><hr><h2 id="Installation-and-Configuration-Comparison"><a href="#Installation-and-Configuration-Comparison" class="headerlink" title="Installation and Configuration Comparison"></a>Installation and Configuration Comparison</h2><h3 id="houdini-mcp-Installation-Steps"><a href="#houdini-mcp-Installation-Steps" class="headerlink" title="houdini-mcp Installation Steps"></a>houdini-mcp Installation Steps</h3><ol><li>Install <code>uv</code> (Python package manager)</li><li>Manually create the Houdini scripts directory and copy files</li><li>Run <code>uv add &quot;mcp[cli]&quot;</code> in the directory</li><li>Manually create a Shelf Tool</li><li>(Optional) Create a Houdini Package JSON for auto-loading</li><li>Configure <code>claude_desktop_config.json</code></li></ol><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">  <span class="attr">&quot;mcpServers&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;houdini&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">      <span class="attr">&quot;command&quot;</span><span class="punctuation">:</span> <span class="string">&quot;uv&quot;</span><span class="punctuation">,</span></span><br><span class="line">      <span class="attr">&quot;args&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span><span class="string">&quot;run&quot;</span><span class="punctuation">,</span> <span class="string">&quot;python&quot;</span><span class="punctuation">,</span> <span class="string">&quot;C:/path/to/houdini_mcp_server.py&quot;</span><span class="punctuation">]</span></span><br><span class="line">    <span class="punctuation">&#125;</span></span><br><span class="line">  <span class="punctuation">&#125;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><h3 id="fxhoudinimcp-Installation-Steps"><a href="#fxhoudinimcp-Installation-Steps" class="headerlink" title="fxhoudinimcp Installation Steps"></a>fxhoudinimcp Installation Steps</h3><ol><li><code>pip install fxhoudinimcp</code> (or <code>uv pip install fxhoudinimcp</code>)</li><li>Copy the Package JSON to the Houdini packages directory</li><li>Configure the MCP client</li></ol><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">  <span class="attr">&quot;mcpServers&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;fxhoudini&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">      <span class="attr">&quot;command&quot;</span><span class="punctuation">:</span> <span class="string">&quot;python&quot;</span><span class="punctuation">,</span></span><br><span class="line">      <span class="attr">&quot;args&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span><span class="string">&quot;-m&quot;</span><span class="punctuation">,</span> <span class="string">&quot;fxhoudinimcp&quot;</span><span class="punctuation">]</span><span class="punctuation">,</span></span><br><span class="line">      <span class="attr">&quot;env&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">        <span class="attr">&quot;HOUDINI_HOST&quot;</span><span class="punctuation">:</span> <span class="string">&quot;localhost&quot;</span><span class="punctuation">,</span></span><br><span class="line">        <span class="attr">&quot;HOUDINI_PORT&quot;</span><span class="punctuation">:</span> <span class="string">&quot;8100&quot;</span></span><br><span class="line">      <span class="punctuation">&#125;</span></span><br><span class="line">    <span class="punctuation">&#125;</span></span><br><span class="line">  <span class="punctuation">&#125;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p><strong>Claude Code support</strong> (one-liner):</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">claude mcp add --scope user fxhoudini -- python -m fxhoudinimcp</span><br></pre></td></tr></table></figure><h3 id="Installation-Experience-Comparison"><a href="#Installation-Experience-Comparison" class="headerlink" title="Installation Experience Comparison"></a>Installation Experience Comparison</h3><table><thead><tr><th>Aspect</th><th>houdini-mcp</th><th>fxhoudinimcp</th></tr></thead><tbody><tr><td>Installation steps</td><td>5-6 steps, multiple manual operations</td><td>2-3 steps, standardized process</td></tr><tr><td>Package manager</td><td>Requires <code>uv</code></td><td>Standard <code>pip</code> or <code>uv</code> both work</td></tr><tr><td>PyPI package</td><td>No</td><td>Yes</td></tr><tr><td>Auto-start</td><td>Requires manual Package configuration</td><td><code>uiready.py</code> handles auto-start</td></tr><tr><td>Documentation quality</td><td>Basic README</td><td>Detailed categorized docs + environment variable reference</td></tr></tbody></table><hr><h2 id="Client-Support-Comparison"><a href="#Client-Support-Comparison" class="headerlink" title="Client Support Comparison"></a>Client Support Comparison</h2><table><thead><tr><th>AI Client</th><th>houdini-mcp</th><th>fxhoudinimcp</th></tr></thead><tbody><tr><td>Claude Desktop</td><td>Supported</td><td>Supported</td></tr><tr><td>Cursor</td><td>Supported</td><td>Supported</td></tr><tr><td>VS Code</td><td>Not mentioned</td><td>Supported</td></tr><tr><td>Claude Code CLI</td><td>Not mentioned</td><td>Supported (one-liner)</td></tr></tbody></table><hr><h2 id="Exclusive-Features"><a href="#Exclusive-Features" class="headerlink" title="Exclusive Features"></a>Exclusive Features</h2><h3 id="Exclusive-to-houdini-mcp"><a href="#Exclusive-to-houdini-mcp" class="headerlink" title="Exclusive to houdini-mcp"></a>Exclusive to houdini-mcp</h3><ul><li><strong>OPUS integration</strong>: Access to the OPUS procedural asset library (furniture and environment assets) via RapidAPI. Requires a RapidAPI account and an active API subscription.</li></ul><h3 id="Exclusive-to-fxhoudinimcp"><a href="#Exclusive-to-fxhoudinimcp" class="headerlink" title="Exclusive to fxhoudinimcp"></a>Exclusive to fxhoudinimcp</h3><ul><li><strong>One-click simulation workflows</strong>: Pyro &#x2F; RBD &#x2F; FLIP &#x2F; Vellum setup in a single call</li><li><strong>Deep USD&#x2F;LOPs support</strong>: 18 dedicated tools</li><li><strong>Copernicus image processing</strong>: COPs node operations</li><li><strong>Scene error analysis</strong>: Automatic Cook error detection and reporting</li><li><strong>Environment variable configuration</strong>: <code>HOUDINI_HOST</code>, <code>HOUDINI_PORT</code>, <code>FXHOUDINIMCP_AUTOSTART</code>, and more</li><li><strong>Dual transport mode</strong>: stdio + streamable-http</li></ul><hr><h2 id="Recommendations-by-Use-Case"><a href="#Recommendations-by-Use-Case" class="headerlink" title="Recommendations by Use Case"></a>Recommendations by Use Case</h2><h3 id="Choose-houdini-mcp-capoomgit-if-you"><a href="#Choose-houdini-mcp-capoomgit-if-you" class="headerlink" title="Choose houdini-mcp (capoomgit) if you:"></a>Choose houdini-mcp (capoomgit) if you:</h3><ul><li>Only need basic AI control of Houdini</li><li>Are already using a <code>uv</code>-based workflow</li><li>Specifically need OPUS procedural asset library integration</li><li>Have a simple project scope and want to get started quickly</li></ul><h3 id="Choose-fxhoudinimcp-healkeiser-if-you"><a href="#Choose-fxhoudinimcp-healkeiser-if-you" class="headerlink" title="Choose fxhoudinimcp (healkeiser) if you:"></a>Choose fxhoudinimcp (healkeiser) if you:</h3><ul><li>Need comprehensive Houdini coverage (SOPs, LOPs, DOPs, TOPs, COPs, etc.)</li><li>Work with USD&#x2F;LOPs pipelines</li><li>Want one-click simulation workflows (Pyro &#x2F; FLIP &#x2F; Vellum &#x2F; RBD)</li><li>Prefer a standardized installation via a PyPI package</li><li>Use Claude Code CLI as your primary AI tool</li><li>Need guaranteed thread safety</li><li>Value active maintenance and long-term project evolution</li></ul><hr><h2 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h2><table><thead><tr><th>Evaluation Dimension</th><th>houdini-mcp</th><th>fxhoudinimcp</th><th>Winner</th></tr></thead><tbody><tr><td>Feature richness</td><td>Basic</td><td>168 tools</td><td>fxhoudinimcp</td></tr><tr><td>Architecture robustness</td><td>Custom socket</td><td>Native hwebserver</td><td>fxhoudinimcp</td></tr><tr><td>Installation convenience</td><td>Multi-step manual</td><td>One-liner pip</td><td>fxhoudinimcp</td></tr><tr><td>Client compatibility</td><td>Desktop + Cursor</td><td>Desktop + Cursor + VSCode + Claude Code</td><td>fxhoudinimcp</td></tr><tr><td>Asset ecosystem</td><td>OPUS integration</td><td>None</td><td>houdini-mcp</td></tr><tr><td>Documentation quality</td><td>Basic</td><td>Comprehensive</td><td>fxhoudinimcp</td></tr><tr><td>Maintenance activity</td><td>Community-maintained</td><td>Actively developed</td><td>fxhoudinimcp</td></tr></tbody></table><p><strong>Overall recommendation</strong>: For most users, <strong>fxhoudinimcp</strong> is the better choice — broader feature coverage, a more robust architecture, and a smoother installation process. If you specifically need the OPUS procedural asset library integration, <strong>houdini-mcp</strong> is worth a look as a complementary tool.</p><hr><h2 id="References"><a href="#References" class="headerlink" title="References"></a>References</h2><ul><li><a href="https://github.com/capoomgit/houdini-mcp">capoomgit&#x2F;houdini-mcp</a></li><li><a href="https://github.com/healkeiser/fxhoudinimcp">healkeiser&#x2F;fxhoudinimcp</a></li><li><a href="https://github.com/ahujasid/blender-mcp">blender-mcp</a> — the project that inspired houdini-mcp</li></ul>]]></content:encoded>
      
      
      
      <category domain="https://eugenepage.com/tags/Houdini/">Houdini</category>
      
      <category domain="https://eugenepage.com/tags/MCP/">MCP</category>
      
      
      <comments>https://eugenepage.com/2026/05/02/20260502.HoudiniMCPComparison/#disqus_thread</comments>
      
    </item>
    
    <item>
      <title>Houdini MCP 项目对比评测</title>
      <link>https://eugenepage.com/zh-CN/2026/05/02/20260502.HoudiniMCPComparison/</link>
      <guid>https://eugenepage.com/zh-CN/2026/05/02/20260502.HoudiniMCPComparison/</guid>
      <pubDate>Sat, 02 May 2026 04:00:00 GMT</pubDate>
      
        
        
      <description>&lt;h1 id=&quot;Houdini-MCP-项目对比评测：capoomgit-houdini-mcp-vs-healkeiser-fxhoudinimcp&quot;&gt;&lt;a href=&quot;#Houdini-MCP-项目对比评测：capoomgit-houdini-mcp-vs-healkeise</description>
        
      
      
      
      <content:encoded><![CDATA[<h1 id="Houdini-MCP-项目对比评测：capoomgit-houdini-mcp-vs-healkeiser-fxhoudinimcp"><a href="#Houdini-MCP-项目对比评测：capoomgit-houdini-mcp-vs-healkeiser-fxhoudinimcp" class="headerlink" title="Houdini MCP 项目对比评测：capoomgit&#x2F;houdini-mcp vs healkeiser&#x2F;fxhoudinimcp"></a>Houdini MCP 项目对比评测：capoomgit&#x2F;houdini-mcp vs healkeiser&#x2F;fxhoudinimcp</h1><h2 id="引言"><a href="#引言" class="headerlink" title="引言"></a>引言</h2><p>随着 MCP（Model Context Protocol）协议的普及，越来越多的 DCC 软件开始接入 AI 助手。在 Houdini 生态中，目前有两个主要的 MCP 开源项目：</p><ol><li><strong><a href="https://github.com/capoomgit/houdini-mcp">capoomgit&#x2F;houdini-mcp</a></strong> — 早期项目，结构简洁</li><li><strong><a href="https://github.com/healkeiser/fxhoudinimcp">healkeiser&#x2F;fxhoudinimcp</a></strong> — 后起之秀，功能全面</li></ol><p>本文从架构设计、功能覆盖、安装体验、扩展性等维度进行对比，帮助选择适合自己工作流的项目。</p><hr><h2 id="总览对比"><a href="#总览对比" class="headerlink" title="总览对比"></a>总览对比</h2><table><thead><tr><th>维度</th><th>houdini-mcp (capoomgit)</th><th>fxhoudinimcp (healkeiser)</th></tr></thead><tbody><tr><td>定位</td><td>轻量级 MCP 桥接</td><td>全面型 Houdini MCP 服务器</td></tr><tr><td>工具数量</td><td>未明确分类，基础功能为主</td><td><strong>168 个工具</strong> + 8 资源 + 6 工作流提示</td></tr><tr><td>架构</td><td>自定义 TCP Socket（端口 9876）</td><td>Houdini 内置 <code>hwebserver</code>（端口 8100）</td></tr><tr><td>安装方式</td><td>手动复制文件到 Houdini 目录</td><td>PyPI 发布，<code>pip install fxhoudinimcp</code></td></tr><tr><td>包管理依赖</td><td>强依赖 <code>uv</code></td><td>标准 <code>pip</code> 即可</td></tr><tr><td>线程安全</td><td>未明确说明</td><td><code>hdefereval.executeInMainThreadWithResult()</code></td></tr><tr><td>许可证</td><td>未明确</td><td>MIT</td></tr><tr><td>维护状态</td><td>社区维护</td><td>活跃开发中</td></tr></tbody></table><hr><h2 id="架构设计对比"><a href="#架构设计对比" class="headerlink" title="架构设计对比"></a>架构设计对比</h2><h3 id="houdini-mcp（capoomgit）"><a href="#houdini-mcp（capoomgit）" class="headerlink" title="houdini-mcp（capoomgit）"></a>houdini-mcp（capoomgit）</h3><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">Claude Desktop  ──(stdio)──&gt;  MCP Bridge Script  ──(TCP :9876)──&gt;  Houdini Plugin</span><br></pre></td></tr></table></figure><ul><li><strong>通信方式</strong>：MCP Bridge Script 通过 stdin&#x2F;stdout 与 Claude 通信，通过自定义 TCP Socket 与 Houdini 通信。</li><li><strong>服务端</strong>：自己实现的 <code>HoudiniMCPServer</code>，监听在 <code>localhost:9876</code>。</li><li><strong>灵感来源</strong>：基于 <a href="https://github.com/ahujasid/blender-mcp">blender-mcp</a> 改写。</li></ul><h3 id="fxhoudinimcp（healkeiser）"><a href="#fxhoudinimcp（healkeiser）" class="headerlink" title="fxhoudinimcp（healkeiser）"></a>fxhoudinimcp（healkeiser）</h3><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">Claude Desktop / Cursor / Claude Code  ──(stdio/streamable-http)──&gt;  FXHoudini MCP Server  ──(HTTP :8100)──&gt;  Houdini hwebserver</span><br></pre></td></tr></table></figure><ul><li><strong>通信方式</strong>：MCP Server 通过 stdio 或 streamable-http 与 AI 客户端通信，通过 HTTP&#x2F;JSON 与 Houdini 通信。</li><li><strong>服务端</strong>：直接使用 Houdini 内置的 <code>hwebserver</code>，无需额外启动自定义服务器。</li><li><strong>线程安全</strong>：使用 <code>hdefereval.executeInMainThreadWithResult()</code> 确保 <code>hou.*</code> API 调用在主线程执行。</li></ul><h3 id="架构差异分析"><a href="#架构差异分析" class="headerlink" title="架构差异分析"></a>架构差异分析</h3><table><thead><tr><th>对比点</th><th>houdini-mcp</th><th>fxhoudinimcp</th></tr></thead><tbody><tr><td>服务端实现</td><td>自定义 Socket</td><td>Houdini 原生 <code>hwebserver</code></td></tr><tr><td>传输协议</td><td>TCP</td><td>HTTP &#x2F; JSON</td></tr><tr><td>MCP 传输</td><td>stdio</td><td>stdio + streamable-http</td></tr><tr><td>线程安全</td><td>未知</td><td>有明确保障</td></tr><tr><td>依赖复杂度</td><td>需要额外运行 Bridge Script</td><td>MCP Server 直接与 hwebserver 通信</td></tr></tbody></table><p><strong>结论</strong>：fxhoudinimcp 的架构更稳健 — 复用 Houdini 原生组件，减少自定义代码带来的潜在问题。</p><hr><h2 id="功能覆盖对比"><a href="#功能覆盖对比" class="headerlink" title="功能覆盖对比"></a>功能覆盖对比</h2><h3 id="houdini-mcp-功能范围"><a href="#houdini-mcp-功能范围" class="headerlink" title="houdini-mcp 功能范围"></a>houdini-mcp 功能范围</h3><p>提供基础的 Houdini 控制：</p><ul><li>创建和修改节点</li><li>执行 Python &#x2F; HScript 代码</li><li>场景基础操作</li><li><strong>OPUS 集成</strong>：通过 RapidAPI 接入 OPUS 的程序化家具和环境资产库（独有功能）</li></ul><h3 id="fxhoudinimcp-功能范围（19-个分类，168-个工具）"><a href="#fxhoudinimcp-功能范围（19-个分类，168-个工具）" class="headerlink" title="fxhoudinimcp 功能范围（19 个分类，168 个工具）"></a>fxhoudinimcp 功能范围（19 个分类，168 个工具）</h3><table><thead><tr><th>分类</th><th>工具数</th><th>说明</th></tr></thead><tbody><tr><td>Scene Management</td><td>7</td><td>打开、保存、导入&#x2F;导出、场景信息</td></tr><tr><td>Node Operations</td><td>16</td><td>创建、删除、复制、连接、布局、标记</td></tr><tr><td>Parameters</td><td>10</td><td>获取&#x2F;设置值、表达式、关键帧、自定义参数</td></tr><tr><td>Geometry (SOPs)</td><td>12</td><td>点、面、属性、组、采样、最近点查找</td></tr><tr><td>LOPs&#x2F;USD</td><td>18</td><td>Stage 检查、Prim、层、合成、变体、灯光</td></tr><tr><td>DOPs</td><td>8</td><td>模拟信息、DOP 对象、步进&#x2F;重置、内存使用</td></tr><tr><td>PDG&#x2F;TOPs</td><td>10</td><td>Cook、Work Item、调度器、依赖图</td></tr><tr><td>COPs (Copernicus)</td><td>7</td><td>图像节点、层、VDB 数据</td></tr><tr><td>HDAs</td><td>10</td><td>创建、安装、管理数字资产</td></tr><tr><td>Animation</td><td>9</td><td>关键帧、播放条控制、帧范围</td></tr><tr><td>Rendering</td><td>9</td><td>视口截图、渲染节点、设置、渲染启动</td></tr><tr><td>VEX</td><td>5</td><td>创建&#x2F;编辑 Wrangle、验证 VEX 代码</td></tr><tr><td>Code Execution</td><td>4</td><td>Python、HScript、表达式、环境变量</td></tr><tr><td>Viewport&#x2F;UI</td><td>11</td><td>面板管理、截图、状态消息、错误检测</td></tr><tr><td>Scene Context</td><td>8</td><td>网络概览、Cook 链、选择、场景摘要、错误分析</td></tr><tr><td>Workflows</td><td>8</td><td>一键 Pyro&#x2F;RBD&#x2F;FLIP&#x2F;Vellum 设置、SOP 链、渲染配置</td></tr><tr><td>Materials</td><td>4</td><td>列出、检查、创建材质和着色器网络</td></tr><tr><td>CHOPs</td><td>4</td><td>通道数据、CHOP 节点、导出通道到参数</td></tr><tr><td>Cache</td><td>4</td><td>列出、检查、清除、写入文件缓存</td></tr><tr><td>Takes</td><td>4</td><td>列出、创建、切换 Take 及参数覆盖</td></tr></tbody></table><p><strong>亮点</strong>：</p><ul><li><strong>一键工作流</strong>：Pyro、RBD、FLIP、Vellum 模拟一键搭建</li><li><strong>USD&#x2F;LOPs 全面支持</strong>：18 个工具覆盖 USD 工作流</li><li><strong>Copernicus (COPs) 支持</strong>：图像处理节点操作</li><li><strong>场景上下文分析</strong>：错误检测、Cook 链追踪</li></ul><hr><h2 id="安装与配置对比"><a href="#安装与配置对比" class="headerlink" title="安装与配置对比"></a>安装与配置对比</h2><h3 id="houdini-mcp-安装步骤"><a href="#houdini-mcp-安装步骤" class="headerlink" title="houdini-mcp 安装步骤"></a>houdini-mcp 安装步骤</h3><ol><li>安装 <code>uv</code>（Python 包管理工具）</li><li>手动创建 Houdini 脚本目录并复制文件</li><li>在目录中运行 <code>uv add &quot;mcp[cli]&quot;</code></li><li>手动创建 Shelf Tool</li><li>（可选）创建 Houdini Package JSON 实现自动加载</li><li>配置 <code>claude_desktop_config.json</code></li></ol><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">  <span class="attr">&quot;mcpServers&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;houdini&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">      <span class="attr">&quot;command&quot;</span><span class="punctuation">:</span> <span class="string">&quot;uv&quot;</span><span class="punctuation">,</span></span><br><span class="line">      <span class="attr">&quot;args&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span><span class="string">&quot;run&quot;</span><span class="punctuation">,</span> <span class="string">&quot;python&quot;</span><span class="punctuation">,</span> <span class="string">&quot;C:/path/to/houdini_mcp_server.py&quot;</span><span class="punctuation">]</span></span><br><span class="line">    <span class="punctuation">&#125;</span></span><br><span class="line">  <span class="punctuation">&#125;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><h3 id="fxhoudinimcp-安装步骤"><a href="#fxhoudinimcp-安装步骤" class="headerlink" title="fxhoudinimcp 安装步骤"></a>fxhoudinimcp 安装步骤</h3><ol><li><code>pip install fxhoudinimcp</code>（或 <code>uv pip install fxhoudinimcp</code>）</li><li>复制 Package JSON 到 Houdini packages 目录</li><li>配置 MCP 客户端</li></ol><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">  <span class="attr">&quot;mcpServers&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;fxhoudini&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">      <span class="attr">&quot;command&quot;</span><span class="punctuation">:</span> <span class="string">&quot;python&quot;</span><span class="punctuation">,</span></span><br><span class="line">      <span class="attr">&quot;args&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span><span class="string">&quot;-m&quot;</span><span class="punctuation">,</span> <span class="string">&quot;fxhoudinimcp&quot;</span><span class="punctuation">]</span><span class="punctuation">,</span></span><br><span class="line">      <span class="attr">&quot;env&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">        <span class="attr">&quot;HOUDINI_HOST&quot;</span><span class="punctuation">:</span> <span class="string">&quot;localhost&quot;</span><span class="punctuation">,</span></span><br><span class="line">        <span class="attr">&quot;HOUDINI_PORT&quot;</span><span class="punctuation">:</span> <span class="string">&quot;8100&quot;</span></span><br><span class="line">      <span class="punctuation">&#125;</span></span><br><span class="line">    <span class="punctuation">&#125;</span></span><br><span class="line">  <span class="punctuation">&#125;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p><strong>Claude Code 支持</strong>（一行命令）：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">claude mcp add --scope user fxhoudini -- python -m fxhoudinimcp</span><br></pre></td></tr></table></figure><h3 id="安装体验对比"><a href="#安装体验对比" class="headerlink" title="安装体验对比"></a>安装体验对比</h3><table><thead><tr><th>对比点</th><th>houdini-mcp</th><th>fxhoudinimcp</th></tr></thead><tbody><tr><td>安装步骤</td><td>5-6 步，多处手动操作</td><td>2-3 步，标准化流程</td></tr><tr><td>包管理</td><td>强依赖 <code>uv</code></td><td>标准 <code>pip</code> &#x2F; <code>uv</code> 均可</td></tr><tr><td>PyPI 发布</td><td>无</td><td>有</td></tr><tr><td>自动启动</td><td>需手动配置 Package</td><td><code>uiready.py</code> 自动启动</td></tr><tr><td>文档质量</td><td>基础 README</td><td>详细的分类文档 + 环境变量说明</td></tr></tbody></table><hr><h2 id="客户端支持对比"><a href="#客户端支持对比" class="headerlink" title="客户端支持对比"></a>客户端支持对比</h2><table><thead><tr><th>AI 客户端</th><th>houdini-mcp</th><th>fxhoudinimcp</th></tr></thead><tbody><tr><td>Claude Desktop</td><td>支持</td><td>支持</td></tr><tr><td>Cursor</td><td>支持</td><td>支持</td></tr><tr><td>VS Code</td><td>未提及</td><td>支持</td></tr><tr><td>Claude Code CLI</td><td>未提及</td><td>支持（一行命令）</td></tr></tbody></table><hr><h2 id="独有功能"><a href="#独有功能" class="headerlink" title="独有功能"></a>独有功能</h2><h3 id="houdini-mcp-独有"><a href="#houdini-mcp-独有" class="headerlink" title="houdini-mcp 独有"></a>houdini-mcp 独有</h3><ul><li><strong>OPUS 集成</strong>：通过 RapidAPI 接入 OPUS 程序化资产库，可获取家具和环境资产。需要注册 RapidAPI 账号并订阅 API。</li></ul><h3 id="fxhoudinimcp-独有"><a href="#fxhoudinimcp-独有" class="headerlink" title="fxhoudinimcp 独有"></a>fxhoudinimcp 独有</h3><ul><li><strong>一键模拟工作流</strong>：Pyro &#x2F; RBD &#x2F; FLIP &#x2F; Vellum 一键搭建</li><li><strong>USD&#x2F;LOPs 深度支持</strong>：18 个工具</li><li><strong>Copernicus 图像处理</strong>：COPs 节点操作</li><li><strong>场景错误分析</strong>：自动检测和报告 Cook 错误</li><li><strong>环境变量配置</strong>：<code>HOUDINI_HOST</code>、<code>HOUDINI_PORT</code>、<code>FXHOUDINIMCP_AUTOSTART</code> 等</li><li><strong>双传输模式</strong>：stdio + streamable-http</li></ul><hr><h2 id="适用场景建议"><a href="#适用场景建议" class="headerlink" title="适用场景建议"></a>适用场景建议</h2><h3 id="选择-houdini-mcp（capoomgit）的情况"><a href="#选择-houdini-mcp（capoomgit）的情况" class="headerlink" title="选择 houdini-mcp（capoomgit）的情况"></a>选择 houdini-mcp（capoomgit）的情况</h3><ul><li>只需要基础的 AI 控制 Houdini 功能</li><li>已经在使用 <code>uv</code> 工作流</li><li>需要 OPUS 程序化资产库的集成</li><li>项目结构简单，希望快速上手</li></ul><h3 id="选择-fxhoudinimcp（healkeiser）的情况"><a href="#选择-fxhoudinimcp（healkeiser）的情况" class="headerlink" title="选择 fxhoudinimcp（healkeiser）的情况"></a>选择 fxhoudinimcp（healkeiser）的情况</h3><ul><li>需要全面的 Houdini 功能覆盖（SOPs、LOPs、DOPs、TOPs、COPs 等）</li><li>需要 USD&#x2F;LOPs 工作流支持</li><li>需要一键模拟工作流（Pyro &#x2F; FLIP &#x2F; Vellum &#x2F; RBD）</li><li>希望使用标准化安装（PyPI 包）</li><li>使用 Claude Code CLI 作为主要 AI 工具</li><li>需要线程安全保障</li><li>重视项目的活跃维护和长期演进</li></ul><hr><h2 id="结论"><a href="#结论" class="headerlink" title="结论"></a>结论</h2><table><thead><tr><th>评价维度</th><th>houdini-mcp</th><th>fxhoudinimcp</th><th>胜出</th></tr></thead><tbody><tr><td>功能丰富度</td><td>基础</td><td>168 工具</td><td>fxhoudinimcp</td></tr><tr><td>架构稳健性</td><td>自定义 Socket</td><td>原生 hwebserver</td><td>fxhoudinimcp</td></tr><tr><td>安装便利性</td><td>手动多步</td><td>pip 一键</td><td>fxhoudinimcp</td></tr><tr><td>客户端兼容</td><td>Desktop + Cursor</td><td>Desktop + Cursor + VSCode + Claude Code</td><td>fxhoudinimcp</td></tr><tr><td>资产生态</td><td>OPUS 集成</td><td>无</td><td>houdini-mcp</td></tr><tr><td>文档质量</td><td>基础</td><td>完善</td><td>fxhoudinimcp</td></tr><tr><td>维护活跃度</td><td>社区维护</td><td>活跃开发</td><td>fxhoudinimcp</td></tr></tbody></table><p><strong>综合推荐</strong>：对于大多数用户，<strong>fxhoudinimcp</strong> 是更好的选择 — 更全面的功能覆盖、更稳健的架构、更便捷的安装流程。如果你特别需要 OPUS 程序化资产库的集成，可以额外关注 <strong>houdini-mcp</strong>。</p><hr><h2 id="参考链接"><a href="#参考链接" class="headerlink" title="参考链接"></a>参考链接</h2><ul><li><a href="https://github.com/capoomgit/houdini-mcp">capoomgit&#x2F;houdini-mcp</a></li><li><a href="https://github.com/healkeiser/fxhoudinimcp">healkeiser&#x2F;fxhoudinimcp</a></li><li><a href="https://github.com/ahujasid/blender-mcp">blender-mcp</a> — houdini-mcp 的灵感来源</li></ul>]]></content:encoded>
      
      
      
      <category domain="https://eugenepage.com/tags/Houdini/">Houdini</category>
      
      <category domain="https://eugenepage.com/tags/MCP/">MCP</category>
      
      
      <comments>https://eugenepage.com/zh-CN/2026/05/02/20260502.HoudiniMCPComparison/#disqus_thread</comments>
      
    </item>
    
    <item>
      <title>AI Agent Framework Research Notes</title>
      <link>https://eugenepage.com/2026/04/30/20260430.AIAgentFrameworkResearchNotes/</link>
      <guid>https://eugenepage.com/2026/04/30/20260430.AIAgentFrameworkResearchNotes/</guid>
      <pubDate>Thu, 30 Apr 2026 04:00:00 GMT</pubDate>
      
        
        
      <description>&lt;h1 id=&quot;AI-Agent-Framework-Research-Notes&quot;&gt;&lt;a href=&quot;#AI-Agent-Framework-Research-Notes&quot; class=&quot;headerlink&quot; title=&quot;AI Agent Framework Researc</description>
        
      
      
      
      <content:encoded><![CDATA[<h1 id="AI-Agent-Framework-Research-Notes"><a href="#AI-Agent-Framework-Research-Notes" class="headerlink" title="AI Agent Framework Research Notes"></a>AI Agent Framework Research Notes</h1><blockquote><p>Last updated: 2026-04-30</p><p>As AI Agent technology evolves at a rapid pace, new agent development frameworks keep appearing. This document surveys and compares six of the most widely adopted Agent frameworks available today, helping developers choose the right tool for their use case.</p></blockquote><hr><h2 id="Table-of-Contents"><a href="#Table-of-Contents" class="headerlink" title="Table of Contents"></a>Table of Contents</h2><ul><li><a href="#i-framework-overview-comparison">I. Framework Overview Comparison</a></li><li><a href="#ii-langgraph">II. LangGraph</a></li><li><a href="#iii-crewai">III. CrewAI</a></li><li><a href="#iv-llamaindex">IV. LlamaIndex</a></li><li><a href="#v-dify">V. Dify</a></li><li><a href="#vi-openai-agents-sdk">VI. OpenAI Agents SDK</a></li><li><a href="#vii-google-adk">VII. Google ADK</a></li><li><a href="#viii-framework-selection-guide">VIII. Framework Selection Guide</a></li></ul><hr><h2 id="I-Framework-Overview-Comparison"><a href="#I-Framework-Overview-Comparison" class="headerlink" title="I. Framework Overview Comparison"></a>I. Framework Overview Comparison</h2><table><thead><tr><th>Dimension</th><th>LangGraph</th><th>CrewAI</th><th>LlamaIndex</th><th>Dify</th><th>OpenAI Agents SDK</th><th>Google ADK</th></tr></thead><tbody><tr><td><strong>Developer</strong></td><td>LangChain Inc.</td><td>CrewAI Inc.</td><td>LlamaIndex Inc.</td><td>LangGenius</td><td>OpenAI</td><td>Google</td></tr><tr><td><strong>Latest Version</strong></td><td>v1.1.10</td><td>v1.14.3</td><td>v0.14.6</td><td>v1.6.0+</td><td>v0.14.6</td><td>v1.31.1</td></tr><tr><td><strong>License</strong></td><td>MIT</td><td>MIT</td><td>MIT</td><td>Dify License (Apache 2.0+)</td><td>MIT</td><td>Apache 2.0</td></tr><tr><td><strong>Language</strong></td><td>Python &#x2F; JS</td><td>Python</td><td>Python &#x2F; TS</td><td>Visual (multi-language SDK)</td><td>Python &#x2F; JS</td><td>Python &#x2F; Java &#x2F; Go &#x2F; TS</td></tr><tr><td><strong>Core Philosophy</strong></td><td>Graph orchestration</td><td>Role-playing teams</td><td>RAG + Agent</td><td>Low-code platform</td><td>Minimal multi-agent</td><td>Code-first</td></tr><tr><td><strong>Model Support</strong></td><td>Model-agnostic</td><td>Model-agnostic</td><td>Model-agnostic</td><td>Model-agnostic</td><td>100+ LLMs</td><td>Model-agnostic</td></tr><tr><td><strong>Learning Curve</strong></td><td>Steep</td><td>Moderate</td><td>Moderate</td><td>Low</td><td>Low</td><td>Moderate</td></tr><tr><td><strong>Best For</strong></td><td>Complex stateful workflows</td><td>Multi-role collaboration</td><td>RAG + Agent</td><td>Rapid prototyping &#x2F; non-technical users</td><td>OpenAI ecosystem apps</td><td>Google ecosystem apps</td></tr></tbody></table><hr><h2 id="II-LangGraph"><a href="#II-LangGraph" class="headerlink" title="II. LangGraph"></a>II. LangGraph</h2><h3 id="2-1-Introduction"><a href="#2-1-Introduction" class="headerlink" title="2.1 Introduction"></a>2.1 Introduction</h3><p><strong>LangGraph</strong> is a <strong>low-level orchestration framework</strong> developed by the LangChain team, specifically designed for building long-running, stateful AI Agents. It draws design inspiration from Google’s Pregel and Apache Beam.</p><p>Core positioning: rather than abstracting away prompts or architecture, LangGraph provides low-level infrastructure that gives developers fine-grained control over agent workflows. It is already used in production by companies such as Klarna, Replit, and Elastic.</p><table><thead><tr><th>Project Info</th><th>Details</th></tr></thead><tbody><tr><td>Latest Version</td><td>v1.1.10 (2026-04-27)</td></tr><tr><td>License</td><td>MIT</td></tr><tr><td>Install</td><td><code>pip install -U langgraph</code></td></tr><tr><td>GitHub</td><td><a href="https://github.com/langchain-ai/langgraph">langchain-ai&#x2F;langgraph</a></td></tr><tr><td>Docs</td><td><a href="https://docs.langchain.com/oss/python/langgraph">docs.langchain.com&#x2F;oss&#x2F;python&#x2F;langgraph</a></td></tr></tbody></table><h3 id="2-2-Core-Architecture"><a href="#2-2-Core-Architecture" class="headerlink" title="2.2 Core Architecture"></a>2.2 Core Architecture</h3><p>LangGraph models agent workflows as a <strong>Graph</strong>, built from three core components:</p><ul><li><strong>State</strong>: A shared data structure, typically defined using <code>TypedDict</code> or a <code>Pydantic Model</code></li><li><strong>Nodes</strong>: Functions that encode agent logic — they receive the current state and return an updated state</li><li><strong>Edges</strong>: Functions that determine the next node, supporting conditional branching or fixed transitions</li></ul><h3 id="2-3-Key-Features"><a href="#2-3-Key-Features" class="headerlink" title="2.3 Key Features"></a>2.3 Key Features</h3><ul><li><strong>Persistence</strong>: Saves the graph state as a checkpoint after each execution step; supports in-memory, Redis, and other backends</li><li><strong>Human-in-the-Loop</strong>: Uses <code>interrupt()</code> to pause execution and wait for human input before resuming</li><li><strong>Streaming</strong>: Supports multiple streaming modes including values, messages, and updates</li><li><strong>Subgraphs</strong>: Supports nested graphs where subgraphs have their own independent checkpoints and interrupt capabilities</li><li><strong>Time Travel</strong>: Can rewind to any historical checkpoint, with support for forking and replaying</li><li><strong>Visualization</strong>: After compilation, can generate Mermaid diagrams to visualize the workflow structure</li></ul><h3 id="2-4-Code-Example"><a href="#2-4-Code-Example" class="headerlink" title="2.4 Code Example"></a>2.4 Code Example</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> typing <span class="keyword">import</span> <span class="type">Literal</span></span><br><span class="line"><span class="keyword">from</span> langgraph.graph <span class="keyword">import</span> StateGraph, MessagesState, START, END</span><br><span class="line"><span class="keyword">from</span> langchain.messages <span class="keyword">import</span> SystemMessage, HumanMessage, ToolMessage</span><br><span class="line"></span><br><span class="line"><span class="comment"># Define tools</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">multiply</span>(<span class="params">a: <span class="built_in">int</span>, b: <span class="built_in">int</span></span>) -&gt; <span class="built_in">int</span>:</span><br><span class="line">    <span class="string">&quot;&quot;&quot;Multiply two numbers.&quot;&quot;&quot;</span></span><br><span class="line">    <span class="keyword">return</span> a * b</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">add</span>(<span class="params">a: <span class="built_in">int</span>, b: <span class="built_in">int</span></span>) -&gt; <span class="built_in">int</span>:</span><br><span class="line">    <span class="string">&quot;&quot;&quot;Add two numbers.&quot;&quot;&quot;</span></span><br><span class="line">    <span class="keyword">return</span> a + b</span><br><span class="line"></span><br><span class="line">tools = [multiply, add]</span><br><span class="line">tools_by_name = &#123;tool.name: tool <span class="keyword">for</span> tool <span class="keyword">in</span> tools&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment"># Define nodes</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">llm_call</span>(<span class="params">state: MessagesState</span>):</span><br><span class="line">    <span class="string">&quot;&quot;&quot;LLM decides whether to call a tool&quot;&quot;&quot;</span></span><br><span class="line">    <span class="keyword">return</span> &#123;</span><br><span class="line">        <span class="string">&quot;messages&quot;</span>: [</span><br><span class="line">            llm_with_tools.invoke(</span><br><span class="line">                [SystemMessage(content=<span class="string">&quot;You are a helpful arithmetic assistant.&quot;</span>)]</span><br><span class="line">                + state[<span class="string">&quot;messages&quot;</span>]</span><br><span class="line">            )</span><br><span class="line">        ]</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">tool_node</span>(<span class="params">state: <span class="built_in">dict</span></span>):</span><br><span class="line">    <span class="string">&quot;&quot;&quot;Execute tool calls&quot;&quot;&quot;</span></span><br><span class="line">    result = []</span><br><span class="line">    <span class="keyword">for</span> tool_call <span class="keyword">in</span> state[<span class="string">&quot;messages&quot;</span>][-<span class="number">1</span>].tool_calls:</span><br><span class="line">        tool = tools_by_name[tool_call[<span class="string">&quot;name&quot;</span>]]</span><br><span class="line">        observation = tool.invoke(tool_call[<span class="string">&quot;args&quot;</span>])</span><br><span class="line">        result.append(ToolMessage(content=<span class="built_in">str</span>(observation), tool_call_id=tool_call[<span class="string">&quot;id&quot;</span>]))</span><br><span class="line">    <span class="keyword">return</span> &#123;<span class="string">&quot;messages&quot;</span>: result&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment"># Define conditional edge routing</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">should_continue</span>(<span class="params">state: MessagesState</span>) -&gt; <span class="type">Literal</span>[<span class="string">&quot;tool_node&quot;</span>, END]:</span><br><span class="line">    last_message = state[<span class="string">&quot;messages&quot;</span>][-<span class="number">1</span>]</span><br><span class="line">    <span class="keyword">if</span> last_message.tool_calls:</span><br><span class="line">        <span class="keyword">return</span> <span class="string">&quot;tool_node&quot;</span></span><br><span class="line">    <span class="keyword">return</span> END</span><br><span class="line"></span><br><span class="line"><span class="comment"># Build and compile the graph</span></span><br><span class="line">builder = StateGraph(MessagesState)</span><br><span class="line">builder.add_node(<span class="string">&quot;llm_call&quot;</span>, llm_call)</span><br><span class="line">builder.add_node(<span class="string">&quot;tool_node&quot;</span>, tool_node)</span><br><span class="line">builder.add_edge(START, <span class="string">&quot;llm_call&quot;</span>)</span><br><span class="line">builder.add_conditional_edges(<span class="string">&quot;llm_call&quot;</span>, should_continue, [<span class="string">&quot;tool_node&quot;</span>, END])</span><br><span class="line">builder.add_edge(<span class="string">&quot;tool_node&quot;</span>, <span class="string">&quot;llm_call&quot;</span>)</span><br><span class="line"></span><br><span class="line">agent = builder.<span class="built_in">compile</span>()</span><br><span class="line"></span><br><span class="line"><span class="comment"># Run</span></span><br><span class="line">result = agent.invoke(&#123;<span class="string">&quot;messages&quot;</span>: [HumanMessage(content=<span class="string">&quot;Add 3 and 4, then multiply by 5.&quot;</span>)]&#125;)</span><br></pre></td></tr></table></figure><h3 id="2-5-Strengths-and-Limitations"><a href="#2-5-Strengths-and-Limitations" class="headerlink" title="2.5 Strengths and Limitations"></a>2.5 Strengths and Limitations</h3><p><strong>Strengths:</strong> Fine-grained control, stateful execution, native human-in-the-loop, fault-tolerant recovery, time-travel debugging, framework-agnostic</p><p><strong>Limitations:</strong> Steep learning curve, lots of boilerplate code, best experience requires the LangSmith ecosystem, fast-moving release cycle</p><hr><h2 id="III-CrewAI"><a href="#III-CrewAI" class="headerlink" title="III. CrewAI"></a>III. CrewAI</h2><h3 id="3-1-Introduction"><a href="#3-1-Introduction" class="headerlink" title="3.1 Introduction"></a>3.1 Introduction</h3><p><strong>CrewAI</strong> is a Python framework for orchestrating multiple autonomous AI Agents, built entirely from scratch with <strong>no dependency on LangChain or any other framework</strong>. Its core idea is to simulate real team collaboration through role-playing.</p><table><thead><tr><th>Project Info</th><th>Details</th></tr></thead><tbody><tr><td>Latest Version</td><td>v1.14.3 (2025-04-24)</td></tr><tr><td>License</td><td>MIT</td></tr><tr><td>Install</td><td><code>pip install &#39;crewai[tools]&#39;</code></td></tr><tr><td>GitHub</td><td><a href="https://github.com/crewAIInc/crewAI">crewAIInc&#x2F;crewAI</a></td></tr><tr><td>Docs</td><td><a href="https://docs.crewai.com/">docs.crewai.com</a></td></tr></tbody></table><h3 id="3-2-Core-Concepts"><a href="#3-2-Core-Concepts" class="headerlink" title="3.2 Core Concepts"></a>3.2 Core Concepts</h3><ul><li><strong>Agent</strong>: Identity and behavior defined through <code>role</code>, <code>goal</code>, and <code>backstory</code></li><li><strong>Task</strong>: A concrete unit of work; can specify the assigned agent, context dependencies, and output format</li><li><strong>Crew</strong>: A collection of collaborating agents, defining the execution process and memory configuration</li><li><strong>Tools</strong>: A rich set of built-in tools (search, file read&#x2F;write, code execution, etc.) with MCP integration support</li><li><strong>Process</strong>: Either Sequential or Hierarchical (automatically creates a Manager Agent)</li></ul><h3 id="3-3-Key-Features"><a href="#3-3-Key-Features" class="headerlink" title="3.3 Key Features"></a>3.3 Key Features</h3><ul><li><strong>Role-playing design</strong>: Intuitive role definitions that closely mirror real team collaboration</li><li><strong>Collaborative workflows</strong>: Agents can delegate tasks to one another and pass context between them</li><li><strong>Four memory systems</strong>: Short-term memory, long-term memory, entity memory, and contextual memory</li><li><strong>Flows</strong>: Enterprise-grade event-driven workflow orchestration, supporting <code>@start</code>, <code>@listen</code>, and <code>@router</code> decorators</li><li><strong>Checkpoint &amp; Fork</strong>: Supports saving, restoring, and branching execution state</li><li><strong>YAML-driven configuration</strong>: Agents and tasks can be defined via YAML files</li></ul><h3 id="3-4-Code-Example"><a href="#3-4-Code-Example" class="headerlink" title="3.4 Code Example"></a>3.4 Code Example</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> crewai <span class="keyword">import</span> Agent, Task, Crew, Process</span><br><span class="line"></span><br><span class="line"><span class="comment"># Define Agents</span></span><br><span class="line">researcher = Agent(</span><br><span class="line">    role=<span class="string">&#x27;Senior AI Researcher&#x27;</span>,</span><br><span class="line">    goal=<span class="string">&#x27;Discover the latest development trends in the AI Agent space&#x27;</span>,</span><br><span class="line">    backstory=<span class="string">&#x27;You are an experienced researcher with a knack for spotting cutting-edge technology developments.&#x27;</span>,</span><br><span class="line">    verbose=<span class="literal">True</span>,</span><br><span class="line">    memory=<span class="literal">True</span>,</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line">writer = Agent(</span><br><span class="line">    role=<span class="string">&#x27;Technical Report Writing Specialist&#x27;</span>,</span><br><span class="line">    goal=<span class="string">&#x27;Transform research findings into clear, well-structured reports&#x27;</span>,</span><br><span class="line">    backstory=<span class="string">&#x27;You are a technical writing expert who excels at turning complex information into readable reports.&#x27;</span>,</span><br><span class="line">    verbose=<span class="literal">True</span>,</span><br><span class="line">    memory=<span class="literal">True</span>,</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="comment"># Define Tasks</span></span><br><span class="line">research_task = Task(</span><br><span class="line">    description=<span class="string">&#x27;Conduct comprehensive research on &#123;topic&#125; and gather the latest development trends.&#x27;</span>,</span><br><span class="line">    expected_output=<span class="string">&#x27;A detailed list of research findings with 10 key points&#x27;</span>,</span><br><span class="line">    agent=researcher,</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line">writing_task = Task(</span><br><span class="line">    description=<span class="string">&#x27;Write a complete technical report based on the research findings.&#x27;</span>,</span><br><span class="line">    expected_output=<span class="string">&#x27;A complete report in Markdown format&#x27;</span>,</span><br><span class="line">    agent=writer,</span><br><span class="line">    context=[research_task],</span><br><span class="line">    output_file=<span class="string">&#x27;report.md&#x27;</span>,</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="comment"># Assemble the Crew and run</span></span><br><span class="line">crew = Crew(</span><br><span class="line">    agents=[researcher, writer],</span><br><span class="line">    tasks=[research_task, writing_task],</span><br><span class="line">    process=Process.sequential,</span><br><span class="line">    verbose=<span class="literal">True</span>,</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line">result = crew.kickoff(inputs=&#123;<span class="string">&#x27;topic&#x27;</span>: <span class="string">&#x27;multi-agent collaboration systems&#x27;</span>&#125;)</span><br></pre></td></tr></table></figure><h3 id="3-5-Strengths-and-Limitations"><a href="#3-5-Strengths-and-Limitations" class="headerlink" title="3.5 Strengths and Limitations"></a>3.5 Strengths and Limitations</h3><p><strong>Strengths:</strong> Fully standalone with no external dependencies, intuitive role-playing design, four memory systems, YAML-driven configuration, active community (100,000+ certified developers)</p><p><strong>Limitations:</strong> Python only, high API overhead for multi-agent collaboration, complex to debug, enterprise features require a paid plan</p><hr><h2 id="IV-LlamaIndex"><a href="#IV-LlamaIndex" class="headerlink" title="IV. LlamaIndex"></a>IV. LlamaIndex</h2><h3 id="4-1-Introduction"><a href="#4-1-Introduction" class="headerlink" title="4.1 Introduction"></a>4.1 Introduction</h3><p><strong>LlamaIndex</strong> (formerly GPT Index) is an open-source framework that started out focused on RAG (Retrieval-Augmented Generation) and has since expanded into a <strong>document intelligence and OCR platform</strong>. Founded by Jerry Liu in 2022.</p><table><thead><tr><th>Project Info</th><th>Details</th></tr></thead><tbody><tr><td>Latest Version</td><td>v0.14.6</td></tr><tr><td>License</td><td>MIT</td></tr><tr><td>Install</td><td><code>pip install llama-index</code></td></tr><tr><td>GitHub</td><td><a href="https://github.com/run-llama/llama_index">run-llama&#x2F;llama_index</a></td></tr><tr><td>Docs</td><td><a href="https://developers.llamaindex.ai/python">developers.llamaindex.ai</a></td></tr></tbody></table><h3 id="4-2-Core-Concepts"><a href="#4-2-Core-Concepts" class="headerlink" title="4.2 Core Concepts"></a>4.2 Core Concepts</h3><ul><li><strong>Workflow</strong>: An event-driven orchestration mechanism where steps are defined using the <code>@step</code> decorator</li><li><strong>Context</strong>: A global runtime context that coordinates data passing between steps and supports persistence</li><li><strong>Event-driven architecture</strong>: <code>StartEvent</code> → custom events → <code>StopEvent</code>, forming a directed graph</li><li><strong>AgentWorkflow</strong>: A high-level abstraction that automatically selects the appropriate agent type based on LLM capabilities</li></ul><h3 id="4-3-Agent-Types"><a href="#4-3-Agent-Types" class="headerlink" title="4.3 Agent Types"></a>4.3 Agent Types</h3><table><thead><tr><th>Type</th><th>Use Case</th><th>Characteristics</th></tr></thead><tbody><tr><td><strong>FunctionAgent</strong></td><td>When the LLM supports function calling</td><td>Uses native function calling directly — most efficient</td></tr><tr><td><strong>ReActAgent</strong></td><td>When the LLM does not support function calling</td><td>Executes via the ReAct (Reasoning + Acting) loop</td></tr><tr><td><strong>CodeActAgent</strong></td><td>Scenarios that require code execution</td><td>Generates and executes code via <code>&lt;execute&gt;</code> tags</td></tr></tbody></table><h3 id="4-4-Key-Features"><a href="#4-4-Key-Features" class="headerlink" title="4.4 Key Features"></a>4.4 Key Features</h3><ul><li><strong>RAG + Agent integration</strong>: RAG is a first-class capability, not an add-on; supports 130+ data formats</li><li><strong>Multi-agent collaboration</strong>: Native support for multi-agent handoff mechanisms</li><li><strong>Context persistence</strong>: Workflow state can be serialized and restored, suitable for production environments</li><li><strong>LlamaParse</strong>: Enterprise-grade document parsing and OCR</li><li><strong>300+ integration packages</strong>: Covers mainstream LLMs, vector databases, and data sources</li></ul><h3 id="4-5-Code-Example"><a href="#4-5-Code-Example" class="headerlink" title="4.5 Code Example"></a>4.5 Code Example</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> llama_index.core <span class="keyword">import</span> VectorStoreIndex, SimpleDirectoryReader</span><br><span class="line"><span class="keyword">from</span> llama_index.core.agent.workflow <span class="keyword">import</span> FunctionAgent</span><br><span class="line"><span class="keyword">from</span> llama_index.llms.openai <span class="keyword">import</span> OpenAI</span><br><span class="line"><span class="keyword">import</span> asyncio</span><br><span class="line"></span><br><span class="line"><span class="comment"># Build a RAG index</span></span><br><span class="line">documents = SimpleDirectoryReader(<span class="string">&quot;data&quot;</span>).load_data()</span><br><span class="line">index = VectorStoreIndex.from_documents(documents)</span><br><span class="line">query_engine = index.as_query_engine()</span><br><span class="line"></span><br><span class="line"><span class="comment"># Define tools</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">multiply</span>(<span class="params">a: <span class="built_in">float</span>, b: <span class="built_in">float</span></span>) -&gt; <span class="built_in">float</span>:</span><br><span class="line">    <span class="string">&quot;&quot;&quot;Multiply two numbers.&quot;&quot;&quot;</span></span><br><span class="line">    <span class="keyword">return</span> a * b</span><br><span class="line"></span><br><span class="line"><span class="keyword">async</span> <span class="keyword">def</span> <span class="title function_">search_documents</span>(<span class="params">query: <span class="built_in">str</span></span>) -&gt; <span class="built_in">str</span>:</span><br><span class="line">    <span class="string">&quot;&quot;&quot;Search documents for answers.&quot;&quot;&quot;</span></span><br><span class="line">    response = <span class="keyword">await</span> query_engine.aquery(query)</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">str</span>(response)</span><br><span class="line"></span><br><span class="line"><span class="comment"># Create the agent</span></span><br><span class="line">agent = FunctionAgent(</span><br><span class="line">    tools=[multiply, search_documents],</span><br><span class="line">    llm=OpenAI(model=<span class="string">&quot;gpt-4o-mini&quot;</span>),</span><br><span class="line">    system_prompt=<span class="string">&quot;You are a helpful assistant that can calculate and search documents.&quot;</span>,</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="comment"># Run</span></span><br><span class="line"><span class="keyword">async</span> <span class="keyword">def</span> <span class="title function_">main</span>():</span><br><span class="line">    response = <span class="keyword">await</span> agent.run(<span class="string">&quot;What did the author do in college? Also, what&#x27;s 7 * 8?&quot;</span>)</span><br><span class="line">    <span class="built_in">print</span>(response)</span><br><span class="line"></span><br><span class="line">asyncio.run(main())</span><br></pre></td></tr></table></figure><h3 id="4-6-Strengths-and-Limitations"><a href="#4-6-Strengths-and-Limitations" class="headerlink" title="4.6 Strengths and Limitations"></a>4.6 Strengths and Limitations</h3><p><strong>Strengths:</strong> Deep RAG + Agent integration, flexible event-driven architecture, 300+ ecosystem integrations, multi-agent support, LlamaParse enterprise-grade parsing</p><p><strong>Limitations:</strong> Steep learning curve, relatively heavy framework, TypeScript version has incomplete feature coverage, fast release cycle with frequent breaking changes, enterprise features require a paid plan</p><hr><h2 id="V-Dify"><a href="#V-Dify" class="headerlink" title="V. Dify"></a>V. Dify</h2><h3 id="5-1-Introduction"><a href="#5-1-Introduction" class="headerlink" title="5.1 Introduction"></a>5.1 Introduction</h3><p><strong>Dify</strong> (Do It For You) is an open-source LLM application development platform positioned as an <strong>agentic workflow builder</strong>. It combines Backend-as-a-Service with LLMOps, enabling both non-technical users and developers to rapidly build AI applications.</p><table><thead><tr><th>Project Info</th><th>Details</th></tr></thead><tbody><tr><td>Latest Version</td><td>v1.6.0+</td></tr><tr><td>License</td><td>Dify Open Source License (Apache 2.0+)</td></tr><tr><td>Deploy</td><td><code>docker compose up -d</code></td></tr><tr><td>GitHub</td><td><a href="https://github.com/langgenius/dify">langgenius&#x2F;dify</a></td></tr><tr><td>Docs</td><td><a href="https://docs.dify.ai/en/use-dify/getting-started/introduction">docs.dify.ai</a></td></tr></tbody></table><h3 id="5-2-Core-Features"><a href="#5-2-Core-Features" class="headerlink" title="5.2 Core Features"></a>5.2 Core Features</h3><ul><li><strong>Visual workflow builder</strong>: Drag-and-drop canvas supporting parallel processing, conditional branching, and loop nodes</li><li><strong>Agent strategies</strong>: Supports Function Calling, ReAct, and custom strategy plugins</li><li><strong>RAG pipeline</strong>: A complete data source → processing → knowledge base → retrieval flow</li><li><strong>Model management</strong>: Seamless integration with hundreds of LLMs, with model switching and performance comparison</li><li><strong>Prompt IDE</strong>: An intuitive prompt authoring interface</li><li><strong>LLMOps</strong>: Monitor and analyze application logs and performance</li></ul><h3 id="5-3-Agent-Strategies"><a href="#5-3-Agent-Strategies" class="headerlink" title="5.3 Agent Strategies"></a>5.3 Agent Strategies</h3><table><thead><tr><th>Strategy</th><th>Use Case</th></tr></thead><tbody><tr><td><strong>Function Calling</strong></td><td>Models with native tool calling support (e.g., GPT-4, Claude)</td></tr><tr><td><strong>ReAct</strong></td><td>Models without native function calling, or when explicit reasoning traces are needed</td></tr><tr><td><strong>Custom Strategy Plugin</strong></td><td>Complex behaviors requiring multi-turn tool calls, etc.</td></tr></tbody></table><h3 id="5-4-How-to-Create-an-Agent"><a href="#5-4-How-to-Create-an-Agent" class="headerlink" title="5.4 How to Create an Agent"></a>5.4 How to Create an Agent</h3><p>Dify uses a visual &#x2F; no-code approach:</p><ol><li>Create an “Agent” type application in Dify Studio</li><li>Select an LLM model</li><li>Set the Agent strategy (automatically detects Function Calling support)</li><li>Choose from 50+ built-in tools or add custom tools</li><li>Write a system prompt</li><li>Preview and debug, then publish with one click</li></ol><h3 id="5-5-Integration-Capabilities"><a href="#5-5-Integration-Capabilities" class="headerlink" title="5.5 Integration Capabilities"></a>5.5 Integration Capabilities</h3><ul><li><strong>API</strong>: Full RESTful API with SSE streaming support</li><li><strong>SDK</strong>: Node.js, PHP, and Java clients</li><li><strong>Plugin system</strong>: Six plugin categories — models, tools, agent strategies, extensions, data sources, and triggers</li><li><strong>MCP integration</strong>: Native support for the Model Context Protocol</li><li><strong>Deployment</strong>: Docker Compose, Kubernetes, Terraform, AWS CDK</li></ul><h3 id="5-6-Strengths-and-Limitations"><a href="#5-6-Strengths-and-Limitations" class="headerlink" title="5.6 Strengths and Limitations"></a>5.6 Strengths and Limitations</h3><p><strong>Strengths:</strong> Low-code &#x2F; no-code, ready out of the box (50+ built-in tools), rapid path from prototype to production, multi-model support, active community (800+ contributors)</p><p><strong>Limitations:</strong> Limited customization flexibility (less than code-first frameworks), execution subject to step&#x2F;time limits, license is not pure Apache 2.0, risk of platform lock-in, advanced reasoning modes are less mature than dedicated frameworks</p><hr><h2 id="VI-OpenAI-Agents-SDK"><a href="#VI-OpenAI-Agents-SDK" class="headerlink" title="VI. OpenAI Agents SDK"></a>VI. OpenAI Agents SDK</h2><h3 id="6-1-Introduction"><a href="#6-1-Introduction" class="headerlink" title="6.1 Introduction"></a>6.1 Introduction</h3><p><strong>OpenAI Agents SDK</strong> is a lightweight multi-agent framework officially released by OpenAI, evolved from the internal Swarm experimental project. Its core philosophy is <strong>minimalist design</strong> — building complex workflows from just a few concepts: Agent, Handoff, Guardrail, and Tool.</p><table><thead><tr><th>Project Info</th><th>Details</th></tr></thead><tbody><tr><td>Latest Version</td><td>v0.14.6 (2026-04-25)</td></tr><tr><td>License</td><td>MIT</td></tr><tr><td>Install</td><td><code>pip install openai-agents</code></td></tr><tr><td>GitHub</td><td><a href="https://github.com/openai/openai-agents-python">openai&#x2F;openai-agents-python</a></td></tr><tr><td>Docs</td><td><a href="https://openai.github.io/openai-agents-python">openai.github.io&#x2F;openai-agents-python</a></td></tr></tbody></table><h3 id="6-2-Core-Concepts"><a href="#6-2-Core-Concepts" class="headerlink" title="6.2 Core Concepts"></a>6.2 Core Concepts</h3><ul><li><strong>Agent</strong>: An LLM configured with instructions, tools, guardrails, and handoff capabilities</li><li><strong>Runner</strong>: The agent executor, providing <code>run()</code> (async), <code>run_sync()</code> (synchronous), and <code>run_streamed()</code> (streaming)</li><li><strong>Handoff</strong>: Task delegation between agents; the receiving agent inherits the full conversation history</li><li><strong>Guardrails</strong>: Safety rails in three categories — input guardrails, output guardrails, and tool guardrails</li><li><strong>Tools</strong>: Supports function tools, MCP tools, OpenAI hosted tools, and Agent as Tool</li></ul><h3 id="6-3-Key-Features"><a href="#6-3-Key-Features" class="headerlink" title="6.3 Key Features"></a>6.3 Key Features</h3><ul><li><strong>Minimalist design</strong>: Few core primitives, gentle learning curve</li><li><strong>Provider-agnostic</strong>: Supports 100+ LLMs via any-llm &#x2F; LiteLLM</li><li><strong>Three-layer guardrails</strong>: Safety validation at the input → output → tool level</li><li><strong>Built-in Tracing</strong>: Visualize and debug agent execution flows</li><li><strong>Realtime Agents</strong>: Build voice agents (gpt-realtime-1.5)</li><li><strong>Sandbox Agents</strong>: Added in v0.14.0 — executes code in a containerized environment</li><li><strong>Structured output</strong>: Define output types via Pydantic Model using <code>output_type</code></li></ul><h3 id="6-4-Code-Example"><a href="#6-4-Code-Example" class="headerlink" title="6.4 Code Example"></a>6.4 Code Example</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> asyncio</span><br><span class="line"><span class="keyword">from</span> agents <span class="keyword">import</span> Agent, Runner, function_tool</span><br><span class="line"></span><br><span class="line"><span class="comment"># Define a tool</span></span><br><span class="line"><span class="meta">@function_tool</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">get_weather</span>(<span class="params">city: <span class="built_in">str</span></span>) -&gt; <span class="built_in">str</span>:</span><br><span class="line">    <span class="string">&quot;&quot;&quot;Get the weather for a specified city.&quot;&quot;&quot;</span></span><br><span class="line">    <span class="keyword">return</span> <span class="string">f&quot;The weather in <span class="subst">&#123;city&#125;</span> is sunny.&quot;</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># Define specialist Agents</span></span><br><span class="line">billing_agent = Agent(</span><br><span class="line">    name=<span class="string">&quot;Billing Agent&quot;</span>,</span><br><span class="line">    instructions=<span class="string">&quot;You are a billing specialist.&quot;</span>,</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line">refund_agent = Agent(</span><br><span class="line">    name=<span class="string">&quot;Refund Agent&quot;</span>,</span><br><span class="line">    instructions=<span class="string">&quot;You are a refund specialist.&quot;</span>,</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="comment"># Define a triage Agent</span></span><br><span class="line">triage_agent = Agent(</span><br><span class="line">    name=<span class="string">&quot;Triage Agent&quot;</span>,</span><br><span class="line">    instructions=<span class="string">&quot;Route the user&#x27;s question to the correct specialist Agent: billing issues -&gt; Billing Agent; refund issues -&gt; Refund Agent.&quot;</span>,</span><br><span class="line">    handoffs=[billing_agent, refund_agent],</span><br><span class="line">    tools=[get_weather],</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="comment"># Run</span></span><br><span class="line"><span class="keyword">async</span> <span class="keyword">def</span> <span class="title function_">main</span>():</span><br><span class="line">    result = <span class="keyword">await</span> Runner.run(</span><br><span class="line">        triage_agent,</span><br><span class="line">        <span class="string">&quot;I was charged twice for my subscription. Please help me resolve this.&quot;</span>,</span><br><span class="line">    )</span><br><span class="line">    <span class="built_in">print</span>(<span class="string">f&quot;Final answer: <span class="subst">&#123;result.final_output&#125;</span>&quot;</span>)</span><br><span class="line">    <span class="built_in">print</span>(<span class="string">f&quot;Handled by Agent: <span class="subst">&#123;result.last_agent.name&#125;</span>&quot;</span>)</span><br><span class="line"></span><br><span class="line">asyncio.run(main())</span><br></pre></td></tr></table></figure><h3 id="6-5-Strengths-and-Limitations"><a href="#6-5-Strengths-and-Limitations" class="headerlink" title="6.5 Strengths and Limitations"></a>6.5 Strengths and Limitations</h3><p><strong>Strengths:</strong> Officially maintained, minimalist design, provider-agnostic, three-layer guardrails, built-in tracing, voice agent support</p><p><strong>Limitations:</strong> Still at 0.x — API may change, deep dependency on the OpenAI ecosystem, no parallel agent execution support, no built-in persistent memory system</p><hr><h2 id="VII-Google-ADK"><a href="#VII-Google-ADK" class="headerlink" title="VII. Google ADK"></a>VII. Google ADK</h2><h3 id="7-1-Introduction"><a href="#7-1-Introduction" class="headerlink" title="7.1 Introduction"></a>7.1 Introduction</h3><p><strong>Google ADK (Agent Development Kit)</strong> is an open-source, code-first agent development framework released by Google. Its design philosophy is to make AI agent development feel like traditional software development. It is optimized for Gemini and Google Cloud, while remaining model-agnostic and deployment-agnostic.</p><table><thead><tr><th>Project Info</th><th>Details</th></tr></thead><tbody><tr><td>Latest Version</td><td>v1.31.1 (2026-04-30)</td></tr><tr><td>License</td><td>Apache 2.0</td></tr><tr><td>Install</td><td><code>pip install google-adk</code></td></tr><tr><td>GitHub</td><td><a href="https://github.com/google/adk-python">google&#x2F;adk-python</a></td></tr><tr><td>Docs</td><td><a href="https://google.github.io/adk-docs/">google.github.io&#x2F;adk-docs</a></td></tr></tbody></table><h3 id="7-2-Core-Concepts"><a href="#7-2-Core-Concepts" class="headerlink" title="7.2 Core Concepts"></a>7.2 Core Concepts</h3><ul><li><strong>LlmAgent</strong> (alias <code>Agent</code>): The core building block — combines an LLM model, instructions, and tools</li><li><strong>SequentialAgent</strong>: Executes sub-agents one after another in order (pipeline style)</li><li><strong>ParallelAgent</strong>: Runs multiple sub-agents concurrently</li><li><strong>LoopAgent</strong>: Repeatedly executes sub-agents with support for exit conditions</li><li><strong>sub_agents</strong>: Nesting sub-agents to build hierarchical multi-agent architectures</li></ul><h3 id="7-3-Key-Features"><a href="#7-3-Key-Features" class="headerlink" title="7.3 Key Features"></a>7.3 Key Features</h3><ul><li><strong>Multi-agent orchestration</strong>: Sequential, parallel, loop-based, and LLM-driven dynamic routing</li><li><strong>Built-in tools</strong>: Google Search, Vertex AI Search, code executor, and more</li><li><strong>Google ecosystem integration</strong>: Native Gemini, Vertex AI Agent Engine, Cloud Run</li><li><strong>Flexible deployment</strong>: Local, Agent Engine (fully managed), Cloud Run, GKE, Docker</li><li><strong>Built-in evaluation</strong>: CLI tool <code>adk eval</code> for systematic agent performance assessment</li><li><strong>A2A protocol</strong>: Supports Agent-to-Agent remote communication</li><li><strong>Lifecycle callbacks</strong>: <code>before/after_agent</code>, <code>before/after_model</code>, and <code>before/after_tool</code> hooks</li></ul><h3 id="7-4-Code-Example"><a href="#7-4-Code-Example" class="headerlink" title="7.4 Code Example"></a>7.4 Code Example</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> asyncio</span><br><span class="line"><span class="keyword">from</span> google.adk.agents <span class="keyword">import</span> Agent, SequentialAgent</span><br><span class="line"><span class="keyword">from</span> google.adk.runners <span class="keyword">import</span> Runner</span><br><span class="line"><span class="keyword">from</span> google.adk.sessions <span class="keyword">import</span> InMemorySessionService</span><br><span class="line"><span class="keyword">from</span> google.genai <span class="keyword">import</span> types</span><br><span class="line"><span class="keyword">from</span> google.adk.tools <span class="keyword">import</span> google_search</span><br><span class="line"></span><br><span class="line"><span class="comment"># Define a weather Agent</span></span><br><span class="line">weather_agent = Agent(</span><br><span class="line">    name=<span class="string">&quot;weather_assistant&quot;</span>,</span><br><span class="line">    model=<span class="string">&quot;gemini-2.5-flash&quot;</span>,</span><br><span class="line">    instruction=<span class="string">&quot;You are a weather query assistant. Use Google Search to find the latest weather information.&quot;</span>,</span><br><span class="line">    tools=[google_search],</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="comment"># Define a translation Agent</span></span><br><span class="line">translate_agent = Agent(</span><br><span class="line">    name=<span class="string">&quot;translate_assistant&quot;</span>,</span><br><span class="line">    model=<span class="string">&quot;gemini-2.5-flash&quot;</span>,</span><br><span class="line">    instruction=<span class="string">&quot;You are a translation assistant. Translate content into Chinese.&quot;</span>,</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="comment"># Compose into a sequential workflow</span></span><br><span class="line">pipeline = SequentialAgent(</span><br><span class="line">    name=<span class="string">&quot;WeatherPipeline&quot;</span>,</span><br><span class="line">    sub_agents=[weather_agent, translate_agent],</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="comment"># Run</span></span><br><span class="line">session_service = InMemorySessionService()</span><br><span class="line">runner = Runner(agent=pipeline, app_name=<span class="string">&quot;weather_app&quot;</span>, session_service=session_service)</span><br><span class="line"></span><br><span class="line"><span class="keyword">async</span> <span class="keyword">def</span> <span class="title function_">run_agent</span>(<span class="params">query: <span class="built_in">str</span></span>):</span><br><span class="line">    session = session_service.create_session(</span><br><span class="line">        app_name=<span class="string">&quot;weather_app&quot;</span>, user_id=<span class="string">&quot;user_1&quot;</span>, session_id=<span class="string">&quot;session_1&quot;</span></span><br><span class="line">    )</span><br><span class="line">    content = types.Content(role=<span class="string">&#x27;user&#x27;</span>, parts=[types.Part(text=query)])</span><br><span class="line">    <span class="keyword">async</span> <span class="keyword">for</span> event <span class="keyword">in</span> runner.run_async(</span><br><span class="line">        user_id=<span class="string">&quot;user_1&quot;</span>, session_id=<span class="string">&quot;session_1&quot;</span>, new_message=content</span><br><span class="line">    ):</span><br><span class="line">        <span class="keyword">if</span> event.is_final_response() <span class="keyword">and</span> event.content <span class="keyword">and</span> event.content.parts:</span><br><span class="line">            <span class="built_in">print</span>(<span class="string">f&quot;Agent reply: <span class="subst">&#123;event.content.parts[<span class="number">0</span>].text.strip()&#125;</span>&quot;</span>)</span><br><span class="line"></span><br><span class="line">asyncio.run(run_agent(<span class="string">&quot;What&#x27;s the weather in Tokyo today?&quot;</span>))</span><br></pre></td></tr></table></figure><h3 id="7-5-Strengths-and-Limitations"><a href="#7-5-Strengths-and-Limitations" class="headerlink" title="7.5 Strengths and Limitations"></a>7.5 Strengths and Limitations</h3><p><strong>Strengths:</strong> Code-first, powerful orchestration (sequential &#x2F; parallel &#x2F; loop), deep Google ecosystem integration, built-in evaluation, multi-language support (Python &#x2F; Java &#x2F; Go &#x2F; TS), Apache 2.0 open source</p><p><strong>Limitations:</strong> Best experience requires Gemini and Google Cloud, relatively new framework with an early-stage community ecosystem, frequent releases mean the API may change, access to Google services is restricted from mainland China</p><hr><h2 id="VIII-Framework-Selection-Guide"><a href="#VIII-Framework-Selection-Guide" class="headerlink" title="VIII. Framework Selection Guide"></a>VIII. Framework Selection Guide</h2><h3 id="Choose-by-Use-Case"><a href="#Choose-by-Use-Case" class="headerlink" title="Choose by Use Case"></a>Choose by Use Case</h3><table><thead><tr><th>Use Case</th><th>Recommended Framework</th><th>Reason</th></tr></thead><tbody><tr><td><strong>Complex stateful workflows</strong></td><td>LangGraph</td><td>Low-level graph orchestration, persistence, time travel</td></tr><tr><td><strong>Multi-role team collaboration</strong></td><td>CrewAI</td><td>Role-playing design, delegation mechanism, memory systems</td></tr><tr><td><strong>RAG + Agent</strong></td><td>LlamaIndex</td><td>Deep RAG integration, 130+ data formats, document parsing</td></tr><tr><td><strong>Rapid prototyping &#x2F; non-technical teams</strong></td><td>Dify</td><td>Visual drag-and-drop, low-code, ready out of the box</td></tr><tr><td><strong>Primarily OpenAI models</strong></td><td>OpenAI Agents SDK</td><td>Officially maintained, minimal API, tracing and debugging</td></tr><tr><td><strong>Google Cloud deployment</strong></td><td>Google ADK</td><td>Gemini-optimized, Vertex AI integration, built-in evaluation</td></tr><tr><td><strong>Need fine-grained control</strong></td><td>LangGraph &#x2F; Google ADK</td><td>Low-level APIs, lifecycle callback hooks</td></tr><tr><td><strong>Need production-grade guardrails</strong></td><td>OpenAI Agents SDK</td><td>Three-layer Guardrails</td></tr></tbody></table><h3 id="Choose-by-Team-Profile"><a href="#Choose-by-Team-Profile" class="headerlink" title="Choose by Team Profile"></a>Choose by Team Profile</h3><table><thead><tr><th>Team Profile</th><th>Recommendation</th></tr></thead><tbody><tr><td>Full-stack development teams</td><td>LangGraph, Google ADK</td></tr><tr><td>Python data science teams</td><td>CrewAI, LlamaIndex</td></tr><tr><td>Product managers &#x2F; operations teams</td><td>Dify</td></tr><tr><td>Heavy OpenAI ecosystem users</td><td>OpenAI Agents SDK</td></tr><tr><td>Google Cloud users</td><td>Google ADK</td></tr><tr><td>Need to validate ideas quickly</td><td>Dify, OpenAI Agents SDK</td></tr></tbody></table><blockquote><p><strong>Note</strong>: The framework information above is based on research conducted in April 2026. All frameworks iterate quickly — check the official documentation for the latest information before getting started.</p></blockquote>]]></content:encoded>
      
      
      
      <category domain="https://eugenepage.com/tags/AI/">AI</category>
      
      <category domain="https://eugenepage.com/tags/OpenAI/">OpenAI</category>
      
      <category domain="https://eugenepage.com/tags/Framework/">Framework</category>
      
      <category domain="https://eugenepage.com/tags/Agent/">Agent</category>
      
      <category domain="https://eugenepage.com/tags/LangGraph/">LangGraph</category>
      
      <category domain="https://eugenepage.com/tags/CrewAI/">CrewAI</category>
      
      <category domain="https://eugenepage.com/tags/LlamaIndex/">LlamaIndex</category>
      
      <category domain="https://eugenepage.com/tags/Dify/">Dify</category>
      
      <category domain="https://eugenepage.com/tags/GoogleADK/">GoogleADK</category>
      
      
      <comments>https://eugenepage.com/2026/04/30/20260430.AIAgentFrameworkResearchNotes/#disqus_thread</comments>
      
    </item>
    
    <item>
      <title>AI Agent 框架调研笔记</title>
      <link>https://eugenepage.com/zh-CN/2026/04/30/20260430.AIAgentFrameworkResearchNotes/</link>
      <guid>https://eugenepage.com/zh-CN/2026/04/30/20260430.AIAgentFrameworkResearchNotes/</guid>
      <pubDate>Thu, 30 Apr 2026 04:00:00 GMT</pubDate>
      
        
        
      <description>&lt;h1 id=&quot;AI-Agent-框架调研笔记&quot;&gt;&lt;a href=&quot;#AI-Agent-框架调研笔记&quot; class=&quot;headerlink&quot; title=&quot;AI Agent 框架调研笔记&quot;&gt;&lt;/a&gt;AI Agent 框架调研笔记&lt;/h1&gt;&lt;blockquote&gt;
&lt;p&gt;更新时间：</description>
        
      
      
      
      <content:encoded><![CDATA[<h1 id="AI-Agent-框架调研笔记"><a href="#AI-Agent-框架调研笔记" class="headerlink" title="AI Agent 框架调研笔记"></a>AI Agent 框架调研笔记</h1><blockquote><p>更新时间：2026-04-30</p><p>随着 AI Agent 技术的快速发展，各类 Agent 开发框架层出不穷。本文档对当前主流的 6 个 Agent 框架进行调研和对比分析，帮助开发者选择合适的工具。</p></blockquote><hr><h2 id="目录"><a href="#目录" class="headerlink" title="目录"></a>目录</h2><ul><li><a href="#%E4%B8%80%E6%A1%86%E6%9E%B6%E6%A6%82%E8%A7%88%E5%AF%B9%E6%AF%94">一、框架概览对比</a></li><li><a href="#%E4%BA%8Clanggraph">二、LangGraph</a></li><li><a href="#%E4%B8%89crewai">三、CrewAI</a></li><li><a href="#%E5%9B%9Bllamaindex">四、LlamaIndex</a></li><li><a href="#%E4%BA%94dify">五、Dify</a></li><li><a href="#%E5%85%ADopenai-agents-sdk">六、OpenAI Agents SDK</a></li><li><a href="#%E4%B8%83google-adk">七、Google ADK</a></li><li><a href="#%E5%85%AB%E6%A1%86%E6%9E%B6%E9%80%89%E5%9E%8B%E6%8C%87%E5%8D%97">八、框架选型指南</a></li></ul><hr><h2 id="一、框架概览对比"><a href="#一、框架概览对比" class="headerlink" title="一、框架概览对比"></a>一、框架概览对比</h2><table><thead><tr><th>维度</th><th>LangGraph</th><th>CrewAI</th><th>LlamaIndex</th><th>Dify</th><th>OpenAI Agents SDK</th><th>Google ADK</th></tr></thead><tbody><tr><td><strong>开发方</strong></td><td>LangChain Inc.</td><td>CrewAI Inc.</td><td>LlamaIndex Inc.</td><td>LangGenius</td><td>OpenAI</td><td>Google</td></tr><tr><td><strong>最新版本</strong></td><td>v1.1.10</td><td>v1.14.3</td><td>v0.14.6</td><td>v1.6.0+</td><td>v0.14.6</td><td>v1.31.1</td></tr><tr><td><strong>许可证</strong></td><td>MIT</td><td>MIT</td><td>MIT</td><td>Dify License (Apache 2.0+)</td><td>MIT</td><td>Apache 2.0</td></tr><tr><td><strong>语言</strong></td><td>Python &#x2F; JS</td><td>Python</td><td>Python &#x2F; TS</td><td>可视化（多语言 SDK）</td><td>Python &#x2F; JS</td><td>Python &#x2F; Java &#x2F; Go &#x2F; TS</td></tr><tr><td><strong>核心理念</strong></td><td>图编排</td><td>角色扮演团队</td><td>RAG + Agent</td><td>低代码平台</td><td>极简多 Agent</td><td>代码优先</td></tr><tr><td><strong>模型支持</strong></td><td>模型无关</td><td>模型无关</td><td>模型无关</td><td>模型无关</td><td>100+ LLM</td><td>模型无关</td></tr><tr><td><strong>学习曲线</strong></td><td>较陡</td><td>中等</td><td>中等</td><td>低</td><td>低</td><td>中等</td></tr><tr><td><strong>适合场景</strong></td><td>复杂有状态工作流</td><td>多角色协作</td><td>RAG + Agent</td><td>快速原型&#x2F;非技术用户</td><td>OpenAI 生态应用</td><td>Google 生态应用</td></tr></tbody></table><hr><h2 id="二、LangGraph"><a href="#二、LangGraph" class="headerlink" title="二、LangGraph"></a>二、LangGraph</h2><h3 id="2-1-简介"><a href="#2-1-简介" class="headerlink" title="2.1 简介"></a>2.1 简介</h3><p><strong>LangGraph</strong> 是由 LangChain 团队开发的<strong>底层编排框架</strong>，专门用于构建长时间运行的、有状态的 AI Agent。设计灵感来自 Google 的 Pregel 和 Apache Beam。</p><p>核心定位：不抽象化提示词或架构，提供底层基础设施，让开发者精细控制 Agent 工作流。已被 Klarna、Replit、Elastic 等公司用于生产环境。</p><table><thead><tr><th>项目信息</th><th>详情</th></tr></thead><tbody><tr><td>最新版本</td><td>v1.1.10（2026-04-27）</td></tr><tr><td>许可证</td><td>MIT</td></tr><tr><td>安装</td><td><code>pip install -U langgraph</code></td></tr><tr><td>GitHub</td><td><a href="https://github.com/langchain-ai/langgraph">langchain-ai&#x2F;langgraph</a></td></tr><tr><td>文档</td><td><a href="https://docs.langchain.com/oss/python/langgraph">docs.langchain.com&#x2F;oss&#x2F;python&#x2F;langgraph</a></td></tr></tbody></table><h3 id="2-2-核心架构"><a href="#2-2-核心架构" class="headerlink" title="2.2 核心架构"></a>2.2 核心架构</h3><p>LangGraph 将 Agent 工作流建模为<strong>图（Graph）</strong>，由三个核心组件构成：</p><ul><li><strong>State（状态）</strong>：共享数据结构，通常用 <code>TypedDict</code> 或 <code>Pydantic Model</code> 定义</li><li><strong>Nodes（节点）</strong>：编码 Agent 逻辑的函数，接收当前状态、返回更新后的状态</li><li><strong>Edges（边）</strong>：决定下一个节点的函数，支持条件分支或固定转换</li></ul><h3 id="2-3-关键特性"><a href="#2-3-关键特性" class="headerlink" title="2.3 关键特性"></a>2.3 关键特性</h3><ul><li><strong>持久化（Persistence）</strong>：每个执行步骤将图状态保存为 checkpoint，支持内存、Redis 等后端</li><li><strong>人机协作（Human-in-the-Loop）</strong>：通过 <code>interrupt()</code> 暂停执行，等待人工输入后恢复</li><li><strong>流式输出（Streaming）</strong>：支持 values、messages、updates 等多种流式模式</li><li><strong>子图（Subgraphs）</strong>：支持图嵌套，子图拥有独立的 checkpoint 和中断能力</li><li><strong>时间旅行</strong>：可回溯到任意历史 checkpoint，支持 fork 和重放</li><li><strong>可视化</strong>：编译后可生成 Mermaid 图形展示工作流结构</li></ul><h3 id="2-4-代码示例"><a href="#2-4-代码示例" class="headerlink" title="2.4 代码示例"></a>2.4 代码示例</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> typing <span class="keyword">import</span> <span class="type">Literal</span></span><br><span class="line"><span class="keyword">from</span> langgraph.graph <span class="keyword">import</span> StateGraph, MessagesState, START, END</span><br><span class="line"><span class="keyword">from</span> langchain.messages <span class="keyword">import</span> SystemMessage, HumanMessage, ToolMessage</span><br><span class="line"></span><br><span class="line"><span class="comment"># 定义工具</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">multiply</span>(<span class="params">a: <span class="built_in">int</span>, b: <span class="built_in">int</span></span>) -&gt; <span class="built_in">int</span>:</span><br><span class="line">    <span class="string">&quot;&quot;&quot;Multiply two numbers.&quot;&quot;&quot;</span></span><br><span class="line">    <span class="keyword">return</span> a * b</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">add</span>(<span class="params">a: <span class="built_in">int</span>, b: <span class="built_in">int</span></span>) -&gt; <span class="built_in">int</span>:</span><br><span class="line">    <span class="string">&quot;&quot;&quot;Add two numbers.&quot;&quot;&quot;</span></span><br><span class="line">    <span class="keyword">return</span> a + b</span><br><span class="line"></span><br><span class="line">tools = [multiply, add]</span><br><span class="line">tools_by_name = &#123;tool.name: tool <span class="keyword">for</span> tool <span class="keyword">in</span> tools&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment"># 定义节点</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">llm_call</span>(<span class="params">state: MessagesState</span>):</span><br><span class="line">    <span class="string">&quot;&quot;&quot;LLM 决定是否调用工具&quot;&quot;&quot;</span></span><br><span class="line">    <span class="keyword">return</span> &#123;</span><br><span class="line">        <span class="string">&quot;messages&quot;</span>: [</span><br><span class="line">            llm_with_tools.invoke(</span><br><span class="line">                [SystemMessage(content=<span class="string">&quot;You are a helpful arithmetic assistant.&quot;</span>)]</span><br><span class="line">                + state[<span class="string">&quot;messages&quot;</span>]</span><br><span class="line">            )</span><br><span class="line">        ]</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">tool_node</span>(<span class="params">state: <span class="built_in">dict</span></span>):</span><br><span class="line">    <span class="string">&quot;&quot;&quot;执行工具调用&quot;&quot;&quot;</span></span><br><span class="line">    result = []</span><br><span class="line">    <span class="keyword">for</span> tool_call <span class="keyword">in</span> state[<span class="string">&quot;messages&quot;</span>][-<span class="number">1</span>].tool_calls:</span><br><span class="line">        tool = tools_by_name[tool_call[<span class="string">&quot;name&quot;</span>]]</span><br><span class="line">        observation = tool.invoke(tool_call[<span class="string">&quot;args&quot;</span>])</span><br><span class="line">        result.append(ToolMessage(content=<span class="built_in">str</span>(observation), tool_call_id=tool_call[<span class="string">&quot;id&quot;</span>]))</span><br><span class="line">    <span class="keyword">return</span> &#123;<span class="string">&quot;messages&quot;</span>: result&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment"># 定义条件边路由</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">should_continue</span>(<span class="params">state: MessagesState</span>) -&gt; <span class="type">Literal</span>[<span class="string">&quot;tool_node&quot;</span>, END]:</span><br><span class="line">    last_message = state[<span class="string">&quot;messages&quot;</span>][-<span class="number">1</span>]</span><br><span class="line">    <span class="keyword">if</span> last_message.tool_calls:</span><br><span class="line">        <span class="keyword">return</span> <span class="string">&quot;tool_node&quot;</span></span><br><span class="line">    <span class="keyword">return</span> END</span><br><span class="line"></span><br><span class="line"><span class="comment"># 构建并编译图</span></span><br><span class="line">builder = StateGraph(MessagesState)</span><br><span class="line">builder.add_node(<span class="string">&quot;llm_call&quot;</span>, llm_call)</span><br><span class="line">builder.add_node(<span class="string">&quot;tool_node&quot;</span>, tool_node)</span><br><span class="line">builder.add_edge(START, <span class="string">&quot;llm_call&quot;</span>)</span><br><span class="line">builder.add_conditional_edges(<span class="string">&quot;llm_call&quot;</span>, should_continue, [<span class="string">&quot;tool_node&quot;</span>, END])</span><br><span class="line">builder.add_edge(<span class="string">&quot;tool_node&quot;</span>, <span class="string">&quot;llm_call&quot;</span>)</span><br><span class="line"></span><br><span class="line">agent = builder.<span class="built_in">compile</span>()</span><br><span class="line"></span><br><span class="line"><span class="comment"># 运行</span></span><br><span class="line">result = agent.invoke(&#123;<span class="string">&quot;messages&quot;</span>: [HumanMessage(content=<span class="string">&quot;Add 3 and 4, then multiply by 5.&quot;</span>)]&#125;)</span><br></pre></td></tr></table></figure><h3 id="2-5-优势与局限"><a href="#2-5-优势与局限" class="headerlink" title="2.5 优势与局限"></a>2.5 优势与局限</h3><p><strong>优势：</strong> 精细化控制、有状态执行、原生人机协作、容错恢复、时间旅行调试、框架无关</p><p><strong>局限：</strong> 学习曲线较陡、样板代码多、最佳体验需配合 LangSmith 生态、版本迭代快</p><hr><h2 id="三、CrewAI"><a href="#三、CrewAI" class="headerlink" title="三、CrewAI"></a>三、CrewAI</h2><h3 id="3-1-简介"><a href="#3-1-简介" class="headerlink" title="3.1 简介"></a>3.1 简介</h3><p><strong>CrewAI</strong> 是一个用于编排多个自主 AI Agent 的 Python 框架，完全从零构建，<strong>不依赖 LangChain 或其他框架</strong>。核心理念是通过角色扮演模拟真实团队协作。</p><table><thead><tr><th>项目信息</th><th>详情</th></tr></thead><tbody><tr><td>最新版本</td><td>v1.14.3（2025-04-24）</td></tr><tr><td>许可证</td><td>MIT</td></tr><tr><td>安装</td><td><code>pip install &#39;crewai[tools]&#39;</code></td></tr><tr><td>GitHub</td><td><a href="https://github.com/crewAIInc/crewAI">crewAIInc&#x2F;crewAI</a></td></tr><tr><td>文档</td><td><a href="https://docs.crewai.com/">docs.crewai.com</a></td></tr></tbody></table><h3 id="3-2-核心概念"><a href="#3-2-核心概念" class="headerlink" title="3.2 核心概念"></a>3.2 核心概念</h3><ul><li><strong>Agent（智能体）</strong>：通过 <code>role</code>（角色）、<code>goal</code>（目标）、<code>backstory</code>（背景故事）定义身份和行为</li><li><strong>Task（任务）</strong>：具体工作单元，可指定执行者、上下文依赖和输出格式</li><li><strong>Crew（团队）</strong>：一组协作 Agent 的集合，定义执行流程和记忆配置</li><li><strong>Tools（工具）</strong>：丰富的内置工具集（搜索、文件读写、代码执行等），支持 MCP 集成</li><li><strong>Process（流程）</strong>：Sequential（顺序）或 Hierarchical（层级，自动创建 Manager Agent）</li></ul><h3 id="3-3-关键特性"><a href="#3-3-关键特性" class="headerlink" title="3.3 关键特性"></a>3.3 关键特性</h3><ul><li><strong>角色扮演设计</strong>：直观的角色定义方式，贴近真实团队协作</li><li><strong>协作工作流</strong>：Agent 间可委派任务、传递上下文</li><li><strong>四种记忆系统</strong>：短期记忆、长期记忆、实体记忆、上下文记忆</li><li><strong>Flows（流程编排）</strong>：企业级事件驱动工作流，支持 <code>@start</code>、<code>@listen</code>、<code>@router</code> 装饰器</li><li><strong>Checkpoint &amp; Fork</strong>：支持执行状态的保存、恢复和分支</li><li><strong>YAML 配置驱动</strong>：Agent 和 Task 可通过 YAML 文件定义</li></ul><h3 id="3-4-代码示例"><a href="#3-4-代码示例" class="headerlink" title="3.4 代码示例"></a>3.4 代码示例</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> crewai <span class="keyword">import</span> Agent, Task, Crew, Process</span><br><span class="line"></span><br><span class="line"><span class="comment"># 定义 Agent</span></span><br><span class="line">researcher = Agent(</span><br><span class="line">    role=<span class="string">&#x27;高级 AI 研究员&#x27;</span>,</span><br><span class="line">    goal=<span class="string">&#x27;发现 AI Agent 领域的最新发展趋势&#x27;</span>,</span><br><span class="line">    backstory=<span class="string">&#x27;你是一位经验丰富的研究员，擅长发现前沿技术的最新动态。&#x27;</span>,</span><br><span class="line">    verbose=<span class="literal">True</span>,</span><br><span class="line">    memory=<span class="literal">True</span>,</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line">writer = Agent(</span><br><span class="line">    role=<span class="string">&#x27;技术报告撰写专家&#x27;</span>,</span><br><span class="line">    goal=<span class="string">&#x27;将研究发现转化为清晰、结构化的报告&#x27;</span>,</span><br><span class="line">    backstory=<span class="string">&#x27;你是一位技术写作专家，擅长将复杂信息转化为易读的报告。&#x27;</span>,</span><br><span class="line">    verbose=<span class="literal">True</span>,</span><br><span class="line">    memory=<span class="literal">True</span>,</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="comment"># 定义 Task</span></span><br><span class="line">research_task = Task(</span><br><span class="line">    description=<span class="string">&#x27;对 &#123;topic&#125; 进行全面调研，收集最新的发展趋势。&#x27;</span>,</span><br><span class="line">    expected_output=<span class="string">&#x27;包含 10 个要点的详细研究发现列表&#x27;</span>,</span><br><span class="line">    agent=researcher,</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line">writing_task = Task(</span><br><span class="line">    description=<span class="string">&#x27;根据研究发现撰写一份完整的技术报告。&#x27;</span>,</span><br><span class="line">    expected_output=<span class="string">&#x27;完整的 Markdown 格式报告&#x27;</span>,</span><br><span class="line">    agent=writer,</span><br><span class="line">    context=[research_task],</span><br><span class="line">    output_file=<span class="string">&#x27;report.md&#x27;</span>,</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="comment"># 组建 Crew 并执行</span></span><br><span class="line">crew = Crew(</span><br><span class="line">    agents=[researcher, writer],</span><br><span class="line">    tasks=[research_task, writing_task],</span><br><span class="line">    process=Process.sequential,</span><br><span class="line">    verbose=<span class="literal">True</span>,</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line">result = crew.kickoff(inputs=&#123;<span class="string">&#x27;topic&#x27;</span>: <span class="string">&#x27;多智能体协作系统&#x27;</span>&#125;)</span><br></pre></td></tr></table></figure><h3 id="3-5-优势与局限"><a href="#3-5-优势与局限" class="headerlink" title="3.5 优势与局限"></a>3.5 优势与局限</h3><p><strong>优势：</strong> 完全独立无依赖、角色扮演直观、四种记忆系统、YAML 配置驱动、活跃社区（10 万+ 认证开发者）</p><p><strong>局限：</strong> 仅支持 Python、多 Agent 协作 API 开销大、调试复杂、企业功能需付费</p><hr><h2 id="四、LlamaIndex"><a href="#四、LlamaIndex" class="headerlink" title="四、LlamaIndex"></a>四、LlamaIndex</h2><h3 id="4-1-简介"><a href="#4-1-简介" class="headerlink" title="4.1 简介"></a>4.1 简介</h3><p><strong>LlamaIndex</strong>（原名 GPT Index）是一个开源框架，最初专注于 RAG（检索增强生成），现已扩展为<strong>文档智能体和 OCR 平台</strong>。由 Jerry Liu 于 2022 年创立。</p><table><thead><tr><th>项目信息</th><th>详情</th></tr></thead><tbody><tr><td>最新版本</td><td>v0.14.6</td></tr><tr><td>许可证</td><td>MIT</td></tr><tr><td>安装</td><td><code>pip install llama-index</code></td></tr><tr><td>GitHub</td><td><a href="https://github.com/run-llama/llama_index">run-llama&#x2F;llama_index</a></td></tr><tr><td>文档</td><td><a href="https://developers.llamaindex.ai/python">developers.llamaindex.ai</a></td></tr></tbody></table><h3 id="4-2-核心概念"><a href="#4-2-核心概念" class="headerlink" title="4.2 核心概念"></a>4.2 核心概念</h3><ul><li><strong>Workflow（工作流）</strong>：事件驱动的编排机制，通过 <code>@step</code> 装饰器定义步骤</li><li><strong>Context（上下文）</strong>：全局运行时上下文，协调步骤间数据传递，支持持久化</li><li><strong>事件驱动架构</strong>：<code>StartEvent</code> → 自定义事件 → <code>StopEvent</code>，形成有向图</li><li><strong>AgentWorkflow</strong>：高层封装，自动根据 LLM 能力选择合适的 Agent 类型</li></ul><h3 id="4-3-Agent-类型"><a href="#4-3-Agent-类型" class="headerlink" title="4.3 Agent 类型"></a>4.3 Agent 类型</h3><table><thead><tr><th>类型</th><th>适用场景</th><th>特点</th></tr></thead><tbody><tr><td><strong>FunctionAgent</strong></td><td>LLM 支持函数调用时</td><td>直接使用原生 function calling，效率最高</td></tr><tr><td><strong>ReActAgent</strong></td><td>LLM 不支持函数调用时</td><td>通过 ReAct（推理+行动）循环执行</td></tr><tr><td><strong>CodeActAgent</strong></td><td>需要执行代码的场景</td><td>通过 <code>&lt;execute&gt;</code> 标签生成并执行代码</td></tr></tbody></table><h3 id="4-4-关键特性"><a href="#4-4-关键特性" class="headerlink" title="4.4 关键特性"></a>4.4 关键特性</h3><ul><li><strong>RAG + Agent 一体化</strong>：RAG 是核心能力而非补充，130+ 数据格式接入</li><li><strong>多智能体协作</strong>：原生支持多 Agent 交接（handoff）机制</li><li><strong>Context 持久化</strong>：工作流状态可序列化恢复，适合生产环境</li><li><strong>LlamaParse</strong>：企业级文档解析和 OCR</li><li><strong>300+ 集成包</strong>：覆盖主流 LLM、向量数据库、数据源</li></ul><h3 id="4-5-代码示例"><a href="#4-5-代码示例" class="headerlink" title="4.5 代码示例"></a>4.5 代码示例</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> llama_index.core <span class="keyword">import</span> VectorStoreIndex, SimpleDirectoryReader</span><br><span class="line"><span class="keyword">from</span> llama_index.core.agent.workflow <span class="keyword">import</span> FunctionAgent</span><br><span class="line"><span class="keyword">from</span> llama_index.llms.openai <span class="keyword">import</span> OpenAI</span><br><span class="line"><span class="keyword">import</span> asyncio</span><br><span class="line"></span><br><span class="line"><span class="comment"># 构建 RAG 索引</span></span><br><span class="line">documents = SimpleDirectoryReader(<span class="string">&quot;data&quot;</span>).load_data()</span><br><span class="line">index = VectorStoreIndex.from_documents(documents)</span><br><span class="line">query_engine = index.as_query_engine()</span><br><span class="line"></span><br><span class="line"><span class="comment"># 定义工具</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">multiply</span>(<span class="params">a: <span class="built_in">float</span>, b: <span class="built_in">float</span></span>) -&gt; <span class="built_in">float</span>:</span><br><span class="line">    <span class="string">&quot;&quot;&quot;Multiply two numbers.&quot;&quot;&quot;</span></span><br><span class="line">    <span class="keyword">return</span> a * b</span><br><span class="line"></span><br><span class="line"><span class="keyword">async</span> <span class="keyword">def</span> <span class="title function_">search_documents</span>(<span class="params">query: <span class="built_in">str</span></span>) -&gt; <span class="built_in">str</span>:</span><br><span class="line">    <span class="string">&quot;&quot;&quot;Search documents for answers.&quot;&quot;&quot;</span></span><br><span class="line">    response = <span class="keyword">await</span> query_engine.aquery(query)</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">str</span>(response)</span><br><span class="line"></span><br><span class="line"><span class="comment"># 创建智能体</span></span><br><span class="line">agent = FunctionAgent(</span><br><span class="line">    tools=[multiply, search_documents],</span><br><span class="line">    llm=OpenAI(model=<span class="string">&quot;gpt-4o-mini&quot;</span>),</span><br><span class="line">    system_prompt=<span class="string">&quot;You are a helpful assistant that can calculate and search documents.&quot;</span>,</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="comment"># 运行</span></span><br><span class="line"><span class="keyword">async</span> <span class="keyword">def</span> <span class="title function_">main</span>():</span><br><span class="line">    response = <span class="keyword">await</span> agent.run(<span class="string">&quot;What did the author do in college? Also, what&#x27;s 7 * 8?&quot;</span>)</span><br><span class="line">    <span class="built_in">print</span>(response)</span><br><span class="line"></span><br><span class="line">asyncio.run(main())</span><br></pre></td></tr></table></figure><h3 id="4-6-优势与局限"><a href="#4-6-优势与局限" class="headerlink" title="4.6 优势与局限"></a>4.6 优势与局限</h3><p><strong>优势：</strong> RAG + Agent 深度集成、事件驱动架构灵活、300+ 生态集成、多智能体支持、LlamaParse 企业级解析</p><p><strong>局限：</strong> 学习曲线较陡、框架较重、TS 版本功能覆盖不全、版本迭代快有 breaking changes、企业功能需付费</p><hr><h2 id="五、Dify"><a href="#五、Dify" class="headerlink" title="五、Dify"></a>五、Dify</h2><h3 id="5-1-简介"><a href="#5-1-简介" class="headerlink" title="5.1 简介"></a>5.1 简介</h3><p><strong>Dify</strong>（Do It For You）是一个开源的 LLM 应用开发平台，定位为<strong>智能体工作流构建器</strong>。将 Backend-as-a-Service 与 LLMOps 结合，让非技术用户和开发者都能快速构建 AI 应用。</p><table><thead><tr><th>项目信息</th><th>详情</th></tr></thead><tbody><tr><td>最新版本</td><td>v1.6.0+</td></tr><tr><td>许可证</td><td>Dify Open Source License (Apache 2.0+)</td></tr><tr><td>部署</td><td><code>docker compose up -d</code></td></tr><tr><td>GitHub</td><td><a href="https://github.com/langgenius/dify">langgenius&#x2F;dify</a></td></tr><tr><td>文档</td><td><a href="https://docs.dify.ai/en/use-dify/getting-started/introduction">docs.dify.ai</a></td></tr></tbody></table><h3 id="5-2-核心功能"><a href="#5-2-核心功能" class="headerlink" title="5.2 核心功能"></a>5.2 核心功能</h3><ul><li><strong>可视化工作流构建器</strong>：拖拽式画布，支持并行处理、条件分支、循环节点</li><li><strong>Agent 策略</strong>：支持 Function Calling、ReAct 和自定义策略插件</li><li><strong>RAG 管道</strong>：完整的数据源 → 处理 → 知识库 → 检索流程</li><li><strong>模型管理</strong>：无缝集成数百种 LLM，支持模型切换和性能比较</li><li><strong>Prompt IDE</strong>：直观的提示词编写界面</li><li><strong>LLMOps</strong>：监控和分析应用日志和性能</li></ul><h3 id="5-3-Agent-策略"><a href="#5-3-Agent-策略" class="headerlink" title="5.3 Agent 策略"></a>5.3 Agent 策略</h3><table><thead><tr><th>策略</th><th>适用场景</th></tr></thead><tbody><tr><td><strong>Function Calling</strong></td><td>模型原生支持工具调用（如 GPT-4、Claude）</td></tr><tr><td><strong>ReAct</strong></td><td>模型不支持原生函数调用，或需要显式推理追踪</td></tr><tr><td><strong>自定义策略插件</strong></td><td>需要多轮工具调用等复杂行为</td></tr></tbody></table><h3 id="5-4-创建-Agent-的方式"><a href="#5-4-创建-Agent-的方式" class="headerlink" title="5.4 创建 Agent 的方式"></a>5.4 创建 Agent 的方式</h3><p>Dify 采用可视化&#x2F;无代码方式：</p><ol><li>在 Dify Studio 中创建 “Agent” 类型应用</li><li>选择 LLM 模型</li><li>设置 Agent 策略（自动检测 Function Calling 支持）</li><li>从 50+ 内置工具中选择或添加自定义工具</li><li>编写系统提示词</li><li>调试预览后一键发布</li></ol><h3 id="5-5-集成能力"><a href="#5-5-集成能力" class="headerlink" title="5.5 集成能力"></a>5.5 集成能力</h3><ul><li><strong>API</strong>：完整的 RESTful API，支持 SSE 流式响应</li><li><strong>SDK</strong>：Node.js、PHP、Java 客户端</li><li><strong>插件系统</strong>：模型、工具、Agent 策略、扩展、数据源、触发器六类插件</li><li><strong>MCP 集成</strong>：原生支持 Model Context Protocol</li><li><strong>部署</strong>：Docker Compose、Kubernetes、Terraform、AWS CDK</li></ul><h3 id="5-6-优势与局限"><a href="#5-6-优势与局限" class="headerlink" title="5.6 优势与局限"></a>5.6 优势与局限</h3><p><strong>优势：</strong> 低代码&#x2F;无代码、开箱即用（50+ 内置工具）、快速原型到生产、多模型支持、活跃社区（800+ 贡献者）</p><p><strong>局限：</strong> 自定义灵活性受限（不如代码框架）、执行有步骤&#x2F;时间限制、许可证非纯 Apache 2.0、平台锁定风险、高级推理模式不如专用框架成熟</p><hr><h2 id="六、OpenAI-Agents-SDK"><a href="#六、OpenAI-Agents-SDK" class="headerlink" title="六、OpenAI Agents SDK"></a>六、OpenAI Agents SDK</h2><h3 id="6-1-简介"><a href="#6-1-简介" class="headerlink" title="6.1 简介"></a>6.1 简介</h3><p><strong>OpenAI Agents SDK</strong> 是 OpenAI 官方推出的轻量级多智能体框架，从内部 Swarm 实验项目演化而来。核心理念是<strong>极简设计</strong>——只用 Agent &#x2F; Handoff &#x2F; Guardrail &#x2F; Tool 几个概念构建复杂工作流。</p><table><thead><tr><th>项目信息</th><th>详情</th></tr></thead><tbody><tr><td>最新版本</td><td>v0.14.6（2026-04-25）</td></tr><tr><td>许可证</td><td>MIT</td></tr><tr><td>安装</td><td><code>pip install openai-agents</code></td></tr><tr><td>GitHub</td><td><a href="https://github.com/openai/openai-agents-python">openai&#x2F;openai-agents-python</a></td></tr><tr><td>文档</td><td><a href="https://openai.github.io/openai-agents-python">openai.github.io&#x2F;openai-agents-python</a></td></tr></tbody></table><h3 id="6-2-核心概念"><a href="#6-2-核心概念" class="headerlink" title="6.2 核心概念"></a>6.2 核心概念</h3><ul><li><strong>Agent</strong>：配置了指令、工具、护栏和交接能力的 LLM</li><li><strong>Runner</strong>：Agent 执行器，提供 <code>run()</code>（异步）、<code>run_sync()</code>（同步）、<code>run_streamed()</code>（流式）</li><li><strong>Handoff</strong>：Agent 间的任务委托，被委托者继承完整对话历史</li><li><strong>Guardrails</strong>：安全护栏，分输入护栏、输出护栏、工具护栏三类</li><li><strong>Tools</strong>：支持函数工具、MCP 工具、OpenAI 托管工具、Agent as Tool</li></ul><h3 id="6-3-关键特性"><a href="#6-3-关键特性" class="headerlink" title="6.3 关键特性"></a>6.3 关键特性</h3><ul><li><strong>极简设计</strong>：核心原语少，学习曲线平缓</li><li><strong>Provider 无关</strong>：通过 any-llm &#x2F; LiteLLM 支持 100+ LLM</li><li><strong>三层护栏</strong>：输入 → 输出 → 工具级别的安全校验</li><li><strong>内置追踪（Tracing）</strong>：可视化调试 Agent 运行流程</li><li><strong>Realtime Agents</strong>：支持构建语音 Agent（gpt-realtime-1.5）</li><li><strong>Sandbox Agents</strong>：v0.14.0 新增，在容器环境中执行代码</li><li><strong>结构化输出</strong>：通过 Pydantic Model 定义 output_type</li></ul><h3 id="6-4-代码示例"><a href="#6-4-代码示例" class="headerlink" title="6.4 代码示例"></a>6.4 代码示例</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> asyncio</span><br><span class="line"><span class="keyword">from</span> agents <span class="keyword">import</span> Agent, Runner, function_tool</span><br><span class="line"></span><br><span class="line"><span class="comment"># 定义工具</span></span><br><span class="line"><span class="meta">@function_tool</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">get_weather</span>(<span class="params">city: <span class="built_in">str</span></span>) -&gt; <span class="built_in">str</span>:</span><br><span class="line">    <span class="string">&quot;&quot;&quot;获取指定城市的天气信息。&quot;&quot;&quot;</span></span><br><span class="line">    <span class="keyword">return</span> <span class="string">f&quot;The weather in <span class="subst">&#123;city&#125;</span> is sunny.&quot;</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 定义专业 Agent</span></span><br><span class="line">billing_agent = Agent(</span><br><span class="line">    name=<span class="string">&quot;Billing Agent&quot;</span>,</span><br><span class="line">    instructions=<span class="string">&quot;你是账单问题专家。&quot;</span>,</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line">refund_agent = Agent(</span><br><span class="line">    name=<span class="string">&quot;Refund Agent&quot;</span>,</span><br><span class="line">    instructions=<span class="string">&quot;你是退款问题专家。&quot;</span>,</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="comment"># 定义分流 Agent</span></span><br><span class="line">triage_agent = Agent(</span><br><span class="line">    name=<span class="string">&quot;Triage Agent&quot;</span>,</span><br><span class="line">    instructions=<span class="string">&quot;根据用户问题路由到正确的专业 Agent：账单 -&gt; Billing Agent；退款 -&gt; Refund Agent。&quot;</span>,</span><br><span class="line">    handoffs=[billing_agent, refund_agent],</span><br><span class="line">    tools=[get_weather],</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="comment"># 运行</span></span><br><span class="line"><span class="keyword">async</span> <span class="keyword">def</span> <span class="title function_">main</span>():</span><br><span class="line">    result = <span class="keyword">await</span> Runner.run(</span><br><span class="line">        triage_agent,</span><br><span class="line">        <span class="string">&quot;我的订阅被扣了两次费用，请帮我处理。&quot;</span>,</span><br><span class="line">    )</span><br><span class="line">    <span class="built_in">print</span>(<span class="string">f&quot;最终回答: <span class="subst">&#123;result.final_output&#125;</span>&quot;</span>)</span><br><span class="line">    <span class="built_in">print</span>(<span class="string">f&quot;处理 Agent: <span class="subst">&#123;result.last_agent.name&#125;</span>&quot;</span>)</span><br><span class="line"></span><br><span class="line">asyncio.run(main())</span><br></pre></td></tr></table></figure><h3 id="6-5-优势与局限"><a href="#6-5-优势与局限" class="headerlink" title="6.5 优势与局限"></a>6.5 优势与局限</h3><p><strong>优势：</strong> 官方维护、极简设计、Provider 无关、三层护栏、内置追踪、语音 Agent 支持</p><p><strong>局限：</strong> 仍处 0.x 阶段 API 可能变动、深度依赖 OpenAI 生态、不支持并行 Agent 执行、无内置持久化记忆系统</p><hr><h2 id="七、Google-ADK"><a href="#七、Google-ADK" class="headerlink" title="七、Google ADK"></a>七、Google ADK</h2><h3 id="7-1-简介"><a href="#7-1-简介" class="headerlink" title="7.1 简介"></a>7.1 简介</h3><p><strong>Google ADK（Agent Development Kit）</strong> 是 Google 推出的开源、代码优先的 Agent 开发框架。设计理念是让 AI Agent 开发更像传统软件开发，针对 Gemini 和 Google Cloud 优化，但保持模型无关和部署无关。</p><table><thead><tr><th>项目信息</th><th>详情</th></tr></thead><tbody><tr><td>最新版本</td><td>v1.31.1（2026-04-30）</td></tr><tr><td>许可证</td><td>Apache 2.0</td></tr><tr><td>安装</td><td><code>pip install google-adk</code></td></tr><tr><td>GitHub</td><td><a href="https://github.com/google/adk-python">google&#x2F;adk-python</a></td></tr><tr><td>文档</td><td><a href="https://google.github.io/adk-docs/">google.github.io&#x2F;adk-docs</a></td></tr></tbody></table><h3 id="7-2-核心概念"><a href="#7-2-核心概念" class="headerlink" title="7.2 核心概念"></a>7.2 核心概念</h3><ul><li><strong>LlmAgent</strong>（别名 <code>Agent</code>）：核心构建块，组合 LLM 模型 + 指令 + 工具</li><li><strong>SequentialAgent</strong>：按顺序依次执行子 Agent（管道式）</li><li><strong>ParallelAgent</strong>：并发执行多个子 Agent</li><li><strong>LoopAgent</strong>：重复执行子 Agent，支持退出条件</li><li><strong>sub_agents</strong>：通过嵌套构建层级式多 Agent 架构</li></ul><h3 id="7-3-关键特性"><a href="#7-3-关键特性" class="headerlink" title="7.3 关键特性"></a>7.3 关键特性</h3><ul><li><strong>多 Agent 编排</strong>：顺序、并行、循环和 LLM 驱动的动态路由</li><li><strong>内置工具</strong>：Google Search、Vertex AI Search、代码执行器等</li><li><strong>Google 生态集成</strong>：原生 Gemini、Vertex AI Agent Engine、Cloud Run</li><li><strong>灵活部署</strong>：本地、Agent Engine（全托管）、Cloud Run、GKE、Docker</li><li><strong>内置评估</strong>：CLI 工具 <code>adk eval</code> 系统化评估 Agent 性能</li><li><strong>A2A 协议</strong>：支持 Agent-to-Agent 远程通信</li><li><strong>生命周期回调</strong>：<code>before/after_agent</code>、<code>before/after_model</code>、<code>before/after_tool</code> 钩子</li></ul><h3 id="7-4-代码示例"><a href="#7-4-代码示例" class="headerlink" title="7.4 代码示例"></a>7.4 代码示例</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> asyncio</span><br><span class="line"><span class="keyword">from</span> google.adk.agents <span class="keyword">import</span> Agent, SequentialAgent</span><br><span class="line"><span class="keyword">from</span> google.adk.runners <span class="keyword">import</span> Runner</span><br><span class="line"><span class="keyword">from</span> google.adk.sessions <span class="keyword">import</span> InMemorySessionService</span><br><span class="line"><span class="keyword">from</span> google.genai <span class="keyword">import</span> types</span><br><span class="line"><span class="keyword">from</span> google.adk.tools <span class="keyword">import</span> google_search</span><br><span class="line"></span><br><span class="line"><span class="comment"># 定义天气 Agent</span></span><br><span class="line">weather_agent = Agent(</span><br><span class="line">    name=<span class="string">&quot;weather_assistant&quot;</span>,</span><br><span class="line">    model=<span class="string">&quot;gemini-2.5-flash&quot;</span>,</span><br><span class="line">    instruction=<span class="string">&quot;你是一个天气查询助手。使用 Google 搜索查找最新天气信息。&quot;</span>,</span><br><span class="line">    tools=[google_search],</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="comment"># 定义翻译 Agent</span></span><br><span class="line">translate_agent = Agent(</span><br><span class="line">    name=<span class="string">&quot;translate_assistant&quot;</span>,</span><br><span class="line">    model=<span class="string">&quot;gemini-2.5-flash&quot;</span>,</span><br><span class="line">    instruction=<span class="string">&quot;你是一个翻译助手，将内容翻译成中文。&quot;</span>,</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="comment"># 组合成顺序工作流</span></span><br><span class="line">pipeline = SequentialAgent(</span><br><span class="line">    name=<span class="string">&quot;WeatherPipeline&quot;</span>,</span><br><span class="line">    sub_agents=[weather_agent, translate_agent],</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="comment"># 运行</span></span><br><span class="line">session_service = InMemorySessionService()</span><br><span class="line">runner = Runner(agent=pipeline, app_name=<span class="string">&quot;weather_app&quot;</span>, session_service=session_service)</span><br><span class="line"></span><br><span class="line"><span class="keyword">async</span> <span class="keyword">def</span> <span class="title function_">run_agent</span>(<span class="params">query: <span class="built_in">str</span></span>):</span><br><span class="line">    session = session_service.create_session(</span><br><span class="line">        app_name=<span class="string">&quot;weather_app&quot;</span>, user_id=<span class="string">&quot;user_1&quot;</span>, session_id=<span class="string">&quot;session_1&quot;</span></span><br><span class="line">    )</span><br><span class="line">    content = types.Content(role=<span class="string">&#x27;user&#x27;</span>, parts=[types.Part(text=query)])</span><br><span class="line">    <span class="keyword">async</span> <span class="keyword">for</span> event <span class="keyword">in</span> runner.run_async(</span><br><span class="line">        user_id=<span class="string">&quot;user_1&quot;</span>, session_id=<span class="string">&quot;session_1&quot;</span>, new_message=content</span><br><span class="line">    ):</span><br><span class="line">        <span class="keyword">if</span> event.is_final_response() <span class="keyword">and</span> event.content <span class="keyword">and</span> event.content.parts:</span><br><span class="line">            <span class="built_in">print</span>(<span class="string">f&quot;Agent 回复: <span class="subst">&#123;event.content.parts[<span class="number">0</span>].text.strip()&#125;</span>&quot;</span>)</span><br><span class="line"></span><br><span class="line">asyncio.run(run_agent(<span class="string">&quot;What&#x27;s the weather in Tokyo today?&quot;</span>))</span><br></pre></td></tr></table></figure><h3 id="7-5-优势与局限"><a href="#7-5-优势与局限" class="headerlink" title="7.5 优势与局限"></a>7.5 优势与局限</h3><p><strong>优势：</strong> 代码优先、强大编排能力（顺序&#x2F;并行&#x2F;循环）、Google 生态深度集成、内置评估、多语言支持（Python&#x2F;Java&#x2F;Go&#x2F;TS）、Apache 2.0 开源</p><p><strong>局限：</strong> 最佳体验需 Gemini 和 Google Cloud、框架较新社区生态初期、高频发布 API 可能变动、中国大陆访问 Google 服务受限</p><hr><h2 id="八、框架选型指南"><a href="#八、框架选型指南" class="headerlink" title="八、框架选型指南"></a>八、框架选型指南</h2><h3 id="按使用场景选择"><a href="#按使用场景选择" class="headerlink" title="按使用场景选择"></a>按使用场景选择</h3><table><thead><tr><th>场景</th><th>推荐框架</th><th>理由</th></tr></thead><tbody><tr><td><strong>复杂有状态工作流</strong></td><td>LangGraph</td><td>底层图编排、持久化、时间旅行</td></tr><tr><td><strong>多角色团队协作</strong></td><td>CrewAI</td><td>角色扮演设计、委派机制、记忆系统</td></tr><tr><td><strong>RAG + Agent</strong></td><td>LlamaIndex</td><td>RAG 深度集成、130+ 数据格式、文档解析</td></tr><tr><td><strong>快速原型 &#x2F; 非技术团队</strong></td><td>Dify</td><td>可视化拖拽、低代码、开箱即用</td></tr><tr><td><strong>OpenAI 模型为主</strong></td><td>OpenAI Agents SDK</td><td>官方维护、极简 API、追踪调试</td></tr><tr><td><strong>Google Cloud 部署</strong></td><td>Google ADK</td><td>Gemini 优化、Vertex AI 集成、内置评估</td></tr><tr><td><strong>需要精细控制</strong></td><td>LangGraph &#x2F; Google ADK</td><td>底层 API、回调钩子</td></tr><tr><td><strong>需要生产级护栏</strong></td><td>OpenAI Agents SDK</td><td>三层 Guardrails</td></tr></tbody></table><h3 id="按团队特点选择"><a href="#按团队特点选择" class="headerlink" title="按团队特点选择"></a>按团队特点选择</h3><table><thead><tr><th>团队特点</th><th>推荐</th></tr></thead><tbody><tr><td>全栈开发团队</td><td>LangGraph、Google ADK</td></tr><tr><td>Python 数据科学团队</td><td>CrewAI、LlamaIndex</td></tr><tr><td>产品经理 &#x2F; 运营团队</td><td>Dify</td></tr><tr><td>OpenAI 生态重度用户</td><td>OpenAI Agents SDK</td></tr><tr><td>Google Cloud 用户</td><td>Google ADK</td></tr><tr><td>需要快速验证想法</td><td>Dify、OpenAI Agents SDK</td></tr></tbody></table><blockquote><p><strong>注意</strong>：以上框架信息基于 2026 年 4 月的调研，各框架迭代较快，建议使用前查看官方文档获取最新信息。</p></blockquote>]]></content:encoded>
      
      
      
      <category domain="https://eugenepage.com/tags/AI/">AI</category>
      
      <category domain="https://eugenepage.com/tags/OpenAI/">OpenAI</category>
      
      <category domain="https://eugenepage.com/tags/Framework/">Framework</category>
      
      <category domain="https://eugenepage.com/tags/Agent/">Agent</category>
      
      <category domain="https://eugenepage.com/tags/LangGraph/">LangGraph</category>
      
      <category domain="https://eugenepage.com/tags/CrewAI/">CrewAI</category>
      
      <category domain="https://eugenepage.com/tags/LlamaIndex/">LlamaIndex</category>
      
      <category domain="https://eugenepage.com/tags/Dify/">Dify</category>
      
      <category domain="https://eugenepage.com/tags/GoogleADK/">GoogleADK</category>
      
      
      <comments>https://eugenepage.com/zh-CN/2026/04/30/20260430.AIAgentFrameworkResearchNotes/#disqus_thread</comments>
      
    </item>
    
    <item>
      <title>Tile Explorer Web — 24h AI GameDev Hackathon Project (Software)</title>
      <link>https://eugenepage.com/2026/04/28/20260428.TileExplorerWeb/</link>
      <guid>https://eugenepage.com/2026/04/28/20260428.TileExplorerWeb/</guid>
      <pubDate>Tue, 28 Apr 2026 04:00:00 GMT</pubDate>
      
        
        
      <description>&lt;p&gt;Tile Explorer is a browser-based tile-matching puzzle game I built in 24 hours. The entire project runs on a purely native web stack — Pi</description>
        
      
      
      
      <content:encoded><![CDATA[<p>Tile Explorer is a browser-based tile-matching puzzle game I built in 24 hours. The entire project runs on a purely native web stack — PixiJS (loaded via CDN) for rendering, Web Audio API for procedurally synthesized sound effects, zero build tools, zero npm dependencies. Double-click <code>index.html</code> and it just runs. The game is deployed on GitHub Pages, with a live leaderboard powered by Supabase’s free tier. Total hosting cost: ¥0&#x2F;month.</p><div style="position: relative; width: 100%; padding-bottom: 75%; margin: 20px 0; border-radius: 12px; overflow: hidden; box-shadow: 0 4px 16px rgba(0,0,0,0.15);">  <iframe src="https://youdrew.github.io/24h-AI-GameDevTest/" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border: none;" loading="lazy" allow="autoplay"></iframe></div><p style="text-align: center; font-size: 13px; color: #888; margin-top: 4px;">↑ Playable right here (requires an internet connection to load)</p><p><strong>Core gameplay</strong>: Patterned tiles are stacked across the board. Tap an accessible tile to send it into a 7-slot collection tray at the bottom. Match 3 identical patterns and they clear automatically. Clear every tile from the board to complete the level.</p><p>Key highlights of the project:</p><ol><li><strong>Mathematically guaranteed solvability</strong>: Level layouts are generated from a difficulty formula where total tile count &#x3D; <code>patternTypes × setsPerType × 3</code>, which structurally ensures every pattern appears in multiples of three. A backtracking solver runs inside a Web Worker to forward-validate each layout — only layouts with a confirmed solution path are accepted. The solver also records the optimal move count, which serves as the star-rating baseline.</li><li><strong>Procedural audio synthesis</strong>: Every interactive sound effect — taps, clears, combos, power-ups, warnings — is synthesized in real time via the Web Audio API. Zero audio files, zero network requests. Combo sounds are built on a C-major chord progression system, progressively brightening from triangle waves to sawtooth waves to give players a satisfying sense of escalating momentum. When BGM is playing, sound effects auto-duck by 6dB and smoothly recover over 200ms.</li><li><strong>Data-driven architecture</strong>: Difficulty curves, power-up properties, and theme configurations are all declarative, editable config tables. A designer can tune difficulty curves and power-up parameters by editing JS config files directly — no touching game logic code. Six visual themes each have their own library of 32 emoji patterns, a background image, and a BGM track; themes rotate automatically every 3 levels.</li><li><strong>PWA + offline support</strong>: Full Progressive Web App support is implemented — installable to a phone’s home screen and fully playable offline. The Service Worker uses a three-tier caching strategy: precached static assets, cache-first for CDN resources, and Stale-While-Revalidate for theme media. Dual-CDN failover provides automatic fallback.</li><li><strong>Zero-cost online leaderboard</strong>: Built on Supabase’s free tier (PostgreSQL + REST API). A UUID is auto-generated on first visit and stored in localStorage — no account required. The database enforces row-level security (RLS); the client holds only the anon key. All input goes through dual regex validation plus XSS sanitization. Scores earned offline are queued locally and submitted automatically once connectivity is restored.</li></ol><p>On the engineering side: tile occlusion uses spatial hashing (O(n) instead of O(n²)); clear particle effects use a pre-allocated object pool to avoid GC jitter; opacity calculations follow an exponential decay model based on the Weber–Fechner law; and the collection slots use a smart clustering insertion algorithm to help players quickly spot matching opportunities.</p><p>The entire project was completed within 24 hours. My own code spans 14 JS modules + 3 CSS files + 1 HTML file, covering 10,000 levels, 6 power-up types, and 6 themes. AI assistance generated the vast majority of the code, along with all audio synthesis parameters and BGM assets. My role focused on architecture design, requirements refinement, data structure design, and overall code quality.</p>]]></content:encoded>
      
      
      
      <category domain="https://eugenepage.com/tags/SoftwareProjects/">SoftwareProjects</category>
      
      <category domain="https://eugenepage.com/tags/WebDevelopment/">WebDevelopment</category>
      
      <category domain="https://eugenepage.com/tags/GameDev/">GameDev</category>
      
      
      <comments>https://eugenepage.com/2026/04/28/20260428.TileExplorerWeb/#disqus_thread</comments>
      
    </item>
    
    <item>
      <title>Tile Explorer Web — 24h AI GameDev 马拉松作品 (软件作品)</title>
      <link>https://eugenepage.com/zh-CN/2026/04/28/20260428.TileExplorerWeb/</link>
      <guid>https://eugenepage.com/zh-CN/2026/04/28/20260428.TileExplorerWeb/</guid>
      <pubDate>Tue, 28 Apr 2026 04:00:00 GMT</pubDate>
      
        
        
      <description>&lt;p&gt;Tile Explorer 是我在 24 小时内完成的一款浏览器三消瓦片解谜游戏。整个项目完全采用 Web 原生技术栈开发，渲染引擎使用 PixiJS（CDN 引入），音效通过 Web Audio API 程序化合成，零构建工具、零 npm 依赖——双击 &lt;code&gt;ind</description>
        
      
      
      
      <content:encoded><![CDATA[<p>Tile Explorer 是我在 24 小时内完成的一款浏览器三消瓦片解谜游戏。整个项目完全采用 Web 原生技术栈开发，渲染引擎使用 PixiJS（CDN 引入），音效通过 Web Audio API 程序化合成，零构建工具、零 npm 依赖——双击 <code>index.html</code> 即可运行。游戏已部署至 GitHub Pages，后端使用 Supabase 免费层实现在线排行榜，整体运维成本为 0 元&#x2F;月。</p><div style="position: relative; width: 100%; padding-bottom: 75%; margin: 20px 0; border-radius: 12px; overflow: hidden; box-shadow: 0 4px 16px rgba(0,0,0,0.15);">  <iframe src="https://youdrew.github.io/24h-AI-GameDevTest/" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border: none;" loading="lazy" allow="autoplay"></iframe></div><p style="text-align: center; font-size: 13px; color: #888; margin-top: 4px;">↑ 上方可直接游玩（需要联网加载）</p><p><strong>核心玩法</strong>：版面上堆叠着带有图案的瓦片，点击可用瓦片将其送入底部 7 格收集槽，凑齐 3 个相同图案自动消除，清空版面上所有瓦片即通关。</p><p>项目的主要亮点：</p><ol><li><strong>数学可解性保证</strong>：关卡布局由难度公式推导生成，瓦片总数 &#x3D; <code>patternTypes × setsPerType × 3</code>，从根本上保证每种图案数量均为 3 的倍数。同时，Web Worker 中运行回溯求解器对每个布局做正向验证，只有确认存在通关路径才会采用，并记录最优步数作为星级评分基准。</li><li><strong>程序化音效合成</strong>：所有交互音效（点击、消除、连击、道具、警告等）均通过 Web Audio API 实时合成，零音频文件、零网络请求。连击音效基于 C 大调和弦递进系统设计，从三角波到锯齿波逐渐变亮，给玩家”蓄力”的感知。BGM 播放时音效自动 Ducking（降 6dB），200ms 后平滑恢复。</li><li><strong>数据驱动架构</strong>：难度曲线、道具属性、主题配置均为可编辑的声明式配置表。策划可直接修改 JS 配置文件调整难度曲线和道具参数，无需触碰游戏逻辑代码。6 套视觉主题各有独立的 32 emoji 图案库、背景图和 BGM，每 3 关自动轮换。</li><li><strong>PWA + 离线支持</strong>：实现了完整的 Progressive Web App 支持——可安装到手机主屏幕、支持完全离线游玩。Service Worker 采用三级缓存策略（静态资源预缓存、CDN 资源缓存优先、主题媒体 Stale-While-Revalidate），双 CDN 容灾自动回退。</li><li><strong>零成本在线排行榜</strong>：使用 Supabase 免费层（PostgreSQL + REST API），首次访问自动生成 UUID 存入 localStorage，无需注册。数据库启用行级安全（RLS），客户端仅持有 anon key，输入经双重正则校验 + XSS 清洗。离线成绩存入本地队列，联网后自动提交。</li></ol><p>工程方面，瓦片覆盖关系使用空间哈希（O(n) 替代 O(n²)），消除特效使用预分配粒子对象池避免 GC 抖动，透明度计算遵循韦伯-费希纳定律的指数衰减模型，槽位采用智能聚类插入算法帮助玩家快速识别匹配机会。</p><p>整个项目在 24 小时内完成，自有代码 14 个 JS 模块 + 3 个 CSS + 1 个 HTML，覆盖 10,000 关、6 种道具、6 套主题。过程中 AI 辅助生成了绝大部分代码与全部音效参数、BGM 资产，我主要负责架构设计、需求梳理、数据结构设计及代码质量把控。</p>]]></content:encoded>
      
      
      
      <category domain="https://eugenepage.com/tags/SoftwareProjects/">SoftwareProjects</category>
      
      <category domain="https://eugenepage.com/tags/WebDevelopment/">WebDevelopment</category>
      
      <category domain="https://eugenepage.com/tags/GameDev/">GameDev</category>
      
      
      <comments>https://eugenepage.com/zh-CN/2026/04/28/20260428.TileExplorerWeb/#disqus_thread</comments>
      
    </item>
    
    <item>
      <title>Veil Lingo — Online English Education Platform (Software Project)</title>
      <link>https://eugenepage.com/2026/04/18/20260428.VeilLingo/</link>
      <guid>https://eugenepage.com/2026/04/18/20260428.VeilLingo/</guid>
      <pubDate>Sat, 18 Apr 2026 04:00:00 GMT</pubDate>
      
        
        
      <description>&lt;p&gt;Veil Lingo is a live one-on-one English speaking education platform targeting Chinese learners, connecting them with professional teacher</description>
        
      
      
      
      <content:encoded><![CDATA[<p>Veil Lingo is a live one-on-one English speaking education platform targeting Chinese learners, connecting them with professional teachers from English-speaking countries. The platform name draws from John Rawls’ philosophical concept of the “veil of ignorance” — the idea being to create a fair, transparent teaching marketplace where the quality of instruction itself becomes the core basis for pricing. The project is deployed and live at <a href="https://talk-lingo.com/">talk-lingo.com</a>.</p><div style="position: relative; width: 100%; padding-bottom: 65%; margin: 20px 0; border-radius: 12px; overflow: hidden; box-shadow: 0 4px 16px rgba(0,0,0,0.15);">  <iframe src="https://talk-lingo.com" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border: none;" loading="lazy"></iframe></div><p style="text-align: center; font-size: 13px; color: #888; margin-top: 4px;">↑ Live site preview above (or visit <a href="https://talk-lingo.com" target="_blank">talk-lingo.com</a> directly)</p><p>The project covers three user-facing portals: a student portal (browse teachers, book lessons, credit wallet, review system), a teacher portal (personal profile, calendar scheduling, earnings dashboard, rating feedback), and an admin backend (data dashboard, teacher approval, review moderation, violation management, system parameter configuration) — totaling 28+ pages and 34+ components.</p><p>Key technical highlights:</p><ol><li><strong>Dynamic Pricing and Salary Algorithm</strong>: The platform’s core differentiating design. Lesson prices float dynamically based on a teacher’s booking rate — high-demand teachers see prices automatically rise, while prices pull back when demand is low, creating a positive incentive loop. Teacher salaries are similarly auto-adjusted based on demand and ratings, ensuring top teachers earn higher returns. All parameters are configurable in the admin backend, so strategy adjustments require no code changes.</li><li><strong>Pairwise Comparison Review System</strong>: Students can evaluate two teachers they’ve taken lessons with in a head-to-head comparison. This produces more reliable quality signals than traditional independent scoring, helping the platform more accurately identify differences in teaching ability.</li><li><strong>Multi-Dimensional Radar Chart Scoring</strong>: Teacher evaluations span multiple teaching dimensions, visualized as radar charts. This gives students an intuitive view of a teacher’s style and strengths, and provides teachers with clear direction for improvement.</li><li><strong>Mainland China Network Optimization</strong>: Geo-aware routing via Cloudflare Workers automatically selects the optimal access path for mainland users, reducing latency and improving availability.</li><li><strong>Full Internationalization Support</strong>: Complete bilingual coverage in Chinese and English, with 874 translation keys managing all user-facing copy through a translation system.</li></ol><p>On the tech stack side, the frontend uses Next.js (App Router + Server Components) + TypeScript + Tailwind CSS + shadcn&#x2F;ui. The backend runs on Supabase (PostgreSQL + Auth + Storage + Realtime), with Row-Level Security enforcing data access control. The app is deployed on Vercel, with Cloudflare handling CDN and DNS. The entire project was built from scratch to production launch, covering full-stack development end to end: database design (21 tables + 26 migration scripts), authentication and authorization, payment wallet, scheduled jobs, SEO optimization, and more.</p>]]></content:encoded>
      
      
      
      <category domain="https://eugenepage.com/tags/SoftwareProjects/">SoftwareProjects</category>
      
      <category domain="https://eugenepage.com/tags/WebDevelopment/">WebDevelopment</category>
      
      <category domain="https://eugenepage.com/tags/FullStack/">FullStack</category>
      
      
      <comments>https://eugenepage.com/2026/04/18/20260428.VeilLingo/#disqus_thread</comments>
      
    </item>
    
    <item>
      <title>Veil Lingo — 在线英语教育平台 (软件作品)</title>
      <link>https://eugenepage.com/zh-CN/2026/04/18/20260428.VeilLingo/</link>
      <guid>https://eugenepage.com/zh-CN/2026/04/18/20260428.VeilLingo/</guid>
      <pubDate>Sat, 18 Apr 2026 04:00:00 GMT</pubDate>
      
        
        
      <description>&lt;p&gt;Veil Lingo（无知之幕）是一个已上线的在线一对一口语教育平台，面向中国英语学习者，连接来自英语国家的专业教师。平台名取自约翰·罗尔斯的「无知之幕」哲学概念——意在创造一个公平、透明的教学市场，让教学质量本身成为定价的核心依据。项目已部署上线，域名为 &lt;a href=</description>
        
      
      
      
      <content:encoded><![CDATA[<p>Veil Lingo（无知之幕）是一个已上线的在线一对一口语教育平台，面向中国英语学习者，连接来自英语国家的专业教师。平台名取自约翰·罗尔斯的「无知之幕」哲学概念——意在创造一个公平、透明的教学市场，让教学质量本身成为定价的核心依据。项目已部署上线，域名为 <a href="https://talk-lingo.com/">talk-lingo.com</a>。</p><div style="position: relative; width: 100%; padding-bottom: 65%; margin: 20px 0; border-radius: 12px; overflow: hidden; box-shadow: 0 4px 16px rgba(0,0,0,0.15);">  <iframe src="https://talk-lingo.com" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border: none;" loading="lazy"></iframe></div><p style="text-align: center; font-size: 13px; color: #888; margin-top: 4px;">↑ 上方为线上实站点预览（也可直接访问 <a href="https://talk-lingo.com" target="_blank">talk-lingo.com</a>）</p><p>项目包含三个用户端：学生端（浏览教师、预约课程、信用钱包、评价系统）、教师端（个人档案、日历排班、收入看板、评分反馈）和管理后台（数据看板、教师审批、评价审核、违规管理、系统参数配置），合计 28+ 个页面、34+ 个组件。</p><p>技术上的主要亮点：</p><ol><li><strong>动态定价与薪资算法</strong>：平台核心差异化设计。课程价格根据教师预约率动态浮动——高需求教师价格自动上调，低需求时回调，形成正向激励循环。教师薪资同样根据需求与评价自动调节，确保优秀教师获得更高回报。所有参数可在管理后台配置，无需改代码即可调整策略。</li><li><strong>配对比较评价系统</strong>：学生可以对上过课的两位教师进行头对头对比评价，比传统独立评分能产生更可靠的质量信号，帮助平台更准确地识别教学水平差异。</li><li><strong>多维度雷达图评分</strong>：教师评价覆盖多个教学维度，通过雷达图可视化呈现，帮助学生直观了解教师的教学风格和强项，也为教师提供清晰的改进方向。</li><li><strong>中国大陆网络优化</strong>：通过 Cloudflare Workers 实现地理感知路由，针对大陆用户自动选择最优访问路径，降低延迟并提升可用性。</li><li><strong>完整的国际化支持</strong>：中英双语全覆盖，874 个翻译键，所有面向用户的文案均通过翻译系统管理。</li></ol><p>技术栈方面，前端采用 Next.js（App Router + Server Components）+ TypeScript + Tailwind CSS + shadcn&#x2F;ui，后端使用 Supabase（PostgreSQL + Auth + Storage + Realtime），通过 Row-Level Security 确保数据安全。部署在 Vercel 上，Cloudflare 提供 CDN 和 DNS 服务。整个项目从零到上线，涉及完整的全栈开发：数据库设计（21 张表 + 26 个迁移脚本）、认证授权、支付钱包、定时任务、SEO 优化等。</p>]]></content:encoded>
      
      
      
      <category domain="https://eugenepage.com/tags/SoftwareProjects/">SoftwareProjects</category>
      
      <category domain="https://eugenepage.com/tags/WebDevelopment/">WebDevelopment</category>
      
      <category domain="https://eugenepage.com/tags/FullStack/">FullStack</category>
      
      
      <comments>https://eugenepage.com/zh-CN/2026/04/18/20260428.VeilLingo/#disqus_thread</comments>
      
    </item>
    
    <item>
      <title>Hermes Agent Research Notes</title>
      <link>https://eugenepage.com/2026/04/16/20260416.Hermes%20Agent/</link>
      <guid>https://eugenepage.com/2026/04/16/20260416.Hermes%20Agent/</guid>
      <pubDate>Thu, 16 Apr 2026 02:00:00 GMT</pubDate>
      
        
        
      <description>&lt;h1 id=&quot;Hermes-Agent-Research-Notes&quot;&gt;&lt;a href=&quot;#Hermes-Agent-Research-Notes&quot; class=&quot;headerlink&quot; title=&quot;Hermes Agent Research Notes&quot;&gt;&lt;/a&gt;Herme</description>
        
      
      
      
      <content:encoded><![CDATA[<h1 id="Hermes-Agent-Research-Notes"><a href="#Hermes-Agent-Research-Notes" class="headerlink" title="Hermes Agent Research Notes"></a>Hermes Agent Research Notes</h1><h2 id="1-Project-Overview"><a href="#1-Project-Overview" class="headerlink" title="1. Project Overview"></a>1. Project Overview</h2><p><strong>Hermes Agent</strong> is an open-source, self-learning AI agent framework developed by <a href="https://github.com/NousResearch">Nous Research</a>.</p><table><thead><tr><th>Project Info</th><th>Details</th></tr></thead><tbody><tr><td>Initial Release</td><td>2026-02-25 (v0.1.0)</td></tr><tr><td>Current Version</td><td>v0.8.0 (2026-04-08)</td></tr><tr><td>GitHub Stars</td><td>22k+</td></tr><tr><td>License</td><td>MIT</td></tr><tr><td>Language</td><td>Python</td></tr></tbody></table><p>Core philosophy: <strong>an agent should grow alongside its user</strong> — through a built-in learning loop, it creates skills from experience and continuously improves. The more you use it, the better it gets.</p><h2 id="2-Core-Features"><a href="#2-Core-Features" class="headerlink" title="2. Core Features"></a>2. Core Features</h2><h3 id="2-1-Self-Learning-Feedback-Loop"><a href="#2-1-Self-Learning-Feedback-Loop" class="headerlink" title="2.1 Self-Learning Feedback Loop"></a>2.1 Self-Learning Feedback Loop</h3><ul><li>Automatically creates reusable <strong>Skill documents</strong> after completing complex tasks</li><li>Skills self-iterate and improve through usage</li><li>Built-in FTS5 full-text search + LLM summarization for cross-session memory recall</li><li>Honcho-based user modeling to understand who you are</li></ul><h3 id="2-2-Multi-Platform-Integration"><a href="#2-2-Multi-Platform-Integration" class="headerlink" title="2.2 Multi-Platform Integration"></a>2.2 Multi-Platform Integration</h3><p>A single Gateway process covers: Telegram, Discord, Slack, WhatsApp, Signal, Email. Supports voice memo transcription with continuous cross-platform conversations.</p><h3 id="2-3-Terminal-Interface"><a href="#2-3-Terminal-Interface" class="headerlink" title="2.3 Terminal Interface"></a>2.3 Terminal Interface</h3><p>Full TUI: multi-line editing, slash command completion, conversation history, interrupt redirection, and streaming tool output.</p><h3 id="2-4-Model-Agnostic"><a href="#2-4-Model-Agnostic" class="headerlink" title="2.4 Model-Agnostic"></a>2.4 Model-Agnostic</h3><p>Supports Nous Portal, OpenRouter (200+ models), OpenAI, Anthropic, Hugging Face, Xiaomi MiMo, and more. Switch with <code>hermes model</code> — zero code changes required.</p><h3 id="2-5-Scheduled-Tasks"><a href="#2-5-Scheduled-Tasks" class="headerlink" title="2.5 Scheduled Tasks"></a>2.5 Scheduled Tasks</h3><p>Built-in Cron scheduler. Define scheduled tasks in natural language (daily digests, backups, audits) and results are automatically delivered to any platform.</p><h3 id="2-6-Parallel-Sub-Agents"><a href="#2-6-Parallel-Sub-Agents" class="headerlink" title="2.6 Parallel Sub-Agents"></a>2.6 Parallel Sub-Agents</h3><p>Spawn isolated sub-agents for parallel workflows. Supports Python scripts that call tools via RPC, compressing multi-step pipelines into single-turn operations with zero context overhead.</p><h3 id="2-7-Flexible-Deployment"><a href="#2-7-Flexible-Deployment" class="headerlink" title="2.7 Flexible Deployment"></a>2.7 Flexible Deployment</h3><p>6 terminal backends: Local, Docker, SSH, Daytona, Singularity, Modal. Serverless on-demand wake-up keeps idle costs near zero.</p><h2 id="3-Quick-Start"><a href="#3-Quick-Start" class="headerlink" title="3. Quick Start"></a>3. Quick Start</h2><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Install (supports Linux / macOS / WSL2 / Termux)</span></span><br><span class="line">curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash</span><br><span class="line"></span><br><span class="line"><span class="comment"># Start</span></span><br><span class="line"><span class="built_in">source</span> ~/.bashrc</span><br><span class="line">hermes              <span class="comment"># Start a conversation</span></span><br><span class="line">hermes model        <span class="comment"># Select a model</span></span><br><span class="line">hermes tools        <span class="comment"># Configure tools</span></span><br><span class="line">hermes gateway      <span class="comment"># Start the message gateway</span></span><br><span class="line">hermes setup        <span class="comment"># Full setup wizard</span></span><br></pre></td></tr></table></figure><h2 id="4-Comparison-with-OpenClaw"><a href="#4-Comparison-with-OpenClaw" class="headerlink" title="4. Comparison with OpenClaw"></a>4. Comparison with OpenClaw</h2><p><a href="https://github.com/openclaw/openclaw">OpenClaw</a> (formerly Clawdbot&#x2F;MoltBot) was released in January 2026 by Austrian engineer Peter Steinberger, and is the hottest open-source agent project of 2026 (200k+ Stars). Hermes has a clear lineage connection — it even ships a built-in OpenClaw migration tool (<code>hermes claw migrate</code>).</p><table><thead><tr><th>Dimension</th><th>Hermes Agent</th><th>OpenClaw</th></tr></thead><tbody><tr><td>Release Date</td><td>2026-02</td><td>2026-01</td></tr><tr><td>Developer</td><td>Nous Research (team)</td><td>Peter Steinberger (solo start)</td></tr><tr><td>GitHub Stars</td><td>22k+</td><td>200k+</td></tr><tr><td>Core Philosophy</td><td><strong>Self-learning loop</strong> — builds skills from experience, continuously iterates</td><td><strong>Autonomous execution</strong> — completes real tasks on behalf of the user</td></tr><tr><td>Skill System</td><td>Auto-created + self-improving, compatible with agentskills.io standard</td><td>Primarily manual configuration, no automatic learning loop</td></tr><tr><td>Model Support</td><td>Model-agnostic (OpenRouter &#x2F; Xiaomi MiMo &#x2F; HuggingFace, etc.)</td><td>Primarily tied to the Claude family</td></tr><tr><td>Messaging Platforms</td><td>Telegram &#x2F; Discord &#x2F; Slack &#x2F; WhatsApp &#x2F; Signal &#x2F; Email</td><td>Telegram &#x2F; Discord &#x2F; Slack &#x2F; Feishu</td></tr><tr><td>Deployment</td><td>VPS &#x2F; Docker &#x2F; SSH &#x2F; Serverless (6 backends)</td><td>Local-first, Docker &#x2F; self-hosted</td></tr><tr><td>Memory System</td><td>Honcho user modeling + FTS5 cross-session search</td><td>MEMORY.md static memory file</td></tr><tr><td>Community Size</td><td>Rapidly growing</td><td>Large ecosystem, rich plugins and templates</td></tr></tbody></table><p><strong>Summary</strong>: OpenClaw has a more mature ecosystem and a larger community — a better fit for users who need autonomous execution out of the box. Hermes is lighter and emphasizes a “the more you use it, the better it knows you” self-learning mechanism, making it ideal for users who want an agent that’s a long-term companion and continuously adapts to their habits. Migration paths exist between the two, so you can switch as needed.</p><h2 id="5-Comparison-with-Other-Tools"><a href="#5-Comparison-with-Other-Tools" class="headerlink" title="5. Comparison with Other Tools"></a>5. Comparison with Other Tools</h2><table><thead><tr><th>Feature</th><th>Hermes Agent</th><th>Claude Code</th><th>OpenAI Codex</th></tr></thead><tbody><tr><td>Self-Learning Skill System</td><td>Yes</td><td>Yes (OMC extension)</td><td>No</td></tr><tr><td>Multi-Platform Messaging</td><td>Telegram &#x2F; Discord &#x2F; Slack &#x2F; WhatsApp &#x2F; Signal</td><td>CLI + IDE</td><td>CLI + API</td></tr><tr><td>Model Choice</td><td>Any model</td><td>Claude family</td><td>GPT family</td></tr><tr><td>Scheduled Tasks</td><td>Built-in Cron</td><td>Requires external scheduler</td><td>No</td></tr><tr><td>Deployment</td><td>VPS &#x2F; Docker &#x2F; Serverless</td><td>Local &#x2F; IDE</td><td>Cloud</td></tr><tr><td>Open Source</td><td>MIT</td><td>Partial</td><td>No</td></tr></tbody></table><h2 id="6-Assessment"><a href="#6-Assessment" class="headerlink" title="6. Assessment"></a>6. Assessment</h2><p><strong>Strengths</strong>: Unique self-learning mechanism, model-agnostic, broad platform coverage, flexible deployment, active community.</p><p><strong>Limitations</strong>: The project is relatively new (only 2 months old), and API stability remains to be seen. Compared to mature tools like Claude Code, the ecosystem and plugin count still have room to grow.</p><p><strong>Best Use Case</strong>: When you want a long-running personal agent that continuously learns your preferences — especially for cross-platform scenarios (Telegram, WeChat, etc.).</p><hr><blockquote><p>Sources: <a href="https://github.com/nousresearch/hermes-agent">Hermes GitHub</a> | <a href="https://hermes-agent.nousresearch.com/">Hermes Official Docs</a> | <a href="https://github.com/openclaw/openclaw">OpenClaw GitHub</a> | <a href="https://www.mittrchina.com/news/detail/16243">MIT Technology Review China</a></p></blockquote>]]></content:encoded>
      
      
      
      <category domain="https://eugenepage.com/tags/AI/">AI</category>
      
      <category domain="https://eugenepage.com/tags/Agent/">Agent</category>
      
      <category domain="https://eugenepage.com/tags/OpenSource/">OpenSource</category>
      
      <category domain="https://eugenepage.com/tags/NousResearch/">NousResearch</category>
      
      
      <comments>https://eugenepage.com/2026/04/16/20260416.Hermes%20Agent/#disqus_thread</comments>
      
    </item>
    
    <item>
      <title>Hermes Agent 调研笔记</title>
      <link>https://eugenepage.com/zh-CN/2026/04/16/20260416.Hermes%20Agent/</link>
      <guid>https://eugenepage.com/zh-CN/2026/04/16/20260416.Hermes%20Agent/</guid>
      <pubDate>Thu, 16 Apr 2026 02:00:00 GMT</pubDate>
      
        
        
      <description>&lt;h1 id=&quot;Hermes-Agent-调研笔记&quot;&gt;&lt;a href=&quot;#Hermes-Agent-调研笔记&quot; class=&quot;headerlink&quot; title=&quot;Hermes Agent 调研笔记&quot;&gt;&lt;/a&gt;Hermes Agent 调研笔记&lt;/h1&gt;&lt;h2 id=&quot;一、项目概</description>
        
      
      
      
      <content:encoded><![CDATA[<h1 id="Hermes-Agent-调研笔记"><a href="#Hermes-Agent-调研笔记" class="headerlink" title="Hermes Agent 调研笔记"></a>Hermes Agent 调研笔记</h1><h2 id="一、项目概览"><a href="#一、项目概览" class="headerlink" title="一、项目概览"></a>一、项目概览</h2><p><strong>Hermes Agent</strong> 是 <a href="https://github.com/NousResearch">Nous Research</a> 开发的开源、自学习 AI Agent 框架。</p><table><thead><tr><th>项目信息</th><th>详情</th></tr></thead><tbody><tr><td>首次发布</td><td>2026-02-25 (v0.1.0)</td></tr><tr><td>当前版本</td><td>v0.8.0 (2026-04-08)</td></tr><tr><td>GitHub Stars</td><td>22k+</td></tr><tr><td>协议</td><td>MIT</td></tr><tr><td>语言</td><td>Python</td></tr></tbody></table><p>核心理念：<strong>Agent 应该随用户一起成长</strong>——通过内置学习循环，从经验中创建技能、持续改进，越用越强。</p><h2 id="二、核心特性"><a href="#二、核心特性" class="headerlink" title="二、核心特性"></a>二、核心特性</h2><h3 id="2-1-自学习闭环"><a href="#2-1-自学习闭环" class="headerlink" title="2.1 自学习闭环"></a>2.1 自学习闭环</h3><ul><li>完成复杂任务后自动创建可复用的 <strong>Skill 文档</strong></li><li>Skill 在使用过程中自我迭代优化</li><li>内置 FTS5 全文搜索 + LLM 摘要，支持跨会话记忆召回</li><li>基于 Honcho 的用户画像建模，理解你是谁</li></ul><h3 id="2-2-多平台接入"><a href="#2-2-多平台接入" class="headerlink" title="2.2 多平台接入"></a>2.2 多平台接入</h3><p>单一 Gateway 进程即可覆盖：Telegram、Discord、Slack、WhatsApp、Signal、Email。支持语音备忘录转录，跨平台对话连续。</p><h3 id="2-3-终端交互"><a href="#2-3-终端交互" class="headerlink" title="2.3 终端交互"></a>2.3 终端交互</h3><p>完整的 TUI 界面：多行编辑、斜杠命令补全、对话历史、中断重定向、流式工具输出。</p><h3 id="2-4-模型无关"><a href="#2-4-模型无关" class="headerlink" title="2.4 模型无关"></a>2.4 模型无关</h3><p>支持 Nous Portal、OpenRouter (200+ 模型)、OpenAI、Anthropic、Hugging Face、小米 MiMo 等，<code>hermes model</code> 一键切换，零代码改动。</p><h3 id="2-5-定时任务"><a href="#2-5-定时任务" class="headerlink" title="2.5 定时任务"></a>2.5 定时任务</h3><p>内置 Cron 调度器，用自然语言定义定时任务（日报、备份、审计），结果自动投递到任意平台。</p><h3 id="2-6-并行子代理"><a href="#2-6-并行子代理" class="headerlink" title="2.6 并行子代理"></a>2.6 并行子代理</h3><p>可生成隔离子代理并行工作流，支持通过 RPC 调用工具的 Python 脚本，将多步骤流水线压缩为零上下文开销的单轮操作。</p><h3 id="2-7-灵活部署"><a href="#2-7-灵活部署" class="headerlink" title="2.7 灵活部署"></a>2.7 灵活部署</h3><p>6 种终端后端：Local、Docker、SSH、Daytona、Singularity、Modal。支持 Serverless 按需唤醒，空闲时几乎零成本。</p><h2 id="三、快速上手"><a href="#三、快速上手" class="headerlink" title="三、快速上手"></a>三、快速上手</h2><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 安装（支持 Linux / macOS / WSL2 / Termux）</span></span><br><span class="line">curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash</span><br><span class="line"></span><br><span class="line"><span class="comment"># 启动</span></span><br><span class="line"><span class="built_in">source</span> ~/.bashrc</span><br><span class="line">hermes              <span class="comment"># 开始对话</span></span><br><span class="line">hermes model        <span class="comment"># 选择模型</span></span><br><span class="line">hermes tools        <span class="comment"># 配置工具</span></span><br><span class="line">hermes gateway      <span class="comment"># 启动消息网关</span></span><br><span class="line">hermes setup        <span class="comment"># 完整设置向导</span></span><br></pre></td></tr></table></figure><h2 id="四、与-OpenClaw-对比"><a href="#四、与-OpenClaw-对比" class="headerlink" title="四、与 OpenClaw 对比"></a>四、与 OpenClaw 对比</h2><p><a href="https://github.com/openclaw/openclaw">OpenClaw</a>（前身 Clawdbot&#x2F;MoltBot）由奥地利工程师 Peter Steinberger 于 2026 年 1 月发布，是 2026 年最火的开源 Agent 项目（200k+ Stars）。Hermes 与之有明确的渊源——内置了 OpenClaw 迁移工具（<code>hermes claw migrate</code>）。</p><table><thead><tr><th>维度</th><th>Hermes Agent</th><th>OpenClaw</th></tr></thead><tbody><tr><td>发布时间</td><td>2026-02</td><td>2026-01</td></tr><tr><td>开发者</td><td>Nous Research（团队）</td><td>Peter Steinberger（个人起步）</td></tr><tr><td>GitHub Stars</td><td>22k+</td><td>200k+</td></tr><tr><td>核心理念</td><td><strong>自学习闭环</strong>——从经验中创建技能、持续迭代</td><td><strong>自主执行</strong>——代替用户完成真实操作</td></tr><tr><td>Skill 系统</td><td>自动创建 + 自我改进，兼容 agentskills.io 标准</td><td>手动配置为主，无自动学习闭环</td></tr><tr><td>模型支持</td><td>模型无关（OpenRouter&#x2F;小米 MiMo&#x2F;HuggingFace 等）</td><td>主要绑定 Claude 系列</td></tr><tr><td>消息平台</td><td>Telegram&#x2F;Discord&#x2F;Slack&#x2F;WhatsApp&#x2F;Signal&#x2F;Email</td><td>Telegram&#x2F;Discord&#x2F;Slack&#x2F;飞书</td></tr><tr><td>部署方式</td><td>VPS&#x2F;Docker&#x2F;SSH&#x2F;Serverless（6 种后端）</td><td>本地优先，Docker&#x2F;自托管</td></tr><tr><td>记忆系统</td><td>Honcho 用户画像 + FTS5 跨会话搜索</td><td>MEMORY.md 静态记忆文件</td></tr><tr><td>社区规模</td><td>快速增长中</td><td>庞大生态，插件&#x2F;模板丰富</td></tr></tbody></table><p><strong>总结</strong>：OpenClaw 生态更成熟、社区更大，适合需要”开箱即用”自主执行的用户；Hermes 更轻量、更强调”越用越懂你”的自学习机制，适合希望 Agent 长期陪伴并持续适应自己习惯的用户。两者有迁移路径，可以按需切换。</p><h2 id="五、与其他工具对比"><a href="#五、与其他工具对比" class="headerlink" title="五、与其他工具对比"></a>五、与其他工具对比</h2><table><thead><tr><th>特性</th><th>Hermes Agent</th><th>Claude Code</th><th>OpenAI Codex</th></tr></thead><tbody><tr><td>自学习 Skill 系统</td><td>有</td><td>有 (OMC 扩展)</td><td>无</td></tr><tr><td>多平台消息</td><td>Telegram&#x2F;Discord&#x2F;Slack&#x2F;WhatsApp&#x2F;Signal</td><td>CLI + IDE</td><td>CLI + API</td></tr><tr><td>模型选择</td><td>任意模型</td><td>Claude 系列</td><td>GPT 系列</td></tr><tr><td>定时任务</td><td>内置 Cron</td><td>需外部调度</td><td>无</td></tr><tr><td>部署方式</td><td>VPS &#x2F; Docker &#x2F; Serverless</td><td>本地 &#x2F; IDE</td><td>云端</td></tr><tr><td>开源</td><td>MIT</td><td>部分</td><td>否</td></tr></tbody></table><h2 id="六、评价"><a href="#六、评价" class="headerlink" title="六、评价"></a>六、评价</h2><p><strong>优势</strong>：自学习机制独特、模型无关、多平台覆盖、部署灵活、社区活跃。</p><p><strong>局限</strong>：项目较新（仅 2 个月），API 稳定性待观察；与 Claude Code 等成熟工具相比，生态和插件数量尚有差距。</p><p><strong>适用场景</strong>：需要一个长期运行、持续学习你偏好的个人 Agent，尤其是跨平台（Telegram&#x2F;微信）使用场景。</p><hr><blockquote><p>参考来源：<a href="https://github.com/nousresearch/hermes-agent">Hermes GitHub</a> | <a href="https://hermes-agent.nousresearch.com/">Hermes 官方文档</a> | <a href="https://github.com/openclaw/openclaw">OpenClaw GitHub</a> | <a href="https://www.mittrchina.com/news/detail/16243">MIT Technology Review China</a></p></blockquote>]]></content:encoded>
      
      
      
      <category domain="https://eugenepage.com/tags/AI/">AI</category>
      
      <category domain="https://eugenepage.com/tags/Agent/">Agent</category>
      
      <category domain="https://eugenepage.com/tags/OpenSource/">OpenSource</category>
      
      <category domain="https://eugenepage.com/tags/NousResearch/">NousResearch</category>
      
      
      <comments>https://eugenepage.com/zh-CN/2026/04/16/20260416.Hermes%20Agent/#disqus_thread</comments>
      
    </item>
    
    <item>
      <title>SQL Basics Notes</title>
      <link>https://eugenepage.com/2026/04/12/20260412.SQLBasicsNotes/</link>
      <guid>https://eugenepage.com/2026/04/12/20260412.SQLBasicsNotes/</guid>
      <pubDate>Sun, 12 Apr 2026 02:00:00 GMT</pubDate>
      
        
        
      <description>&lt;h1 id=&quot;I-Introduction-to-SQL&quot;&gt;&lt;a href=&quot;#I-Introduction-to-SQL&quot; class=&quot;headerlink&quot; title=&quot;I. Introduction to SQL&quot;&gt;&lt;/a&gt;I. Introduction to SQL</description>
        
      
      
      
      <content:encoded><![CDATA[<h1 id="I-Introduction-to-SQL"><a href="#I-Introduction-to-SQL" class="headerlink" title="I. Introduction to SQL"></a>I. Introduction to SQL</h1><h2 id="1-1-What-is-SQL"><a href="#1-1-What-is-SQL" class="headerlink" title="1.1 What is SQL"></a>1.1 What is SQL</h2><p><strong>SQL (Structured Query Language)</strong>: the standard programming language for managing relational databases.</p><ul><li><strong>RDBMS</strong>: Relational Database Management System</li><li>Common databases (by type): MySQL, PostgreSQL, SQLite, Oracle, SQL Server<ol><li>SQLite: lightweight, embedded — great for mobile apps</li><li>MySQL: open-source, widely used — great for web apps</li><li>PostgreSQL: open-source, feature-rich — great for complex apps</li><li>Oracle: enterprise-grade, fully featured — great for large-scale apps</li><li>SQL Server: developed by Microsoft — great for Windows environments</li></ol></li></ul><h2 id="1-2-Basic-SQL-Categories"><a href="#1-2-Basic-SQL-Categories" class="headerlink" title="1.2 Basic SQL Categories"></a>1.2 Basic SQL Categories</h2><p>Four schools of thought — these are the disciplines you use to communicate with a database. Master them and you’re a data wrangler; give up and you’re just a data janitor. 🐶</p><table><thead><tr><th>Category</th><th>Purpose</th><th>Keywords</th></tr></thead><tbody><tr><td>DDL</td><td>Define database structure</td><td>CREATE, ALTER, DROP</td></tr><tr><td>DML</td><td>Manipulate data</td><td>INSERT, UPDATE, DELETE</td></tr><tr><td>DQL</td><td>Query data</td><td>SELECT</td></tr><tr><td>DCL</td><td>Control permissions</td><td>GRANT, REVOKE</td></tr></tbody></table><hr><h1 id="II-Basic-Syntax"><a href="#II-Basic-Syntax" class="headerlink" title="II. Basic Syntax"></a>II. Basic Syntax</h1><h2 id="2-1-Basic-Rules"><a href="#2-1-Basic-Rules" class="headerlink" title="2.1 Basic Rules"></a>2.1 Basic Rules</h2><ul><li>SQL statements end with a semicolon <code>;</code> (some databases allow omitting it)</li><li>Keywords are case-insensitive, but the convention is to write keywords in uppercase and table&#x2F;column names in lowercase</li><li>Strings and dates are wrapped in single quotes <code>&#39; &#39;</code></li><li>Comments: <code>-- single-line comment</code>, <code>/* multi-line comment */</code></li></ul><h2 id="2-2-Writing-Style"><a href="#2-2-Writing-Style" class="headerlink" title="2.2 Writing Style"></a>2.2 Writing Style</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- Recommended writing style</span></span><br><span class="line"><span class="keyword">SELECT</span></span><br><span class="line">    id,</span><br><span class="line">    name,</span><br><span class="line">    email</span><br><span class="line"><span class="keyword">FROM</span></span><br><span class="line">    users<span class="comment">/* Whether to use double quotes depends on the database type */</span></span><br><span class="line"><span class="keyword">WHERE</span></span><br><span class="line">    status <span class="operator">=</span> <span class="string">&#x27;active&#x27;</span></span><br><span class="line"><span class="keyword">ORDER</span> <span class="keyword">BY</span></span><br><span class="line">    create_time <span class="keyword">DESC</span>;</span><br></pre></td></tr></table></figure><h2 id="2-3-Common-Operators"><a href="#2-3-Common-Operators" class="headerlink" title="2.3 Common Operators"></a>2.3 Common Operators</h2><h3 id="Arithmetic-Operators"><a href="#Arithmetic-Operators" class="headerlink" title="Arithmetic Operators"></a>Arithmetic Operators</h3><table><thead><tr><th>Operator</th><th>Description</th></tr></thead><tbody><tr><td><code>+</code></td><td>Addition</td></tr><tr><td><code>-</code></td><td>Subtraction</td></tr><tr><td><code>*</code></td><td>Multiplication</td></tr><tr><td><code>/</code></td><td>Division</td></tr><tr><td><code>%</code> or <code>MOD()</code></td><td>Modulo</td></tr></tbody></table><h3 id="Comparison-Operators"><a href="#Comparison-Operators" class="headerlink" title="Comparison Operators"></a>Comparison Operators</h3><table><thead><tr><th>Operator</th><th>Description</th></tr></thead><tbody><tr><td><code>=</code></td><td>Equal to</td></tr><tr><td><code>&lt;&gt;</code> or <code>!=</code></td><td>Not equal to</td></tr><tr><td><code>&gt;</code></td><td>Greater than</td></tr><tr><td><code>&lt;</code></td><td>Less than</td></tr><tr><td><code>&gt;=</code></td><td>Greater than or equal to</td></tr><tr><td><code>&lt;=</code></td><td>Less than or equal to</td></tr></tbody></table><h3 id="Logical-Operators"><a href="#Logical-Operators" class="headerlink" title="Logical Operators"></a>Logical Operators</h3><table><thead><tr><th>Operator</th><th>Description</th></tr></thead><tbody><tr><td><code>AND</code></td><td>Logical AND (higher precedence than OR — use parentheses like in C++)</td></tr><tr><td><code>OR</code></td><td>Logical OR</td></tr><tr><td><code>NOT</code></td><td>Logical NOT</td></tr></tbody></table><h2 id="2-4-Common-Commands-MySQL"><a href="#2-4-Common-Commands-MySQL" class="headerlink" title="2.4 Common Commands (MySQL)"></a>2.4 Common Commands (MySQL)</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- Show all databases</span></span><br><span class="line"><span class="keyword">SHOW</span> DATABASES;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Show all tables in the current database</span></span><br><span class="line"><span class="keyword">SHOW</span> TABLES;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- View table structure</span></span><br><span class="line"><span class="keyword">DESC</span> table_name;</span><br><span class="line"><span class="comment">-- or</span></span><br><span class="line"><span class="keyword">DESCRIBE</span> table_name;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- View the CREATE TABLE statement</span></span><br><span class="line"><span class="keyword">SHOW</span> <span class="keyword">CREATE TABLE</span> table_name;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Show full column info for a table</span></span><br><span class="line"><span class="keyword">SHOW</span> <span class="keyword">FULL</span> COLUMNS <span class="keyword">FROM</span> table_name;</span><br></pre></td></tr></table></figure><h2 id="2-5-⚠️-Things-to-Watch-Out-For"><a href="#2-5-⚠️-Things-to-Watch-Out-For" class="headerlink" title="2.5 ⚠️ Things to Watch Out For"></a>2.5 ⚠️ Things to Watch Out For</h2><ol><li>Query syntax keywords have a specific ordering relationship.</li><li><img src="https://cdn.jsdelivr.net/gh/youdrew/MyPicGo/Images/%E6%88%AA%E5%B1%8F2026-04-12%2018.51.16.png" alt="Screenshot 2026-04-12 18.51.16"></li></ol><hr><h1 id="III-DDL-—-Data-Definition"><a href="#III-DDL-—-Data-Definition" class="headerlink" title="III. DDL — Data Definition"></a>III. DDL — Data Definition</h1><h2 id="2-1-Creating-a-Database"><a href="#2-1-Creating-a-Database" class="headerlink" title="2.1 Creating a Database"></a>2.1 Creating a Database</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">CREATE</span> DATABASE database_name;</span><br><span class="line">USE database_name;</span><br></pre></td></tr></table></figure><h2 id="2-2-Creating-a-Table"><a href="#2-2-Creating-a-Table" class="headerlink" title="2.2 Creating a Table"></a>2.2 Creating a Table</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">CREATE TABLE</span> table_name (</span><br><span class="line">    column1 data_type [<span class="keyword">constraint</span>],</span><br><span class="line">    column2 data_type [<span class="keyword">constraint</span>],</span><br><span class="line">    ...</span><br><span class="line">);</span><br></pre></td></tr></table></figure><p><strong>Common data types</strong>:</p><ul><li>Integer: <code>INT</code>, <code>BIGINT</code></li><li>Decimal: <code>DECIMAL(m,n)</code>, <code>FLOAT</code>, <code>DOUBLE</code></li><li>String: <code>VARCHAR(n)</code>, <code>CHAR(n)</code>, <code>TEXT</code></li><li>Date&#x2F;Time: <code>DATE</code>, <code>DATETIME</code>, <code>TIMESTAMP</code></li></ul><h2 id="2-3-Constraints"><a href="#2-3-Constraints" class="headerlink" title="2.3 Constraints"></a>2.3 Constraints</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">CREATE TABLE</span> users (</span><br><span class="line">    id <span class="type">INT</span> <span class="keyword">PRIMARY KEY</span> AUTO_INCREMENT,</span><br><span class="line">    name <span class="type">VARCHAR</span>(<span class="number">50</span>) <span class="keyword">NOT NULL</span>,</span><br><span class="line">    email <span class="type">VARCHAR</span>(<span class="number">100</span>) <span class="keyword">UNIQUE</span>,</span><br><span class="line">    age <span class="type">INT</span> <span class="keyword">DEFAULT</span> <span class="number">18</span>,</span><br><span class="line">    <span class="keyword">FOREIGN KEY</span> (dept_id) <span class="keyword">REFERENCES</span> departments(id)</span><br><span class="line">);</span><br></pre></td></tr></table></figure><p><strong>Common constraints</strong>:</p><ul><li><code>PRIMARY KEY</code>: primary key, uniquely identifies a row</li><li><code>NOT NULL</code>: value cannot be null</li><li><code>UNIQUE</code>: value must be unique</li><li><code>DEFAULT</code>: default value</li><li><code>FOREIGN KEY</code>: foreign key constraint</li><li><code>AUTO_INCREMENT</code>: auto-increment (MySQL)</li></ul><h2 id="2-4-Altering-Table-Structure"><a href="#2-4-Altering-Table-Structure" class="headerlink" title="2.4 Altering Table Structure"></a>2.4 Altering Table Structure</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- Add a column</span></span><br><span class="line"><span class="keyword">ALTER TABLE</span> table_name <span class="keyword">ADD</span> column_name data_type;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Modify a column</span></span><br><span class="line"><span class="keyword">ALTER TABLE</span> table_name MODIFY column_name new_data_type;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Drop a column</span></span><br><span class="line"><span class="keyword">ALTER TABLE</span> table_name <span class="keyword">DROP</span> <span class="keyword">COLUMN</span> column_name;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Rename a table</span></span><br><span class="line"><span class="keyword">ALTER TABLE</span> table_name RENAME <span class="keyword">TO</span> new_table_name;</span><br></pre></td></tr></table></figure><h2 id="2-5-Dropping-a-Table"><a href="#2-5-Dropping-a-Table" class="headerlink" title="2.5 Dropping a Table"></a>2.5 Dropping a Table</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">DROP</span> <span class="keyword">TABLE</span> table_name;           <span class="comment">-- Drop the table entirely</span></span><br><span class="line"><span class="keyword">TRUNCATE</span> <span class="keyword">TABLE</span> table_name;       <span class="comment">-- Clear all data (keep the structure)</span></span><br></pre></td></tr></table></figure><hr><h1 id="III-DML-—-Data-Manipulation"><a href="#III-DML-—-Data-Manipulation" class="headerlink" title="III. DML — Data Manipulation"></a>III. DML — Data Manipulation</h1><h2 id="3-1-Inserting-Data"><a href="#3-1-Inserting-Data" class="headerlink" title="3.1 Inserting Data"></a>3.1 Inserting Data</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- Insert a single row</span></span><br><span class="line"><span class="keyword">INSERT INTO</span> table_name (col1, col2) <span class="keyword">VALUES</span> (val1, val2);</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Insert multiple rows</span></span><br><span class="line"><span class="keyword">INSERT INTO</span> table_name (col1, col2) <span class="keyword">VALUES</span></span><br><span class="line">(val1, val2),</span><br><span class="line">(val3, val4),</span><br><span class="line">(val5, val6);</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Import from another table</span></span><br><span class="line"><span class="keyword">INSERT INTO</span> table_name <span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> other_table <span class="keyword">WHERE</span> <span class="keyword">condition</span>;</span><br></pre></td></tr></table></figure><h2 id="3-2-Updating-Data"><a href="#3-2-Updating-Data" class="headerlink" title="3.2 Updating Data"></a>3.2 Updating Data</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">UPDATE</span> table_name</span><br><span class="line"><span class="keyword">SET</span> col1 <span class="operator">=</span> new_val1, col2 <span class="operator">=</span> new_val2</span><br><span class="line"><span class="keyword">WHERE</span> <span class="keyword">condition</span>;</span><br></pre></td></tr></table></figure><h2 id="3-3-Deleting-Data"><a href="#3-3-Deleting-Data" class="headerlink" title="3.3 Deleting Data"></a>3.3 Deleting Data</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">DELETE</span> <span class="keyword">FROM</span> table_name <span class="keyword">WHERE</span> <span class="keyword">condition</span>;</span><br></pre></td></tr></table></figure><hr><h1 id="IV-DQL-—-Data-Query-Core"><a href="#IV-DQL-—-Data-Query-Core" class="headerlink" title="IV. DQL — Data Query (Core)"></a>IV. DQL — Data Query (Core)</h1><h2 id="4-1-Basic-Queries"><a href="#4-1-Basic-Queries" class="headerlink" title="4.1 Basic Queries"></a>4.1 Basic Queries</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- Query all columns</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> table_name;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Query specific columns</span></span><br><span class="line"><span class="keyword">SELECT</span> col1, col2 <span class="keyword">FROM</span> table_name;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Deduplicate</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="keyword">DISTINCT</span> col <span class="keyword">FROM</span> table_name;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Alias</span></span><br><span class="line"><span class="keyword">SELECT</span> col <span class="keyword">AS</span> alias <span class="keyword">FROM</span> table_name;</span><br></pre></td></tr></table></figure><h2 id="4-2-Conditional-Queries-—-WHERE"><a href="#4-2-Conditional-Queries-—-WHERE" class="headerlink" title="4.2 Conditional Queries — WHERE"></a>4.2 Conditional Queries — WHERE</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> table_name <span class="keyword">WHERE</span> <span class="keyword">condition</span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Comparison operators</span></span><br><span class="line"><span class="keyword">WHERE</span> age <span class="operator">&gt;</span> <span class="number">18</span></span><br><span class="line"><span class="keyword">WHERE</span> name <span class="operator">=</span> <span class="string">&#x27;Alice&#x27;</span></span><br><span class="line"><span class="keyword">WHERE</span> age <span class="operator">&gt;=</span> <span class="number">18</span> <span class="keyword">AND</span> age <span class="operator">&lt;=</span> <span class="number">30</span></span><br><span class="line"></span><br><span class="line"><span class="comment">-- Range</span></span><br><span class="line"><span class="keyword">WHERE</span> age <span class="keyword">BETWEEN</span> <span class="number">18</span> <span class="keyword">AND</span> <span class="number">30</span></span><br><span class="line"></span><br><span class="line"><span class="comment">-- Enumeration</span></span><br><span class="line"><span class="keyword">WHERE</span> status <span class="keyword">IN</span> (<span class="string">&#x27;active&#x27;</span>, <span class="string">&#x27;pending&#x27;</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Pattern matching</span></span><br><span class="line"><span class="keyword">WHERE</span> name <span class="keyword">LIKE</span> <span class="string">&#x27;A%&#x27;</span>       <span class="comment">-- starts with A</span></span><br><span class="line"><span class="keyword">WHERE</span> name <span class="keyword">LIKE</span> <span class="string">&#x27;%son%&#x27;</span>    <span class="comment">-- contains &quot;son&quot;</span></span><br><span class="line"><span class="keyword">WHERE</span> name <span class="keyword">LIKE</span> <span class="string">&#x27;A_&#x27;</span>       <span class="comment">-- starts with A, exactly 2 characters</span></span><br><span class="line"></span><br><span class="line"><span class="comment">-- Null checks</span></span><br><span class="line"><span class="keyword">WHERE</span> email <span class="keyword">IS</span> <span class="keyword">NULL</span></span><br><span class="line"><span class="keyword">WHERE</span> email <span class="keyword">IS</span> <span class="keyword">NOT NULL</span></span><br></pre></td></tr></table></figure><h2 id="4-3-Sorting-—-ORDER-BY"><a href="#4-3-Sorting-—-ORDER-BY" class="headerlink" title="4.3 Sorting — ORDER BY"></a>4.3 Sorting — ORDER BY</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> table_name <span class="keyword">ORDER</span> <span class="keyword">BY</span> col1 <span class="keyword">ASC</span>, col2 <span class="keyword">DESC</span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- ASC: ascending (default)</span></span><br><span class="line"><span class="comment">-- DESC: descending</span></span><br></pre></td></tr></table></figure><h2 id="4-4-Limiting-Results-—-LIMIT"><a href="#4-4-Limiting-Results-—-LIMIT" class="headerlink" title="4.4 Limiting Results — LIMIT"></a>4.4 Limiting Results — LIMIT</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- MySQL</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> table_name LIMIT <span class="number">10</span>;</span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> table_name LIMIT <span class="number">5</span>, <span class="number">10</span>;  <span class="comment">-- Start from row 5, fetch 10 rows</span></span><br><span class="line"></span><br><span class="line"><span class="comment">-- SQL Server</span></span><br><span class="line"><span class="keyword">SELECT</span> TOP <span class="number">10</span> <span class="operator">*</span> <span class="keyword">FROM</span> table_name;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Oracle</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> table_name <span class="keyword">WHERE</span> ROWNUM <span class="operator">&lt;=</span> <span class="number">10</span>;</span><br></pre></td></tr></table></figure><h2 id="4-5-Aggregate-Functions"><a href="#4-5-Aggregate-Functions" class="headerlink" title="4.5 Aggregate Functions"></a>4.5 Aggregate Functions</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">SELECT</span></span><br><span class="line">    <span class="built_in">COUNT</span>(<span class="operator">*</span>)          <span class="keyword">AS</span> total_rows,</span><br><span class="line">    <span class="built_in">COUNT</span>(col)        <span class="keyword">AS</span> non_null_count,</span><br><span class="line">    <span class="built_in">SUM</span>(col)          <span class="keyword">AS</span> total,</span><br><span class="line">    <span class="built_in">AVG</span>(col)          <span class="keyword">AS</span> average,</span><br><span class="line">    <span class="built_in">MAX</span>(col)          <span class="keyword">AS</span> maximum,</span><br><span class="line">    <span class="built_in">MIN</span>(col)          <span class="keyword">AS</span> minimum</span><br><span class="line"><span class="keyword">FROM</span> table_name;</span><br></pre></td></tr></table></figure><h2 id="4-6-Grouping-—-GROUP-BY"><a href="#4-6-Grouping-—-GROUP-BY" class="headerlink" title="4.6 Grouping — GROUP BY"></a>4.6 Grouping — GROUP BY</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">SELECT</span> col, aggregate_function</span><br><span class="line"><span class="keyword">FROM</span> table_name</span><br><span class="line"><span class="keyword">GROUP</span> <span class="keyword">BY</span> col</span><br><span class="line"><span class="keyword">HAVING</span> aggregate_condition;</span><br></pre></td></tr></table></figure><p><strong>Note</strong>: <code>WHERE</code> filters before grouping; <code>HAVING</code> filters after grouping.</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- Example: average salary per department</span></span><br><span class="line"><span class="keyword">SELECT</span> dept_id, <span class="built_in">AVG</span>(salary) <span class="keyword">AS</span> avg_salary</span><br><span class="line"><span class="keyword">FROM</span> employees</span><br><span class="line"><span class="keyword">GROUP</span> <span class="keyword">BY</span> dept_id</span><br><span class="line"><span class="keyword">HAVING</span> <span class="built_in">AVG</span>(salary) <span class="operator">&gt;</span> <span class="number">5000</span>;</span><br></pre></td></tr></table></figure><h2 id="4-7-Multi-Table-Queries"><a href="#4-7-Multi-Table-Queries" class="headerlink" title="4.7 Multi-Table Queries"></a>4.7 Multi-Table Queries</h2><h3 id="Joins-JOIN"><a href="#Joins-JOIN" class="headerlink" title="Joins (JOIN)"></a>Joins (JOIN)</h3><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- INNER JOIN: only keep matching rows</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span></span><br><span class="line"><span class="keyword">FROM</span> table1</span><br><span class="line"><span class="keyword">INNER</span> <span class="keyword">JOIN</span> table2 <span class="keyword">ON</span> table1.col <span class="operator">=</span> table2.col;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- LEFT JOIN: keep all rows from the left table; NULLs where there&#x27;s no match on the right</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span></span><br><span class="line"><span class="keyword">FROM</span> table1</span><br><span class="line"><span class="keyword">LEFT</span> <span class="keyword">JOIN</span> table2 <span class="keyword">ON</span> table1.col <span class="operator">=</span> table2.col;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- RIGHT JOIN: keep all rows from the right table</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span></span><br><span class="line"><span class="keyword">FROM</span> table1</span><br><span class="line"><span class="keyword">RIGHT</span> <span class="keyword">JOIN</span> table2 <span class="keyword">ON</span> table1.col <span class="operator">=</span> table2.col;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- FULL JOIN (MySQL doesn&#x27;t support this natively — simulate with UNION)</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> table1 <span class="keyword">LEFT</span> <span class="keyword">JOIN</span> table2 <span class="keyword">ON</span> ...</span><br><span class="line"><span class="keyword">UNION</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> table1 <span class="keyword">RIGHT</span> <span class="keyword">JOIN</span> table2 <span class="keyword">ON</span> ...;</span><br></pre></td></tr></table></figure><h3 id="Subqueries"><a href="#Subqueries" class="headerlink" title="Subqueries"></a>Subqueries</h3><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- Subquery in WHERE</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> table_name <span class="keyword">WHERE</span> col <span class="operator">=</span> (<span class="keyword">SELECT</span> col <span class="keyword">FROM</span> ...);</span><br><span class="line"></span><br><span class="line"><span class="comment">-- IN subquery</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> table_name <span class="keyword">WHERE</span> col <span class="keyword">IN</span> (<span class="keyword">SELECT</span> col <span class="keyword">FROM</span> ...);</span><br><span class="line"></span><br><span class="line"><span class="comment">-- EXISTS subquery</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> table_name <span class="keyword">WHERE</span> <span class="keyword">EXISTS</span> (<span class="keyword">SELECT</span> <span class="number">1</span> <span class="keyword">FROM</span> ... <span class="keyword">WHERE</span> <span class="keyword">condition</span>);</span><br></pre></td></tr></table></figure><h2 id="4-8-UNION-—-Combined-Queries"><a href="#4-8-UNION-—-Combined-Queries" class="headerlink" title="4.8 UNION — Combined Queries"></a>4.8 UNION — Combined Queries</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">SELECT</span> col <span class="keyword">FROM</span> table1</span><br><span class="line"><span class="keyword">UNION</span>                 <span class="comment">-- merge and deduplicate</span></span><br><span class="line"><span class="keyword">SELECT</span> col <span class="keyword">FROM</span> table2;</span><br><span class="line"></span><br><span class="line"><span class="keyword">SELECT</span> col <span class="keyword">FROM</span> table1</span><br><span class="line"><span class="keyword">UNION</span> <span class="keyword">ALL</span>            <span class="comment">-- merge and keep duplicates</span></span><br><span class="line"><span class="keyword">SELECT</span> col <span class="keyword">FROM</span> table2;</span><br></pre></td></tr></table></figure><hr><h1 id="V-Common-Functions"><a href="#V-Common-Functions" class="headerlink" title="V. Common Functions"></a>V. Common Functions</h1><h2 id="5-1-String-Functions"><a href="#5-1-String-Functions" class="headerlink" title="5.1 String Functions"></a>5.1 String Functions</h2><table><thead><tr><th>Function</th><th>Description</th></tr></thead><tbody><tr><td><code>CONCAT(s1, s2)</code></td><td>Concatenate strings</td></tr><tr><td><code>LENGTH(s)</code></td><td>Get string length</td></tr><tr><td><code>UPPER(s)</code> &#x2F; <code>LOWER(s)</code></td><td>Convert case</td></tr><tr><td><code>TRIM(s)</code></td><td>Strip leading&#x2F;trailing spaces</td></tr><tr><td><code>SUBSTRING(s, start, len)</code></td><td>Extract a substring</td></tr><tr><td><code>REPLACE(s, old, new)</code></td><td>Replace substring</td></tr><tr><td><code>IFNULL(s, default)</code></td><td>Replace NULL with a default value</td></tr></tbody></table><h2 id="5-2-Numeric-Functions"><a href="#5-2-Numeric-Functions" class="headerlink" title="5.2 Numeric Functions"></a>5.2 Numeric Functions</h2><table><thead><tr><th>Function</th><th>Description</th></tr></thead><tbody><tr><td><code>ROUND(n, d)</code></td><td>Round to d decimal places</td></tr><tr><td><code>CEIL(n)</code> &#x2F; <code>FLOOR(n)</code></td><td>Ceiling &#x2F; floor</td></tr><tr><td><code>ABS(n)</code></td><td>Absolute value</td></tr><tr><td><code>MOD(n, m)</code></td><td>Modulo</td></tr><tr><td><code>RAND()</code></td><td>Random number</td></tr></tbody></table><h2 id="5-3-Date-Functions"><a href="#5-3-Date-Functions" class="headerlink" title="5.3 Date Functions"></a>5.3 Date Functions</h2><table><thead><tr><th>Function</th><th>Description</th></tr></thead><tbody><tr><td><code>NOW()</code> &#x2F; <code>SYSDATE()</code></td><td>Current date and time</td></tr><tr><td><code>CURDATE()</code></td><td>Current date</td></tr><tr><td><code>YEAR(d)</code> &#x2F; <code>MONTH(d)</code> &#x2F; <code>DAY(d)</code></td><td>Extract year &#x2F; month &#x2F; day</td></tr><tr><td><code>DATE_FORMAT(d, format)</code></td><td>Format a date</td></tr><tr><td><code>DATE_ADD(d, INTERVAL n unit)</code></td><td>Add&#x2F;subtract from a date</td></tr><tr><td><code>DATEDIFF(d1, d2)</code></td><td>Difference between two dates</td></tr></tbody></table><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">SELECT</span> DATE_FORMAT(create_time, <span class="string">&#x27;%Y-%m-%d %H:%i:%s&#x27;</span>) <span class="keyword">FROM</span> table_name;</span><br></pre></td></tr></table></figure><h2 id="5-4-Conditional-Logic"><a href="#5-4-Conditional-Logic" class="headerlink" title="5.4 Conditional Logic"></a>5.4 Conditional Logic</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- IF</span></span><br><span class="line"><span class="keyword">SELECT</span> IF(age <span class="operator">&gt;=</span> <span class="number">18</span>, <span class="string">&#x27;adult&#x27;</span>, <span class="string">&#x27;minor&#x27;</span>) <span class="keyword">FROM</span> table_name;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- CASE WHEN</span></span><br><span class="line"><span class="keyword">SELECT</span></span><br><span class="line">    <span class="keyword">CASE</span></span><br><span class="line">        <span class="keyword">WHEN</span> score <span class="operator">&gt;=</span> <span class="number">90</span> <span class="keyword">THEN</span> <span class="string">&#x27;A&#x27;</span></span><br><span class="line">        <span class="keyword">WHEN</span> score <span class="operator">&gt;=</span> <span class="number">80</span> <span class="keyword">THEN</span> <span class="string">&#x27;B&#x27;</span></span><br><span class="line">        <span class="keyword">WHEN</span> score <span class="operator">&gt;=</span> <span class="number">60</span> <span class="keyword">THEN</span> <span class="string">&#x27;C&#x27;</span></span><br><span class="line">        <span class="keyword">ELSE</span> <span class="string">&#x27;D&#x27;</span></span><br><span class="line">    <span class="keyword">END</span> <span class="keyword">AS</span> grade</span><br><span class="line"><span class="keyword">FROM</span> table_name;</span><br></pre></td></tr></table></figure><hr><h1 id="VI-Indexes"><a href="#VI-Indexes" class="headerlink" title="VI. Indexes"></a>VI. Indexes</h1><h2 id="6-1-Index-Types"><a href="#6-1-Index-Types" class="headerlink" title="6.1 Index Types"></a>6.1 Index Types</h2><table><thead><tr><th>Type</th><th>Description</th></tr></thead><tbody><tr><td>Regular index</td><td>Allows duplicate values</td></tr><tr><td>Unique index</td><td>Values must be unique</td></tr><tr><td>Primary key index</td><td>Auto-created with the primary key; unique and not null</td></tr><tr><td>Full-text index</td><td>Full-text search (MyISAM)</td></tr><tr><td>Composite index</td><td>Spans multiple columns</td></tr></tbody></table><h2 id="6-2-Creating-Indexes"><a href="#6-2-Creating-Indexes" class="headerlink" title="6.2 Creating Indexes"></a>6.2 Creating Indexes</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- Create an index</span></span><br><span class="line"><span class="keyword">CREATE</span> INDEX index_name <span class="keyword">ON</span> table_name(col);</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Create a unique index</span></span><br><span class="line"><span class="keyword">CREATE</span> <span class="keyword">UNIQUE</span> INDEX index_name <span class="keyword">ON</span> table_name(col);</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Create a composite index</span></span><br><span class="line"><span class="keyword">CREATE</span> INDEX index_name <span class="keyword">ON</span> table_name(col1, col2);</span><br><span class="line"></span><br><span class="line"><span class="comment">-- View indexes</span></span><br><span class="line"><span class="keyword">SHOW</span> INDEX <span class="keyword">FROM</span> table_name;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Drop an index</span></span><br><span class="line"><span class="keyword">DROP</span> INDEX index_name <span class="keyword">ON</span> table_name;</span><br></pre></td></tr></table></figure><h2 id="6-3-Indexing-Principles"><a href="#6-3-Indexing-Principles" class="headerlink" title="6.3 Indexing Principles"></a>6.3 Indexing Principles</h2><ul><li><strong>Good candidates</strong>: large datasets, frequently queried columns, columns often used in WHERE</li><li><strong>Avoid</strong>: small datasets, frequently updated columns, low-cardinality columns</li><li><strong>Leftmost prefix rule</strong>: composite indexes are used starting from the leftmost column</li></ul><hr><h1 id="VII-Transactions"><a href="#VII-Transactions" class="headerlink" title="VII. Transactions"></a>VII. Transactions</h1><h2 id="7-1-Transaction-Properties-ACID"><a href="#7-1-Transaction-Properties-ACID" class="headerlink" title="7.1 Transaction Properties (ACID)"></a>7.1 Transaction Properties (ACID)</h2><ul><li><strong>Atomicity</strong>: either everything succeeds or everything fails</li><li><strong>Consistency</strong>: data is in a valid state before and after the transaction</li><li><strong>Isolation</strong>: concurrent transactions don’t interfere with each other</li><li><strong>Durability</strong>: once committed, data is permanently saved</li></ul><h2 id="7-2-Transaction-Control"><a href="#7-2-Transaction-Control" class="headerlink" title="7.2 Transaction Control"></a>7.2 Transaction Control</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- Start a transaction</span></span><br><span class="line"><span class="keyword">START</span> TRANSACTION;</span><br><span class="line"><span class="comment">-- or</span></span><br><span class="line"><span class="keyword">BEGIN</span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Commit</span></span><br><span class="line"><span class="keyword">COMMIT</span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Rollback</span></span><br><span class="line"><span class="keyword">ROLLBACK</span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Set a savepoint</span></span><br><span class="line"><span class="keyword">SAVEPOINT</span> savepoint_name;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Rollback to a savepoint</span></span><br><span class="line"><span class="keyword">ROLLBACK</span> <span class="keyword">TO</span> savepoint_name;</span><br></pre></td></tr></table></figure><h2 id="7-3-Isolation-Levels"><a href="#7-3-Isolation-Levels" class="headerlink" title="7.3 Isolation Levels"></a>7.3 Isolation Levels</h2><table><thead><tr><th>Isolation Level</th><th>Dirty Read</th><th>Non-Repeatable Read</th><th>Phantom Read</th></tr></thead><tbody><tr><td>READ UNCOMMITTED</td><td>Possible</td><td>Possible</td><td>Possible</td></tr><tr><td>READ COMMITTED</td><td>Not possible</td><td>Possible</td><td>Possible</td></tr><tr><td>REPEATABLE READ (default)</td><td>Not possible</td><td>Not possible</td><td>Possible</td></tr><tr><td>SERIALIZABLE</td><td>Not possible</td><td>Not possible</td><td>Not possible</td></tr></tbody></table><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">SET</span> SESSION TRANSACTION ISOLATION LEVEL level;</span><br></pre></td></tr></table></figure><hr><h1 id="VIII-Views"><a href="#VIII-Views" class="headerlink" title="VIII. Views"></a>VIII. Views</h1><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- Create a view</span></span><br><span class="line"><span class="keyword">CREATE</span> <span class="keyword">VIEW</span> view_name <span class="keyword">AS</span></span><br><span class="line"><span class="keyword">SELECT</span> col1, col2</span><br><span class="line"><span class="keyword">FROM</span> table_name</span><br><span class="line"><span class="keyword">WHERE</span> <span class="keyword">condition</span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Use a view</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> view_name;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Drop a view</span></span><br><span class="line"><span class="keyword">DROP</span> <span class="keyword">VIEW</span> view_name;</span><br></pre></td></tr></table></figure><hr><h1 id="IX-References"><a href="#IX-References" class="headerlink" title="IX. References"></a>IX. References</h1><ol><li><p><a href="https://www.bilibili.com/video/BV1bQxMehETa/">Bilibili Quick-Start Course</a></p></li><li><p><a href="https://sqlzoo.net/wiki/SQL_Tutorial">SQL Practice Website</a></p></li></ol>]]></content:encoded>
      
      
      
      <category domain="https://eugenepage.com/tags/SQL/">SQL</category>
      
      <category domain="https://eugenepage.com/tags/Database/">Database</category>
      
      
      <comments>https://eugenepage.com/2026/04/12/20260412.SQLBasicsNotes/#disqus_thread</comments>
      
    </item>
    
    <item>
      <title>SQL 基础笔记</title>
      <link>https://eugenepage.com/zh-CN/2026/04/12/20260412.SQLBasicsNotes/</link>
      <guid>https://eugenepage.com/zh-CN/2026/04/12/20260412.SQLBasicsNotes/</guid>
      <pubDate>Sun, 12 Apr 2026 02:00:00 GMT</pubDate>
      
        
        
      <description>&lt;h1 id=&quot;一、SQL-简介&quot;&gt;&lt;a href=&quot;#一、SQL-简介&quot; class=&quot;headerlink&quot; title=&quot;一、SQL 简介&quot;&gt;&lt;/a&gt;一、SQL 简介&lt;/h1&gt;&lt;h2 id=&quot;1-1-什么是-SQL&quot;&gt;&lt;a href=&quot;#1-1-什么是-SQL&quot; class</description>
        
      
      
      
      <content:encoded><![CDATA[<h1 id="一、SQL-简介"><a href="#一、SQL-简介" class="headerlink" title="一、SQL 简介"></a>一、SQL 简介</h1><h2 id="1-1-什么是-SQL"><a href="#1-1-什么是-SQL" class="headerlink" title="1.1 什么是 SQL"></a>1.1 什么是 SQL</h2><p><strong>SQL（Structured Query Language）</strong>：用于管理关系型数据库的标准编程语言。</p><ul><li><strong>RDBMS</strong>：Relational Database Management System，关系型数据库管理系统</li><li>常见数据库（类型）：MySQL、PostgreSQL、SQLite、Oracle、SQL Server<br>1.SQLite：轻量级、嵌入式，适合移动应用<br>2.MySQL：开源、流行，适合Web应用<br>3.PostgreSQL：开源、功能强大，适合复杂应用<br>4.Oracle：企业级、功能全面，适合大型应用<br>5.SQL Server：微软开发，适合Windows环境</li></ul><h2 id="1-2-SQL-基本分类"><a href="#1-2-SQL-基本分类" class="headerlink" title="1.2 SQL 基本分类"></a>1.2 SQL 基本分类</h2><p>四大门派，用这几门绝学来与数据库进行交流。学废了也就当个搬运工。🐶</p><table><thead><tr><th>分类</th><th>用途</th><th>关键字</th></tr></thead><tbody><tr><td>DDL</td><td>定义数据库结构</td><td>CREATE、ALTER、DROP</td></tr><tr><td>DML</td><td>操作数据</td><td>INSERT、UPDATE、DELETE</td></tr><tr><td>DQL</td><td>查询数据</td><td>SELECT</td></tr><tr><td>DCL</td><td>控制权限</td><td>GRANT、REVOKE</td></tr></tbody></table><hr><h1 id="二、基础语法"><a href="#二、基础语法" class="headerlink" title="二、基础语法"></a>二、基础语法</h1><h2 id="2-1-基本规则"><a href="#2-1-基本规则" class="headerlink" title="2.1 基本规则"></a>2.1 基本规则</h2><ul><li>SQL 语句以分号 <code>;</code> 结尾（部分数据库可不加）</li><li>关键字不区分大小写，但习惯上关键字大写，表名&#x2F;字段名小写</li><li>字符串和日期用单引号 <code>&#39; &#39;</code> 包围</li><li>注释：<code>-- 单行注释</code>、<code>/* 多行注释 */</code></li></ul><h2 id="2-2-书写规范"><a href="#2-2-书写规范" class="headerlink" title="2.2 书写规范"></a>2.2 书写规范</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- 推荐的书写风格</span></span><br><span class="line"><span class="keyword">SELECT</span></span><br><span class="line">    id,</span><br><span class="line">    name,</span><br><span class="line">    email</span><br><span class="line"><span class="keyword">FROM</span></span><br><span class="line">    users<span class="comment">/* 是否使用双引号取决于数据库类型 */</span></span><br><span class="line"><span class="keyword">WHERE</span></span><br><span class="line">    status <span class="operator">=</span> <span class="string">&#x27;active&#x27;</span></span><br><span class="line"><span class="keyword">ORDER</span> <span class="keyword">BY</span></span><br><span class="line">    create_time <span class="keyword">DESC</span>;</span><br></pre></td></tr></table></figure><h2 id="2-3-常用运算符"><a href="#2-3-常用运算符" class="headerlink" title="2.3 常用运算符"></a>2.3 常用运算符</h2><h3 id="算术运算符"><a href="#算术运算符" class="headerlink" title="算术运算符"></a>算术运算符</h3><table><thead><tr><th>运算符</th><th>说明</th></tr></thead><tbody><tr><td><code>+</code></td><td>加</td></tr><tr><td><code>-</code></td><td>减</td></tr><tr><td><code>*</code></td><td>乘</td></tr><tr><td><code>/</code></td><td>除</td></tr><tr><td><code>%</code> 或 <code>MOD()</code></td><td>取余</td></tr></tbody></table><h3 id="比较运算符"><a href="#比较运算符" class="headerlink" title="比较运算符"></a>比较运算符</h3><table><thead><tr><th>运算符</th><th>说明</th></tr></thead><tbody><tr><td><code>=</code></td><td>等于</td></tr><tr><td><code>&lt;&gt;</code> 或 <code>!=</code></td><td>不等于</td></tr><tr><td><code>&gt;</code></td><td>大于</td></tr><tr><td><code>&lt;</code></td><td>小于</td></tr><tr><td><code>&gt;=</code></td><td>大于等于</td></tr><tr><td><code>&lt;=</code></td><td>小于等于</td></tr></tbody></table><h3 id="逻辑运算符"><a href="#逻辑运算符" class="headerlink" title="逻辑运算符"></a>逻辑运算符</h3><table><thead><tr><th>运算符</th><th>说明</th></tr></thead><tbody><tr><td><code>AND</code></td><td>且（优先级比OR更高 和Cpp一样可以用括号 ）</td></tr><tr><td><code>OR</code></td><td>或</td></tr><tr><td><code>NOT</code></td><td>非</td></tr></tbody></table><h2 id="2-4-常用命令（MySQL）"><a href="#2-4-常用命令（MySQL）" class="headerlink" title="2.4 常用命令（MySQL）"></a>2.4 常用命令（MySQL）</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- 显示所有数据库</span></span><br><span class="line"><span class="keyword">SHOW</span> DATABASES;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 显示当前数据库所有表</span></span><br><span class="line"><span class="keyword">SHOW</span> TABLES;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 查看表结构</span></span><br><span class="line"><span class="keyword">DESC</span> 表名;</span><br><span class="line"><span class="comment">-- 或</span></span><br><span class="line"><span class="keyword">DESCRIBE</span> 表名;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 查看建表语句</span></span><br><span class="line"><span class="keyword">SHOW</span> <span class="keyword">CREATE TABLE</span> 表名;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 显示表的所有列信息</span></span><br><span class="line"><span class="keyword">SHOW</span> <span class="keyword">FULL</span> COLUMNS <span class="keyword">FROM</span> 表名;</span><br></pre></td></tr></table></figure><h2 id="2-5-⚠️注意事项"><a href="#2-5-⚠️注意事项" class="headerlink" title="2.5 ⚠️注意事项"></a>2.5 ⚠️注意事项</h2><ol><li>查询语法关键字是带顺序关系的。</li><li><img src="https://cdn.jsdelivr.net/gh/youdrew/MyPicGo/Images/%E6%88%AA%E5%B1%8F2026-04-12%2018.51.16.png" alt="截屏2026-04-12 18.51.16"></li></ol><hr><h1 id="三、DDL-数据定义"><a href="#三、DDL-数据定义" class="headerlink" title="三、DDL 数据定义"></a>三、DDL 数据定义</h1><h2 id="2-1-创建数据库"><a href="#2-1-创建数据库" class="headerlink" title="2.1 创建数据库"></a>2.1 创建数据库</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">CREATE</span> DATABASE 数据库名;</span><br><span class="line">USE 数据库名;</span><br></pre></td></tr></table></figure><h2 id="2-2-创建表"><a href="#2-2-创建表" class="headerlink" title="2.2 创建表"></a>2.2 创建表</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">CREATE TABLE</span> 表名 (</span><br><span class="line">    字段名<span class="number">1</span> 数据类型 [约束],</span><br><span class="line">    字段名<span class="number">2</span> 数据类型 [约束],</span><br><span class="line">    ...</span><br><span class="line">);</span><br></pre></td></tr></table></figure><p><strong>常用数据类型</strong>：</p><ul><li>整数：<code>INT</code>、<code>BIGINT</code></li><li>小数：<code>DECIMAL(m,n)</code>、<code>FLOAT</code>、<code>DOUBLE</code></li><li>字符串：<code>VARCHAR(n)</code>、<code>CHAR(n)</code>、<code>TEXT</code></li><li>日期：<code>DATE</code>、<code>DATETIME</code>、<code>TIMESTAMP</code></li></ul><h2 id="2-3-约束"><a href="#2-3-约束" class="headerlink" title="2.3 约束"></a>2.3 约束</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">CREATE TABLE</span> users (</span><br><span class="line">    id <span class="type">INT</span> <span class="keyword">PRIMARY KEY</span> AUTO_INCREMENT,</span><br><span class="line">    name <span class="type">VARCHAR</span>(<span class="number">50</span>) <span class="keyword">NOT NULL</span>,</span><br><span class="line">    email <span class="type">VARCHAR</span>(<span class="number">100</span>) <span class="keyword">UNIQUE</span>,</span><br><span class="line">    age <span class="type">INT</span> <span class="keyword">DEFAULT</span> <span class="number">18</span>,</span><br><span class="line">    <span class="keyword">FOREIGN KEY</span> (dept_id) <span class="keyword">REFERENCES</span> departments(id)</span><br><span class="line">);</span><br></pre></td></tr></table></figure><p><strong>常用约束</strong>：</p><ul><li><code>PRIMARY KEY</code>：主键，唯一标识</li><li><code>NOT NULL</code>：非空</li><li><code>UNIQUE</code>：唯一</li><li><code>DEFAULT</code>：默认值</li><li><code>FOREIGN KEY</code>：外键约束</li><li><code>AUTO_INCREMENT</code>：自增（MySQL）</li></ul><h2 id="2-4-修改表结构"><a href="#2-4-修改表结构" class="headerlink" title="2.4 修改表结构"></a>2.4 修改表结构</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- 添加字段</span></span><br><span class="line"><span class="keyword">ALTER TABLE</span> 表名 <span class="keyword">ADD</span> 字段名 数据类型;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 修改字段</span></span><br><span class="line"><span class="keyword">ALTER TABLE</span> 表名 MODIFY 字段名 新数据类型;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 删除字段</span></span><br><span class="line"><span class="keyword">ALTER TABLE</span> 表名 <span class="keyword">DROP</span> <span class="keyword">COLUMN</span> 字段名;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 重命名表</span></span><br><span class="line"><span class="keyword">ALTER TABLE</span> 表名 RENAME <span class="keyword">TO</span> 新表名;</span><br></pre></td></tr></table></figure><h2 id="2-5-删除表"><a href="#2-5-删除表" class="headerlink" title="2.5 删除表"></a>2.5 删除表</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">DROP</span> <span class="keyword">TABLE</span> 表名;           <span class="comment">-- 删除表结构</span></span><br><span class="line"><span class="keyword">TRUNCATE</span> <span class="keyword">TABLE</span> 表名;        <span class="comment">-- 清空表数据（保留结构）</span></span><br></pre></td></tr></table></figure><hr><h1 id="三、DML-数据操作"><a href="#三、DML-数据操作" class="headerlink" title="三、DML 数据操作"></a>三、DML 数据操作</h1><h2 id="3-1-插入数据"><a href="#3-1-插入数据" class="headerlink" title="3.1 插入数据"></a>3.1 插入数据</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- 插入单条</span></span><br><span class="line"><span class="keyword">INSERT INTO</span> 表名 (字段<span class="number">1</span>, 字段<span class="number">2</span>) <span class="keyword">VALUES</span> (值<span class="number">1</span>, 值<span class="number">2</span>);</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 插入多条</span></span><br><span class="line"><span class="keyword">INSERT INTO</span> 表名 (字段<span class="number">1</span>, 字段<span class="number">2</span>) <span class="keyword">VALUES</span></span><br><span class="line">(值<span class="number">1</span>, 值<span class="number">2</span>),</span><br><span class="line">(值<span class="number">3</span>, 值<span class="number">4</span>),</span><br><span class="line">(值<span class="number">5</span>, 值<span class="number">6</span>);</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 从其他表导入</span></span><br><span class="line"><span class="keyword">INSERT INTO</span> 表名 <span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> 其他表 <span class="keyword">WHERE</span> 条件;</span><br></pre></td></tr></table></figure><h2 id="3-2-更新数据"><a href="#3-2-更新数据" class="headerlink" title="3.2 更新数据"></a>3.2 更新数据</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">UPDATE</span> 表名</span><br><span class="line"><span class="keyword">SET</span> 字段<span class="number">1</span> <span class="operator">=</span> 新值<span class="number">1</span>, 字段<span class="number">2</span> <span class="operator">=</span> 新值<span class="number">2</span></span><br><span class="line"><span class="keyword">WHERE</span> 条件;</span><br></pre></td></tr></table></figure><h2 id="3-3-删除数据"><a href="#3-3-删除数据" class="headerlink" title="3.3 删除数据"></a>3.3 删除数据</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">DELETE</span> <span class="keyword">FROM</span> 表名 <span class="keyword">WHERE</span> 条件;</span><br></pre></td></tr></table></figure><hr><h1 id="四、DQL-数据查询（核心）"><a href="#四、DQL-数据查询（核心）" class="headerlink" title="四、DQL 数据查询（核心）"></a>四、DQL 数据查询（核心）</h1><h2 id="4-1-基本查询"><a href="#4-1-基本查询" class="headerlink" title="4.1 基本查询"></a>4.1 基本查询</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- 查询所有字段</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> 表名;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 查询指定字段</span></span><br><span class="line"><span class="keyword">SELECT</span> 字段<span class="number">1</span>, 字段<span class="number">2</span> <span class="keyword">FROM</span> 表名;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 去重</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="keyword">DISTINCT</span> 字段 <span class="keyword">FROM</span> 表名;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 别名</span></span><br><span class="line"><span class="keyword">SELECT</span> 字段 <span class="keyword">AS</span> 别名 <span class="keyword">FROM</span> 表名;</span><br></pre></td></tr></table></figure><h2 id="4-2-条件查询-WHERE"><a href="#4-2-条件查询-WHERE" class="headerlink" title="4.2 条件查询 WHERE"></a>4.2 条件查询 WHERE</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> 表名 <span class="keyword">WHERE</span> 条件;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 比较运算符</span></span><br><span class="line"><span class="keyword">WHERE</span> age <span class="operator">&gt;</span> <span class="number">18</span></span><br><span class="line"><span class="keyword">WHERE</span> name <span class="operator">=</span> <span class="string">&#x27;张三&#x27;</span></span><br><span class="line"><span class="keyword">WHERE</span> age <span class="operator">&gt;=</span> <span class="number">18</span> <span class="keyword">AND</span> age <span class="operator">&lt;=</span> <span class="number">30</span></span><br><span class="line"></span><br><span class="line"><span class="comment">-- 范围</span></span><br><span class="line"><span class="keyword">WHERE</span> age <span class="keyword">BETWEEN</span> <span class="number">18</span> <span class="keyword">AND</span> <span class="number">30</span></span><br><span class="line"></span><br><span class="line"><span class="comment">-- 枚举</span></span><br><span class="line"><span class="keyword">WHERE</span> status <span class="keyword">IN</span> (<span class="string">&#x27;active&#x27;</span>, <span class="string">&#x27;pending&#x27;</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 模糊匹配</span></span><br><span class="line"><span class="keyword">WHERE</span> name <span class="keyword">LIKE</span> <span class="string">&#x27;张%&#x27;</span>      <span class="comment">-- 张开头</span></span><br><span class="line"><span class="keyword">WHERE</span> name <span class="keyword">LIKE</span> <span class="string">&#x27;%三%&#x27;</span>      <span class="comment">-- 包含三</span></span><br><span class="line"><span class="keyword">WHERE</span> name <span class="keyword">LIKE</span> <span class="string">&#x27;张_&#x27;</span>       <span class="comment">-- 张开头，2个字</span></span><br><span class="line"></span><br><span class="line"><span class="comment">-- 空值</span></span><br><span class="line"><span class="keyword">WHERE</span> email <span class="keyword">IS</span> <span class="keyword">NULL</span></span><br><span class="line"><span class="keyword">WHERE</span> email <span class="keyword">IS</span> <span class="keyword">NOT NULL</span></span><br></pre></td></tr></table></figure><h2 id="4-3-排序-ORDER-BY"><a href="#4-3-排序-ORDER-BY" class="headerlink" title="4.3 排序 ORDER BY"></a>4.3 排序 ORDER BY</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> 表名 <span class="keyword">ORDER</span> <span class="keyword">BY</span> 字段<span class="number">1</span> <span class="keyword">ASC</span>, 字段<span class="number">2</span> <span class="keyword">DESC</span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- ASC：升序（默认）</span></span><br><span class="line"><span class="comment">-- DESC：降序</span></span><br></pre></td></tr></table></figure><h2 id="4-4-限制-LIMIT"><a href="#4-4-限制-LIMIT" class="headerlink" title="4.4 限制 LIMIT"></a>4.4 限制 LIMIT</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- MySQL</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> 表名 LIMIT <span class="number">10</span>;</span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> 表名 LIMIT <span class="number">5</span>, <span class="number">10</span>;  <span class="comment">-- 从第5条开始，取10条</span></span><br><span class="line"></span><br><span class="line"><span class="comment">-- SQL Server</span></span><br><span class="line"><span class="keyword">SELECT</span> TOP <span class="number">10</span> <span class="operator">*</span> <span class="keyword">FROM</span> 表名;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Oracle</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> 表名 <span class="keyword">WHERE</span> ROWNUM <span class="operator">&lt;=</span> <span class="number">10</span>;</span><br></pre></td></tr></table></figure><h2 id="4-5-聚合函数"><a href="#4-5-聚合函数" class="headerlink" title="4.5 聚合函数"></a>4.5 聚合函数</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">SELECT</span></span><br><span class="line">    <span class="built_in">COUNT</span>(<span class="operator">*</span>)          <span class="keyword">AS</span> 总记录数,</span><br><span class="line">    <span class="built_in">COUNT</span>(字段)       <span class="keyword">AS</span> 非空数量,</span><br><span class="line">    <span class="built_in">SUM</span>(字段)         <span class="keyword">AS</span> 求和,</span><br><span class="line">    <span class="built_in">AVG</span>(字段)         <span class="keyword">AS</span> 平均值,</span><br><span class="line">    <span class="built_in">MAX</span>(字段)         <span class="keyword">AS</span> 最大值,</span><br><span class="line">    <span class="built_in">MIN</span>(字段)         <span class="keyword">AS</span> 最小值</span><br><span class="line"><span class="keyword">FROM</span> 表名;</span><br></pre></td></tr></table></figure><h2 id="4-6-分组-GROUP-BY"><a href="#4-6-分组-GROUP-BY" class="headerlink" title="4.6 分组 GROUP BY"></a>4.6 分组 GROUP BY</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">SELECT</span> 字段, 聚合函数</span><br><span class="line"><span class="keyword">FROM</span> 表名</span><br><span class="line"><span class="keyword">GROUP</span> <span class="keyword">BY</span> 字段</span><br><span class="line"><span class="keyword">HAVING</span> 聚合条件;</span><br></pre></td></tr></table></figure><p><strong>注意</strong>：<code>WHERE</code> 在分组前过滤，<code>HAVING</code> 在分组后过滤。</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- 示例：统计每个部门的平均工资</span></span><br><span class="line"><span class="keyword">SELECT</span> dept_id, <span class="built_in">AVG</span>(salary) <span class="keyword">AS</span> avg_salary</span><br><span class="line"><span class="keyword">FROM</span> employees</span><br><span class="line"><span class="keyword">GROUP</span> <span class="keyword">BY</span> dept_id</span><br><span class="line"><span class="keyword">HAVING</span> <span class="built_in">AVG</span>(salary) <span class="operator">&gt;</span> <span class="number">5000</span>;</span><br></pre></td></tr></table></figure><h2 id="4-7-多表查询"><a href="#4-7-多表查询" class="headerlink" title="4.7 多表查询"></a>4.7 多表查询</h2><h3 id="连接（JOIN）"><a href="#连接（JOIN）" class="headerlink" title="连接（JOIN）"></a>连接（JOIN）</h3><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- 内连接：只保留匹配的行</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span></span><br><span class="line"><span class="keyword">FROM</span> 表<span class="number">1</span></span><br><span class="line"><span class="keyword">INNER</span> <span class="keyword">JOIN</span> 表<span class="number">2</span> <span class="keyword">ON</span> 表<span class="number">1.</span>字段 <span class="operator">=</span> 表<span class="number">2.</span>字段;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 左连接：保留左表全部，右表无匹配为 NULL</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span></span><br><span class="line"><span class="keyword">FROM</span> 表<span class="number">1</span></span><br><span class="line"><span class="keyword">LEFT</span> <span class="keyword">JOIN</span> 表<span class="number">2</span> <span class="keyword">ON</span> 表<span class="number">1.</span>字段 <span class="operator">=</span> 表<span class="number">2.</span>字段;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 右连接：保留右表全部</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span></span><br><span class="line"><span class="keyword">FROM</span> 表<span class="number">1</span></span><br><span class="line"><span class="keyword">RIGHT</span> <span class="keyword">JOIN</span> 表<span class="number">2</span> <span class="keyword">ON</span> 表<span class="number">1.</span>字段 <span class="operator">=</span> 表<span class="number">2.</span>字段;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 全连接（MySQL 不支持，用 UNION 模拟）</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> 表<span class="number">1</span> <span class="keyword">LEFT</span> <span class="keyword">JOIN</span> 表<span class="number">2</span> <span class="keyword">ON</span> ...</span><br><span class="line"><span class="keyword">UNION</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> 表<span class="number">1</span> <span class="keyword">RIGHT</span> <span class="keyword">JOIN</span> 表<span class="number">2</span> <span class="keyword">ON</span> ...;</span><br></pre></td></tr></table></figure><h3 id="子查询"><a href="#子查询" class="headerlink" title="子查询"></a>子查询</h3><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- WHERE 中的子查询</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> 表名 <span class="keyword">WHERE</span> 字段 <span class="operator">=</span> (<span class="keyword">SELECT</span> 字段 <span class="keyword">FROM</span> ...);</span><br><span class="line"></span><br><span class="line"><span class="comment">-- IN 子查询</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> 表名 <span class="keyword">WHERE</span> 字段 <span class="keyword">IN</span> (<span class="keyword">SELECT</span> 字段 <span class="keyword">FROM</span> ...);</span><br><span class="line"></span><br><span class="line"><span class="comment">-- EXISTS 子查询</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> 表名 <span class="keyword">WHERE</span> <span class="keyword">EXISTS</span> (<span class="keyword">SELECT</span> <span class="number">1</span> <span class="keyword">FROM</span> ... <span class="keyword">WHERE</span> 条件);</span><br></pre></td></tr></table></figure><h2 id="4-8-UNION-联合查询"><a href="#4-8-UNION-联合查询" class="headerlink" title="4.8 UNION 联合查询"></a>4.8 UNION 联合查询</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">SELECT</span> 字段 <span class="keyword">FROM</span> 表<span class="number">1</span></span><br><span class="line"><span class="keyword">UNION</span>                 <span class="comment">-- 去重合并</span></span><br><span class="line"><span class="keyword">SELECT</span> 字段 <span class="keyword">FROM</span> 表<span class="number">2</span>;</span><br><span class="line"></span><br><span class="line"><span class="keyword">SELECT</span> 字段 <span class="keyword">FROM</span> 表<span class="number">1</span></span><br><span class="line"><span class="keyword">UNION</span> <span class="keyword">ALL</span>            <span class="comment">-- 保留重复</span></span><br><span class="line"><span class="keyword">SELECT</span> 字段 <span class="keyword">FROM</span> 表<span class="number">2</span>;</span><br></pre></td></tr></table></figure><hr><h1 id="五、常用函数"><a href="#五、常用函数" class="headerlink" title="五、常用函数"></a>五、常用函数</h1><h2 id="5-1-字符串函数"><a href="#5-1-字符串函数" class="headerlink" title="5.1 字符串函数"></a>5.1 字符串函数</h2><table><thead><tr><th>函数</th><th>作用</th></tr></thead><tbody><tr><td><code>CONCAT(s1, s2)</code></td><td>拼接字符串</td></tr><tr><td><code>LENGTH(s)</code></td><td>获取长度</td></tr><tr><td><code>UPPER(s)</code> &#x2F; <code>LOWER(s)</code></td><td>大小写转换</td></tr><tr><td><code>TRIM(s)</code></td><td>去除首尾空格</td></tr><tr><td><code>SUBSTRING(s, start, len)</code></td><td>截取子串</td></tr><tr><td><code>REPLACE(s, old, new)</code></td><td>替换</td></tr><tr><td><code>IFNULL(s, default)</code></td><td>NULL 替换</td></tr></tbody></table><h2 id="5-2-数值函数"><a href="#5-2-数值函数" class="headerlink" title="5.2 数值函数"></a>5.2 数值函数</h2><table><thead><tr><th>函数</th><th>作用</th></tr></thead><tbody><tr><td><code>ROUND(n, d)</code></td><td>四舍五入</td></tr><tr><td><code>CEIL(n)</code> &#x2F; <code>FLOOR(n)</code></td><td>向上&#x2F;下取整</td></tr><tr><td><code>ABS(n)</code></td><td>绝对值</td></tr><tr><td><code>MOD(n, m)</code></td><td>取余</td></tr><tr><td><code>RAND()</code></td><td>随机数</td></tr></tbody></table><h2 id="5-3-日期函数"><a href="#5-3-日期函数" class="headerlink" title="5.3 日期函数"></a>5.3 日期函数</h2><table><thead><tr><th>函数</th><th>作用</th></tr></thead><tbody><tr><td><code>NOW()</code> &#x2F; <code>SYSDATE()</code></td><td>当前日期时间</td></tr><tr><td><code>CURDATE()</code></td><td>当前日期</td></tr><tr><td><code>YEAR(d)</code> &#x2F; <code>MONTH(d)</code> &#x2F; <code>DAY(d)</code></td><td>提取年月日</td></tr><tr><td><code>DATE_FORMAT(d, format)</code></td><td>格式化日期</td></tr><tr><td><code>DATE_ADD(d, INTERVAL n unit)</code></td><td>日期加减</td></tr><tr><td><code>DATEDIFF(d1, d2)</code></td><td>日期差值</td></tr></tbody></table><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">SELECT</span> DATE_FORMAT(create_time, <span class="string">&#x27;%Y-%m-%d %H:%i:%s&#x27;</span>) <span class="keyword">FROM</span> 表名;</span><br></pre></td></tr></table></figure><h2 id="5-4-条件判断"><a href="#5-4-条件判断" class="headerlink" title="5.4 条件判断"></a>5.4 条件判断</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- IF</span></span><br><span class="line"><span class="keyword">SELECT</span> IF(age <span class="operator">&gt;=</span> <span class="number">18</span>, <span class="string">&#x27;成年&#x27;</span>, <span class="string">&#x27;未成年&#x27;</span>) <span class="keyword">FROM</span> 表名;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- CASE WHEN</span></span><br><span class="line"><span class="keyword">SELECT</span></span><br><span class="line">    <span class="keyword">CASE</span></span><br><span class="line">        <span class="keyword">WHEN</span> score <span class="operator">&gt;=</span> <span class="number">90</span> <span class="keyword">THEN</span> <span class="string">&#x27;A&#x27;</span></span><br><span class="line">        <span class="keyword">WHEN</span> score <span class="operator">&gt;=</span> <span class="number">80</span> <span class="keyword">THEN</span> <span class="string">&#x27;B&#x27;</span></span><br><span class="line">        <span class="keyword">WHEN</span> score <span class="operator">&gt;=</span> <span class="number">60</span> <span class="keyword">THEN</span> <span class="string">&#x27;C&#x27;</span></span><br><span class="line">        <span class="keyword">ELSE</span> <span class="string">&#x27;D&#x27;</span></span><br><span class="line">    <span class="keyword">END</span> <span class="keyword">AS</span> grade</span><br><span class="line"><span class="keyword">FROM</span> 表名;</span><br></pre></td></tr></table></figure><hr><h1 id="六、索引"><a href="#六、索引" class="headerlink" title="六、索引"></a>六、索引</h1><h2 id="6-1-索引类型"><a href="#6-1-索引类型" class="headerlink" title="6.1 索引类型"></a>6.1 索引类型</h2><table><thead><tr><th>类型</th><th>说明</th></tr></thead><tbody><tr><td>普通索引</td><td>允许重复值</td></tr><tr><td>唯一索引</td><td>值唯一</td></tr><tr><td>主键索引</td><td>主键自动创建，唯一且非空</td></tr><tr><td>全文索引</td><td>全文搜索（MyISAM）</td></tr><tr><td>组合索引</td><td>多列组合</td></tr></tbody></table><h2 id="6-2-创建索引"><a href="#6-2-创建索引" class="headerlink" title="6.2 创建索引"></a>6.2 创建索引</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- 创建索引</span></span><br><span class="line"><span class="keyword">CREATE</span> INDEX 索引名 <span class="keyword">ON</span> 表名(字段);</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 创建唯一索引</span></span><br><span class="line"><span class="keyword">CREATE</span> <span class="keyword">UNIQUE</span> INDEX 索引名 <span class="keyword">ON</span> 表名(字段);</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 创建组合索引</span></span><br><span class="line"><span class="keyword">CREATE</span> INDEX 索引名 <span class="keyword">ON</span> 表名(字段<span class="number">1</span>, 字段<span class="number">2</span>);</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 查看索引</span></span><br><span class="line"><span class="keyword">SHOW</span> INDEX <span class="keyword">FROM</span> 表名;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 删除索引</span></span><br><span class="line"><span class="keyword">DROP</span> INDEX 索引名 <span class="keyword">ON</span> 表名;</span><br></pre></td></tr></table></figure><h2 id="6-3-索引原则"><a href="#6-3-索引原则" class="headerlink" title="6.3 索引原则"></a>6.3 索引原则</h2><ul><li><strong>适合</strong>：数据量大、查询频繁、WHERE 条件常用</li><li><strong>避免</strong>：数据量小、更新频繁、区分度低的字段</li><li><strong>最左前缀</strong>：组合索引从左开始使用</li></ul><hr><h1 id="七、事务"><a href="#七、事务" class="headerlink" title="七、事务"></a>七、事务</h1><h2 id="7-1-事务特性（ACID）"><a href="#7-1-事务特性（ACID）" class="headerlink" title="7.1 事务特性（ACID）"></a>7.1 事务特性（ACID）</h2><ul><li><strong>Atomicity（原子性）</strong>：要么全部成功，要么全部失败</li><li><strong>Consistency（一致性）</strong>：事务前后数据状态一致</li><li><strong>Isolation（隔离性）</strong>：并发事务互不干扰</li><li><strong>Durability（持久性）</strong>：提交后数据永久保存</li></ul><h2 id="7-2-事务控制"><a href="#7-2-事务控制" class="headerlink" title="7.2 事务控制"></a>7.2 事务控制</h2><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- 开启事务</span></span><br><span class="line"><span class="keyword">START</span> TRANSACTION;</span><br><span class="line"><span class="comment">-- 或</span></span><br><span class="line"><span class="keyword">BEGIN</span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 提交</span></span><br><span class="line"><span class="keyword">COMMIT</span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 回滚</span></span><br><span class="line"><span class="keyword">ROLLBACK</span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 设置保存点</span></span><br><span class="line"><span class="keyword">SAVEPOINT</span> 保存点名称;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 回滚到保存点</span></span><br><span class="line"><span class="keyword">ROLLBACK</span> <span class="keyword">TO</span> 保存点名称;</span><br></pre></td></tr></table></figure><h2 id="7-3-隔离级别"><a href="#7-3-隔离级别" class="headerlink" title="7.3 隔离级别"></a>7.3 隔离级别</h2><table><thead><tr><th>隔离级别</th><th>脏读</th><th>不可重复读</th><th>幻读</th></tr></thead><tbody><tr><td>READ UNCOMMITTED</td><td>可能</td><td>可能</td><td>可能</td></tr><tr><td>READ COMMITTED</td><td>不可能</td><td>可能</td><td>可能</td></tr><tr><td>REPEATABLE READ（默认）</td><td>不可能</td><td>不可能</td><td>可能</td></tr><tr><td>SERIALIZABLE</td><td>不可能</td><td>不可能</td><td>不可能</td></tr></tbody></table><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">SET</span> SESSION TRANSACTION ISOLATION LEVEL 级别;</span><br></pre></td></tr></table></figure><hr><h1 id="八、视图"><a href="#八、视图" class="headerlink" title="八、视图"></a>八、视图</h1><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- 创建视图</span></span><br><span class="line"><span class="keyword">CREATE</span> <span class="keyword">VIEW</span> 视图名 <span class="keyword">AS</span></span><br><span class="line"><span class="keyword">SELECT</span> 字段<span class="number">1</span>, 字段<span class="number">2</span></span><br><span class="line"><span class="keyword">FROM</span> 表名</span><br><span class="line"><span class="keyword">WHERE</span> 条件;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 使用视图</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> 视图名;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 删除视图</span></span><br><span class="line"><span class="keyword">DROP</span> <span class="keyword">VIEW</span> 视图名;</span><br></pre></td></tr></table></figure><hr><h1 id="九、资料"><a href="#九、资料" class="headerlink" title="九、资料"></a>九、资料</h1><ol><li><p><a href="https://www.bilibili.com/video/BV1bQxMehETa/">B站快速入门课程</a></p></li><li><p><a href="https://sqlzoo.net/wiki/SQL_Tutorial">SQL学习编写网站</a></p></li></ol>]]></content:encoded>
      
      
      
      <category domain="https://eugenepage.com/tags/SQL/">SQL</category>
      
      <category domain="https://eugenepage.com/tags/Database/">Database</category>
      
      
      <comments>https://eugenepage.com/zh-CN/2026/04/12/20260412.SQLBasicsNotes/#disqus_thread</comments>
      
    </item>
    
    <item>
      <title>Notes on Building Agentic Workflows (Andrew Ng)</title>
      <link>https://eugenepage.com/2026/04/08/20260408.AgenticAIDevByAndrew/</link>
      <guid>https://eugenepage.com/2026/04/08/20260408.AgenticAIDevByAndrew/</guid>
      <pubDate>Wed, 08 Apr 2026 09:45:00 GMT</pubDate>
      
        
        
      <description>&lt;h1 id=&quot;I-Introduction-to-Agentic-Workflows&quot;&gt;&lt;a href=&quot;#I-Introduction-to-Agentic-Workflows&quot; class=&quot;headerlink&quot; title=&quot;I. Introduction to Age</description>
        
      
      
      
      <content:encoded><![CDATA[<h1 id="I-Introduction-to-Agentic-Workflows"><a href="#I-Introduction-to-Agentic-Workflows" class="headerlink" title="I. Introduction to Agentic Workflows"></a>I. Introduction to Agentic Workflows</h1><h2 id="1-1-What-Is-Agentic-AI"><a href="#1-1-What-Is-Agentic-AI" class="headerlink" title="1.1 What Is Agentic AI"></a>1.1 What Is Agentic AI</h2><p><strong>Core definition</strong>: An LLM-based application that completes tasks through multi-step execution flows.</p><ul><li><strong>Non-agentic</strong>: Single prompt, one-shot completion (like writing an essay without a backspace key)</li><li><strong>Agentic</strong>: Multi-step flow — outline → decide if research is needed → execute searches → write draft → reflect and revise → final output</li></ul><blockquote><p>Analogy: Making a stir-fry dish with multiple AIs each handling a role (prep &#x2F; cooking &#x2F; plating &#x2F; review)</p></blockquote><img src="https://cdn.jsdelivr.net/gh/youdrew/MyPicGo/Images/%E6%88%AA%E5%B1%8F2026-04-19%2016.14.14.png" alt="截屏2026-04-19 16.14.14" style="zoom:50%;" /><p>The blue labels mark different stages of AI evolution: from prompt engineer to content engineer to Hermes engineer (Agent).</p><h2 id="1-2-Levels-of-Autonomy"><a href="#1-2-Levels-of-Autonomy" class="headerlink" title="1.2 Levels of Autonomy"></a>1.2 Levels of Autonomy</h2><p>Agentic is an <strong>adjective</strong>, not a noun — this sidesteps the debate over “what really counts as an Agent.”</p><table><thead><tr><th>Low Autonomy</th><th>High Autonomy</th></tr></thead><tbody><tr><td>All steps predefined, tool calls hardcoded</td><td>Agent dynamically decides the step sequence</td></tr><tr><td>AI only generates text</td><td>Agent can create new tools on its own</td></tr><tr><td>“Obedient but brainless assistant”</td><td>“Smart, accountable intern”</td></tr></tbody></table><blockquote><p>The essence: not just “can do work,” but “knows how to think about the work, what tools to use, and can self-correct.”<br>A proper Agentic AI should be capable of <strong>autonomous planning</strong> (Planning — selecting tools on its own) and <strong>autonomous reflection</strong> (Reflection — with memory and self-review).</p></blockquote><h2 id="1-3-Three-Major-Benefits"><a href="#1-3-Three-Major-Benefits" class="headerlink" title="1.3 Three Major Benefits"></a>1.3 Three Major Benefits</h2><ol><li><strong>Significant performance gains</strong>: On HumanEval, an agentic GPT-3.5 can outperform a non-agentic GPT-4 (though both sound like ancient history now)</li><li><strong>Parallel speedup</strong>: Multiple LLM instances search and read simultaneously, then aggregate results</li><li><strong>Modular design</strong>: Freely swap components (search engines, different LLMs, different tools)</li></ol><h2 id="1-4-Task-Decomposition-Method"><a href="#1-4-Task-Decomposition-Method" class="headerlink" title="1.4 Task Decomposition Method"></a>1.4 Task Decomposition Method</h2><p>Core methodology:</p><ol><li>Observe human behavior → 2. Break into sub-steps → 3. Assess LLM&#x2F;tool feasibility → 4. Iterate and improve</li></ol><p><strong>Case study — progressive decomposition of article writing</strong>:</p><ul><li>1 step: Generate directly (shallow)</li><li>3 steps: Outline → Search → Write (better, but may feel disconnected)</li><li>5 steps: Outline → Search → Draft → Self-critique → Revise (best — simulates the human write-reflect-revise loop)</li><li>Core principle: <strong>“If a step produces poor results, break it down into smaller sub-steps.”</strong></li></ul><p>Building blocks: <strong>Model</strong> (LLM) + <strong>Tools</strong> (APIs, information retrieval, code execution)</p><h2 id="1-5-Agentic-AI-Evaluation-Evals-—-think-before-you-build"><a href="#1-5-Agentic-AI-Evaluation-Evals-—-think-before-you-build" class="headerlink" title="1.5 Agentic AI Evaluation (Evals — think before you build)"></a>1.5 Agentic AI Evaluation (Evals — think before you build)</h2><blockquote><p>Andrew Ng emphasizes: <strong>The biggest predictor distinguishing effective from ineffective practitioners is a rigorous development process built around evaluation and error analysis.</strong></p></blockquote><p>Methodology:</p><ol><li>Build first, observe outputs, then evaluate (don’t design all metrics upfront): just see how things work.</li><li>Identify low-quality outputs and define error types.</li><li>Build evaluation metrics to track errors: write scripts to automatically scan all agent outputs and count how often error outputs appear.</li><li>For subjective criteria, use <strong>LLM as Judge</strong> (1–5 scoring, but don’t let the model score directly without guidance).</li></ol><p>Two types of evaluation: <strong>End-to-end evaluation</strong> (overall output quality) &#x2F; <strong>Component-level evaluation</strong> (per-step quality)</p><h2 id="1-6-Overview-of-the-Four-Design-Patterns"><a href="#1-6-Overview-of-the-Four-Design-Patterns" class="headerlink" title="1.6 Overview of the Four Design Patterns"></a>1.6 Overview of the Four Design Patterns</h2><table><thead><tr><th align="center">Pattern</th><th align="center">Core Idea</th></tr></thead><tbody><tr><td align="center"><strong>Reflection</strong></td><td align="center">Multiple agents check, evaluate, and improve their own outputs.</td></tr><tr><td align="center"><strong>Tool Use</strong></td><td align="center">Gives LLMs the ability to call external tools&#x2F;functions</td></tr><tr><td align="center"><strong>Planning</strong></td><td align="center">The model autonomously decides the sequence of steps for complex tasks</td></tr><tr><td align="center"><strong>Multi-Agent</strong></td><td align="center">Multiple agents with different specializations collaborate</td></tr></tbody></table><h3 id="Real-World-Example-oh-my-claudecode-OMC"><a href="#Real-World-Example-oh-my-claudecode-OMC" class="headerlink" title="Real-World Example: oh-my-claudecode (OMC)"></a>Real-World Example: oh-my-claudecode (OMC)</h3><p>OMC (oh-my-claudecode) is a real Agent system that perfectly maps to these four design patterns:</p><ul><li><strong>Reflection</strong>: <code>code-reviewer</code> &#x2F; <code>verifier</code> agent — after the executor writes code, an independent reviewer examines it. This is exactly what section 2.1 describes: “use different models, one to generate and one to review (1+1&gt;2).” OMC’s rule is <em>Never self-approve in the same active context</em> — writing code and reviewing code must always be two separate agents.</li><li><strong>Tool Use</strong>: MCP servers (Context7 for docs, filesystem for file ops, LSP for code analysis), the Skill system (<code>/commit</code>, <code>/plan</code>, and other callable skills), Bash tools — matching section 3.1’s “tools are functions, the model autonomously decides when to use them.” Claude independently judges which tool to use for each task.</li><li><strong>Planning</strong>: <code>planner</code> agent, <code>/plan</code> skill, plan mode — matching section 5.2’s “LLM outputs a structured plan before executing.” In plan mode, Claude first explores the codebase and designs an approach, only writing code after you approve.</li><li><strong>Multi-Agent</strong>: Team mode (<code>/team</code>) can launch multiple specialized agents simultaneously (explorer for search, executor for writing code, reviewer for review, designer for UI…), sharing a TaskList and collaborating via SendMessage.</li></ul><hr><h1 id="II-The-Reflection-Design-Pattern"><a href="#II-The-Reflection-Design-Pattern" class="headerlink" title="II. The Reflection Design Pattern"></a>II. The Reflection Design Pattern</h1><figure style="text-align: center; margin: 24px 0;">  <img src="https://cdn.jsdelivr.net/gh/youdrew/MyPicGo/Images/%E6%88%AA%E5%B1%8F2026-04-09%2003.19.57.png" alt="使用反射模式真的是能提升效果的" style="max-width: 66%; height: auto; display: inline-block; border-radius: 8px;" />  <figcaption style="margin-top: 8px; font-size: 14px; color: #666;">Reflection really does improve output quality. So maybe taking more time to reflect on yourself actually leads to growth. 🐶</figcaption></figure><h2 id="2-1-Reflection-Improves-Task-Output"><a href="#2-1-Reflection-Improves-Task-Output" class="headerlink" title="2.1 Reflection Improves Task Output"></a>2.1 Reflection Improves Task Output</h2><p>Core analogy: Humans review and revise drafts — AI can do the same.</p><p><strong>Email writing example</strong>:</p><ul><li>AI generates V1 → Feed V1 back to the LLM with a reflection prompt → Generate improved V2</li></ul><p><strong>Progressive path for code writing</strong>:</p><ul><li>Basic: LLM writes code V1 → LLM reviews and generates V2</li><li>Advanced: Use different models — one to generate, one reasoning model to review (1+1&gt;2)</li><li>Ultimate: Combine external feedback — execute V1 in a sandbox, capture errors, feed back to LLM to generate V2</li></ul><blockquote><p>Key insight: Reflection is an engineering practice, not magic; <strong>external feedback is the critical differentiator</strong></p></blockquote><h2 id="2-2-Internal-—-Two-Golden-Rules-of-Self-Reflection"><a href="#2-2-Internal-—-Two-Golden-Rules-of-Self-Reflection" class="headerlink" title="2.2 Internal — Two Golden Rules of Self-Reflection"></a>2.2 Internal — Two Golden Rules of Self-Reflection</h2><ol><li><strong>Explicitly instruct the reflection action</strong>: Say “review,” “check,” “verify” (specific actions), not just “improve.” For <strong>objective tasks</strong>: build ground-truth datasets + automated code evaluation (e.g., checking SQL query correctness).</li><li><strong>Specify concrete criteria</strong>: List explicit evaluation dimensions (e.g., “professional tone,” “factually accurate”). For <strong>subjective tasks</strong>: use a <strong>Rubric</strong> to guide LLM scoring — avoid direct comparison (which introduces positional bias).</li></ol><p>The paper “Self-Refine” shows that reflection consistently improved performance across all 7 tasks and 4 models tested.</p><h2 id="2-3-External-—-Breaking-Through-the-Performance-Ceiling"><a href="#2-3-External-—-Breaking-Through-the-Performance-Ceiling" class="headerlink" title="2.3 External — Breaking Through the Performance Ceiling"></a>2.3 External — Breaking Through the Performance Ceiling</h2><p>Three performance curves:</p><ul><li>🔴 No reflection: rapid gains from prompt engineering, then <strong>plateaus</strong></li><li>🔵 With reflection: breaks through the plateau to a higher level</li><li>🟡 Reflection + external feedback: breaks through again to the highest level</li></ul><p>Three types of external feedback: regex matching (avoid mentioning competitors) → search validation (fact-checking) → word count checks (format constraints)</p><blockquote><p>External feedback breaks the model out of its information silo, addressing inherent weaknesses (precise counting, fact verification) and enabling closed-loop optimization.</p></blockquote><figure style="text-align: center; margin: 24px 0;">  <img src="https://cdn.jsdelivr.net/gh/youdrew/MyPicGo/Images/%E6%88%AA%E5%B1%8F2026-04-09%2003.33.25.png" alt="截屏2026-04-09 03.33.25" style="max-width: 100%; height: auto; display: inline-block; border-radius: 8px;" />  <figcaption style="margin-top: 8px; font-size: 14px; color: #666;">Performance curve comparison: reflection vs. external feedback</figcaption></figure><hr><h1 id="III-Tool-Use"><a href="#III-Tool-Use" class="headerlink" title="III. Tool Use"></a>III. Tool Use</h1><h2 id="3-1-What-Tools-Actually-Are"><a href="#3-1-What-Tools-Actually-Are" class="headerlink" title="3.1 What Tools Actually Are"></a>3.1 What Tools Actually Are</h2><p><strong>Tools are functions</strong> — the model autonomously decides when to use them.</p><p>Key capability — <strong>conditional invocation</strong>: the model intelligently judges when a tool is needed.</p><ul><li>“How much caffeine is in green tea?” → Answer from internal knowledge</li><li>“What time is it now?” → Call the get_current_time tool</li></ul><p>Multi-tool chaining: a calendar assistant can chain check_calendar → make_appointment</p><h2 id="3-2-How-Does-an-LLM-Actually-“Call”-a-Function"><a href="#3-2-How-Does-an-LLM-Actually-“Call”-a-Function" class="headerlink" title="3.2 How Does an LLM Actually “Call” a Function?"></a>3.2 How Does an LLM Actually “Call” a Function?</h2><blockquote><p>In theory, an LLM never touches the execution layer of any function — from start to finish, it does exactly one thing: <strong>generate text</strong>.<br>What we call “calling a function” is fundamentally a <strong>text relay protocol</strong>.</p></blockquote><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line">sequenceDiagram</span><br><span class="line">    participant U as User</span><br><span class="line">    participant S as System / Engineering Code</span><br><span class="line">    participant L as LLM</span><br><span class="line"></span><br><span class="line">    U-&gt;&gt;L: &quot;What time is it?&quot;</span><br><span class="line"></span><br><span class="line">    Note over S,L: Step 1: System prompt tells LLM which tools are available</span><br><span class="line">    Note right of L: System prompt includes:&lt;br/&gt;function name: get_current_time&lt;br/&gt;description: get the current time&lt;br/&gt;parameters: none</span><br><span class="line"></span><br><span class="line">    Note over S,L: Step 2: LLM outputs structured text (nothing is executed!)</span><br><span class="line">    L--&gt;&gt;S: tool_calls: [&#123;&lt;br/&gt;  name: &quot;get_current_time&quot;,&lt;br/&gt;  arguments: &#123;&#125;&lt;br/&gt;&#125;]</span><br><span class="line"></span><br><span class="line">    Note over S: Step 3: Outer code intercepts&lt;br/&gt;actually executes the function&lt;br/&gt;gets result &quot;15:20:45&quot;</span><br><span class="line"></span><br><span class="line">    Note over S,L: Step 4: Result is fed back into the LLM context</span><br><span class="line">    S-&gt;&gt;L: &#123; role: &quot;tool&quot;,&lt;br/&gt;  content: &quot;15:20:45&quot; &#125;</span><br><span class="line"></span><br><span class="line">    L--&gt;&gt;S: &quot;It is currently 3:20 PM.&quot;</span><br><span class="line">    S-&gt;&gt;U: &quot;It is currently 3:20 PM.&quot;</span><br></pre></td></tr></table></figure><p><strong>Three key insights</strong>:</p><ol><li><p><strong>The LLM is not “calling” a function</strong> — it is predicting “the next output should be a JSON snippet expressing that I want to use this tool.” This is a pattern learned from training on large amounts of code and API documentation.</p></li><li><p><strong>The outer engineering code is what actually executes the function</strong> — frameworks like Claude Code, the OpenAI SDK, and LangChain parse the LLM’s structured text output, execute the function, and feed the result back.</p></li><li><p><strong>The LLM’s core ability is “judgment”</strong> — it decides “should I answer this user’s question from internal knowledge, or should I call a tool?” The “green tea caffeine” (internal knowledge) vs. “what time is it” (call a tool) example in section 3.1 illustrates exactly this judgment.</p></li></ol><ul><li>Early on, hand-written prompt templates were needed to trigger tool calls (e.g., “FUNCTION: get_current_time()”). Modern LLMs natively understand tool calling without hardcoded trigger syntax.</li></ul><h2 id="3-3-Tool-Syntax-and-AI-SDK-Writing-Functions-for-LLMs-to-Call"><a href="#3-3-Tool-Syntax-and-AI-SDK-Writing-Functions-for-LLMs-to-Call" class="headerlink" title="3.3 Tool Syntax and AI SDK (Writing Functions for LLMs to Call)"></a>3.3 Tool Syntax and AI SDK (Writing Functions for LLMs to Call)</h2><p>The <a href="https://github.com/andrewyng/aisuite">AI SDK</a> (from Andrew Ng’s team) unifies access to multiple LLM providers:</p><ul><li>Function name → Python function name</li><li>Description → docstring</li><li>Parameter types → automatically extracted</li></ul><h2 id="3-4-Models-That-Write-Their-Own-Code-LLM-Writes-and-Calls-Its-Own-Code"><a href="#3-4-Models-That-Write-Their-Own-Code-LLM-Writes-and-Calls-Its-Own-Code" class="headerlink" title="3.4 Models That Write Their Own Code (LLM Writes and Calls Its Own Code)"></a>3.4 Models That Write Their Own Code (LLM Writes and Calls Its Own Code)</h2><p>Traditional approach (predefined add&#x2F;subtract, etc.) vs. code execution (let the model write code itself):</p><ul><li>Model outputs code inside <code>&lt;execute_python&gt;</code> tags</li><li>Code is extracted and executed in a sandbox</li><li>Error messages are fed back to the model for reflection and revision</li></ul><blockquote><p>⚠️ Security warning: A real-world case — an agentic code executor ran <code>rm *.py</code> and deleted all project files. Sandbox environments (Docker, E2B) are mandatory.</p></blockquote><h2 id="3-5-MCP-Model-Context-Protocol"><a href="#3-5-MCP-Model-Context-Protocol" class="headerlink" title="3.5 MCP (Model Context Protocol)"></a>3.5 MCP (Model Context Protocol)</h2><p>MCP standardizes how LLMs access external tools and data sources, expanding the “tool surface” available to the LLM.</p><ul><li><strong>Problem</strong>: m applications × n tools &#x3D; m×n amount of work</li><li><strong>MCP solution</strong>: Build n shared MCP Servers, m applications connect → work reduced to m+n</li><li><strong>Client</strong>: The application that needs tools (Cursor, Claude Desktop, etc.)</li><li><strong>Server</strong>: The tool&#x2F;data provider (Slack, GitHub, PostgreSQL, etc.)</li></ul><hr><h1 id="IV-Practical-Tips-for-Building-Agentic-AI"><a href="#IV-Practical-Tips-for-Building-Agentic-AI" class="headerlink" title="IV. Practical Tips for Building Agentic AI"></a>IV. Practical Tips for Building Agentic AI</h1><h2 id="4-1-Evals-in-Practice"><a href="#4-1-Evals-in-Practice" class="headerlink" title="4.1 Evals in Practice"></a>4.1 Evals in Practice</h2><p>Evaluation approaches can be divided along two dimensions, forming a 2×2 matrix to guide evaluation design:</p><table><thead><tr><th>Evaluation Dimension</th><th>Objective Evals (check with code)</th><th>Subjective Evals (use LLM as judge)</th></tr></thead><tbody><tr><td>Each question has a unique correct answer (Per-Example Ground Truth)</td><td>Case 1: Invoice date extraction (each invoice has a different correct date — use code to check for a match)</td><td>Case 3: Counting gold-standard points (each topic has different key ideas — use LLM to check coverage)</td></tr><tr><td>Only unified rules &#x2F; format &#x2F; standards, no fixed answer (No Per-Example Ground Truth)</td><td>Case 2: Marketing copy length (all headlines must be 10 words — use code to check compliance)</td><td>Rubric Grading (e.g., evaluate charts against a unified clarity rubric)</td></tr></tbody></table><ol><li><strong>Start fast and rough</strong>: Don’t be intimidated into treating evals as a massive project or spend endless time on theory first. Start with 10–20 examples and get some quick metrics to complement manual observation.</li><li><strong>Iterate on your evals</strong>:<ol><li>As the system and evals mature, scale up the evaluation set.</li><li>If the system improves but eval scores don’t go up, it’s time to improve the evals themselves.</li></ol></li><li><strong>Take inspiration from expert behavior</strong>: For systems automating human tasks, observe where the system underperforms human experts and use that as the focus for the next phase of work.</li></ol><h2 id="4-2-Error-Analysis-and-Prioritization"><a href="#4-2-Error-Analysis-and-Prioritization" class="headerlink" title="4.2 Error Analysis and Prioritization"></a>4.2 Error Analysis and Prioritization</h2><p>As system complexity grows, <strong>intuition-driven debugging becomes unreliable</strong> — systematic analysis is required.</p><p>Core method:</p><ol><li><strong>Inspect traces and intermediate outputs</strong>: Each step’s output is called a “span”; combined, they form a “trace.”</li><li><strong>Focus on error cases and quantify them</strong>: Build a table tracking failure rates per component.<ul><li>Example: 45% unsatisfactory search results vs. 5% poor search keyword generation → prioritize improving the search component.</li><li>Make a habit of regularly reading the conversation log between the LLM and its tools.</li></ul></li></ol><h2 id="4-3-Component-Level-Evaluation"><a href="#4-3-Component-Level-Evaluation" class="headerlink" title="4.3 Component-Level Evaluation"></a>4.3 Component-Level Evaluation</h2><p>Analogous to unit tests vs. integration tests. Advantages: faster iteration, cleaner signal, teams can work in parallel.</p><p>Workflow: Error analysis pinpoints the problem component → Component-level eval for tuning → End-to-end eval to validate overall improvement</p><h2 id="4-4-Strategies-for-Problem-Solving"><a href="#4-4-Strategies-for-Problem-Solving" class="headerlink" title="4.4 Strategies for Problem-Solving"></a>4.4 Strategies for Problem-Solving</h2><p><strong>Non-LLM components</strong>: Tune parameters&#x2F;hyperparameters (number of search results, RAG similarity threshold), switch vendors.</p><p><strong>LLM components</strong> (by priority):</p><ol><li>Improve the prompt (clearer instructions, few-shot examples)</li><li>Try different LLMs (use evals to test multiple models)</li><li>Task decomposition (break complex steps into generate + reflect)</li><li>Fine-tuning (last resort — highest cost)</li></ol><h2 id="4-5-Latency-and-Cost-Optimization"><a href="#4-5-Latency-and-Cost-Optimization" class="headerlink" title="4.5 Latency and Cost Optimization"></a>4.5 Latency and Cost Optimization</h2><blockquote><p>For early-stage teams, <strong>output quality matters far more than latency and cost</strong>. Optimize quality first, then latency, then cost.<br>Apply the same modular thinking: first identify which component is slowest&#x2F;most expensive, then optimize it specifically (e.g., refine the prompt, switch models, reduce call frequency).</p></blockquote><h2 id="4-6-Four-Phases-of-the-Development-Process"><a href="#4-6-Four-Phases-of-the-Development-Process" class="headerlink" title="4.6 Four Phases of the Development Process"></a>4.6 Four Phases of the Development Process</h2><table><thead><tr><th>Phase</th><th>Focus</th><th>Analysis Activity</th></tr></thead><tbody><tr><td>1. Rapid Prototype</td><td>Get the end-to-end flow working (“build the garbage first”)</td><td>Manually inspect outputs, read traces</td></tr><tr><td>2. Initial Evaluation</td><td>Go beyond manual observation</td><td>Build 10–20 example end-to-end evals</td></tr><tr><td>3. Rigorous Analysis</td><td>Need precise improvement direction</td><td>Error analysis, quantify component failure rates</td></tr><tr><td>4. Efficient Tuning</td><td>System is mature, improve at component level</td><td>Component-level evals</td></tr></tbody></table><blockquote><p>Two main developer activities: <strong>building</strong> (writing code) and <strong>analyzing</strong> (deciding where to focus). Teams typically spend too much time building and too little time analyzing.</p></blockquote><hr><h1 id="V-Patterns-for-Highly-Autonomous-Agents"><a href="#V-Patterns-for-Highly-Autonomous-Agents" class="headerlink" title="V. Patterns for Highly Autonomous Agents"></a>V. Patterns for Highly Autonomous Agents</h1><h2 id="5-1-Planning-Workflows"><a href="#5-1-Planning-Workflows" class="headerlink" title="5.1 Planning Workflows"></a>5.1 Planning Workflows</h2><p>Planning pattern: The agent <strong>autonomously decides</strong> the tool-call sequence — no hardcoding.</p><p>Case study — customer service assistant (tools: get description, get price, check inventory, check orders, process purchase, process returns):</p><ul><li>User asks: “Do you have round sunglasses under $100?”</li><li>LLM plans: get description → check inventory → get price → output answer</li></ul><p><strong>Advantages</strong>: Rich capabilities, no need to pre-orchestrate. <strong>Risks</strong>: The LLM’s plan is unpredictable and may be unstable.</p><h2 id="5-2-Structured-Plans"><a href="#5-2-Structured-Plans" class="headerlink" title="5.2 Structured Plans"></a>5.2 Structured Plans</h2><p>Natural language plans are ambiguous → require the LLM to output a <strong>structured plan</strong> (JSON&#x2F;XML):</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">[</span></span><br><span class="line">  <span class="punctuation">&#123;</span><span class="attr">&quot;description&quot;</span><span class="punctuation">:</span> <span class="string">&quot;Find round sunglasses&quot;</span><span class="punctuation">,</span> <span class="attr">&quot;tool&quot;</span><span class="punctuation">:</span> <span class="string">&quot;get_item_descriptions&quot;</span><span class="punctuation">,</span> <span class="attr">&quot;arguments&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span><span class="attr">&quot;query&quot;</span><span class="punctuation">:</span> <span class="string">&quot;round sunglasses&quot;</span><span class="punctuation">&#125;</span><span class="punctuation">&#125;</span><span class="punctuation">,</span></span><br><span class="line">  <span class="punctuation">&#123;</span><span class="attr">&quot;description&quot;</span><span class="punctuation">:</span> <span class="string">&quot;Check inventory&quot;</span><span class="punctuation">,</span> <span class="attr">&quot;tool&quot;</span><span class="punctuation">:</span> <span class="string">&quot;check_inventory&quot;</span><span class="punctuation">,</span> <span class="attr">&quot;arguments&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span><span class="attr">&quot;items&quot;</span><span class="punctuation">:</span> <span class="string">&quot;$step1_result&quot;</span><span class="punctuation">&#125;</span><span class="punctuation">&#125;</span></span><br><span class="line"><span class="punctuation">]</span></span><br></pre></td></tr></table></figure><h2 id="5-3-Code-As-Action"><a href="#5-3-Code-As-Action" class="headerlink" title="5.3 Code As Action"></a>5.3 Code As Action</h2><figure style="text-align: center; margin: 24px 0;">  <img src="https://cdn.jsdelivr.net/gh/youdrew/MyPicGo/Images/%E6%88%AA%E5%B1%8F2026-04-12%2003.48.14.png" alt="截屏2026-04-12 03.48.14" style="max-width: 100%; height: auto; display: inline-block; border-radius: 8px;" />  <figcaption style="margin-top: 8px; font-size: 14px; color: #666;">Code As Action — HuggingFace smolagents</figcaption></figure><p>Drawing from the CodeAgent concept in HuggingFace smolagents — letting the LLM write code directly to express multi-step plans.</p><p>Advantages: Can call large libraries (hundreds of Pandas functions), highly expressive, research shows better performance than JSON&#x2F;text plans.<br>Risks: The code the LLM writes must be executed in a sandbox environment.</p><h2 id="5-4-Multi-Agent-Workflows"><a href="#5-4-Multi-Agent-Workflows" class="headerlink" title="5.4 Multi-Agent Workflows"></a>5.4 Multi-Agent Workflows</h2><p>Even when all agents use the same LLM, splitting complex tasks into independent roles is more effective.</p><blockquote><p>My personal intuition is that it works because different prompts&#x2F;contexts cause the model to focus on different things.</p></blockquote><p><strong>Advantages</strong>:</p><ol><li><strong>Task decomposition</strong>: Natural division of work by role&#x2F;skill</li><li><strong>Focus</strong>: Developers build one role at a time; simpler tasks &#x3D; better output</li><li><strong>Modular reuse</strong>: General-purpose agents (e.g., “chart designer”) can be reused across applications</li><li><strong>Bypasses context limits</strong>: Each agent handles its own context (critical for 128k context constraints)</li><li><strong>Cost savings</strong>: Shorter contexts &#x3D; fewer tokens &#x3D; lower cost and faster response</li></ol><h2 id="5-5-Four-Communication-Patterns"><a href="#5-5-Four-Communication-Patterns" class="headerlink" title="5.5 Four Communication Patterns"></a>5.5 Four Communication Patterns</h2><table><thead><tr><th>Pattern</th><th>Structure</th><th>Pros</th><th>Cons</th><th>Best For</th></tr></thead><tbody><tr><td><strong>Linear</strong></td><td>Sequential, one-directional</td><td>Simple</td><td>Inflexible</td><td>Fixed-flow tasks</td></tr><tr><td><strong>Hierarchy (two-tier)</strong></td><td>Manager coordinates all subordinates</td><td>Easy to control</td><td>Manager bottleneck</td><td>Multi-task coordination</td></tr><tr><td><strong>Deep Hierarchy</strong></td><td>Sub-agents have their own sub-agents</td><td>Scalable, modular</td><td>Complex, hard to debug</td><td>Large systems</td></tr><tr><td><strong>All-to-All (Decentralized)</strong></td><td>All agents communicate freely</td><td>Creative</td><td>Unpredictable results</td><td>Exploratory &#x2F; generative tasks</td></tr></tbody></table><blockquote><p>Given current LLM capabilities, linear and hierarchical patterns are more practical (the deeper the hierarchy, the more information is lost in transmission).</p><p>Beyond these four patterns, there is also a <strong>conversation pattern</strong> — a downgraded version of the decentralized model. In conversation mode, only two agents talk to each other at a time: one executes the task, the other reviews it, and together they hand off a result both are satisfied with.</p></blockquote><h2 id="5-6-Framework-Recommendations"><a href="#5-6-Framework-Recommendations" class="headerlink" title="5.6 Framework Recommendations"></a>5.6 Framework Recommendations</h2><ul><li><strong>LangChain</strong>: Linear workflows</li><li><strong>smolagents</strong>: Hierarchical workflows (author’s recommendation — simple, low abstraction, <code>@tool</code> decorator makes development easy)</li><li><strong>MetaGPT &#x2F; CamelAI</strong>: Decentralized workflows</li></ul><hr><h1 id="Summary-and-Personal-Reflections"><a href="#Summary-and-Personal-Reflections" class="headerlink" title="Summary and Personal Reflections"></a>Summary and Personal Reflections</h1><blockquote><p>I previously built a Skill at work that used Claude Code to call MCP tools to inspect UE assets and parse build error logs (though maybe that Skill doesn’t quite qualify as Agentic AI). Thinking about it through the lens of Agentic AI, it probably could have been built much more robustly. I was also genuinely surprised by the stability of the test projects in Andrew Ng’s course.</p></blockquote><ul><li><ol><li><strong>Planning</strong>: After a timer fires, Claude Code first analyzes the error log, decides which tool to call (check docs, check code, check historical error records, etc.), then executes the tools — or even writes its own database query code on the fly.</li></ol></li><li><ol start="2"><li><strong>Reflection</strong>: After receiving the tool result, Claude Code performs a self-review to determine if the result is useful. If it isn’t satisfied, it adjusts the query parameters and calls the tool again, repeating until it gets a result it’s happy with.</li></ol></li><li><ol start="3"><li><strong>Multi-Agent</strong>: You could design multiple specialized agents — one dedicated to log analysis, one to querying docs, one to querying code — collaborating through a shared context.</li></ol></li><li><ol start="4"><li><strong>Evals</strong>: You could design automated evaluation scripts to quantify Claude Code’s performance on resolving errors — metrics like success rate, average time to resolution, etc. (Each completed result could auto-upload a JSON record to a server for the admin to review weekly. Users could also be asked whether the AI’s suggested solution actually solved their problem, building up a solution database so the AI can reference past resolutions for similar future issues.)</li></ol></li></ul><p>One more thing: different models may suit different harnesses, since their capabilities vary (as mentioned in Hung-yi Lee’s course — for example, Sonnet has a kind of “context anxiety,” meaning its capabilities noticeably degrade when the context gets very long).</p><p>Finally, returning to the Tool Use design pattern: one key practical insight is that <strong>the design quality of MCP tools directly determines the capability ceiling of the agent</strong>. Drawing from my experience with the UE MCP project, here are six tool design principles I’ve distilled:</p><ol><li><strong>Description is the most important design decision — the description is the interface</strong>: The caller of an MCP tool is the LLM, not a human. The LLM relies on the <code>description</code> field to decide “when to call this and how to call it.” A good description includes: what it does, when to use it, boundary constraints, parameter semantics, and what the return value means. A poor description makes the tool dead weight.</li><li><strong>Granularity control — use subsystems as boundaries</strong>: Tools that are too fine-grained (e.g., splitting node creation by coordinate axis) lead to long call chains and compounding errors; tools that are too coarse (e.g., generating an entire character blueprint in one call) become black boxes where the LLM can’t localize failures. Use engine subsystems as boundaries — each tool does one complete thing.</li><li><strong>Return values must be “LLM-friendly”</strong>: Return values must include enough decision context — on success, indicate what operations are available next; on failure, provide <code>error_type</code>, <code>error_message</code>, and <code>suggestion</code> so the agent can self-correct rather than blindly retry.</li><li><strong>Separate reads from writes, make side effects transparent</strong>: When uncertain, LLMs tend to call tools that “look safe.” Read-only tools and write operations should be clearly categorized, with write operations explicitly annotating side effects in the description (e.g., “creates a new file on disk,” “irreversible operation”).</li><li><strong>Idempotent design — make the LLM willing to retry</strong>: The LLM may call the same tool repeatedly due to timeouts or misjudgments. Design tools to be safe to call multiple times (e.g., if the asset already exists, return the existing asset instead of throwing an error).</li><li><strong>Layered tool structure</strong>: High-level tools (task-oriented complete workflows, e.g., <code>setup_character_blueprint()</code>) reduce the number of calls needed; mid-level tools (single-step operations) preserve flexibility; low-level APIs should not be directly exposed to the LLM. Guide the LLM in the description to prefer high-level paths.</li></ol><p>The one-sentence summary: <strong>Good MCP tool design, at its core, means “designing tools so the LLM can use them correctly, as if it had read the documentation.”</strong> This perfectly aligns with the core idea of the Tool Use pattern in Andrew Ng’s course — the quality of your tools sets the upper bound on your agent’s autonomous decision-making.</p><h1 id="References"><a href="#References" class="headerlink" title="References"></a>References</h1><ol><li><a href="https://www.bilibili.com/video/BV1DfrdByE2H">https://www.bilibili.com/video/BV1DfrdByE2H</a> — Course video (Bilibili)</li><li><a href="https://github.com/datawhalechina/agentic-ai/tree/main">Original GitHub notes</a>: Contains runnable code you can study as Jupyter Notebooks in VSCode — very convenient.</li><li>Another video mentioned later in the course: <a href="https://www.deeplearning.ai/short-courses/agentic-knowledge-graph-construction/">Agentic Knowledge Graph Construction</a></li><li>Original course <a href="https://www.deeplearning.ai/courses/agentic-ai/">link</a></li><li><a href="https://www.bilibili.com/video/BV1vrQbBBE6Z/">Hung-yi Lee’s course</a>: A good companion — feels a bit like an Agent course in its own right</li></ol>]]></content:encoded>
      
      
      
      <category domain="https://eugenepage.com/tags/AI/">AI</category>
      
      <category domain="https://eugenepage.com/tags/AgenticAI/">AgenticAI</category>
      
      <category domain="https://eugenepage.com/tags/Framework/">Framework</category>
      
      
      <comments>https://eugenepage.com/2026/04/08/20260408.AgenticAIDevByAndrew/#disqus_thread</comments>
      
    </item>
    
    <item>
      <title>Agentic工作流搭建教程笔记（吴恩达）</title>
      <link>https://eugenepage.com/zh-CN/2026/04/08/20260408.AgenticAIDevByAndrew/</link>
      <guid>https://eugenepage.com/zh-CN/2026/04/08/20260408.AgenticAIDevByAndrew/</guid>
      <pubDate>Wed, 08 Apr 2026 09:45:00 GMT</pubDate>
      
        
        
      <description>&lt;h1 id=&quot;一、Agentic工作流简介&quot;&gt;&lt;a href=&quot;#一、Agentic工作流简介&quot; class=&quot;headerlink&quot; title=&quot;一、Agentic工作流简介&quot;&gt;&lt;/a&gt;一、Agentic工作流简介&lt;/h1&gt;&lt;h2 id=&quot;1-1-什么是Agentic-AI</description>
        
      
      
      
      <content:encoded><![CDATA[<h1 id="一、Agentic工作流简介"><a href="#一、Agentic工作流简介" class="headerlink" title="一、Agentic工作流简介"></a>一、Agentic工作流简介</h1><h2 id="1-1-什么是Agentic-AI"><a href="#1-1-什么是Agentic-AI" class="headerlink" title="1.1 什么是Agentic AI"></a>1.1 什么是Agentic AI</h2><p><strong>核心定义</strong>：基于LLM的应用通过多步骤执行来完成任务的流程。</p><ul><li><strong>Non-agentic</strong>：单次prompt，one-shot完成（类似写作文不许用退格键）</li><li><strong>Agentic</strong>：多步骤流程——列大纲 → 决定是否需要调研 → 执行搜索 → 写初稿 → 反思修改 → 最终定稿</li></ul><blockquote><p>类比：做番茄炒蛋，让多个AI各司其职（备料&#x2F;烹饪&#x2F;摆盘&#x2F;审查）</p></blockquote><img src="https://cdn.jsdelivr.net/gh/youdrew/MyPicGo/Images/%E6%88%AA%E5%B1%8F2026-04-19%2016.14.14.png" alt="截屏2026-04-19 16.14.14" style="zoom:50%;" /><p>蓝色标注的分别是AI的不同发展阶段：从prompt engineer到content engineer再到Hermes engineer（Agent）.</p><h2 id="1-2-自主性等级"><a href="#1-2-自主性等级" class="headerlink" title="1.2 自主性等级"></a>1.2 自主性等级</h2><p>Agentic是一个<strong>形容词</strong>而非名词，避免”什么才算真正的Agent”的争论。</p><table><thead><tr><th>低自主</th><th>高自主</th></tr></thead><tbody><tr><td>所有步骤预定义，工具调用硬编码</td><td>Agent动态决策步骤顺序</td></tr><tr><td>AI只负责生成文本</td><td>Agent可自行创建新工具</td></tr><tr><td>“听话但没脑子的助手”</td><td>“聪明、有责任心的实习生”</td></tr></tbody></table><blockquote><p>本质：不只是”能干活”，而是”知道怎么思考如何干活、用什么工具、能自检纠错”。<br>一个合格的Agentic AI应该能<strong>自主规划</strong>（Planning，自己选择工具）和<strong>自主反思</strong>（Reflection，有记忆和反思）。</p></blockquote><h2 id="1-3-三大益处"><a href="#1-3-三大益处" class="headerlink" title="1.3 三大益处"></a>1.3 三大益处</h2><ol><li><strong>性能大幅提升</strong>：HumanEval上，Agentic的GPT-3.5可超越Non-agentic的GPT-4（BTW这些听起来都是上古模型了）</li><li><strong>并行加速</strong>：多个LLM实例同时搜索&#x2F;阅读，汇总后输出</li><li><strong>模块化设计</strong>：自由替换组件（搜索引擎、不同LLM、不同工具）</li></ol><h2 id="1-4-任务分解方法"><a href="#1-4-任务分解方法" class="headerlink" title="1.4 任务分解方法"></a>1.4 任务分解方法</h2><p>核心方法论：</p><ol><li>观察人类行为 → 2. 拆解为子步骤 → 3. 评估LLM&#x2F;工具可行性 → 4. 迭代优化</li></ol><p><strong>案例——写文章的递进分解</strong>：</p><ul><li>1步：直接生成（肤浅）</li><li>3步：大纲 → 搜索 → 写文（较好但可能脱节）</li><li>5步：大纲 → 搜索 → 初稿 → 自我批评 → 修改（最好，模拟人的写-反思-修改循环）</li><li>核心方法论：<strong>“如果某一步骤效果不好，就把它再拆成更小的子步骤。”</strong></li></ul><p>构建模块：<strong>模型</strong>（LLM）+ <strong>工具</strong>（API、信息检索、代码执行）</p><h2 id="1-5-Agentic-AI评估（Evals，在做之前对项目进行思考）"><a href="#1-5-Agentic-AI评估（Evals，在做之前对项目进行思考）" class="headerlink" title="1.5 Agentic AI评估（Evals，在做之前对项目进行思考）"></a>1.5 Agentic AI评估（Evals，在做之前对项目进行思考）</h2><blockquote><p>吴恩达强调：<strong>区分有效和无效实践者的最大预测因子，是围绕评估和错误分析的规范开发流程。</strong></p></blockquote><p>方法论：</p><ol><li>先构建，观察输出，再做评估（不要预先设计所有评估标准）：看看这事情咋弄。</li><li>识别低质量输出，定义错误类型。</li><li>建立评估指标追踪错误：编写脚本自动扫描智能体的所有输出，统计提及错误输出的次数和频率。</li><li>主观标准可用 <strong>LLM as Judge</strong>（1-5分打分，但是不要直接让大模型去打分）。</li></ol><p>两类评估：<strong>端到端评估</strong>（整体输出质量） &#x2F; <strong>组件级评估</strong>（单步质量）（这两种就是一个整体一个局部评估）</p><h2 id="1-6-四大设计模式总览"><a href="#1-6-四大设计模式总览" class="headerlink" title="1.6 四大设计模式总览"></a>1.6 四大设计模式总览</h2><table><thead><tr><th align="center">模式</th><th align="center">核心思想</th></tr></thead><tbody><tr><td align="center"><strong>Reflection 反思</strong></td><td align="center">多Agent检查、评估、改进自己的输出。</td></tr><tr><td align="center"><strong>Tool Use 工具使用</strong></td><td align="center">给LLM调用外部工具&#x2F;函数的能力</td></tr><tr><td align="center"><strong>Planning 规划</strong></td><td align="center">模型自主决定复杂任务的步骤序列</td></tr><tr><td align="center"><strong>Multi-Agent 多智能体</strong></td><td align="center">多个不同专长的Agent协作</td></tr></tbody></table><h3 id="现实案例：oh-my-claudecode（OMC）"><a href="#现实案例：oh-my-claudecode（OMC）" class="headerlink" title="现实案例：oh-my-claudecode（OMC）"></a>现实案例：oh-my-claudecode（OMC）</h3><p>OMC （oh-My-Claude）是一个真实的 Agent 系统，完美对应了这四大设计模式：</p><ul><li><strong>Reflection 反思</strong>：<code>code-reviewer</code> &#x2F; <code>verifier</code> agent — executor 写完代码后，由独立的 reviewer 审查。这就是笔记 2.1 说的”用不同模型，一个生成一个审查（1+1&gt;2）”。OMC 的规则是 <em>Never self-approve in the same active context</em>，写代码和审代码必须是两个独立的 Agent。</li><li><strong>Tool Use 工具使用</strong>：MCP servers（Context7 查文档、filesystem 操作文件、LSP 做代码分析）、Skill 系统（<code>/commit</code>、<code>/plan</code> 等可调用技能）、Bash 工具 — 对应笔记 3.1 的”工具就是函数，模型自主决定何时使用”。Claude 会根据任务自主判断该用哪个工具。</li><li><strong>Planning 规划</strong>：<code>planner</code> agent、<code>/plan</code> skill、plan mode — 对应笔记 5.2 的”LLM输出结构化计划后再执行”。在 plan mode 下，Claude 会先探索代码库、设计方案，等你批准后才动手写代码。</li><li><strong>Multi-Agent 多智能体</strong>：Team 模式（<code>/team</code>）可以同时启动多个专项 Agent（explorer 负责搜索、executor 负责写代码、reviewer 负责审查、designer 负责 UI…），它们共享 TaskList，通过 SendMessage 通信协作。</li></ul><hr><h1 id="二、反思设计模式（Reflection）"><a href="#二、反思设计模式（Reflection）" class="headerlink" title="二、反思设计模式（Reflection）"></a>二、反思设计模式（Reflection）</h1><figure style="text-align: center; margin: 24px 0;">  <img src="https://cdn.jsdelivr.net/gh/youdrew/MyPicGo/Images/%E6%88%AA%E5%B1%8F2026-04-09%2003.19.57.png" alt="使用反射模式真的是能提升效果的" style="max-width: 66%; height: auto; display: inline-block; border-radius: 8px;" />  <figcaption style="margin-top: 8px; font-size: 14px; color: #666;">使用反射模式真的是能提升效果的。所以人有多时候多反思一下自己，可能真的就能进步。🐶</figcaption></figure><h2 id="2-1-反思提升任务输出"><a href="#2-1-反思提升任务输出" class="headerlink" title="2.1 反思提升任务输出"></a>2.1 反思提升任务输出</h2><p>核心类比：人类会审查和修改草稿，AI也可以。</p><p><strong>邮件写作案例</strong>：</p><ul><li>AI生成V1 → 将V1反馈给LLM并附上反思prompt → 生成改进的V2</li></ul><p><strong>代码写作进阶路径</strong>：</p><ul><li>基础：LLM写代码V1 → LLM审查生成V2</li><li>进阶：用不同模型——一个生成，一个推理模型审查（1+1&gt;2）</li><li>终极：结合外部反馈——在沙箱执行V1，捕获错误，反馈给LLM生成V2</li></ul><blockquote><p>关键：反思是工程实践而非魔法；<strong>外部反馈是关键区分因素</strong></p></blockquote><h2 id="2-2-Internal——自我-反思的两条黄金法则"><a href="#2-2-Internal——自我-反思的两条黄金法则" class="headerlink" title="2.2 Internal——自我 反思的两条黄金法则"></a>2.2 Internal——自我 反思的两条黄金法则</h2><ol><li><strong>明确指示反思动作</strong>：说”审查””检查””验证”（具体的事情），不只是”改进”。<strong>客观任务</strong>：构建ground truth数据集 + 自动化代码评估（如SQL查询正确性）</li><li><strong>指定具体标准</strong>：列出明确的评估维度（如”专业语气””事实准确”）。<strong>主观任务</strong>：用**评分标准（Rubric）**引导LLM打分，避免直接比较（存在位置偏差）</li></ol><p>论文”Self-refine”表明：在全部7个任务、4个模型上，反思一致地提升了性能。</p><h2 id="2-3-external——外部-突破性能天花板"><a href="#2-3-external——外部-突破性能天花板" class="headerlink" title="2.3 external——外部 突破性能天花板"></a>2.3 external——外部 突破性能天花板</h2><p>三条性能曲线：</p><ul><li>🔴 无反思：prompt工程快速提升后<strong>停滞</strong></li><li>🔵 有反思：突破停滞到更高水平</li><li>🟡 反思+外部反馈：再次突破到最高水平</li></ul><p>三类外部反馈：正则匹配（避免提竞品）→ 搜索验证（事实核查）→ 字数检查（格式约束）</p><blockquote><p>外部反馈打破模型信息孤岛，解决固有弱点（精确计数、事实核实），实现闭环优化。</p></blockquote><figure style="text-align: center; margin: 24px 0;">  <img src="https://cdn.jsdelivr.net/gh/youdrew/MyPicGo/Images/%E6%88%AA%E5%B1%8F2026-04-09%2003.33.25.png" alt="截屏2026-04-09 03.33.25" style="max-width: 100%; height: auto; display: inline-block; border-radius: 8px;" />  <figcaption style="margin-top: 8px; font-size: 14px; color: #666;">反思与外部反馈的性能曲线对比</figcaption></figure><hr><h1 id="三、工具使用（Tool-Use）"><a href="#三、工具使用（Tool-Use）" class="headerlink" title="三、工具使用（Tool Use）"></a>三、工具使用（Tool Use）</h1><h2 id="3-1-工具的本质"><a href="#3-1-工具的本质" class="headerlink" title="3.1 工具的本质"></a>3.1 工具的本质</h2><p><strong>工具就是函数</strong>，模型自主决定何时使用。</p><p>关键能力——<strong>条件调用</strong>：模型智能判断何时需要工具。</p><ul><li>“绿茶含多少咖啡因？” → 从内部知识回答</li><li>“现在几点？” → 调用get_current_time工具</li></ul><p>多工具协作：日历助手可链式调用 check_calendar → make_appointment</p><h2 id="3-2-LLM-是如何”调用”函数的？"><a href="#3-2-LLM-是如何”调用”函数的？" class="headerlink" title="3.2 LLM 是如何”调用”函数的？"></a>3.2 LLM 是如何”调用”函数的？</h2><blockquote><p>LLM 理论上根本触碰不到函数的执行层——它从头到尾只做一件事：<strong>生成文本</strong>。<br>所谓的”调用函数”，本质是一个<strong>文本中转协议</strong>。</p></blockquote><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line">sequenceDiagram</span><br><span class="line">    participant U as 用户</span><br><span class="line">    participant S as 系统/工程代码</span><br><span class="line">    participant L as LLM</span><br><span class="line"></span><br><span class="line">    U-&gt;&gt;L: &quot;现在几点了？&quot;</span><br><span class="line"></span><br><span class="line">    Note over S,L: Step 1: 系统 prompt 告诉 LLM 可用工具</span><br><span class="line">    Note right of L: 系统 prompt 中包含：&lt;br/&gt;函数名: get_current_time&lt;br/&gt;描述: 获取当前时间&lt;br/&gt;参数: 无</span><br><span class="line"></span><br><span class="line">    Note over S,L: Step 2: LLM 输出结构化文本（没有执行任何东西！）</span><br><span class="line">    L--&gt;&gt;S: tool_calls: [&#123;&lt;br/&gt;  name: &quot;get_current_time&quot;,&lt;br/&gt;  arguments: &#123;&#125;&lt;br/&gt;&#125;]</span><br><span class="line"></span><br><span class="line">    Note over S: Step 3: 外层代码拦截&lt;br/&gt;真正执行函数&lt;br/&gt;拿到结果 &quot;15:20:45&quot;</span><br><span class="line"></span><br><span class="line">    Note over S,L: Step 4: 把结果塞回 LLM 上下文</span><br><span class="line">    S-&gt;&gt;L: &#123; role: &quot;tool&quot;,&lt;br/&gt;  content: &quot;15:20:45&quot; &#125;</span><br><span class="line"></span><br><span class="line">    L--&gt;&gt;S: &quot;现在是下午3点20分。&quot;</span><br><span class="line">    S-&gt;&gt;U: &quot;现在是下午3点20分。&quot;</span><br></pre></td></tr></table></figure><p><strong>三个关键认知</strong>：</p><ol><li><p><strong>LLM 不是在”调用”函数</strong> — 它只是在预测”接下来应该输出一段 JSON 表示我想用这个工具”。这是从大量代码和 API 文档训练中学到的模式。</p></li><li><p><strong>真正执行函数的是外层工程代码</strong> — Claude Code、OpenAI SDK、LangChain 这些框架负责解析 LLM 输出的结构化文本，执行函数，再把结果喂回去。</p></li><li><p><strong>LLM 的核心能力在于”判断”</strong> — 它决定”这个用户问题我该用内部知识回答，还是该调工具”。笔记 3.1 里”绿茶咖啡因”（内部知识）vs “现在几点”（调工具）就是这个判断。</p></li></ol><ul><li>早期需要手写prompt模板触发（如”FUNCTION: get_current_time()”），现代LLM已原生理解工具调用，无需硬编码触发语法。</li></ul><h2 id="3-3-工具语法与AI-SDK（写好函数让LLM调用）"><a href="#3-3-工具语法与AI-SDK（写好函数让LLM调用）" class="headerlink" title="3.3 工具语法与AI SDK（写好函数让LLM调用）"></a>3.3 工具语法与AI SDK（写好函数让LLM调用）</h2><p><a href="https://github.com/andrewyng/aisuite">AI SDK</a>（Andrew Ng团队出品）统一多家LLM提供商访问：</p><ul><li>函数名 → Python函数名</li><li>描述 → docstring</li><li>参数类型 → 自动提取</li></ul><h2 id="3-4-模型自己写代码（LLM自己写自己调）"><a href="#3-4-模型自己写代码（LLM自己写自己调）" class="headerlink" title="3.4 模型自己写代码（LLM自己写自己调）"></a>3.4 模型自己写代码（LLM自己写自己调）</h2><p>传统方式（预定义add&#x2F;subtract等）vs 代码执行（让模型自己写代码）</p><ul><li>模型输出<code>&lt;execute_python&gt;</code>标签中的代码</li><li>在沙箱中提取执行</li><li>错误信息反馈给模型进行反思修改</li></ul><blockquote><p>⚠️ 安全警告：真实案例——Agentic代码执行器运行<code>rm *.py</code>删除了项目所有文件。必须使用沙箱环境（Docker、E2B）。</p></blockquote><h2 id="3-5-MCP（Model-Context-Protocol）"><a href="#3-5-MCP（Model-Context-Protocol）" class="headerlink" title="3.5 MCP（Model Context Protocol）"></a>3.5 MCP（Model Context Protocol）</h2><p>MCP标准化LLM访问外部工具和数据源的方式。让LLM调用的“工具范围”更大。</p><ul><li><strong>问题</strong>：m个应用 × n个工具 &#x3D; m×n的工作量</li><li><strong>MCP方案</strong>：建n个共享MCP Server，m个应用连接 → 工作量降为 m+n</li><li><strong>Client</strong>：需要工具的应用（Cursor、Claude Desktop等）</li><li><strong>Server</strong>：工具&#x2F;数据提供者（Slack、GitHub、PostgreSQL等）</li></ul><hr><h1 id="四、构建Agentic-AI的实用技巧"><a href="#四、构建Agentic-AI的实用技巧" class="headerlink" title="四、构建Agentic AI的实用技巧"></a>四、构建Agentic AI的实用技巧</h1><h2 id="4-1-评估（Evals）实战"><a href="#4-1-评估（Evals）实战" class="headerlink" title="4.1 评估（Evals）实战"></a>4.1 评估（Evals）实战</h2><p>评估方式可以从两个维度划分，形成一个 2x2 的矩阵，用于指导评估的设计：</p><table><thead><tr><th>评估维度</th><th>客观评估 (Objective Evals) （用代码检查）</th><th>主观评估 (Subjective Evals) （用 LLM 作为评判者）</th></tr></thead><tbody><tr><td>每个问题有唯一正确答案 (Per-Example Ground Truth)</td><td>案例一：发票日期提取 (每个发票有不同的正确日期，用代码检查是否匹配)</td><td>案例三：统计黄金标准点 (每个主题有不同的重要观点，用 LLM 检查是否充分提及)</td></tr><tr><td>只有统一规则 &#x2F; 格式 &#x2F; 标准，没有固定答案 (No Per-Example Ground Truth)</td><td>案例二：营销文案长度 (所有标题都要求是 10 个词，用代码检查是否符合统一标准)</td><td>评分标准评估 (Rubric Grading) (例如，根据统一的清晰度评分标准来评估图表)</td></tr></tbody></table><ol><li>从快速而粗糙的评估开始： 不要因为觉得评估是一个大型项目，就不敢轻易建立，或者花漫长的时间去做理论调研。先用 10-20 个例子开始，快速获得一些指标来辅助人工观察。</li><li>迭代改进评估：<ol><li>随着系统和评估的成熟，可以增加评估集的规模。</li><li>如果系统改进了但评估分数没有提高，意味着该改进评估本身了。</li></ol></li><li>以专业人士的行为为灵感： 对于自动化人类任务的系统，观察系统在哪些方面性能不如人类专家，以此作为下一阶段工作的重点。</li></ol><h2 id="4-2-错误分析与优先级"><a href="#4-2-错误分析与优先级" class="headerlink" title="4.2 错误分析与优先级"></a>4.2 错误分析与优先级</h2><p>系统复杂度上升后，<strong>直觉驱动debug不可靠</strong>，需要系统化分析。</p><p>核心方法：</p><ol><li><strong>检查traces和中间输出</strong>：每步输出叫”span”，合在一起叫”trace”</li><li><strong>聚焦错误案例并量化</strong>：建表格追踪各组件失败率<ul><li>例：搜索结果不满意45% vs 搜索关键词生成5% → 优先改搜索组件</li><li>习惯性地多看看LLM和工具的交流过程。</li></ul></li></ol><h2 id="4-3-组件级评估"><a href="#4-3-组件级评估" class="headerlink" title="4.3 组件级评估"></a>4.3 组件级评估</h2><p>类比单元测试 vs 集成测试。优势：更快迭代、信号更清晰、团队可并行。</p><p>工作流：错误分析定位问题组件 → 组件级评估调优 → 端到端评估验证整体改善</p><h2 id="4-4-解决问题的策略"><a href="#4-4-解决问题的策略" class="headerlink" title="4.4 解决问题的策略"></a>4.4 解决问题的策略</h2><p><strong>非LLM组件</strong>：调参数&#x2F;超参数（搜索结果数、RAG相似度阈值）、换供应商</p><p><strong>LLM组件</strong>（按优先级）：</p><ol><li>改进Prompt（明确指令、few-shot示例）</li><li>尝试不同LLM（用eval测试多个模型）</li><li>任务分解（将复杂步骤拆为生成+反思）</li><li>微调（最后手段，成本最高）</li></ol><h2 id="4-5-延迟与成本优化"><a href="#4-5-延迟与成本优化" class="headerlink" title="4.5 延迟与成本优化"></a>4.5 延迟与成本优化</h2><blockquote><p>早期团队，<strong>输出质量远比延迟和成本重要</strong>。先优化质量，再优化延迟，最后优化成本。<br>还是以组建化的思想，先分析哪个组件最慢&#x2F;最贵，再针对性优化（如改prompt、换模型、减少调用频率）。</p></blockquote><h2 id="4-6-开发过程四阶段"><a href="#4-6-开发过程四阶段" class="headerlink" title="4.6 开发过程四阶段"></a>4.6 开发过程四阶段</h2><table><thead><tr><th>阶段</th><th>核心</th><th>分析活动</th></tr></thead><tbody><tr><td>1. 快速原型</td><td>端到端先跑通（”先造垃圾”）</td><td>手动检查输出、阅读traces</td></tr><tr><td>2. 初始评估</td><td>超越手动观察</td><td>建10-20例端到端eval</td></tr><tr><td>3. 严格分析</td><td>需要精确改善方向</td><td>错误分析，量化组件失败率</td></tr><tr><td>4. 高效调优</td><td>系统成熟，组件级改善</td><td>组件级eval</td></tr></tbody></table><blockquote><p>开发者两大活动：<strong>构建</strong>（写代码）和<strong>分析</strong>（决定聚焦哪里）。团队常花太多时间构建、太少时间分析。</p></blockquote><hr><h1 id="五、高度自治智能体的模式"><a href="#五、高度自治智能体的模式" class="headerlink" title="五、高度自治智能体的模式"></a>五、高度自治智能体的模式</h1><h2 id="5-1-规划工作流（Planning）"><a href="#5-1-规划工作流（Planning）" class="headerlink" title="5.1 规划工作流（Planning）"></a>5.1 规划工作流（Planning）</h2><p>规划模式：Agent<strong>自主决定</strong>工具调用序列，不硬编码。</p><p>案例——客服助手（工具：查描述、查价格、查库存、查订单、处理购买、处理退货）：</p><ul><li>用户问”有100刀以下的圆墨镜吗？”</li><li>LLM规划：查描述 → 查库存 → 查价格 → 输出答案</li></ul><p><strong>优势</strong>：能力丰富，无需预编排。<strong>风险</strong>：无法预测LLM的计划，可能不稳定。</p><h2 id="5-2-结构化计划"><a href="#5-2-结构化计划" class="headerlink" title="5.2 结构化计划"></a>5.2 结构化计划</h2><p>自然语言计划有歧义 → 要求LLM输出<strong>结构化计划</strong>（JSON&#x2F;XML）：</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">[</span></span><br><span class="line">  <span class="punctuation">&#123;</span><span class="attr">&quot;description&quot;</span><span class="punctuation">:</span> <span class="string">&quot;查找圆墨镜&quot;</span><span class="punctuation">,</span> <span class="attr">&quot;tool&quot;</span><span class="punctuation">:</span> <span class="string">&quot;get_item_descriptions&quot;</span><span class="punctuation">,</span> <span class="attr">&quot;arguments&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span><span class="attr">&quot;query&quot;</span><span class="punctuation">:</span> <span class="string">&quot;round sunglasses&quot;</span><span class="punctuation">&#125;</span><span class="punctuation">&#125;</span><span class="punctuation">,</span></span><br><span class="line">  <span class="punctuation">&#123;</span><span class="attr">&quot;description&quot;</span><span class="punctuation">:</span> <span class="string">&quot;检查库存&quot;</span><span class="punctuation">,</span> <span class="attr">&quot;tool&quot;</span><span class="punctuation">:</span> <span class="string">&quot;check_inventory&quot;</span><span class="punctuation">,</span> <span class="attr">&quot;arguments&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span><span class="attr">&quot;items&quot;</span><span class="punctuation">:</span> <span class="string">&quot;$step1_result&quot;</span><span class="punctuation">&#125;</span><span class="punctuation">&#125;</span></span><br><span class="line"><span class="punctuation">]</span></span><br></pre></td></tr></table></figure><h2 id="5-3-代码即计划-Code-As-Action"><a href="#5-3-代码即计划-Code-As-Action" class="headerlink" title="5.3 代码即计划 Code As Action"></a>5.3 代码即计划 Code As Action</h2><figure style="text-align: center; margin: 24px 0;">  <img src="https://cdn.jsdelivr.net/gh/youdrew/MyPicGo/Images/%E6%88%AA%E5%B1%8F2026-04-12%2003.48.14.png" alt="截屏2026-04-12 03.48.14" style="max-width: 100%; height: auto; display: inline-block; border-radius: 8px;" />  <figcaption style="margin-top: 8px; font-size: 14px; color: #666;">Code As Action — HuggingFace smolagents</figcaption></figure><p>参考HuggingFace smolagents的CodeAgent概念——让LLM直接写代码表达多步计划。</p><p>优势：可调用大型库（Pandas数百函数）、表达力强、研究显示性能优于JSON&#x2F;文本计划。<br>风险：明确要求LLM编写的代码并需要沙箱执行。</p><h2 id="5-4-多智能体工作流"><a href="#5-4-多智能体工作流" class="headerlink" title="5.4 多智能体工作流"></a>5.4 多智能体工作流</h2><p>即使所有Agent用同一个LLM，拆分复杂任务为独立角色也更有效。</p><blockquote><p>我个人觉得可能是因为提示词&#x2F;上下文不同，导致模型的关注点是不一样的。</p></blockquote><p><strong>优势</strong>：</p><ol><li><strong>任务分解</strong>：按角色&#x2F;技能自然分工</li><li><strong>聚焦</strong>：开发者一次构建一个角色；更简单的任务 &#x3D; 更好的输出</li><li><strong>模块化复用</strong>：通用Agent（如”图表设计师”）可跨应用复用</li><li><strong>突破上下文限制</strong>：每个Agent处理自己的上下文（对128k上下文限制至关重要）</li><li><strong>成本节省</strong>：更短的上下文 &#x3D; 更少的token &#x3D; 更低的成本和更快的响应</li></ol><h2 id="5-5-四种通信模式"><a href="#5-5-四种通信模式" class="headerlink" title="5.5 四种通信模式"></a>5.5 四种通信模式</h2><table><thead><tr><th>模式</th><th>结构</th><th>优点</th><th>缺点</th><th>适用场景</th></tr></thead><tbody><tr><td>**线性 **Linear</td><td>顺序单向传递</td><td>简单</td><td>不灵活</td><td>固定流程任务</td></tr><tr><td>**层级（两层）**Hierarchy</td><td>Manager协调所有下属</td><td>易控制</td><td>Manager瓶颈</td><td>多任务协调</td></tr><tr><td><strong>深层层级</strong> Deep Hierarchy</td><td>子Agent有自己的子Agent</td><td>可扩展、模块化</td><td>复杂难调试</td><td>大型系统</td></tr><tr><td>**全连接（去中心化）**All to all</td><td>所有Agent自由通信</td><td>有创意</td><td>结果不可预测</td><td>探索&#x2F;生成任务</td></tr></tbody></table><blockquote><p>当前LLM能力下，线性和层级模式更实用（层级越深信息损失越大）。</p><p>其实在这四种模式外，还有一种对话模式，比较类似去中心模式的降级版。对话模式每次都只有两个Agent互相对话交流，一方执行任务，另一方审查任务，最终交出一份双方都满意的结果。</p></blockquote><h2 id="5-6-框架推荐"><a href="#5-6-框架推荐" class="headerlink" title="5.6 框架推荐"></a>5.6 框架推荐</h2><ul><li><strong>LangChain</strong>：线性工作流</li><li><strong>smolagents</strong>：层级工作流（作者推荐——简单、低抽象、@tool装饰器易开发）</li><li><strong>MetaGPT &#x2F; CamelAI</strong>：去中心化工作流</li></ul><hr><h1 id="总结与个人思考"><a href="#总结与个人思考" class="headerlink" title="总结与个人思考"></a>总结与个人思考</h1><blockquote><p>我之前在公司构建了一个通过ClaudeCode调用MCP去检查UE资产和打包报错Log的Skill（但或许这个Skill其实并不算Agentic AI），如果用Agentic AI的思维来做这个项目的话，它可能能做得更完善一点。而且我非常惊讶，吴恩达课程里面的几个测试项目的稳定性。</p></blockquote><ul><li><ol><li>规划（Planning）：定时器触发后，ClaudeCode先分析报错日志，判断需要调用哪个工具（查文档、查代码、查历史报错记录等），然后再执行工具，甚至自行编写数据库查询代码。</li></ol></li><li><ol start="2"><li>反思（Reflection）：ClaudeCode在得到工具结果后，先进行自我审查，看看结果是否有用，如果不满意就调整查询参数重新调用工具，直到得到满意的结果。</li></ol></li><li><ol start="3"><li>多智能体（Multi-Agent）：可以设计多个专门的Agent，比如一个专门分析日志的Agent，一个专门查询文档的Agent，一个专门查询代码的Agent，它们通过共享上下文进行协作。</li></ol></li><li><ol start="4"><li>评估（Evals）：可以设计一些自动化的评估脚本，来量化ClaudeCode在解决报错问题上的表现，比如成功率、平均解决时间等指标。（每一个结果执行完成之后，会有一个Json表格自动上传服务器，然后管理员每周看统计效果。并且可以让用户回答该AI解决思路是否解决了你的问题，组建一个问题解决方案数据库，这样AI遇到了类似问题就可以参考之前的解决方案）。</li></ol></li></ul><p>另外，不同的模型可能适用于不同的harness，因为模型的能力不太一样（李宏毅的课程提到了，比如说sonnet 他对于上下文会有焦虑，所以当内容很多的时候，它会出现明显的能力下降)。</p><p>最后，回到 Tool Use 这个设计模式，一个关键的实践洞察是：<strong>MCP 工具的设计质量直接决定了 Agent 的能力上限</strong>。结合我在 UE MCP 项目中的经验，总结出以下六条工具设计原则：</p><ol><li><strong>Description 是最重要的设计——描述即接口</strong>：MCP 工具的调用方是 LLM 而非人类，LLM 靠 <code>description</code> 字段判断”什么时候该调用、怎么调用”。好的描述要包含：做什么、什么时候用、边界限制、参数语义、返回值含义。描述写得烂，工具就是死的。</li><li><strong>粒度控制——以子系统为边界</strong>：工具太细（如按坐标轴拆分节点创建）导致调用链过长、容易出错累积；太粗（如一句话生成整个角色蓝图）变成黑盒，出错了 LLM 无法定位。以引擎子系统为边界划分，每个工具做一件完整的事。</li><li><strong>返回值要”对 LLM 友好”</strong>：返回值必须包含足够的决策上下文——成功时提示下一步可用的操作，失败时给出 <code>error_type</code>、<code>error_message</code> 和 <code>suggestion</code>，让 Agent 能自我纠正而不是盲重试。</li><li><strong>读写分离，副作用透明</strong>：LLM 在不确定时倾向于调用”看起来安全”的工具。只读工具和写操作要明确分类，写操作在描述里标注副作用（如”会在磁盘上创建新文件”、”不可逆操作”）。</li><li><strong>幂等性设计，让 LLM 敢于重试</strong>：LLM 可能因超时或误判而重复调用同一工具，设计为重复调用安全（如：资产已存在则返回现有资产而非报错）。</li><li><strong>分层工具结构</strong>：高层工具（面向任务的完整工作流，如 <code>setup_character_blueprint()</code>）减少调用次数；中层工具（面向单步操作）保证灵活性；底层 API 不直接暴露给 LLM。描述里引导 LLM 优先走高层路径。</li></ol><p>核心一句话：<strong>好用的 MCP 工具设计，本质是”让 LLM 像一个读过文档的开发者一样能正确使用它”。</strong> 这和吴恩达课程中 Tool Use 模式的核心思想完美对应——工具的质量决定了 Agent 自主决策的上限。</p><h1 id="参考资料"><a href="#参考资料" class="headerlink" title="参考资料"></a>参考资料</h1><ol><li><a href="https://www.bilibili.com/video/BV1DfrdByE2H">https://www.bilibili.com/video/BV1DfrdByE2H</a> 课程地址</li><li><a href="https://github.com/datawhalechina/agentic-ai/tree/main">原github笔记</a>:这里面有可以运行的代码，可以通过在VSCode里面的Jupyter Notebook的形式来学习，会很方便。</li><li>👆视频里面提到后面部分存在的另外一个视频：<a href="https://www.deeplearning.ai/short-courses/agentic-knowledge-graph-construction/">代理知识图谱构建</a></li><li>原始的视频<a href="https://www.deeplearning.ai/courses/agentic-ai/">地址</a></li><li><a href="https://www.bilibili.com/video/BV1vrQbBBE6Z/">李宏毅课程</a>：这个课程有点像 Agent</li></ol>]]></content:encoded>
      
      
      
      <category domain="https://eugenepage.com/tags/AI/">AI</category>
      
      <category domain="https://eugenepage.com/tags/AgenticAI/">AgenticAI</category>
      
      <category domain="https://eugenepage.com/tags/Framework/">Framework</category>
      
      
      <comments>https://eugenepage.com/zh-CN/2026/04/08/20260408.AgenticAIDevByAndrew/#disqus_thread</comments>
      
    </item>
    
    <item>
      <title>Claude Code Tips &amp; Workflows</title>
      <link>https://eugenepage.com/2026/03/22/20260322.Claude%20Code%20Tips/</link>
      <guid>https://eugenepage.com/2026/03/22/20260322.Claude%20Code%20Tips/</guid>
      <pubDate>Sun, 22 Mar 2026 02:00:00 GMT</pubDate>
      
        
        
      <description>&lt;h1 id=&quot;Claude-Code-Tips-Workflows&quot;&gt;&lt;a href=&quot;#Claude-Code-Tips-Workflows&quot; class=&quot;headerlink&quot; title=&quot;Claude Code Tips &amp;amp; Workflows&quot;&gt;&lt;/a&gt;Cl</description>
        
      
      
      
      <content:encoded><![CDATA[<h1 id="Claude-Code-Tips-Workflows"><a href="#Claude-Code-Tips-Workflows" class="headerlink" title="Claude Code Tips &amp; Workflows"></a>Claude Code Tips &amp; Workflows</h1><p>This post documents the stable workflows, prompting strategies, and plugin usage patterns I’ve settled on while working with Claude Code. The goal isn’t to list every feature — it’s to capture the practices that actually improve delivery speed.</p><hr><h2 id="What-Claude-Code-Is-Good-At"><a href="#What-Claude-Code-Is-Good-At" class="headerlink" title="What Claude Code Is Good At"></a>What Claude Code Is Good At</h2><p>From the perspective of game development, toolchain work, and content engineering, Claude Code is best suited for:</p><ul><li>Understanding existing project structure</li><li>Batch refactoring scripts or tooling code</li><li>Adding tests, documentation, or scaffolding</li><li>Integrating third-party SDKs, service APIs, or CLI tools</li><li>Mid-complexity multi-file changes</li></ul><p>Its strengths aren’t about “just throwing everything at it and hoping for the best.” It’s more about:</p><ul><li>Reliable long-context comprehension</li><li>Strong performance on code explanation, refactoring, and summarization</li><li>Workflows that follow an analyze-first, then execute pattern</li></ul><p>If your tasks are very granular and real-time — like line-level autocomplete while you type — an in-IDE completion tool is still more direct.</p><hr><h2 id="Claude-Configuration-File-Locations"><a href="#Claude-Configuration-File-Locations" class="headerlink" title="Claude Configuration File Locations"></a>Claude Configuration File Locations</h2><table><thead><tr><th></th><th><code>project-root/CLAUDE.md</code></th><th><code>~/.claude/settings.json</code></th><th><code>~/.claude.json</code></th></tr></thead><tbody><tr><td><strong>Purpose</strong></td><td>Instructions for Claude</td><td>User configuration</td><td>Internal system state</td></tr><tr><td><strong>Written by</strong></td><td>User (manually)</td><td>User (manually edited)</td><td>Claude Code (auto-maintained)</td></tr><tr><td><strong>Contents</strong></td><td>Project conventions, coding style, workflow agreements</td><td>API keys, model config, permission rules, MCP servers</td><td>Startup count, tool usage stats, per-project session records</td></tr><tr><td><strong>Analogy</strong></td><td><code>.editorconfig</code> &#x2F; <code>.eslintrc</code></td><td>VS Code <code>settings.json</code></td><td>VS Code <code>state.vscdb</code></td></tr><tr><td><strong>Version control</strong></td><td>Should be committed to git</td><td>Do not commit (contains keys)</td><td>Do not commit (contains user ID)</td></tr><tr><td><strong><code>env</code></strong></td><td>—</td><td>API endpoint, model name</td><td>—</td></tr><tr><td><strong><code>permissions</code></strong></td><td>—</td><td>Tool allowlist</td><td>—</td></tr><tr><td><strong><code>mcpServers</code></strong></td><td>—</td><td>Global MCP servers</td><td>—</td></tr><tr><td><strong><code>enabledPlugins</code></strong></td><td>—</td><td>Plugin toggles</td><td>—</td></tr><tr><td><strong><code>numStartups</code></strong></td><td>—</td><td>—</td><td>Total startup count</td></tr><tr><td><strong><code>projects</code></strong></td><td>—</td><td>—</td><td>Per-project MCP, trust state, session stats</td></tr><tr><td><strong><code>toolUsage</code></strong></td><td>—</td><td>—</td><td>Call count and last-used time per tool</td></tr><tr><td><strong><code>tipsHistory</code></strong></td><td>—</td><td>—</td><td>Tips that have already been shown</td></tr><tr><td>Notes</td><td>This file is always present in the context window.</td><td></td><td></td></tr></tbody></table><hr><h2 id="My-Stable-Workflow"><a href="#My-Stable-Workflow" class="headerlink" title="My Stable Workflow"></a>My Stable Workflow</h2><h3 id="Ask-It-to-Map-the-Impact-First"><a href="#Ask-It-to-Map-the-Impact-First" class="headerlink" title="Ask It to Map the Impact First"></a>Ask It to Map the Impact First</h3><p>Before touching shared modules, base libraries, or build scripts, I’ll ask:</p><figure class="highlight markdown"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">If I change this function signature, which callers will be affected?</span><br><span class="line">Please list them by file and flag the high-risk points.</span><br></pre></td></tr></table></figure><p>This step is genuinely valuable for avoiding unintended breakage.</p><h3 id="Ask-for-Verification-Steps-Along-the-Way"><a href="#Ask-for-Verification-Steps-Along-the-Way" class="headerlink" title="Ask for Verification Steps Along the Way"></a>Ask for Verification Steps Along the Way</h3><p>Beyond just the code changes, I’ll also request:</p><figure class="highlight markdown"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">After completing the changes, please add:</span><br><span class="line"><span class="bullet">1.</span> How to verify the result</span><br><span class="line"><span class="bullet">2.</span> Recommended commands to run</span><br><span class="line"><span class="bullet">3.</span> Edge cases that might fail</span><br></pre></td></tr></table></figure><p>This brings the final output much closer to a deliverable state.</p><h3 id="Be-Explicit-About-What-Not-to-Do"><a href="#Be-Explicit-About-What-Not-to-Do" class="headerlink" title="Be Explicit About What Not to Do"></a>Be Explicit About What Not to Do</h3><p>For example:</p><ul><li>Don’t change public APIs</li><li>Don’t introduce new dependencies</li><li>Don’t touch UI styles</li><li>Don’t modify the database schema</li></ul><p>Negative constraints like these are critical, especially in existing projects.</p><ul><li>Side note: Claude Sonnet tends to suffer from fairly serious context anxiety. It’s best used for short, quick tasks that can be completed in a single shot.</li></ul><hr><h2 id="Plugins"><a href="#Plugins" class="headerlink" title="Plugins"></a>Plugins</h2><p>The real value of plugins and external tool integrations isn’t “more features” — it’s turning Claude Code from something that only talks into something that can look things up, run things, and verify things.</p><p>My usual criteria come down to three questions:</p><ol><li>Does it reduce manual context switching?</li><li>Does it ground the analysis in the actual codebase?</li><li>Does it form a stable workflow, not just a one-off demo?</li></ol><h3 id="ralph-loop-Loop-Plugin"><a href="#ralph-loop-Loop-Plugin" class="headerlink" title="ralph-loop (Loop Plugin)"></a><a href="https://awesomeclaude.ai/ralph-wiggum">ralph-loop</a> (Loop Plugin)</h3><p>I think of ralph-loop as a “loop execution framework” or a “task closure enhancer” rather than a simple plugin.</p><p>It works well for:</p><ul><li>Tasks that require multiple rounds of analysis, execution, and checking</li><li>Situations where you want AI to iterate at a fixed cadence</li><li>Breaking large tasks into observable, reviewable rounds</li></ul><h4 id="Auto-Approving-Confirmations"><a href="#Auto-Approving-Confirmations" class="headerlink" title="Auto-Approving Confirmations"></a>Auto-Approving Confirmations</h4><p>If your goal is simply “let it edit files in the current workspace without asking every time,” reach for Claude Code’s built-in permission modes first, rather than expecting the plugin to bypass confirmations.</p><p>I think about this in two tiers:</p><h4 id="Option-1-Auto-approve-file-edits-only"><a href="#Option-1-Auto-approve-file-edits-only" class="headerlink" title="Option 1: Auto-approve file edits only"></a>Option 1: Auto-approve file edits only</h4><p>This is the safer approach.</p><p>Claude Code has an <code>acceptEdits</code> mode whose core effect is:</p><ul><li>File edits within the workspace can be batch-accepted</li><li>Commands, network requests, and other side-effectful operations will still prompt you</li></ul><p>If your main frustration is the “can I edit this file?” prompt, this is the tier to use.</p><p>You can check the current mode via <code>/config</code> or <code>/permissions</code>, or explicitly set it in a settings file:</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">  <span class="attr">&quot;$schema&quot;</span><span class="punctuation">:</span> <span class="string">&quot;https://json.schemastore.org/claude-code-settings.json&quot;</span><span class="punctuation">,</span></span><br><span class="line">  <span class="attr">&quot;permissions&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;defaultMode&quot;</span><span class="punctuation">:</span> <span class="string">&quot;acceptEdits&quot;</span></span><br><span class="line">  <span class="punctuation">&#125;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>Common placement options:</p><ul><li>Global effect: <code>~/.claude/settings.json</code></li><li>Current repo only (local): <code>.claude/settings.local.json</code></li><li>Team-shared config: <code>.claude/settings.json</code></li></ul><p>If you only want to open this up on your own machine for the current project, <code>.claude/settings.local.json</code> is the best fit — it won’t be committed to the repo and won’t affect other projects.</p><h4 id="Option-2-Skip-command-confirmations-too"><a href="#Option-2-Skip-command-confirmations-too" class="headerlink" title="Option 2: Skip command confirmations too"></a>Option 2: Skip command confirmations too</h4><p>If you want Claude Code to skip confirmations for command execution and tool calls as well, you can pass a startup flag:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">claude --dangerously-skip-permissions</span><br></pre></td></tr></table></figure><p>This mode skips all permission prompts outright. It’s appropriate when you fully trust the current repo and explicitly accept that it may automatically edit files, run commands, and call tools.</p><p>The risk level here is noticeably higher — I wouldn’t leave it on by default. Better suited for:</p><ul><li>Your own personal repos</li><li>Well-isolated test environments</li><li>Short-term high-frequency iteration tasks</li></ul><p>I’d avoid using it for:</p><ul><li>Repos of unknown origin</li><li>Projects with sensitive configs mixed in</li><li>Projects that can trigger deployments, publishing, or database operations</li></ul><h4 id="Practical-Recommendation"><a href="#Practical-Recommendation" class="headerlink" title="Practical Recommendation"></a>Practical Recommendation</h4><p>If your concern is “I don’t want ralph-loop to keep asking me during multi-round execution,” the priority order should be:</p><ol><li>Switch the default permission mode to <code>acceptEdits</code> first</li><li>Apply it only to the current repo via <code>.claude/settings.local.json</code></li><li>Only switch to <code>--dangerously-skip-permissions</code> if even command confirmations are breaking your flow</li></ol><p>In short:</p><ul><li>Just want to skip file edit confirmations → use <code>acceptEdits</code></li><li>Want to skip basically all confirmations → use <code>--dangerously-skip-permissions</code></li></ul><p>The first is for daily use; the second is for when you’ve consciously accepted the risks.</p><p>The bottom line: ralph-loop doesn’t make Claude Code “smarter” — it makes your task orchestration more stable.</p><h3 id="Oh-My-Claude-Code-OMC-—-Claude-Agent-Suite"><a href="#Oh-My-Claude-Code-OMC-—-Claude-Agent-Suite" class="headerlink" title="Oh My Claude Code (OMC — Claude Agent Suite)"></a>Oh My Claude Code (OMC — Claude Agent Suite)</h3><p>A zero-learning-curve tool for getting into Claude workflows. Provides a pre-configured collection of agents.</p><h4 id="Installation"><a href="#Installation" class="headerlink" title="Installation"></a>Installation</h4><p>I use npm. Run in your terminal:<br><code>npm i -g oh-my-claude-sisyphus@latest</code><br>Then <code>omc setup</code> to complete the setup.<br><a href="https://github.com/Yeachan-Heo/oh-my-claudecode/blob/main/README.md">Project site</a></p><h4 id="Five-Available-Modes"><a href="#Five-Available-Modes" class="headerlink" title="Five Available Modes"></a>Five Available Modes</h4><p>🔸 Autopilot (fully autonomous) — end-to-end automation from planning through implementation to testing<br>🔸 Ultra Pilot (parallel acceleration) — up to 5 parallel workers running simultaneously, ~5x throughput<br>🔸 Swarm (collaborative team) — multiple agents collaborate like a dev team, pulling from a shared task pool<br>🔸 Pipeline (sequential) — agents chained in a fixed order, ideal for workflows that must proceed step by step<br>🔸 EcoCode (economy mode) — maximizes token savings while maintaining efficiency</p><h4 id="Status-Bar-in-the-New-Claude-CLI"><a href="#Status-Bar-in-the-New-Claude-CLI" class="headerlink" title="Status Bar in the New Claude CLI"></a>Status Bar in the New Claude CLI</h4><p>[OMC#4.11.2] | session:33m | ctx:32% | T:17 A:1<br>● This is the OMC status bar. Here’s what each field means:</p><ul><li>[OMC#4.11.2] — OMC plugin version</li><li>session:33m — current session has been running for 33 minutes</li><li>ctx:32% — 32% of the context window has been used</li><li>T:17 — Tool calls: 17 tool invocations in this session</li><li>A:1 — Agents: 1 active sub-agent currently running</li></ul><h2 id="Common-Commands"><a href="#Common-Commands" class="headerlink" title="Common Commands"></a>Common Commands</h2><p>There’s a subtle distinction worth noting: <code>claude --resume</code> is a startup flag you run in the terminal, while <code>/xxx</code> commands are typed directly inside a Claude Code session.</p><ul><li><code>claude --resume</code>: Resume the most recent or a specific session. Great when the terminal closes unexpectedly or you want to pick up where you left off.</li><li><code>/rewind</code>: Roll the current session back to an earlier point. Very handy when a few rounds of thinking went sideways and you want to undo a stretch of work.</li><li><code>/resume</code> &#x2F; <code>/continue</code>: Resume an existing session from within a session, or open the session picker to continue a previous task.</li><li><code>/remote-control</code> &#x2F; <code>/rc</code>: Expose the current local session for remote control, so you can continue the task from <code>claude.ai/code</code> or a mobile device. Perfect for stepping away from your desk without losing context.</li><li><code>claude --dangerously-skip-permissions</code>: Skip all permission confirmations at startup — no more per-action approval prompts. Use this when you fully trust the repo and want Claude Code to autonomously edit files, run commands, and call tools. Highest risk mode; best kept for personal projects or isolated environments.</li><li><code>/compact</code>: Compress the current context into a summary and continue the same session. A lifesaver when the context is nearly full but you don’t want to re-explain the whole project. (Not supported by OMC.)</li><li><code>/doctor</code>: Check Claude Code’s installation, permissions, and configuration. When a command stops working, a plugin misbehaves, or the environment looks off, running this first usually saves a lot of debugging time.</li></ul><h2 id="Errors-Debugging"><a href="#Errors-Debugging" class="headerlink" title="Errors &amp; Debugging"></a>Errors &amp; Debugging</h2><h3 id="PowerShell-Garbled-Output-on-Chinese-Windows"><a href="#PowerShell-Garbled-Output-on-Chinese-Windows" class="headerlink" title="PowerShell Garbled Output on Chinese Windows"></a>PowerShell Garbled Output on Chinese Windows</h3><p><strong>Example error:</strong></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">● Bash(powershell.exe -NoProfile -Command &quot;...&quot;)</span><br><span class="line">  ⎿  Error: Exit code 1</span><br><span class="line">     ����λ�� ��:1 �ַ��: 169</span><br><span class="line">     + ... ref])|Out-Null;if(.Count -eq 0)&#123;&#x27;OK: no syntax errors&#x27;&#125;else&#123;|%&#123;extglo ...</span><br><span class="line">     +                                                                 ~</span><br><span class="line">     ������ʹ�ÿչ��Ԫ��</span><br><span class="line">         + CategoryInfo          : ParserError: (:) [], ParentContainsErrorRecordException</span><br><span class="line">         + FullyQualifiedErrorId : EmptyPipeElement</span><br></pre></td></tr></table></figure><p><strong>Root cause:</strong></p><p>Simplified Chinese Windows defaults to code page <strong>936 (GBK&#x2F;GB2312)</strong>, but Claude Code processes strings as <strong>UTF-8</strong> internally. When PowerShell outputs Chinese error messages encoded in GBK, Claude Code interprets them as UTF-8 → mojibake.</p><p>This is a <strong>very common issue</strong> affecting virtually all CJK (Chinese&#x2F;Japanese&#x2F;Korean) Windows users. English Windows is unaffected because code page 437 is ASCII-compatible with UTF-8.</p><p><strong>Fix: Set the environment variable in Claude Code settings</strong></p><p>Add to your project-level <code>.claude/settings.local.json</code>:</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">  <span class="attr">&quot;env&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;CHCP&quot;</span><span class="punctuation">:</span> <span class="string">&quot;65001&quot;</span></span><br><span class="line">  <span class="punctuation">&#125;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>Or for global effect, add it to <code>~/.claude/settings.json</code>.</p><p><strong>Why not change the system locale:</strong></p><p>Windows offers a “Beta: Use Unicode UTF-8 for worldwide language support” toggle that changes the system code page from 936 to 65001. It’s the most thorough fix, but it’s a global irreversible change that can break older Chinese software with hardcoded GBK encoding (legacy installers, older archive tools, etc.). The <code>env CHCP=65001</code> approach only affects Claude Code’s shell process — the rest of the system remains untouched.</p><p>You’ll need to restart your Claude Code session for the change to take effect.</p><hr><h2 id="References"><a href="#References" class="headerlink" title="References"></a>References</h2><p><a href="https://www.bilibili.com/video/BV12hf2B9E29/">Learning video by a Bilibili creator</a></p>]]></content:encoded>
      
      
      
      <category domain="https://eugenepage.com/tags/AI/">AI</category>
      
      <category domain="https://eugenepage.com/tags/Assistant/">Assistant</category>
      
      <category domain="https://eugenepage.com/tags/VibeCoding/">VibeCoding</category>
      
      <category domain="https://eugenepage.com/tags/MCP/">MCP</category>
      
      <category domain="https://eugenepage.com/tags/Claude/">Claude</category>
      
      
      <comments>https://eugenepage.com/2026/03/22/20260322.Claude%20Code%20Tips/#disqus_thread</comments>
      
    </item>
    
    <item>
      <title>Claude Code 使用技巧整理</title>
      <link>https://eugenepage.com/zh-CN/2026/03/22/20260322.Claude%20Code%20Tips/</link>
      <guid>https://eugenepage.com/zh-CN/2026/03/22/20260322.Claude%20Code%20Tips/</guid>
      <pubDate>Sun, 22 Mar 2026 02:00:00 GMT</pubDate>
      
        
        
      <description>&lt;h1 id=&quot;Claude-Code-使用技巧整理&quot;&gt;&lt;a href=&quot;#Claude-Code-使用技巧整理&quot; class=&quot;headerlink&quot; title=&quot;Claude Code 使用技巧整理&quot;&gt;&lt;/a&gt;Claude Code 使用技巧整理&lt;/h1&gt;&lt;p&gt;这篇文档主要</description>
        
      
      
      
      <content:encoded><![CDATA[<h1 id="Claude-Code-使用技巧整理"><a href="#Claude-Code-使用技巧整理" class="headerlink" title="Claude Code 使用技巧整理"></a>Claude Code 使用技巧整理</h1><p>这篇文档主要记录我在使用 Claude Code 过程中的一些稳定工作流、提问方式和插件使用经验。目标不是把所有功能列一遍，而是整理出真正能提高交付效率的做法。</p><hr><h2 id="Claude-Code-适合做什么"><a href="#Claude-Code-适合做什么" class="headerlink" title="Claude Code 适合做什么"></a>Claude Code 适合做什么</h2><p>如果从游戏开发、工具链开发和内容工程的视角来看，Claude Code 更适合以下几类任务：</p><ul><li>理解已有工程结构</li><li>批量重构脚本或工具代码</li><li>补测试、补文档、补脚手架</li><li>对接第三方 SDK、服务接口或命令行工具</li><li>做中等复杂度的多文件改动</li></ul><p>它的优势不是“直接无脑一把梭”，而是：</p><ul><li>长上下文理解能力比较稳定</li><li>对代码解释、重构、归纳总结表现较好</li><li>适合先分析、再执行的工作流</li></ul><p>如果任务非常碎、非常即时，比如一边写一边需要行级补全，那还是 IDE 内补全工具更直接。</p><hr><h2 id="Claude配置文件位置"><a href="#Claude配置文件位置" class="headerlink" title="Claude配置文件位置"></a>Claude配置文件位置</h2><table><thead><tr><th></th><th><code>项目根/CLAUDE.md</code></th><th><code>~/.claude/settings.json</code></th><th><code>~/.claude.json</code></th></tr></thead><tbody><tr><td><strong>语义</strong></td><td>给 Claude 的指令</td><td>用户配置</td><td>系统内部状态</td></tr><tr><td><strong>谁写</strong></td><td>用户手写</td><td>用户手动编辑</td><td>Claude Code 自动维护</td></tr><tr><td><strong>内容</strong></td><td>项目规范、编码风格、工作流约定</td><td>API 密钥、模型配置、权限规则、MCP 服务器</td><td>启动次数、工具使用统计、项目级会话记录</td></tr><tr><td><strong>类比</strong></td><td><code>.editorconfig</code> &#x2F; <code>.eslintrc</code></td><td>VS Code 的 <code>settings.json</code></td><td>VS Code 的 <code>state.vscdb</code></td></tr><tr><td><strong>版本控制</strong></td><td>应提交到 git</td><td>不提交（含密钥）</td><td>不提交（含用户 ID）</td></tr><tr><td><strong><code>env</code></strong></td><td>—</td><td>API 地址、模型名</td><td>—</td></tr><tr><td><strong><code>permissions</code></strong></td><td>—</td><td>工具白名单</td><td>—</td></tr><tr><td><strong><code>mcpServers</code></strong></td><td>—</td><td>全局 MCP 服务器</td><td>—</td></tr><tr><td><strong><code>enabledPlugins</code></strong></td><td>—</td><td>插件开关</td><td>—</td></tr><tr><td><strong><code>numStartups</code></strong></td><td>—</td><td>—</td><td>启动总次数</td></tr><tr><td><strong><code>projects</code></strong></td><td>—</td><td>—</td><td>每个项目的 MCP、信任状态、会话统计</td></tr><tr><td><strong><code>toolUsage</code></strong></td><td>—</td><td>—</td><td>每个工具的调用次数和最后使用时间</td></tr><tr><td><strong><code>tipsHistory</code></strong></td><td>—</td><td>—</td><td>已展示过的提示信息</td></tr><tr><td>备注</td><td>这个文件一直存在于上下文中。</td><td></td><td></td></tr></tbody></table><hr><h2 id="我比较稳定的使用流程"><a href="#我比较稳定的使用流程" class="headerlink" title="我比较稳定的使用流程"></a>我比较稳定的使用流程</h2><h3 id="先要求它说明影响面"><a href="#先要求它说明影响面" class="headerlink" title="先要求它说明影响面"></a>先要求它说明影响面</h3><p>在改公共模块、基础类库、构建脚本之前，可以先问：</p><figure class="highlight markdown"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">如果修改这个函数签名，会影响哪些调用方？</span><br><span class="line">请按文件列出来，并说明高风险点。</span><br></pre></td></tr></table></figure><p>这一步对避免误改很有价值。</p><h3 id="让它顺手补验证步骤"><a href="#让它顺手补验证步骤" class="headerlink" title="让它顺手补验证步骤"></a>让它顺手补验证步骤</h3><p>除了改代码，我还会顺带要求：</p><figure class="highlight markdown"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">修改完成后，请补充：</span><br><span class="line"><span class="bullet">1.</span> 如何验证</span><br><span class="line"><span class="bullet">2.</span> 建议执行的命令</span><br><span class="line"><span class="bullet">3.</span> 可能失败的边界情况</span><br></pre></td></tr></table></figure><p>这样最后产出更接近可交付状态。</p><h3 id="把“不要做什么”写明"><a href="#把“不要做什么”写明" class="headerlink" title="把“不要做什么”写明"></a>把“不要做什么”写明</h3><p>例如：</p><ul><li>不要改 public API</li><li>不要引入新依赖</li><li>不要动 UI 样式</li><li>不要修改数据库结构</li></ul><p>这类负向约束非常重要，尤其是在已有项目里。</p><ul><li>另外，Claude Sonnot会有比较严重的上下文焦虑。所以它只适合用于来做短暂的快速完成的小功能。</li></ul><hr><h2 id="ClaudeCode底层实现讨论"><a href="#ClaudeCode底层实现讨论" class="headerlink" title="ClaudeCode底层实现讨论"></a>ClaudeCode底层实现讨论</h2><h3 id="模型自带的搜索能力WebSearch如何实现的"><a href="#模型自带的搜索能力WebSearch如何实现的" class="headerlink" title="模型自带的搜索能力WebSearch如何实现的"></a>模型自带的搜索能力WebSearch如何实现的</h3><p>Claude Code 内置了两个独立的联网工具，经常被混为一谈：</p><table><thead><tr><th>工具</th><th>职责</th><th>输入</th><th>输出</th></tr></thead><tbody><tr><td><strong>WebSearch</strong></td><td>搜索引擎入口</td><td>关键词 query</td><td>标题 + 链接列表</td></tr><tr><td><strong>WebFetch</strong></td><td>页面内容读取</td><td>具体 URL</td><td>页面正文</td></tr></tbody></table><p>两者配合使用：先用 WebSearch 找到相关链接，再用 WebFetch 读取具体内容。</p><h4 id="底层调用链路"><a href="#底层调用链路" class="headerlink" title="底层调用链路"></a>底层调用链路</h4><p>这里有个比较有意思的实现细节，是社区通过逆向 Claude Code 流量发现的：</p><ol><li><strong>主对话触发</strong>：当 Claude 判断需要搜索时，主会话调用 <code>WebSearch</code>，传入 query 参数</li><li><strong>派生子对话</strong>：Anthropic 服务端会为这次搜索单独起一个 Claude Opus 子会话，调用 Anthropic 内部的 <code>web_search</code> 服务端工具</li><li><strong>结果回传</strong>：子会话处理完后，结果作为工具返回值传回主对话</li><li><strong>可能多轮</strong>：整个过程可能在单次请求中重复多次（比如先搜一次，根据结果决定再搜一次）</li></ol><p>这个设计的意图是让主 Agent 保持轻量，并限制注入面（injection surface），搜索逻辑在隔离的子会话里运行。</p><h4 id="版本与收费说明"><a href="#版本与收费说明" class="headerlink" title="版本与收费说明"></a>版本与收费说明</h4><ul><li>目前最新工具版本是 <code>web_search_20260209</code>，支持动态过滤（Dynamic Filtering，正式”进入 Claude 上下文之前”，先让代码过滤一遍，只保留有用的部分**）</li><li><strong>API 用户</strong>：WebSearch 是单独计费的附加功能，每次搜索额外收费</li><li><strong>Max 套餐用户</strong>：已经包含在套餐里，不单独扣费，可以直接用</li><li>这也是为什么有人反馈”用 Claude 订阅账号跑 Claude Code 时 WebSearch 显示 Rate limit，但用 API Key 却正常”——两者走的是不同的配额通道</li></ul><h4 id="和-MCP-搜索插件的区别"><a href="#和-MCP-搜索插件的区别" class="headerlink" title="和 MCP 搜索插件的区别"></a>和 MCP 搜索插件的区别</h4><p>如果你已经用 Max 套餐，内置的 WebSearch + WebFetch 对日常搜索够用，不需要额外装 Tavily、Brave 这类 MCP 搜索插件。MCP 搜索插件更适合需要更高频次搜索、或者需要自定义搜索行为的 API 用户。</p><h2 id="Plugin-其他工具-工作流"><a href="#Plugin-其他工具-工作流" class="headerlink" title="Plugin &#x2F;其他工具&#x2F;工作流"></a>Plugin &#x2F;其他工具&#x2F;工作流</h2><p>插件或者外部工具接入，真正的价值不是“功能变多”，而是让 Claude Code 可以从“只会说”变成“能查、能跑、能验证”。</p><p>我的判断标准通常是三条：</p><ol><li>能不能减少手动切换上下文</li><li>能不能把分析结果落到真实工程上</li><li>能不能形成稳定工作流，而不是偶尔演示一次</li></ol><h3 id="ralph-loop-（循环插件）"><a href="#ralph-loop-（循环插件）" class="headerlink" title="ralph-loop （循环插件）"></a><a href="https://awesomeclaude.ai/ralph-wiggum">ralph-loop</a> （循环插件）</h3><p>ralph-loop 这类插件我更倾向把它看成“循环执行框架”或者“任务闭环增强器”。</p><p>它比较适合下面这些场景：</p><ul><li>一个任务需要多轮分析、执行、检查</li><li>需要让 AI 按固定节奏反复迭代</li><li>需要把大任务拆成可观察的小回合</li></ul><h4 id="自动化同意问题"><a href="#自动化同意问题" class="headerlink" title="自动化同意问题"></a>自动化同意问题</h4><p>如果你的目标只是“允许它直接修改当前工作区里的文件，不要每次都弹确认”，优先用 Claude Code 自带的权限模式，而不是指望插件本身绕过确认。</p><p>我建议分两档理解：</p><h4 id="方案-1：只自动同意编辑文件"><a href="#方案-1：只自动同意编辑文件" class="headerlink" title="方案 1：只自动同意编辑文件"></a>方案 1：只自动同意编辑文件</h4><p>这是更稳的做法。</p><p>Claude Code 有一个 <code>acceptEdits</code> 模式，核心效果是：</p><ul><li>工作区内的文件编辑可以批量接受</li><li>但执行命令、联网请求、其他有副作用的操作，仍然会继续询问</li></ul><p>如果你主要烦的是“改这个文件能不能同意”这种提示，那应该优先用这一档。</p><p>可以通过 <code>/config</code> 或 <code>/permissions</code> 检查当前模式，也可以在设置文件里显式写上：</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">  <span class="attr">&quot;$schema&quot;</span><span class="punctuation">:</span> <span class="string">&quot;https://json.schemastore.org/claude-code-settings.json&quot;</span><span class="punctuation">,</span></span><br><span class="line">  <span class="attr">&quot;permissions&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;defaultMode&quot;</span><span class="punctuation">:</span> <span class="string">&quot;acceptEdits&quot;</span></span><br><span class="line">  <span class="punctuation">&#125;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>常见放置位置：</p><ul><li>全局生效：<code>~/.claude/settings.json</code></li><li>当前仓库本地生效：<code>.claude/settings.local.json</code></li><li>团队共享配置：<code>.claude/settings.json</code></li></ul><p>如果你只是想在自己机器上对当前项目放开，最适合放到 <code>.claude/settings.local.json</code>。这样不会提交进仓库，也不会影响别的项目。</p><h4 id="方案-2：连命令确认也一起跳过"><a href="#方案-2：连命令确认也一起跳过" class="headerlink" title="方案 2：连命令确认也一起跳过"></a>方案 2：连命令确认也一起跳过</h4><p>如果你希望 Claude Code 连命令执行、工具调用这些确认也尽量别问，可以直接用启动参数：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">claude --dangerously-skip-permissions</span><br></pre></td></tr></table></figure><p>这个模式相当于直接跳过权限确认，适合你完全信任当前仓库、并且明确知道它可能会自动执行改文件、跑命令、调用工具等操作的场景。</p><p>但这一档风险明显更高，不建议默认长期打开。更适合：</p><ul><li>自己的个人仓库</li><li>隔离良好的测试环境</li><li>临时做高频迭代任务</li></ul><p>不太建议直接用于：</p><ul><li>来路不明的仓库</li><li>混有敏感配置的工程</li><li>会执行部署、发布、数据库操作的项目</li></ul><h4 id="实战建议"><a href="#实战建议" class="headerlink" title="实战建议"></a>实战建议</h4><p>如果你的诉求只是“ralph-loop 在多轮执行时别一直问我能不能改文件”，那优先级应该是：</p><ol><li>先把默认权限模式切到 <code>acceptEdits</code></li><li>用 <code>.claude/settings.local.json</code> 只对当前仓库生效</li><li>只有在你连命令确认都嫌打断流程时，才改用 <code>--dangerously-skip-permissions</code></li></ol><p>简单说：</p><ul><li>只想免掉文件编辑确认，用 <code>acceptEdits</code></li><li>想把所有确认基本都跳过，用 <code>--dangerously-skip-permissions</code></li></ul><p>前者适合日常主力使用，后者适合你明确接受风险时再开。</p><p>简单说，ralph-loop 不是让 Claude Code “更聪明”，而是让你的任务编排更稳定。</p><h3 id="Oh-My-Claude-Code（OMC，Claude智能体集合）"><a href="#Oh-My-Claude-Code（OMC，Claude智能体集合）" class="headerlink" title="Oh My Claude Code（OMC，Claude智能体集合）"></a>Oh My Claude Code（OMC，Claude智能体集合）</h3><p>主打一个零成本学习Claude的一个工具。提供一个预设好的agent合集。</p><h4 id="安装方式"><a href="#安装方式" class="headerlink" title="安装方式"></a>安装方式</h4><p>我使用npm包管理。在terminal里执行：<br><code>npm i -g oh-my-claude-sisyphus@latest</code><br>然后<code>omc setup</code>,完成设置。<br><a href="https://github.com/Yeachan-Heo/oh-my-claudecode/blob/main/README.md">项目网站</a></p><h4 id="提供了5种模式"><a href="#提供了5种模式" class="headerlink" title="提供了5种模式"></a>提供了5种模式</h4><p>🔸 Autopilot（完全自主） - 从规划、实施到测试的全流程自动化<br>🔸 Ultra Pilot（并行加速） - 最多5个并行工作者同时处理，速度提升5倍<br>🔸 Swarm（协作团队） - 多个智能体像开发团队一样协作，从共享任务池领取工作<br>🔸 Pipeline（流水线） - 按固定顺序串联智能体，适合必须按步骤推进的工作流<br>🔸 EcoCode（经济模式） - 在保持效率的前提下最大化节省 token 消耗</p><h4 id="新版本claude命令行下的标识"><a href="#新版本claude命令行下的标识" class="headerlink" title="新版本claude命令行下的标识"></a>新版本claude命令行下的标识</h4><p>[OMC#4.11.2] | session:33m | ctx:32% | T:17 A:1<br>● 这是 OMC 状态栏的显示，各项含义：</p><ul><li>[OMC#4.11.2] — OMC 插件版本号</li><li>session:33m — 当前会话已持续 33 分钟</li><li>ctx:32% — 上下文窗口已使用 32%</li><li>T:17 — Tool calls，本次会话已执行的工具新调用次数（17 次）</li><li>A:1 — Agents，当前活跃的子代理数量（1 个）</li></ul><p>在运行命令的时候，遇到一个报错：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">● Ran 4 stop hooks (ctrl+o to expand)</span><br><span class="line">  ⎿  Stop hook error: Failed with non-blocking status code: /usr/bin/bash: line 1: node: command not found</span><br><span class="line">  ⎿  Stop hook error: Failed with non-blocking status code: /usr/bin/bash: line 1: node: command not found</span><br><span class="line">  ⎿  Stop hook error: Failed with non-blocking status code: /usr/bin/bash: line 1: node: command not found</span><br></pre></td></tr></table></figure><p>首先建议全局安装 OMC，这样在全局的命令行里都能调用到这个，比在plugin里面管理会好些：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">npm install -g oh-my-claude-sisyphus</span><br><span class="line">omc setup</span><br></pre></td></tr></table></figure><p>上面提到的这个问题可能是因为没有安装tmux，</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">winget install psmux</span><br></pre></td></tr></table></figure><h3 id="claude-tap（API-流量追踪器）"><a href="#claude-tap（API-流量追踪器）" class="headerlink" title="claude-tap（API 流量追踪器）"></a>claude-tap（API 流量追踪器）</h3><p>一个本地代理工具，用来<strong>拦截并可视化</strong> Claude Code、Codex CLI、Gemini CLI 等编程代理的真实 API 流量。</p><p>核心用途：调试 AI 行为时，能直接看到底层发生了什么——</p><ul><li>查看完整的系统提示词、对话历史、工具定义与工具调用结果</li><li>比较相邻两次请求的差异，精确定位是哪条提示、哪个参数发生了变化</li><li>每次运行生成 JSONL 日志 + 自包含 HTML 查看器，方便留存和分享</li><li>数据全留本地，无需任何托管仪表盘，常见 auth header 会自动脱敏</li></ul><p><strong>安装</strong>（需 Python 3.11+）：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">uv tool install claude-tap</span><br><span class="line"><span class="comment"># 或</span></span><br><span class="line">pip install claude-tap</span><br></pre></td></tr></table></figure><p><a href="https://github.com/liaohch3/claude-tap">GitHub 项目地址</a><br>第一次使用</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">claude-tap --tap-live</span><br></pre></td></tr></table></figure><p>工具在当前时间点不支持 Python 3.14 这个版本，所以为了适用这工具，我还特地降级为 3.13.</p><hr><h2 id="常用命令"><a href="#常用命令" class="headerlink" title="常用命令"></a>常用命令</h2><p>这里有一个小区别：<code>claude --resume</code> 是在终端里执行的启动参数，<code>/xxx</code> 则是在 Claude Code 会话里直接输入的命令。</p><ul><li><code>claude --resume</code>：恢复最近一次或指定会话。终端意外关闭，或者你想接着上次上下文继续做时很好用。</li><li><code>/rewind</code>：把当前会话回退到前面的某个节点。刚刚几轮思路跑偏、想撤回一段操作时非常方便。</li><li><code>/resume</code> &#x2F; <code>/continue</code>：在会话里恢复已有 session，或者直接打开会话选择器继续之前的任务。</li><li><code>/remote-control</code> &#x2F; <code>/rc</code>：把当前本地会话开放给远程控制，可以在 <code>claude.ai/code</code> 或移动端继续接手当前任务。这个功能很适合临时离开电脑但又不想断掉上下文的场景。</li><li><code>claude --dangerously-skip-permissions</code>：启动时直接跳过权限确认，不再逐次询问用户同意。适合你完全信任当前仓库、并且希望 Claude Code 自主连续执行改文件、跑命令、调用工具的场景，但风险也最高，最好只在个人项目或隔离环境里临时使用。</li><li><code>/compact</code>：把当前上下文压缩成摘要后继续同一会话。上下文快满、但你又不想重新解释项目背景时特别省事。（这个OMC不支持）</li><li><code>/doctor</code>：检查 Claude Code 的安装、权限和配置问题。命令失效、插件不工作、环境异常时，先跑它通常能省掉不少排查时间。</li></ul><h2 id="报错及Debug"><a href="#报错及Debug" class="headerlink" title="报错及Debug"></a>报错及Debug</h2><h3 id="PreToolUse-PostToolUse-Hooks-报错"><a href="#PreToolUse-PostToolUse-Hooks-报错" class="headerlink" title="PreToolUse &#x2F; PostToolUse Hooks 报错"></a>PreToolUse &#x2F; PostToolUse Hooks 报错</h3><p><strong>报错示例：</strong></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">Read 1 file (ctrl+o to expand)</span><br><span class="line">  ⎿  PreToolUse:Read hook error    ⎿  ECONNREFUSED</span><br><span class="line">  ⎿  PreToolUse:Read hook error    ⎿  ECONNREFUSED</span><br><span class="line">  ⎿  PostToolUse:Read hook error   ⎿  ECONNREFUSED</span><br><span class="line">  ⎿  PostToolUse:Read hook error   ⎿  ECONNREFUSED</span><br></pre></td></tr></table></figure><p><strong>原因分析：</strong><br>一些公司会通过 Hooks 的方式监控 Claude 的使用情况，它同时还是 Claude Code 的 LLM 网关 —— <code>ANTHROPIC_BASE_URL</code> 也指向同一个端口。</p><p><strong>“内网AI网关工具们”监听的 Endpoint：</strong></p><table><thead><tr><th>Endpoint</th><th>触发时机</th><th>用途（推断）</th></tr></thead><tbody><tr><td><code>/hook/claude</code></td><td><code>UserPromptSubmit</code> &#x2F; <code>Stop</code> &#x2F; <code>StopFailure</code> &#x2F; <code>Subagent*</code> &#x2F; <code>PostToolUseFailure</code></td><td>通用事件流 —— 把”用户提交了 prompt”、”会话结束”、”子 agent 启停”这类生命周期事件喂给“内网AI网关工具们”</td></tr><tr><td><code>/hook/claude/pre-tool</code></td><td><code>PreToolUse</code></td><td>工具调用前拦截 —— “内网AI网关工具们”能在这里看到你要调什么工具、什么参数</td></tr><tr><td><code>/hook/claude/post-tool</code></td><td><code>PostToolUse</code></td><td>工具调用后回执 —— “内网AI网关工具们”能看到工具返回了什么</td></tr></tbody></table><p><strong>“内网AI网关工具们”拿这些数据干嘛：</strong></p><ol><li><strong>会话观测 &#x2F; 录制</strong> —— “内网AI网关工具们”是 GUI，需要实时知道 Claude Code 当前在做什么，才能在它的界面里展示”当前会话、调了哪些工具、用了多少 token”。</li><li><strong>多 agent 编排</strong> —— 看它的扩展代码（<code>gateway-dispatch.ts</code>），“内网AI网关工具们”内部跑了一个 daemon，可以在 Codex &#x2F; Claude &#x2F; Gemini &#x2F; CodeMaker 之间分发任务、做后台任务卡片。Hooks 是它统一观察这些 agent 的入口。</li><li><strong>可能的策略干预</strong> —— <code>PreToolUse</code> hook 在 Claude Code 里是有能力 block &#x2F; 改写工具调用的（这是规范的能力），但具体“内网AI网关工具们”用没用没看到证据。用空 <code>&#123;&#125;</code> 打 <code>pre-tool</code> 它回 <code>422</code>，说明它确实在解析 payload。</li></ol><p><strong>小结：</strong></p><p>这些 hooks 是“内网AI网关工具们”把自己 UI 接到 Claude Code 上的观测 + 控制通道，不是必需。如果你不用“内网AI网关工具们”的 GUI 看会话状态、不用它编排多 agent，可以删；但因为 <code>ANTHROPIC_BASE_URL</code> 也走“内网AI网关工具们”，“内网AI网关工具们”进程本身仍然必须活着。</p><h3 id="Windows-中文环境-PowerShell-乱码"><a href="#Windows-中文环境-PowerShell-乱码" class="headerlink" title="Windows 中文环境 PowerShell 乱码"></a>Windows 中文环境 PowerShell 乱码</h3><p><strong>报错示例：</strong></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">● Bash(powershell.exe -NoProfile -Command &quot;...&quot;)</span><br><span class="line">  ⎿  Error: Exit code 1</span><br><span class="line">     ����λ�� ��:1 �ַ��: 169</span><br><span class="line">     + ... ref])|Out-Null;if(.Count -eq 0)&#123;&#x27;OK: no syntax errors&#x27;&#125;else&#123;|%&#123;extglo ...</span><br><span class="line">     +                                                                 ~</span><br><span class="line">     ������ʹ�ÿչ��Ԫ��</span><br><span class="line">         + CategoryInfo          : ParserError: (:) [], ParentContainsErrorRecordException</span><br><span class="line">         + FullyQualifiedErrorId : EmptyPipeElement</span><br></pre></td></tr></table></figure><p><strong>原因分析：</strong></p><p>Windows 简体中文版默认 code page 是 <strong>936 (GBK&#x2F;GB2312)</strong>，但 Claude Code 内部按 <strong>UTF-8</strong> 处理字符串。PowerShell 输出中文错误信息时用 GBK 编码，Claude Code 按 UTF-8 解读 → 乱码。</p><p>这是一个<strong>非常常见的问题</strong>，几乎所有 CJK（中日韩）Windows 用户都会遇到。英文 Windows 不会触发，因为 code page 437 与 UTF-8 在 ASCII 范围内兼容。<br><strong>修复方案：在 Claude Code settings 里设置环境变量</strong></p><p>在项目级 <code>.claude/settings.local.json</code> 中添加：</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">  <span class="attr">&quot;env&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;CHCP&quot;</span><span class="punctuation">:</span> <span class="string">&quot;65001&quot;</span></span><br><span class="line">  <span class="punctuation">&#125;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>或全局生效，写入 <code>~/.claude/settings.json</code>。</p><p><strong>为什么不推荐改系统区域设置：</strong></p><p>Windows 提供”Beta: 使用 Unicode UTF-8 提供全球语言支持”选项，会把系统 code page 从 936 改成 65001，效果最彻底，但属于全局不可逆改动，可能导致少数旧中文软件（GBK 硬编码的安装程序、老版压缩工具等）乱码。<code>env CHCP=65001</code> 只在 Claude Code 的 shell 进程里生效，系统其他部分完全不受影响。</p><p>修改后需要重启 Claude Code 会话才能生效。</p><hr><h2 id="参考资料"><a href="#参考资料" class="headerlink" title="参考资料"></a>参考资料</h2><p><a href="https://www.bilibili.com/video/BV12hf2B9E29/">一个B站博主的学习视频</a></p>]]></content:encoded>
      
      
      
      <category domain="https://eugenepage.com/tags/AI/">AI</category>
      
      <category domain="https://eugenepage.com/tags/Assistant/">Assistant</category>
      
      <category domain="https://eugenepage.com/tags/VibeCoding/">VibeCoding</category>
      
      <category domain="https://eugenepage.com/tags/MCP/">MCP</category>
      
      <category domain="https://eugenepage.com/tags/Claude/">Claude</category>
      
      
      <comments>https://eugenepage.com/zh-CN/2026/03/22/20260322.Claude%20Code%20Tips/#disqus_thread</comments>
      
    </item>
    
    <item>
      <title>Research Notes on Mainstream AI Subscription Plans</title>
      <link>https://eugenepage.com/2026/03/13/20260314.AI%20Services/</link>
      <guid>https://eugenepage.com/2026/03/13/20260314.AI%20Services/</guid>
      <pubDate>Fri, 13 Mar 2026 19:10:00 GMT</pubDate>
      
        
        
      <description>&lt;h1 id=&quot;Research-Notes-on-Mainstream-AI-Subscription-Plans&quot;&gt;&lt;a href=&quot;#Research-Notes-on-Mainstream-AI-Subscription-Plans&quot; class=&quot;headerlink&quot;</description>
        
      
      
      
      <content:encoded><![CDATA[<h1 id="Research-Notes-on-Mainstream-AI-Subscription-Plans"><a href="#Research-Notes-on-Mainstream-AI-Subscription-Plans" class="headerlink" title="Research Notes on Mainstream AI Subscription Plans"></a>Research Notes on Mainstream AI Subscription Plans</h1><blockquote><p>Last updated: 2026-03-14</p><p>As AI technology advances rapidly, AI subscription services are proliferating. This document surveys and compares the current mainstream AI subscription options to help individuals and teams find the right fit.</p><p><strong>Provider categories:</strong></p><ul><li><strong>Native AI providers</strong>: Companies that build their own models (OpenAI, Google, Anthropic, etc.)</li><li><strong>Third-party AI providers</strong>: Platforms that aggregate multiple model sources (OpenRouter, Together AI, Replicate, etc.)</li></ul></blockquote><hr><h2 id="Table-of-Contents"><a href="#Table-of-Contents" class="headerlink" title="Table of Contents"></a>Table of Contents</h2><ul><li><a href="#provider-category-overview">Provider Category Overview</a></li><li><a href="#native-ai-providers">Native AI Providers</a><ul><li><a href="#openai">OpenAI</a></li><li><a href="#google-gemini">Google Gemini</a></li><li><a href="#anthropic-claude">Anthropic Claude</a></li><li><a href="#zhipu-ai">Zhipu AI</a></li><li><a href="#baidu-ernie-bot">Baidu ERNIE Bot</a></li><li><a href="#alibaba-cloud-qwen">Alibaba Cloud Qwen</a></li><li><a href="#bytedance-doubao">ByteDance Doubao</a></li><li><a href="#moonshot-kimi">Moonshot Kimi</a></li><li><a href="#github-copilot-microsoft">GitHub Copilot (Microsoft)</a></li></ul></li><li><a href="#third-party-ai-providers">Third-Party AI Providers</a><ul><li><a href="#openrouter">OpenRouter</a></li><li><a href="#together-ai">Together AI</a></li><li><a href="#replicate">Replicate</a></li><li><a href="#fireworksai">Fireworks.ai</a></li><li><a href="#hugging-face-inference">Hugging Face Inference</a></li><li><a href="#siliconflow">SiliconFlow</a></li></ul></li><li><a href="#comparison-summary">Comparison Summary</a></li></ul><hr><h2 id="Provider-Category-Overview"><a href="#Provider-Category-Overview" class="headerlink" title="Provider Category Overview"></a>Provider Category Overview</h2><h3 id="Native-AI-Providers"><a href="#Native-AI-Providers" class="headerlink" title="Native AI Providers"></a>Native AI Providers</h3><p><strong>Definition</strong>: Companies that develop their own AI models and offer direct API access.</p><p><strong>Characteristics:</strong></p><ul><li>✅ Strongest model capabilities (cutting-edge technology)</li><li>✅ Mature ecosystem, rich documentation</li><li>✅ Official support, high stability</li><li>❌ Single vendor, risk of vendor lock-in</li><li>❌ Relatively higher prices (though OpenAI allows region-switching tricks for lower rates)</li><li>❌ Complex integration when using multiple vendors</li></ul><p><strong>Best for:</strong></p><ul><li>Projects requiring peak model performance</li><li>Enterprise applications with high stability requirements</li><li>Global products needing multilingual support</li><li>Teams that don’t want to rely on third-party proxies</li></ul><hr><h3 id="Third-Party-AI-Providers-Aggregators"><a href="#Third-Party-AI-Providers-Aggregators" class="headerlink" title="Third-Party AI Providers (Aggregators)"></a>Third-Party AI Providers (Aggregators)</h3><p><strong>Definition</strong>: Platforms that aggregate multiple AI model sources behind a single unified API.</p><p><strong>Characteristics:</strong></p><ul><li>✅ Unified interface, lower integration complexity</li><li>✅ Rich model selection, flexible switching</li><li>✅ Smart routing and automatic failover</li><li>✅ Cost optimization, transparent pricing</li><li>❌ Extra middleware layer, may introduce additional latency</li><li>❌ Dependent on third-party platform stability</li><li>❌ Feature set may not be as complete as native providers</li></ul><p><strong>Best for:</strong></p><ul><li>Projects that need to connect to multiple models simultaneously</li><li>Teams wanting to reduce vendor lock-in risk</li><li>Cost-sensitive scenarios requiring flexible model switching</li><li>Rapid prototyping and testing</li></ul><hr><h1 id="Native-AI-Providers-1"><a href="#Native-AI-Providers-1" class="headerlink" title="Native AI Providers"></a>Native AI Providers</h1><h2 id="OpenAI"><a href="#OpenAI" class="headerlink" title="OpenAI"></a>OpenAI</h2><h3 id="Official-Websites"><a href="#Official-Websites" class="headerlink" title="Official Websites"></a>Official Websites</h3><ul><li><a href="https://openai.com/">https://openai.com/</a></li><li><a href="https://platform.openai.com/">https://platform.openai.com/</a></li></ul><h3 id="Overview"><a href="#Overview" class="headerlink" title="Overview"></a>Overview</h3><p>OpenAI is the pioneer and leader in large language models, offering the GPT series (GPT-4, GPT-3.5, etc.) and image generation models (DALL-E). It is currently the most mature AI API provider in the industry.</p><h3 id="Core-Models"><a href="#Core-Models" class="headerlink" title="Core Models"></a>Core Models</h3><h4 id="Language-Models"><a href="#Language-Models" class="headerlink" title="Language Models"></a>Language Models</h4><ul><li><strong>GPT-4 Turbo</strong>: Latest GPT-4, faster and cheaper, supports 128K context</li><li><strong>GPT-4</strong>: Top-tier language model, supports 8K&#x2F;32K&#x2F;128K context</li><li><strong>GPT-3.5 Turbo</strong>: Great value, fast responses, supports 16K context</li><li><strong>GPT-4o</strong>: Multimodal model supporting text, images, and audio</li></ul><h4 id="Image-Models"><a href="#Image-Models" class="headerlink" title="Image Models"></a>Image Models</h4><ul><li><strong>DALL-E 3</strong>: High-quality image generation</li><li><strong>DALL-E 2</strong>: Previous-generation image generation</li></ul><h4 id="Other-Models"><a href="#Other-Models" class="headerlink" title="Other Models"></a>Other Models</h4><ul><li><strong>Whisper</strong>: Speech recognition (multilingual)</li><li><strong>Embeddings</strong>: Text embedding vectors</li><li><strong>Text-to-Speech</strong>: Voice synthesis</li><li><strong>Moderation</strong>: Content moderation</li></ul><h3 id="Subscription-Plans"><a href="#Subscription-Plans" class="headerlink" title="Subscription Plans"></a>Subscription Plans</h3><h4 id="Free-Tier"><a href="#Free-Tier" class="headerlink" title="Free Tier"></a>Free Tier</h4><ul><li><strong>Price</strong>: $0&#x2F;month</li><li><strong>Credits</strong>: $5 free credit (new users)</li><li><strong>Limits</strong>: Lower rate limits</li></ul><h4 id="API-Pay-as-you-go"><a href="#API-Pay-as-you-go" class="headerlink" title="API Pay-as-you-go"></a>API Pay-as-you-go</h4><ul><li><strong>GPT-4 Turbo</strong>: $0.01 &#x2F; 1K input tokens, $0.03 &#x2F; 1K output tokens</li><li><strong>GPT-4</strong>: $0.03 &#x2F; 1K input tokens, $0.06 &#x2F; 1K output tokens</li><li><strong>GPT-3.5 Turbo</strong>: $0.0015 &#x2F; 1K input tokens, $0.002 &#x2F; 1K output tokens</li><li><strong>DALL-E 3</strong>: $0.04 &#x2F; image</li><li><strong>Whisper</strong>: $0.006 &#x2F; minute</li></ul><h4 id="ChatGPT-Plus-Personal"><a href="#ChatGPT-Plus-Personal" class="headerlink" title="ChatGPT Plus (Personal)"></a>ChatGPT Plus (Personal)</h4><ul><li><strong>Price</strong>: $20&#x2F;month</li><li><strong>Includes</strong>:<ul><li>GPT-4 access</li><li>DALL-E 3 image generation</li><li>Advanced data analysis</li><li>Browsing capability</li><li>Priority access to new features</li></ul></li></ul><h4 id="Team"><a href="#Team" class="headerlink" title="Team"></a>Team</h4><ul><li><strong>Price</strong>: $25&#x2F;user&#x2F;month</li><li><strong>Includes</strong>:<ul><li>Everything in ChatGPT Plus</li><li>Admin console</li><li>Team collaboration workspace</li><li>Data isolation</li><li>Higher rate limits</li></ul></li></ul><h4 id="Enterprise"><a href="#Enterprise" class="headerlink" title="Enterprise"></a>Enterprise</h4><ul><li><strong>Price</strong>: Contact sales</li><li><strong>Includes</strong>:<ul><li>Unlimited speed</li><li>Priority support</li><li>API access</li><li>Data encryption</li><li>Custom model fine-tuning</li><li>Compliance certifications (SOC2, HIPAA)</li></ul></li></ul><h3 id="Core-Strengths"><a href="#Core-Strengths" class="headerlink" title="Core Strengths"></a>Core Strengths</h3><h4 id="1-Model-Capabilities"><a href="#1-Model-Capabilities" class="headerlink" title="1. Model Capabilities"></a>1. Model Capabilities</h4><ul><li>Industry-leading language models</li><li>Excellent multilingual support</li><li>Powerful code generation</li><li>Outstanding reasoning and comprehension</li></ul><h4 id="2-Ecosystem"><a href="#2-Ecosystem" class="headerlink" title="2. Ecosystem"></a>2. Ecosystem</h4><ul><li><strong>API</strong>: REST API, Python&#x2F;JS SDK</li><li><strong>LangChain</strong>: Native support</li><li><strong>Vercel AI SDK</strong>: Native support</li><li><strong>VS Code plugins</strong>: Copilot and more</li><li><strong>Rich documentation</strong>: Detailed API docs and examples</li></ul><h4 id="3-Advanced-Features"><a href="#3-Advanced-Features" class="headerlink" title="3. Advanced Features"></a>3. Advanced Features</h4><ul><li><strong>Function Calling</strong>: Call external functions</li><li><strong>Streaming</strong>: Stream responses</li><li><strong>JSON Mode</strong>: Guaranteed JSON output</li><li><strong>Vision</strong>: Image understanding</li><li><strong>Fine-tuning</strong>: Custom model tuning</li><li><strong>Assistants API</strong>: Build AI assistants</li></ul><h4 id="4-Enterprise-Features"><a href="#4-Enterprise-Features" class="headerlink" title="4. Enterprise Features"></a>4. Enterprise Features</h4><ul><li><strong>Azure OpenAI</strong>: Enterprise-grade deployment</li><li><strong>Data privacy</strong>: Data not used for training (Enterprise tier)</li><li><strong>Compliance</strong>: SOC2, HIPAA, GDPR</li><li><strong>SLA</strong>: Enterprise service level agreements</li><li><strong>Technical support</strong>: Dedicated support team</li></ul><h3 id="Best-For"><a href="#Best-For" class="headerlink" title="Best For"></a>Best For</h3><ul><li>Applications that demand peak model performance</li><li>Enterprise apps with high stability requirements</li><li>Global products needing multilingual support</li><li>Teams wanting a complete ecosystem and toolchain</li><li>Cost-insensitive scenarios</li></ul><h3 id="Pros-Cons"><a href="#Pros-Cons" class="headerlink" title="Pros &amp; Cons"></a>Pros &amp; Cons</h3><p><strong>Pros:</strong></p><ul><li>✅ Strongest models, industry benchmark</li><li>✅ Mature ecosystem, rich tooling</li><li>✅ Best documentation and community support</li><li>✅ Comprehensive enterprise features</li><li>✅ Continuous updates and improvements</li></ul><p><strong>Cons:</strong></p><ul><li>❌ Relatively higher prices</li><li>❌ Single vendor, lock-in risk</li><li>❌ Some features require Enterprise tier</li><li>❌ Data compliance concerns for users outside the US</li></ul><h3 id="Docs-Resources"><a href="#Docs-Resources" class="headerlink" title="Docs &amp; Resources"></a>Docs &amp; Resources</h3><ul><li><strong>Official docs</strong>: <a href="https://platform.openai.com/docs">https://platform.openai.com/docs</a></li><li><strong>API Reference</strong>: <a href="https://platform.openai.com/docs/api-reference">https://platform.openai.com/docs/api-reference</a></li><li><strong>GitHub</strong>: <a href="https://github.com/openai">https://github.com/openai</a></li><li><strong>Community</strong>: OpenAI Developer Forum</li></ul><hr><h2 id="Google-Gemini"><a href="#Google-Gemini" class="headerlink" title="Google Gemini"></a>Google Gemini</h2><h3 id="Official-Websites-1"><a href="#Official-Websites-1" class="headerlink" title="Official Websites"></a>Official Websites</h3><ul><li><a href="https://gemini.google.com/">https://gemini.google.com/</a></li><li><a href="https://ai.google.dev/">https://ai.google.dev/</a></li></ul><h3 id="Overview-1"><a href="#Overview-1" class="headerlink" title="Overview"></a>Overview</h3><p>Google Gemini (formerly Bard) is Google’s multimodal large language model offering strong text, image, and audio understanding, with deep integration across the Google ecosystem.</p><h3 id="Core-Models-1"><a href="#Core-Models-1" class="headerlink" title="Core Models"></a>Core Models</h3><h4 id="Gemini-Series"><a href="#Gemini-Series" class="headerlink" title="Gemini Series"></a>Gemini Series</h4><ul><li><strong>Gemini Ultra</strong>: Most powerful model, multimodal</li><li><strong>Gemini Pro</strong>: Mainstream model, balanced performance and cost</li><li><strong>Gemini Pro Vision</strong>: Vision model</li><li><strong>Gemini Flash</strong>: High-speed response model</li></ul><h4 id="Other-Models-1"><a href="#Other-Models-1" class="headerlink" title="Other Models"></a>Other Models</h4><ul><li><strong>PaLM 2</strong>: Previous-generation language model</li><li><strong>Imagen</strong>: Image generation</li><li><strong>Codey</strong>: Code model</li></ul><h3 id="Subscription-Plans-1"><a href="#Subscription-Plans-1" class="headerlink" title="Subscription Plans"></a>Subscription Plans</h3><h4 id="Free-Tier-1"><a href="#Free-Tier-1" class="headerlink" title="Free Tier"></a>Free Tier</h4><ul><li><strong>Price</strong>: $0&#x2F;month</li><li><strong>Includes</strong>:<ul><li>Gemini Pro access</li><li>Daily usage limit</li><li>Web interface access</li></ul></li></ul><h4 id="AI-Studio-API-Pay-as-you-go"><a href="#AI-Studio-API-Pay-as-you-go" class="headerlink" title="AI Studio (API Pay-as-you-go)"></a>AI Studio (API Pay-as-you-go)</h4><ul><li><strong>Gemini Pro</strong>: $0.0005 &#x2F; 1K input tokens, $0.0015 &#x2F; 1K output tokens</li><li><strong>Gemini Pro Vision</strong>: $0.0025 &#x2F; 1K input tokens, $0.0075 &#x2F; 1K output tokens</li><li><strong>Imagen</strong>: $0.002 &#x2F; image</li></ul><h4 id="Google-One-AI-Premium-Personal"><a href="#Google-One-AI-Premium-Personal" class="headerlink" title="Google One AI Premium (Personal)"></a>Google One AI Premium (Personal)</h4><ul><li><strong>Price</strong>: $19.99&#x2F;month</li><li><strong>Includes</strong>:<ul><li>Gemini Ultra access</li><li>2TB Google Cloud storage</li><li>Google Workspace premium features</li></ul></li></ul><h4 id="Enterprise-1"><a href="#Enterprise-1" class="headerlink" title="Enterprise"></a>Enterprise</h4><ul><li><strong>Price</strong>: Contact sales</li><li><strong>Includes</strong>:<ul><li>Vertex AI platform access</li><li>Custom model fine-tuning</li><li>Data privacy protection</li><li>Compliance certifications</li><li>Technical support</li></ul></li></ul><h3 id="Core-Strengths-1"><a href="#Core-Strengths-1" class="headerlink" title="Core Strengths"></a>Core Strengths</h3><h4 id="1-Multimodal-Capabilities"><a href="#1-Multimodal-Capabilities" class="headerlink" title="1. Multimodal Capabilities"></a>1. Multimodal Capabilities</h4><ul><li>Native multimodal (text, images, audio, video)</li><li>Cross-modal understanding and generation</li><li>Real-time video analysis</li></ul><h4 id="2-Google-Ecosystem-Integration"><a href="#2-Google-Ecosystem-Integration" class="headerlink" title="2. Google Ecosystem Integration"></a>2. Google Ecosystem Integration</h4><ul><li><strong>Google Workspace</strong>: Docs, Gmail, Sheets integration</li><li><strong>Google Search</strong>: Real-time search capability</li><li><strong>Google Maps</strong>: Geospatial information</li><li><strong>YouTube</strong>: Video content understanding</li><li><strong>Android</strong>: Mobile integration</li></ul><h4 id="3-Developer-Experience"><a href="#3-Developer-Experience" class="headerlink" title="3. Developer Experience"></a>3. Developer Experience</h4><ul><li><strong>Vertex AI</strong>: Enterprise-grade AI platform</li><li><strong>AI Studio</strong>: Free development environment</li><li><strong>Google Cloud</strong>: Cloud-native deployment</li><li><strong>Kaggle</strong>: Data science community</li></ul><h4 id="4-Performance-Advantages"><a href="#4-Performance-Advantages" class="headerlink" title="4. Performance Advantages"></a>4. Performance Advantages</h4><ul><li><strong>MLOps</strong>: Model deployment and monitoring</li><li><strong>A&#x2F;B Testing</strong>: Model comparison</li><li><strong>AutoML</strong>: Automated machine learning</li><li><strong>TPU optimization</strong>: Hardware acceleration</li></ul><h3 id="Best-For-1"><a href="#Best-For-1" class="headerlink" title="Best For"></a>Best For</h3><ul><li>Projects needing Google ecosystem integration</li><li>Multimodal application development</li><li>Enterprise AI platforms</li><li>Teams that need MLOps capabilities</li><li>Existing Google Cloud users</li></ul><h3 id="Pros-Cons-1"><a href="#Pros-Cons-1" class="headerlink" title="Pros &amp; Cons"></a>Pros &amp; Cons</h3><p><strong>Pros:</strong></p><ul><li>✅ Strong multimodal capabilities</li><li>✅ Deep Google ecosystem integration</li><li>✅ Relatively lower prices</li><li>✅ Mature enterprise AI platform</li><li>✅ Rich developer tools</li></ul><p><strong>Cons:</strong></p><ul><li>❌ Language model capabilities slightly behind GPT-4</li><li>❌ Documentation and community not as strong as OpenAI</li><li>❌ Some features still in Beta</li><li>❌ Access restricted in some regions</li></ul><h3 id="Docs-Resources-1"><a href="#Docs-Resources-1" class="headerlink" title="Docs &amp; Resources"></a>Docs &amp; Resources</h3><ul><li><strong>Official docs</strong>: <a href="https://ai.google.dev/docs">https://ai.google.dev/docs</a></li><li><strong>Vertex AI</strong>: <a href="https://cloud.google.com/vertex-ai">https://cloud.google.com/vertex-ai</a></li><li><strong>AI Studio</strong>: <a href="https://aistudio.google.com/">https://aistudio.google.com/</a></li></ul><hr><h2 id="Anthropic-Claude"><a href="#Anthropic-Claude" class="headerlink" title="Anthropic Claude"></a>Anthropic Claude</h2><h3 id="Official-Websites-2"><a href="#Official-Websites-2" class="headerlink" title="Official Websites"></a>Official Websites</h3><ul><li><a href="https://www.anthropic.com/">https://www.anthropic.com/</a></li><li><a href="https://docs.anthropic.com/">https://docs.anthropic.com/</a></li></ul><h3 id="Overview-2"><a href="#Overview-2" class="headerlink" title="Overview"></a>Overview</h3><p>Anthropic was founded by former OpenAI employees and focuses on AI safety and alignment. The Claude series is known for its safety, long context window, and natural conversational quality.</p><h3 id="Core-Models-2"><a href="#Core-Models-2" class="headerlink" title="Core Models"></a>Core Models</h3><h4 id="Claude-3-Series"><a href="#Claude-3-Series" class="headerlink" title="Claude 3 Series"></a>Claude 3 Series</h4><ul><li><strong>Claude 3 Opus</strong>: Most powerful, highest intelligence</li><li><strong>Claude 3 Sonnet</strong>: Balanced model, good performance-to-cost ratio</li><li><strong>Claude 3 Haiku</strong>: Fast model, low cost</li></ul><h4 id="Claude-2-Series"><a href="#Claude-2-Series" class="headerlink" title="Claude 2 Series"></a>Claude 2 Series</h4><ul><li><strong>Claude 2.1</strong>: Long context (200K tokens)</li><li><strong>Claude 2</strong>: Previous-generation model</li></ul><h3 id="Subscription-Plans-2"><a href="#Subscription-Plans-2" class="headerlink" title="Subscription Plans"></a>Subscription Plans</h3><h4 id="Free-Tier-2"><a href="#Free-Tier-2" class="headerlink" title="Free Tier"></a>Free Tier</h4><ul><li><strong>Price</strong>: $0&#x2F;month</li><li><strong>Credits</strong>: Limited usage quota</li></ul><h4 id="API-Pay-as-you-go-1"><a href="#API-Pay-as-you-go-1" class="headerlink" title="API Pay-as-you-go"></a>API Pay-as-you-go</h4><ul><li><strong>Claude 3 Opus</strong>: $15 &#x2F; 1M input tokens, $75 &#x2F; 1M output tokens</li><li><strong>Claude 3 Sonnet</strong>: $3 &#x2F; 1M input tokens, $15 &#x2F; 1M output tokens</li><li><strong>Claude 3 Haiku</strong>: $0.25 &#x2F; 1M input tokens, $1.25 &#x2F; 1M output tokens</li><li><strong>Claude 2.1</strong>: $8 &#x2F; 1M input tokens, $24 &#x2F; 1M output tokens</li></ul><h4 id="Claude-Pro-Personal"><a href="#Claude-Pro-Personal" class="headerlink" title="Claude Pro (Personal)"></a>Claude Pro (Personal)</h4><ul><li><strong>Price</strong>: $20&#x2F;month</li><li><strong>Includes</strong>:<ul><li>Claude 3 Opus access</li><li>Higher usage limits</li><li>Priority access to new features</li></ul></li></ul><h4 id="Team-1"><a href="#Team-1" class="headerlink" title="Team"></a>Team</h4><ul><li><strong>Price</strong>: $30&#x2F;user&#x2F;month</li><li><strong>Includes</strong>:<ul><li>Everything in Claude Pro</li><li>Team management features</li><li>Higher usage limits</li></ul></li></ul><h4 id="Enterprise-2"><a href="#Enterprise-2" class="headerlink" title="Enterprise"></a>Enterprise</h4><ul><li><strong>Price</strong>: Contact sales</li><li><strong>Includes</strong>:<ul><li>Custom model fine-tuning</li><li>Data privacy protection</li><li>Compliance certifications</li><li>Dedicated support</li></ul></li></ul><h3 id="Core-Strengths-2"><a href="#Core-Strengths-2" class="headerlink" title="Core Strengths"></a>Core Strengths</h3><h4 id="1-Safety-and-Alignment"><a href="#1-Safety-and-Alignment" class="headerlink" title="1. Safety and Alignment"></a>1. Safety and Alignment</h4><ul><li>Leading AI safety research</li><li>Constitutional AI methodology</li><li>Refusal of harmful content</li><li>Strong interpretability</li></ul><h4 id="2-Long-Context"><a href="#2-Long-Context" class="headerlink" title="2. Long Context"></a>2. Long Context</h4><ul><li>Claude 2.1 supports 200K tokens</li><li>Long document comprehension and summarization</li><li>Large codebase analysis</li></ul><h4 id="3-Natural-Conversation"><a href="#3-Natural-Conversation" class="headerlink" title="3. Natural Conversation"></a>3. Natural Conversation</h4><ul><li>Highly fluent dialogue</li><li>Natural tone and voice</li><li>Ideal for chatbots</li><li>Creative writing</li></ul><h4 id="4-Coding-Capabilities"><a href="#4-Coding-Capabilities" class="headerlink" title="4. Coding Capabilities"></a>4. Coding Capabilities</h4><ul><li>Excellent performance on programming tasks</li><li>Code generation and debugging</li><li>Technical documentation understanding</li><li>Code refactoring suggestions</li></ul><h4 id="5-Developer-Experience"><a href="#5-Developer-Experience" class="headerlink" title="5. Developer Experience"></a>5. Developer Experience</h4><ul><li><strong>API</strong>: REST API, Python&#x2F;JS SDK</li><li><strong>LangChain</strong>: Native support</li><li><strong>Function Calling</strong>: Call external functions</li><li><strong>Streaming</strong>: Stream responses</li><li><strong>Tool Use</strong>: External tool integration</li></ul><h3 id="Best-For-2"><a href="#Best-For-2" class="headerlink" title="Best For"></a>Best For</h3><ul><li>Scenarios with high safety requirements</li><li>Applications requiring long context</li><li>Coding assistant tools</li><li>Chatbots</li><li>Content moderation and compliance</li></ul><h3 id="Pros-Cons-2"><a href="#Pros-Cons-2" class="headerlink" title="Pros &amp; Cons"></a>Pros &amp; Cons</h3><p><strong>Pros:</strong></p><ul><li>✅ Best safety and alignment properties</li><li>✅ Long context support (200K)</li><li>✅ Natural and fluent conversation</li><li>✅ Strong coding capabilities</li><li>✅ Reasonably priced</li></ul><p><strong>Cons:</strong></p><ul><li>❌ Model capabilities slightly behind GPT-4</li><li>❌ Ecosystem not as mature as OpenAI</li><li>❌ Smaller documentation and community</li><li>❌ Fewer tools and plugins</li></ul><h3 id="Docs-Resources-2"><a href="#Docs-Resources-2" class="headerlink" title="Docs &amp; Resources"></a>Docs &amp; Resources</h3><ul><li><strong>Official docs</strong>: <a href="https://docs.anthropic.com/">https://docs.anthropic.com/</a></li><li><strong>API Reference</strong>: <a href="https://docs.anthropic.com/claude/reference">https://docs.anthropic.com/claude/reference</a></li><li><strong>GitHub</strong>: <a href="https://github.com/anthropics">https://github.com/anthropics</a></li><li><strong>Research papers</strong>: <a href="https://www.anthropic.com/research">https://www.anthropic.com/research</a></li></ul><hr><h2 id="Zhipu-AI"><a href="#Zhipu-AI" class="headerlink" title="Zhipu AI"></a>Zhipu AI</h2><h3 id="Official-Websites-3"><a href="#Official-Websites-3" class="headerlink" title="Official Websites"></a>Official Websites</h3><ul><li><a href="https://open.bigmodel.cn/">https://open.bigmodel.cn/</a></li><li><a href="https://www.zhipuai.cn/">https://www.zhipuai.cn/</a></li></ul><h3 id="Overview-3"><a href="#Overview-3" class="headerlink" title="Overview"></a>Overview</h3><p>Zhipu AI is a leading Chinese large model company that developed the GLM series, offering Chinese-optimized language models and multimodal capabilities.</p><h3 id="Core-Models-3"><a href="#Core-Models-3" class="headerlink" title="Core Models"></a>Core Models</h3><h4 id="Language-Models-1"><a href="#Language-Models-1" class="headerlink" title="Language Models"></a>Language Models</h4><ul><li><strong>GLM-4</strong>: New-generation LLM, capabilities benchmarked against GPT-4</li><li><strong>GLM-3-Turbo</strong>: Fast response model</li><li><strong>GLM-3-6B</strong>: Lightweight model</li></ul><h4 id="Multimodal-Models"><a href="#Multimodal-Models" class="headerlink" title="Multimodal Models"></a>Multimodal Models</h4><ul><li><strong>CogView</strong>: Image generation</li><li><strong>CogVideo</strong>: Video generation</li><li><strong>CogView3</strong>: Third-generation image model</li></ul><h4 id="Specialized-Models"><a href="#Specialized-Models" class="headerlink" title="Specialized Models"></a>Specialized Models</h4><ul><li><strong>CodeGeeX</strong>: Code model</li><li><strong>CharacterGLM</strong>: Role-playing</li><li><strong>MedicalGLM</strong>: Healthcare</li></ul><h3 id="Subscription-Plans-3"><a href="#Subscription-Plans-3" class="headerlink" title="Subscription Plans"></a>Subscription Plans</h3><h4 id="Free-Tier-3"><a href="#Free-Tier-3" class="headerlink" title="Free Tier"></a>Free Tier</h4><ul><li><strong>Price</strong>: ¥0&#x2F;month</li><li><strong>Includes</strong>:<ul><li>Basic model access</li><li>Daily usage limit</li><li>Online chat</li></ul></li></ul><h4 id="API-Pay-as-you-go-2"><a href="#API-Pay-as-you-go-2" class="headerlink" title="API Pay-as-you-go"></a>API Pay-as-you-go</h4><ul><li><strong>GLM-4</strong>: ¥0.1 &#x2F; 1K input tokens, ¥0.1 &#x2F; 1K output tokens</li><li><strong>GLM-3-Turbo</strong>: ¥0.005 &#x2F; 1K input tokens, ¥0.005 &#x2F; 1K output tokens</li><li><strong>CogView</strong>: ¥0.05 &#x2F; image</li></ul><h4 id="Personal-Plan"><a href="#Personal-Plan" class="headerlink" title="Personal Plan"></a>Personal Plan</h4><ul><li><strong>Price</strong>: ¥49&#x2F;month</li><li><strong>Includes</strong>:<ul><li>GLM-4 premium access</li><li>Higher usage limits</li><li>Priority responses</li></ul></li></ul><h4 id="Enterprise-3"><a href="#Enterprise-3" class="headerlink" title="Enterprise"></a>Enterprise</h4><ul><li><strong>Price</strong>: Contact sales</li><li><strong>Includes</strong>:<ul><li>Private deployment</li><li>Model fine-tuning</li><li>Data privacy protection</li><li>Dedicated technical support</li><li>Compliance certifications</li></ul></li></ul><h3 id="Core-Strengths-3"><a href="#Core-Strengths-3" class="headerlink" title="Core Strengths"></a>Core Strengths</h3><h4 id="1-Chinese-Language-Optimization"><a href="#1-Chinese-Language-Optimization" class="headerlink" title="1. Chinese Language Optimization"></a>1. Chinese Language Optimization</h4><ul><li>Strong Chinese comprehension and generation</li><li>Deep understanding of Chinese culture</li><li>Excellent Chinese instruction-following</li></ul><h4 id="2-Domestic-Compliance-Support"><a href="#2-Domestic-Compliance-Support" class="headerlink" title="2. Domestic Compliance Support"></a>2. Domestic Compliance Support</h4><ul><li>Compliant with Chinese data regulations</li><li>Data stays onshore</li><li>Compatible with domestic hardware</li></ul><h4 id="3-Multimodal-Capabilities"><a href="#3-Multimodal-Capabilities" class="headerlink" title="3. Multimodal Capabilities"></a>3. Multimodal Capabilities</h4><ul><li>Text, images, and video</li><li>Cross-modal understanding</li><li>Wide application scenarios</li></ul><h4 id="4-Cost-Advantage"><a href="#4-Cost-Advantage" class="headerlink" title="4. Cost Advantage"></a>4. Cost Advantage</h4><ul><li>Relatively lower prices</li><li>Optimized for the domestic market</li><li>Suitable for large-scale applications</li></ul><h4 id="5-Developer-Experience-1"><a href="#5-Developer-Experience-1" class="headerlink" title="5. Developer Experience"></a>5. Developer Experience</h4><ul><li><strong>API</strong>: REST API, Python&#x2F;Java SDK</li><li><strong>LangChain</strong>: Native support</li><li><strong>Web interface</strong>: Online debugging</li><li><strong>Detailed docs</strong>: Chinese-language documentation</li></ul><h3 id="Best-For-3"><a href="#Best-For-3" class="headerlink" title="Best For"></a>Best For</h3><ul><li>Domestic application development</li><li>Chinese-primary applications</li><li>Scenarios with strict data compliance requirements</li><li>Cost-sensitive projects</li><li>Teams requiring domestic-only deployment</li></ul><h3 id="Pros-Cons-3"><a href="#Pros-Cons-3" class="headerlink" title="Pros &amp; Cons"></a>Pros &amp; Cons</h3><p><strong>Pros:</strong></p><ul><li>✅ Strong Chinese language capabilities</li><li>✅ Compliant with Chinese regulations</li><li>✅ Relatively lower cost</li><li>✅ Domestic deployment support</li><li>✅ Solid multimodal capabilities</li></ul><p><strong>Cons:</strong></p><ul><li>❌ Model capabilities slightly behind GPT-4</li><li>❌ Weaker English capabilities</li><li>❌ Smaller ecosystem and community</li><li>❌ Fewer tools and plugins</li></ul><h3 id="Docs-Resources-3"><a href="#Docs-Resources-3" class="headerlink" title="Docs &amp; Resources"></a>Docs &amp; Resources</h3><ul><li><strong>Official docs</strong>: <a href="https://open.bigmodel.cn/dev/api">https://open.bigmodel.cn/dev/api</a></li><li><strong>GitHub</strong>: <a href="https://github.com/THUDM">https://github.com/THUDM</a></li><li><strong>Open-source projects</strong>: GLM-4, CodeGeeX, etc.</li></ul><hr><h2 id="Baidu-ERNIE-Bot"><a href="#Baidu-ERNIE-Bot" class="headerlink" title="Baidu ERNIE Bot"></a>Baidu ERNIE Bot</h2><h3 id="Official-Websites-4"><a href="#Official-Websites-4" class="headerlink" title="Official Websites"></a>Official Websites</h3><ul><li><a href="https://yiyan.baidu.com/">https://yiyan.baidu.com/</a></li><li><a href="https://cloud.baidu.com/product/wenxinworkshop">https://cloud.baidu.com/product/wenxinworkshop</a></li></ul><h3 id="Overview-4"><a href="#Overview-4" class="headerlink" title="Overview"></a>Overview</h3><p>Baidu’s ERNIE Bot (Wenxin Yiyan) is a large language model based on Baidu’s ERNIE series, with deep integration across the Baidu ecosystem.</p><h3 id="Core-Models-4"><a href="#Core-Models-4" class="headerlink" title="Core Models"></a>Core Models</h3><h4 id="ERNIE-Series"><a href="#ERNIE-Series" class="headerlink" title="ERNIE Series"></a>ERNIE Series</h4><ul><li><strong>ERNIE 4.0</strong>: Latest version, multimodal</li><li><strong>ERNIE 3.5</strong>: Mainstream version</li><li><strong>ERNIE 4.0 Turbo</strong>: Fast version</li></ul><h4 id="Specialized-Models-1"><a href="#Specialized-Models-1" class="headerlink" title="Specialized Models"></a>Specialized Models</h4><ul><li><strong>ERNIE Bot</strong>: Dialogue model</li><li><strong>ERNIE Speed</strong>: High-speed response</li><li><strong>ERNIE Lite</strong>: Lightweight</li></ul><h3 id="Subscription-Plans-4"><a href="#Subscription-Plans-4" class="headerlink" title="Subscription Plans"></a>Subscription Plans</h3><h4 id="Free-Tier-4"><a href="#Free-Tier-4" class="headerlink" title="Free Tier"></a>Free Tier</h4><ul><li><strong>Price</strong>: ¥0&#x2F;month</li><li><strong>Includes</strong>:<ul><li>Basic dialogue</li><li>Daily usage limit</li></ul></li></ul><h4 id="API-Pay-as-you-go-3"><a href="#API-Pay-as-you-go-3" class="headerlink" title="API Pay-as-you-go"></a>API Pay-as-you-go</h4><ul><li><strong>ERNIE 4.0</strong>: ¥0.12 &#x2F; 1K tokens</li><li><strong>ERNIE 3.5</strong>: ¥0.008 &#x2F; 1K tokens</li><li><strong>ERNIE Speed</strong>: ¥0.004 &#x2F; 1K tokens</li></ul><h4 id="VIP-Membership"><a href="#VIP-Membership" class="headerlink" title="VIP Membership"></a>VIP Membership</h4><ul><li><strong>Price</strong>: ¥49&#x2F;month</li><li><strong>Includes</strong>:<ul><li>ERNIE 4.0 access</li><li>Higher usage limits</li><li>Exclusive features</li></ul></li></ul><h4 id="Enterprise-4"><a href="#Enterprise-4" class="headerlink" title="Enterprise"></a>Enterprise</h4><ul><li><strong>Price</strong>: Contact sales</li><li><strong>Includes</strong>:<ul><li>Baidu Cloud integration</li><li>Private deployment</li><li>Model fine-tuning</li><li>Dedicated support</li></ul></li></ul><h3 id="Core-Strengths-4"><a href="#Core-Strengths-4" class="headerlink" title="Core Strengths"></a>Core Strengths</h3><h4 id="1-Baidu-Ecosystem-Integration"><a href="#1-Baidu-Ecosystem-Integration" class="headerlink" title="1. Baidu Ecosystem Integration"></a>1. Baidu Ecosystem Integration</h4><ul><li><strong>Baidu Search</strong>: Real-time search</li><li><strong>Baidu Maps</strong>: Geospatial information</li><li><strong>Baidu Baike</strong>: Knowledge base</li><li><strong>Baidu Netdisk</strong>: Cloud storage</li></ul><h4 id="2-Chinese-Language-Optimization"><a href="#2-Chinese-Language-Optimization" class="headerlink" title="2. Chinese Language Optimization"></a>2. Chinese Language Optimization</h4><ul><li>Strong Chinese capabilities</li><li>Chinese cultural understanding</li><li>Rich Chinese knowledge base</li></ul><h4 id="3-Enterprise-Features"><a href="#3-Enterprise-Features" class="headerlink" title="3. Enterprise Features"></a>3. Enterprise Features</h4><ul><li><strong>Baidu Cloud</strong>: Cloud-native deployment</li><li><strong>Intelligent Cloud</strong>: AI development platform</li><li><strong>Compliance</strong>: Meets domestic requirements</li><li><strong>Technical support</strong>: 24&#x2F;7 support</li></ul><h4 id="4-Developer-Tools"><a href="#4-Developer-Tools" class="headerlink" title="4. Developer Tools"></a>4. Developer Tools</h4><ul><li><strong>Qianfan Platform</strong>: AI development platform</li><li><strong>ModelArts</strong>: Model training</li><li><strong>AppBuilder</strong>: Application building</li></ul><h3 id="Best-For-4"><a href="#Best-For-4" class="headerlink" title="Best For"></a>Best For</h3><ul><li>Baidu ecosystem integration</li><li>Domestic enterprise applications</li><li>Teams needing Baidu Cloud services</li><li>High compliance requirements</li><li>SMBs looking for rapid deployment</li></ul><h3 id="Pros-Cons-4"><a href="#Pros-Cons-4" class="headerlink" title="Pros &amp; Cons"></a>Pros &amp; Cons</h3><p><strong>Pros:</strong></p><ul><li>✅ Deep Baidu ecosystem integration</li><li>✅ Strong Chinese capabilities</li><li>✅ Comprehensive enterprise features</li><li>✅ Good Baidu Cloud support</li><li>✅ Relatively lower prices</li></ul><p><strong>Cons:</strong></p><ul><li>❌ Average model capabilities</li><li>❌ Weaker English capabilities</li><li>❌ Relatively closed ecosystem</li><li>❌ Fewer developer tools</li></ul><h3 id="Docs-Resources-4"><a href="#Docs-Resources-4" class="headerlink" title="Docs &amp; Resources"></a>Docs &amp; Resources</h3><ul><li><strong>Official docs</strong>: <a href="https://cloud.baidu.com/doc/WENXINWORKSHOP/">https://cloud.baidu.com/doc/WENXINWORKSHOP/</a></li><li><strong>Qianfan Platform</strong>: <a href="https://console.bce.baidu.com/qianfan/">https://console.bce.baidu.com/qianfan/</a></li></ul><hr><h2 id="Alibaba-Cloud-Qwen"><a href="#Alibaba-Cloud-Qwen" class="headerlink" title="Alibaba Cloud Qwen"></a>Alibaba Cloud Qwen</h2><h3 id="Official-Websites-5"><a href="#Official-Websites-5" class="headerlink" title="Official Websites"></a>Official Websites</h3><ul><li><a href="https://tongyi.aliyun.com/">https://tongyi.aliyun.com/</a></li><li><a href="https://help.aliyun.com/zh/dashscope/">https://help.aliyun.com/zh/dashscope/</a></li></ul><h3 id="Overview-5"><a href="#Overview-5" class="headerlink" title="Overview"></a>Overview</h3><p>Alibaba Cloud’s Qwen (Tongyi Qianwen) is a series of large language models ranging from lightweight to ultra-large scale, offering diverse model choices.</p><h3 id="Core-Models-5"><a href="#Core-Models-5" class="headerlink" title="Core Models"></a>Core Models</h3><h4 id="Qwen-Series"><a href="#Qwen-Series" class="headerlink" title="Qwen Series"></a>Qwen Series</h4><ul><li><strong>Qwen-Max</strong>: Most powerful model</li><li><strong>Qwen-Plus</strong>: Mainstream model</li><li><strong>Qwen-Turbo</strong>: Fast response</li><li><strong>Qwen-Long</strong>: Long context</li></ul><h4 id="Open-Source-Models"><a href="#Open-Source-Models" class="headerlink" title="Open-Source Models"></a>Open-Source Models</h4><ul><li><strong>Qwen-72B</strong>: Open-source large-scale</li><li><strong>Qwen-14B</strong>: Open-source mid-scale</li><li><strong>Qwen-7B</strong>: Open-source small-scale</li></ul><h4 id="Specialized-Models-2"><a href="#Specialized-Models-2" class="headerlink" title="Specialized Models"></a>Specialized Models</h4><ul><li><strong>Qwen-VL</strong>: Vision-language model</li><li><strong>Qwen-Audio</strong>: Audio model</li><li><strong>CodeQwen</strong>: Code model</li></ul><h3 id="Subscription-Plans-5"><a href="#Subscription-Plans-5" class="headerlink" title="Subscription Plans"></a>Subscription Plans</h3><h4 id="Free-Tier-5"><a href="#Free-Tier-5" class="headerlink" title="Free Tier"></a>Free Tier</h4><ul><li><strong>Price</strong>: ¥0&#x2F;month</li><li><strong>Includes</strong>:<ul><li>Basic model access</li><li>Daily usage limit</li></ul></li></ul><h4 id="API-Pay-as-you-go-4"><a href="#API-Pay-as-you-go-4" class="headerlink" title="API Pay-as-you-go"></a>API Pay-as-you-go</h4><ul><li><strong>Qwen-Max</strong>: ¥0.04 &#x2F; 1K tokens</li><li><strong>Qwen-Plus</strong>: ¥0.008 &#x2F; 1K tokens</li><li><strong>Qwen-Turbo</strong>: ¥0.003 &#x2F; 1K tokens</li></ul><h4 id="Enterprise-5"><a href="#Enterprise-5" class="headerlink" title="Enterprise"></a>Enterprise</h4><ul><li><strong>Price</strong>: Contact sales</li><li><strong>Includes</strong>:<ul><li>Deep Alibaba Cloud integration</li><li>Private deployment</li><li>Model fine-tuning</li><li>SLA guarantees</li></ul></li></ul><h3 id="Core-Strengths-5"><a href="#Core-Strengths-5" class="headerlink" title="Core Strengths"></a>Core Strengths</h3><h4 id="1-Alibaba-Cloud-Ecosystem"><a href="#1-Alibaba-Cloud-Ecosystem" class="headerlink" title="1. Alibaba Cloud Ecosystem"></a>1. Alibaba Cloud Ecosystem</h4><ul><li><strong>Alibaba Cloud ECS</strong>: Cloud servers</li><li><strong>OSS</strong>: Object storage</li><li><strong>RDS</strong>: Database</li><li><strong>Function Compute</strong>: Serverless</li></ul><h4 id="2-Rich-Model-Selection"><a href="#2-Rich-Model-Selection" class="headerlink" title="2. Rich Model Selection"></a>2. Rich Model Selection</h4><ul><li>Multiple scales of models</li><li>Downloadable open-source models</li><li>Long context support</li></ul><h4 id="3-Chinese-Language-Optimization"><a href="#3-Chinese-Language-Optimization" class="headerlink" title="3. Chinese Language Optimization"></a>3. Chinese Language Optimization</h4><ul><li>Strong Chinese capabilities</li><li>Integration with Alibaba products</li><li>Optimized for e-commerce scenarios</li></ul><h4 id="4-Developer-Tools-1"><a href="#4-Developer-Tools-1" class="headerlink" title="4. Developer Tools"></a>4. Developer Tools</h4><ul><li><strong>DashScope</strong>: AI development platform</li><li><strong>ModelScope</strong>: Model community</li><li><strong>PAI</strong>: Machine learning platform</li></ul><h3 id="Best-For-5"><a href="#Best-For-5" class="headerlink" title="Best For"></a>Best For</h3><ul><li>Existing Alibaba Cloud users</li><li>E-commerce applications</li><li>Enterprise-grade applications</li><li>Teams that prefer open-source models</li><li>Cost-sensitive projects</li></ul><h3 id="Pros-Cons-5"><a href="#Pros-Cons-5" class="headerlink" title="Pros &amp; Cons"></a>Pros &amp; Cons</h3><p><strong>Pros:</strong></p><ul><li>✅ Deep Alibaba Cloud integration</li><li>✅ Rich model selection</li><li>✅ Downloadable open-source models</li><li>✅ Strong Chinese capabilities</li><li>✅ Lower prices</li></ul><p><strong>Cons:</strong></p><ul><li>❌ Average model capabilities</li><li>❌ Weaker English capabilities</li><li>❌ Relatively closed ecosystem</li><li>❌ Toolchain not fully mature</li></ul><h3 id="Docs-Resources-5"><a href="#Docs-Resources-5" class="headerlink" title="Docs &amp; Resources"></a>Docs &amp; Resources</h3><ul><li><strong>Official docs</strong>: <a href="https://help.aliyun.com/zh/dashscope/">https://help.aliyun.com/zh/dashscope/</a></li><li><strong>ModelScope</strong>: <a href="https://modelscope.cn/">https://modelscope.cn/</a></li></ul><hr><h2 id="ByteDance-Doubao"><a href="#ByteDance-Doubao" class="headerlink" title="ByteDance Doubao"></a>ByteDance Doubao</h2><h3 id="Official-Websites-6"><a href="#Official-Websites-6" class="headerlink" title="Official Websites"></a>Official Websites</h3><ul><li><a href="https://www.doubao.com/">https://www.doubao.com/</a></li><li><a href="https://platform.volcengine.com/">https://platform.volcengine.com/</a></li></ul><h3 id="Overview-6"><a href="#Overview-6" class="headerlink" title="Overview"></a>Overview</h3><p>ByteDance’s Doubao is an AI assistant and multimodal model platform offering dialogue, image generation, voice, and other capabilities.</p><h3 id="Core-Models-6"><a href="#Core-Models-6" class="headerlink" title="Core Models"></a>Core Models</h3><h4 id="Doubao-Series"><a href="#Doubao-Series" class="headerlink" title="Doubao Series"></a>Doubao Series</h4><ul><li><strong>Doubao-Pro</strong>: Professional version</li><li><strong>Doubao-Lite</strong>: Lightweight version</li><li><strong>Doubao-Character</strong>: Role-playing</li></ul><h4 id="Vision-Models"><a href="#Vision-Models" class="headerlink" title="Vision Models"></a>Vision Models</h4><ul><li><strong>Skylark</strong>: Image generation</li><li><strong>Skylark-2</strong>: Second-generation images</li></ul><h4 id="Other-Models-2"><a href="#Other-Models-2" class="headerlink" title="Other Models"></a>Other Models</h4><ul><li><strong>Voice models</strong>: Speech synthesis</li><li><strong>Video models</strong>: Video generation</li></ul><h3 id="Subscription-Plans-6"><a href="#Subscription-Plans-6" class="headerlink" title="Subscription Plans"></a>Subscription Plans</h3><h4 id="Free-Tier-6"><a href="#Free-Tier-6" class="headerlink" title="Free Tier"></a>Free Tier</h4><ul><li><strong>Price</strong>: ¥0&#x2F;month</li><li><strong>Includes</strong>:<ul><li>Basic dialogue</li><li>Daily usage limit</li></ul></li></ul><h4 id="API-Pay-as-you-go-5"><a href="#API-Pay-as-you-go-5" class="headerlink" title="API Pay-as-you-go"></a>API Pay-as-you-go</h4><ul><li><strong>Doubao-Pro</strong>: ¥0.008 &#x2F; 1K tokens</li><li><strong>Doubao-Lite</strong>: ¥0.001 &#x2F; 1K tokens</li><li><strong>Skylark</strong>: ¥0.05 &#x2F; image</li></ul><h4 id="Enterprise-6"><a href="#Enterprise-6" class="headerlink" title="Enterprise"></a>Enterprise</h4><ul><li><strong>Price</strong>: Contact sales</li><li><strong>Includes</strong>:<ul><li>Volcano Engine integration</li><li>Private deployment</li><li>Dedicated support</li></ul></li></ul><h3 id="Core-Strengths-6"><a href="#Core-Strengths-6" class="headerlink" title="Core Strengths"></a>Core Strengths</h3><h4 id="1-ByteDance-Ecosystem"><a href="#1-ByteDance-Ecosystem" class="headerlink" title="1. ByteDance Ecosystem"></a>1. ByteDance Ecosystem</h4><ul><li><strong>Douyin (TikTok)</strong>: Short video integration</li><li><strong>Toutiao</strong>: News and content</li><li><strong>Feishu (Lark)</strong>: Workplace collaboration</li><li><strong>Volcano Engine</strong>: Cloud services</li></ul><h4 id="2-Multimodal"><a href="#2-Multimodal" class="headerlink" title="2. Multimodal"></a>2. Multimodal</h4><ul><li>Text, images, audio, and video</li><li>Cross-modal understanding</li><li>Creative content generation</li></ul><h4 id="3-Cost-Advantage"><a href="#3-Cost-Advantage" class="headerlink" title="3. Cost Advantage"></a>3. Cost Advantage</h4><ul><li>Lower prices</li><li>Suitable for large-scale applications</li><li>Generous free quota</li></ul><h4 id="4-Developer-Tools-2"><a href="#4-Developer-Tools-2" class="headerlink" title="4. Developer Tools"></a>4. Developer Tools</h4><ul><li><strong>Volcano Engine</strong>: Development platform</li><li><strong>Open Platform</strong>: API services</li><li><strong>Detailed docs</strong>: Chinese-language documentation</li></ul><h3 id="Best-For-6"><a href="#Best-For-6" class="headerlink" title="Best For"></a>Best For</h3><ul><li>ByteDance ecosystem integration</li><li>Multimodal applications</li><li>Content creation</li><li>Cost-sensitive projects</li><li>Consumer-facing (C-end) applications</li></ul><h3 id="Pros-Cons-6"><a href="#Pros-Cons-6" class="headerlink" title="Pros &amp; Cons"></a>Pros &amp; Cons</h3><p><strong>Pros:</strong></p><ul><li>✅ ByteDance ecosystem integration</li><li>✅ Strong multimodal capabilities</li><li>✅ Lower prices</li><li>✅ Generous free quota</li><li>✅ Suitable for consumer apps</li></ul><p><strong>Cons:</strong></p><ul><li>❌ Average model capabilities</li><li>❌ Fewer enterprise features</li><li>❌ Toolchain not fully mature</li><li>❌ Relatively closed ecosystem</li></ul><h3 id="Docs-Resources-6"><a href="#Docs-Resources-6" class="headerlink" title="Docs &amp; Resources"></a>Docs &amp; Resources</h3><ul><li><strong>Volcano Engine</strong>: <a href="https://platform.volcengine.com/">https://platform.volcengine.com/</a></li><li><strong>Open Platform</strong>: <a href="https://open.volcengine.com/">https://open.volcengine.com/</a></li></ul><hr><h2 id="Moonshot-Kimi"><a href="#Moonshot-Kimi" class="headerlink" title="Moonshot Kimi"></a>Moonshot Kimi</h2><h3 id="Official-Website"><a href="#Official-Website" class="headerlink" title="Official Website"></a>Official Website</h3><ul><li><a href="https://www.moonshot.cn/">https://www.moonshot.cn/</a></li></ul><h3 id="Overview-7"><a href="#Overview-7" class="headerlink" title="Overview"></a>Overview</h3><p>Moonshot AI’s Kimi is known for its ultra-long context window — supporting up to 2 million tokens — making it ideal for long document analysis and summarization.</p><h3 id="Core-Models-7"><a href="#Core-Models-7" class="headerlink" title="Core Models"></a>Core Models</h3><h4 id="Kimi-Series"><a href="#Kimi-Series" class="headerlink" title="Kimi Series"></a>Kimi Series</h4><ul><li><strong>moonshot-v1-128k</strong>: 128K context</li><li><strong>moonshot-v1-32k</strong>: 32K context</li><li><strong>moonshot-v1-8k</strong>: 8K context</li></ul><h3 id="Subscription-Plans-7"><a href="#Subscription-Plans-7" class="headerlink" title="Subscription Plans"></a>Subscription Plans</h3><h4 id="Free-Tier-7"><a href="#Free-Tier-7" class="headerlink" title="Free Tier"></a>Free Tier</h4><ul><li><strong>Price</strong>: ¥0&#x2F;month</li><li><strong>Includes</strong>:<ul><li>Basic dialogue</li><li>20 files&#x2F;day</li></ul></li></ul><h4 id="Pro"><a href="#Pro" class="headerlink" title="Pro"></a>Pro</h4><ul><li><strong>Price</strong>: ¥68&#x2F;month</li><li><strong>Includes</strong>:<ul><li>128K context</li><li>Unlimited file uploads</li><li>Higher usage limits</li></ul></li></ul><h4 id="Enterprise-7"><a href="#Enterprise-7" class="headerlink" title="Enterprise"></a>Enterprise</h4><ul><li><strong>Price</strong>: Contact sales</li><li><strong>Includes</strong>:<ul><li>API access</li><li>Private deployment</li><li>Dedicated support</li></ul></li></ul><h3 id="Core-Strengths-7"><a href="#Core-Strengths-7" class="headerlink" title="Core Strengths"></a>Core Strengths</h3><h4 id="1-Ultra-Long-Context"><a href="#1-Ultra-Long-Context" class="headerlink" title="1. Ultra-Long Context"></a>1. Ultra-Long Context</h4><ul><li>2 million token context window</li><li>Long document analysis</li><li>Large codebase comprehension</li></ul><h4 id="2-File-Processing"><a href="#2-File-Processing" class="headerlink" title="2. File Processing"></a>2. File Processing</h4><ul><li>Supports multiple formats</li><li>PDF, Word, Excel, and more</li><li>Document summarization and analysis</li></ul><h4 id="3-Chinese-Language-Optimization-1"><a href="#3-Chinese-Language-Optimization-1" class="headerlink" title="3. Chinese Language Optimization"></a>3. Chinese Language Optimization</h4><ul><li>Strong Chinese capabilities</li><li>Chinese document processing</li><li>Chinese cultural understanding</li></ul><h4 id="4-User-Experience"><a href="#4-User-Experience" class="headerlink" title="4. User Experience"></a>4. User Experience</h4><ul><li>Clean interface</li><li>Easy to use</li><li>Suited for individual users</li></ul><h3 id="Best-For-7"><a href="#Best-For-7" class="headerlink" title="Best For"></a>Best For</h3><ul><li>Long document analysis</li><li>Codebase comprehension</li><li>Research and academic work</li><li>Personal knowledge management</li><li>Document summarization</li></ul><h3 id="Pros-Cons-7"><a href="#Pros-Cons-7" class="headerlink" title="Pros &amp; Cons"></a>Pros &amp; Cons</h3><p><strong>Pros:</strong></p><ul><li>✅ Ultra-long context (2 million tokens)</li><li>✅ Strong file processing</li><li>✅ Good Chinese language optimization</li><li>✅ Pleasant user experience</li><li>✅ Great for personal use</li></ul><p><strong>Cons:</strong></p><ul><li>❌ Average model capabilities</li><li>❌ Relatively narrow feature set</li><li>❌ Fewer enterprise features</li><li>❌ Toolchain not fully mature</li></ul><h3 id="Docs-Resources-7"><a href="#Docs-Resources-7" class="headerlink" title="Docs &amp; Resources"></a>Docs &amp; Resources</h3><ul><li><strong>Official website</strong>: <a href="https://www.moonshot.cn/">https://www.moonshot.cn/</a></li><li><strong>Usage docs</strong>: Available on the official website</li></ul><hr><h2 id="GitHub-Copilot-Microsoft"><a href="#GitHub-Copilot-Microsoft" class="headerlink" title="GitHub Copilot (Microsoft)"></a>GitHub Copilot (Microsoft)</h2><h3 id="Official-Websites-7"><a href="#Official-Websites-7" class="headerlink" title="Official Websites"></a>Official Websites</h3><ul><li><a href="https://github.com/features/copilot">https://github.com/features/copilot</a></li><li><a href="https://docs.github.com/en/copilot">https://docs.github.com/en/copilot</a></li></ul><h3 id="Overview-8"><a href="#Overview-8" class="headerlink" title="Overview"></a>Overview</h3><p>GitHub Copilot is Microsoft’s AI coding assistant that aggregates multiple leading large language models — OpenAI, Anthropic Claude, Google Gemini, and others — with deep integration into VS Code, Visual Studio, JetBrains, and other IDEs. It provides code completion, chat, Agent mode, code review, and comprehensive coding assistance. For developers, <strong>Copilot Pro is one of the best-value AI coding subscriptions available today</strong>.</p><p>One important billing quirk: GitHub Copilot resets usage quotas on the 1st of each month — <strong>code completions and Premium request allowances both reset on the 1st</strong>. This means subscribing around the 15th lets you get roughly two months of quota for a single month’s payment (if you don’t subscribe continuously, you effectively get 15 full days each month). In practice, I stopped my subscription on the 7th, then re-subscribed on the 15th (same account — interestingly I was only charged $7 that time, and I haven’t fully figured out the billing logic). My Premium request quota for that month remained at whatever was left over from before the 7th. So I’d suggest alternating between two accounts — essentially getting $20 worth of value out of $10, which is now roughly on par with the Claude Pro $20 tier.</p><h3 id="Core-Models-8"><a href="#Core-Models-8" class="headerlink" title="Core Models"></a>Core Models</h3><p>Copilot supports multiple AI models and you can switch freely within the IDE:</p><ul><li><strong>Claude Sonnet 4.6 &#x2F; Claude 3.7 Sonnet</strong>: Anthropic’s strong coding models</li><li><strong>Claude Opus 4.5</strong>: Anthropic’s most powerful reasoning model</li><li><strong>GPT-4.1 &#x2F; GPT-5 mini</strong>: OpenAI’s latest models</li><li><strong>Gemini 2.5 Pro &#x2F; Gemini 3.1 Pro</strong>: Google’s high-performance models</li><li><strong>o3-mini &#x2F; o1-mini</strong>: Reasoning-enhanced models</li></ul><h3 id="Subscription-Plans-8"><a href="#Subscription-Plans-8" class="headerlink" title="Subscription Plans"></a>Subscription Plans</h3><h4 id="Free"><a href="#Free" class="headerlink" title="Free"></a>Free</h4><ul><li><strong>Price</strong>: $0&#x2F;month</li><li><strong>Includes</strong>:<ul><li>2,000 code completions&#x2F;month</li><li>50 Premium requests&#x2F;month</li><li>Basic chat functionality</li><li>VS Code and other IDE support</li></ul></li></ul><h4 id="Pro-Personal-Professional"><a href="#Pro-Personal-Professional" class="headerlink" title="Pro (Personal Professional)"></a>Pro (Personal Professional)</h4><ul><li><strong>Price</strong>: $10&#x2F;month or $100&#x2F;year</li><li><strong>Includes</strong>:<ul><li>Unlimited code completions</li><li>300 Premium requests&#x2F;month</li><li>Multi-model switching (Claude, Gemini, GPT, etc.)</li><li>Agent mode (autonomous multi-step coding tasks)</li><li>Code Review</li><li>Copilot CLI (command-line assistant)</li><li>Copilot Chat (conversational coding assistance)</li></ul></li></ul><h4 id="Business-Team"><a href="#Business-Team" class="headerlink" title="Business (Team)"></a>Business (Team)</h4><ul><li><strong>Price</strong>: $19&#x2F;user&#x2F;month</li><li><strong>Includes</strong>:<ul><li>Everything in Pro</li><li>Organization management dashboard</li><li>Knowledge base integration (index org codebase)</li><li>Custom model selection policies</li><li>IP indemnification</li><li>SAML SSO</li></ul></li></ul><h4 id="Enterprise-8"><a href="#Enterprise-8" class="headerlink" title="Enterprise"></a>Enterprise</h4><ul><li><strong>Price</strong>: $39&#x2F;user&#x2F;month</li><li><strong>Includes</strong>:<ul><li>Everything in Business</li><li>Requires GitHub Enterprise Cloud ($21&#x2F;user&#x2F;month)</li><li>Custom model fine-tuning</li><li>Pull Request summaries</li><li>Security vulnerability fix suggestions</li><li>Advanced security compliance features</li></ul></li></ul><h3 id="Core-Strengths-8"><a href="#Core-Strengths-8" class="headerlink" title="Core Strengths"></a>Core Strengths</h3><h4 id="1-Deep-IDE-Integration"><a href="#1-Deep-IDE-Integration" class="headerlink" title="1. Deep IDE Integration"></a>1. Deep IDE Integration</h4><ul><li><strong>VS Code</strong>: Best experience, native integration</li><li><strong>Visual Studio</strong>: Seamless within the Microsoft ecosystem</li><li><strong>JetBrains suite</strong>: IntelliJ, PyCharm, and more</li><li><strong>Neovim &#x2F; Vim</strong>: Terminal-friendly</li><li><strong>Xcode</strong>: Apple ecosystem support</li></ul><h4 id="2-Free-Multi-Model-Switching"><a href="#2-Free-Multi-Model-Switching" class="headerlink" title="2. Free Multi-Model Switching"></a>2. Free Multi-Model Switching</h4><ul><li>Supports OpenAI, Anthropic, Google, and more</li><li>One-click model switching within the IDE</li><li>Choose the best model for each task</li><li>Premium requests consumed based on model complexity</li></ul><h4 id="3-Agent-Mode"><a href="#3-Agent-Mode" class="headerlink" title="3. Agent Mode"></a>3. Agent Mode</h4><ul><li>Autonomously understands requirements and executes multi-step coding tasks</li><li>Auto-reads files, edits code, runs terminal commands</li><li>End-to-end automated coding experience</li></ul><h4 id="4-Coding-Scenario-Optimization"><a href="#4-Coding-Scenario-Optimization" class="headerlink" title="4. Coding-Scenario Optimization"></a>4. Coding-Scenario Optimization</h4><ul><li><strong>Code completion</strong>: Real-time context-aware completion</li><li><strong>Code generation</strong>: Generate code from natural language</li><li><strong>Code explanation</strong>: Understand complex code logic</li><li><strong>Bug fixing</strong>: Intelligently locate and fix issues</li><li><strong>Code refactoring</strong>: Optimize code structure and quality</li><li><strong>Unit testing</strong>: Auto-generate test cases</li></ul><h3 id="Best-For-8"><a href="#Best-For-8" class="headerlink" title="Best For"></a>Best For</h3><ul><li>Everyday coding development (top recommendation)</li><li>Code review and refactoring</li><li>Learning new languages and frameworks</li><li>Rapid prototyping</li><li>Team collaborative coding</li></ul><h3 id="Pros-Cons-8"><a href="#Pros-Cons-8" class="headerlink" title="Pros &amp; Cons"></a>Pros &amp; Cons</h3><p><strong>Pros:</strong></p><ul><li>✅ Extremely competitive price ($10&#x2F;month, outstanding value)</li><li>✅ Multi-model support with free switching</li><li>✅ Best-in-class IDE integration experience</li><li>✅ Agent mode for strong automation</li><li>✅ Deep Microsoft ecosystem integration</li><li>✅ Free for students</li></ul><p><strong>Cons:</strong></p><ul><li>❌ Primarily focused on coding; general conversation is limited</li><li>❌ Monthly cap on Premium requests</li><li>❌ Tied to the GitHub ecosystem</li><li>❌ Enterprise tier is pricey</li></ul><h3 id="Personal-Take"><a href="#Personal-Take" class="headerlink" title="Personal Take"></a>Personal Take</h3><p>For developers, GitHub Copilot Pro is one of the most worthwhile AI subscriptions out there. <strong>$10&#x2F;month</strong> gets you unlimited code completions plus 300 multi-model Premium requests — just the right amount, not too little and nothing wasted. Compared to subscribing separately to ChatGPT Plus ($20) or Claude Pro ($20), Copilot Pro delivers clearly better value, and since you use it directly inside the IDE, your workflow stays seamless.</p><h3 id="Docs-Resources-8"><a href="#Docs-Resources-8" class="headerlink" title="Docs &amp; Resources"></a>Docs &amp; Resources</h3><ul><li><strong>Official docs</strong>: <a href="https://docs.github.com/en/copilot">https://docs.github.com/en/copilot</a></li><li><strong>Model list</strong>: <a href="https://docs.github.com/copilot/reference/ai-models/supported-models">https://docs.github.com/copilot/reference/ai-models/supported-models</a></li><li><strong>GitHub Blog</strong>: <a href="https://github.blog/">https://github.blog</a></li></ul><hr><h1 id="Third-Party-AI-Providers"><a href="#Third-Party-AI-Providers" class="headerlink" title="Third-Party AI Providers"></a>Third-Party AI Providers</h1><h2 id="OpenRouter"><a href="#OpenRouter" class="headerlink" title="OpenRouter"></a>OpenRouter</h2><h3 id="Official-Website-1"><a href="#Official-Website-1" class="headerlink" title="Official Website"></a>Official Website</h3><ul><li><a href="https://openrouter.ai/">https://openrouter.ai/</a></li></ul><h3 id="Overview-9"><a href="#Overview-9" class="headerlink" title="Overview"></a>Overview</h3><p>OpenRouter is a <strong>unified AI model API gateway</strong> that provides a single interface for accessing hundreds of AI models. Its core idea is to let developers connect to multiple AI providers through one API, simplifying integration work.</p><h3 id="Core-Strengths-9"><a href="#Core-Strengths-9" class="headerlink" title="Core Strengths"></a>Core Strengths</h3><h4 id="1-Model-Ecosystem-300-Models"><a href="#1-Model-Ecosystem-300-Models" class="headerlink" title="1. Model Ecosystem (300+ Models)"></a>1. Model Ecosystem (300+ Models)</h4><ul><li><strong>Large language models</strong>: GPT-4, Claude, DeepSeek, GLM, Llama, Mistral, and more</li><li><strong>Image models</strong>: DALL-E, Stable Diffusion, Midjourney, and more</li><li><strong>Multimodal</strong>: Vision models, audio, video</li></ul><h4 id="2-Smart-Routing-System"><a href="#2-Smart-Routing-System" class="headerlink" title="2. Smart Routing System"></a>2. Smart Routing System</h4><ul><li><strong>Model Fallbacks</strong>: Automatic failover</li><li><strong>Provider Routing</strong>: Intelligent routing selection</li><li><strong>Auto Router</strong>: Automatically select the best model (powered by NotDiamond)</li><li>Optimization by cost, performance, or reliability</li></ul><h4 id="3-Advanced-Features-1"><a href="#3-Advanced-Features-1" class="headerlink" title="3. Advanced Features"></a>3. Advanced Features</h4><h5 id="Multimodal-Support"><a href="#Multimodal-Support" class="headerlink" title="Multimodal Support"></a>Multimodal Support</h5><ul><li><strong>Image Inputs</strong>: Send images to vision models</li><li><strong>Image Generation</strong>: Generate images</li><li><strong>PDF Inputs</strong>: Process PDF documents</li><li><strong>Audio</strong>: Voice input&#x2F;output</li><li><strong>Video Inputs</strong>: Video processing</li></ul><h5 id="Enhancement-Features"><a href="#Enhancement-Features" class="headerlink" title="Enhancement Features"></a>Enhancement Features</h5><ul><li><strong>Zero Data Retention (ZDR)</strong>: No data retained</li><li><strong>Structured Outputs</strong>: JSON Schema validation</li><li><strong>Web Search</strong>: Real-time web search</li><li><strong>Prompt Caching</strong>: Cache prompts to reduce costs</li><li><strong>Response Healing</strong>: Auto-fix malformed responses</li><li><strong>Zero Completion Insurance</strong>: No charge for failed responses</li></ul><h4 id="4-Model-Variants"><a href="#4-Model-Variants" class="headerlink" title="4. Model Variants"></a>4. Model Variants</h4><ul><li><code>:free</code> - Free model variant</li><li><code>:extended</code> - Extended context window</li><li><code>:exacto</code> - Prioritizes tool call quality</li><li><code>:thinking</code> - Extended reasoning</li><li><code>:online</code> - Real-time web search</li><li><code>:nitro</code> - High-speed inference</li></ul><h4 id="5-Developer-Experience-2"><a href="#5-Developer-Experience-2" class="headerlink" title="5. Developer Experience"></a>5. Developer Experience</h4><h5 id="SDKs-Frameworks"><a href="#SDKs-Frameworks" class="headerlink" title="SDKs &amp; Frameworks"></a>SDKs &amp; Frameworks</h5><ul><li><strong>Official SDKs</strong>: TypeScript, Python</li><li><strong>Compatible with</strong>: OpenAI SDK, Anthropic Agent SDK</li><li><strong>Frameworks</strong>: LangChain, Vercel AI SDK, PydanticAI, TanStack AI</li><li><strong>Tools</strong>: Zapier, Infisical, LiveKit</li></ul><h5 id="Integration-Tools"><a href="#Integration-Tools" class="headerlink" title="Integration Tools"></a>Integration Tools</h5><ul><li><strong>BYOK (Bring Your Own Key)</strong>: Use your own API keys</li><li><strong>Guardrails</strong>: Data policies and model access restrictions</li><li><strong>Broadcast</strong>: Integrates with Langfuse, Datadog, Braintrust, and more</li></ul><h4 id="6-Management-Features"><a href="#6-Management-Features" class="headerlink" title="6. Management Features"></a>6. Management Features</h4><ul><li><strong>Organization Management</strong>: Team collaboration and API key management</li><li><strong>App Attribution</strong>: Application attribution and ranking</li><li><strong>Activity Export</strong>: Usage data export</li><li><strong>Crypto API</strong>: Cryptocurrency payment support</li></ul><h3 id="Pricing"><a href="#Pricing" class="headerlink" title="Pricing"></a>Pricing</h3><ul><li><strong>Billing</strong>: Per token</li><li><strong>Transparent pricing</strong>: Clear pricing per model</li><li><strong>Cost optimization</strong>: Smart routing reduces costs</li><li><strong>Free models</strong>: Some models available in <code>:free</code> variant</li></ul><h3 id="Best-For-9"><a href="#Best-For-9" class="headerlink" title="Best For"></a>Best For</h3><ul><li>Projects connecting to multiple AI models simultaneously</li><li>Enterprise apps requiring high availability and failover</li><li>Developers looking to reduce migration costs</li><li>Scenarios requiring flexible model switching</li><li>A&#x2F;B testing across different models</li></ul><h3 id="Pros-Cons-9"><a href="#Pros-Cons-9" class="headerlink" title="Pros &amp; Cons"></a>Pros &amp; Cons</h3><p><strong>Pros:</strong></p><ul><li>✅ Unified interface, lower integration complexity</li><li>✅ Hundreds of models, rich selection</li><li>✅ Smart routing and failover</li><li>✅ Advanced features (caching, structured outputs, etc.)</li><li>✅ Compatible with major SDKs, low learning curve</li><li>✅ Active community and ecosystem</li></ul><p><strong>Cons:</strong></p><ul><li>❌ Extra middleware layer, possible additional latency</li><li>❌ Dependent on OpenRouter’s service stability</li><li>❌ Some advanced features may cost extra</li></ul><h3 id="Docs-Resources-9"><a href="#Docs-Resources-9" class="headerlink" title="Docs &amp; Resources"></a>Docs &amp; Resources</h3><ul><li><strong>Official docs</strong>: <a href="https://openrouter.ai/docs">https://openrouter.ai/docs</a></li><li><strong>GitHub</strong>: <a href="https://github.com/openrouter">https://github.com/openrouter</a></li><li><strong>Community projects</strong>: Awesome OpenRouter</li></ul><hr><h2 id="Together-AI"><a href="#Together-AI" class="headerlink" title="Together AI"></a>Together AI</h2><h3 id="Official-Website-2"><a href="#Official-Website-2" class="headerlink" title="Official Website"></a>Official Website</h3><ul><li><a href="https://www.together.ai/">https://www.together.ai/</a></li></ul><h3 id="Overview-10"><a href="#Overview-10" class="headerlink" title="Overview"></a>Overview</h3><p>Together AI is an <strong>AI infrastructure provider</strong> offering hosted inference for open-source models, along with custom model training and deployment services.</p><h3 id="Core-Strengths-10"><a href="#Core-Strengths-10" class="headerlink" title="Core Strengths"></a>Core Strengths</h3><h4 id="1-Open-Source-Model-Hosting"><a href="#1-Open-Source-Model-Hosting" class="headerlink" title="1. Open-Source Model Hosting"></a>1. Open-Source Model Hosting</h4><ul><li><strong>Llama series</strong>: Llama 3, Llama 2, and more</li><li><strong>Mistral series</strong>: Mistral, Mixtral, and more</li><li><strong>Other open-source models</strong>: Falcon, Vicuna, and more</li><li>Regular updates with the latest open-source models</li></ul><h4 id="2-High-Performance-Inference"><a href="#2-High-Performance-Inference" class="headerlink" title="2. High-Performance Inference"></a>2. High-Performance Inference</h4><ul><li><strong>GPU optimization</strong>: Optimized for specific GPUs</li><li><strong>Flash Attention</strong>: Accelerated inference</li><li><strong>Low latency</strong>: Optimized inference engine</li><li><strong>High throughput</strong>: Supports large-scale concurrency</li></ul><h4 id="3-Custom-Models"><a href="#3-Custom-Models" class="headerlink" title="3. Custom Models"></a>3. Custom Models</h4><ul><li><strong>Model fine-tuning</strong>: Fine-tuning service</li><li><strong>Custom training</strong>: Train on your own data</li><li><strong>Model evaluation</strong>: Model performance benchmarking tools</li><li><strong>Model deployment</strong>: One-click deployment</li></ul><h4 id="4-Developer-Tools-3"><a href="#4-Developer-Tools-3" class="headerlink" title="4. Developer Tools"></a>4. Developer Tools</h4><ul><li><strong>Python SDK</strong>: Full Python client</li><li><strong>OpenAI-compatible</strong>: Works with the OpenAI SDK</li><li><strong>Monitoring and analytics</strong>: Usage tracking</li><li><strong>Cost management</strong>: Detailed cost analysis</li></ul><h4 id="5-Enterprise-Features"><a href="#5-Enterprise-Features" class="headerlink" title="5. Enterprise Features"></a>5. Enterprise Features</h4><ul><li><strong>Private deployment</strong>: Support for private cloud</li><li><strong>Data privacy</strong>: GDPR compliant</li><li><strong>SLA guarantees</strong>: Enterprise service levels</li><li><strong>Technical support</strong>: Professional team support</li></ul><h3 id="Pricing-1"><a href="#Pricing-1" class="headerlink" title="Pricing"></a>Pricing</h3><ul><li><strong>Billing</strong>: Per token</li><li><strong>Transparent pricing</strong>: Open-source model prices generally lower than proprietary</li><li><strong>Volume discounts</strong>: Discounts for high usage</li><li><strong>Reserved instances</strong>: Reserve capacity for long-term use</li></ul><h3 id="Best-For-10"><a href="#Best-For-10" class="headerlink" title="Best For"></a>Best For</h3><ul><li>Developers who prefer open-source models</li><li>Teams needing custom model training</li><li>Cost-sensitive large-scale applications</li><li>Enterprises requiring private deployment</li></ul><h3 id="Pros-Cons-10"><a href="#Pros-Cons-10" class="headerlink" title="Pros &amp; Cons"></a>Pros &amp; Cons</h3><p><strong>Pros:</strong></p><ul><li>✅ Rich open-source model ecosystem</li><li>✅ Well-optimized performance, fast</li><li>✅ Supports custom model training</li><li>✅ Strong openness and control</li><li>✅ Relatively lower costs</li></ul><p><strong>Cons:</strong></p><ul><li>❌ Does not include proprietary models like GPT or Claude</li><li>❌ Model capabilities may not match proprietary models</li><li>❌ Limited multimodal support</li></ul><h3 id="Docs-Resources-10"><a href="#Docs-Resources-10" class="headerlink" title="Docs &amp; Resources"></a>Docs &amp; Resources</h3><ul><li><strong>Website</strong>: <a href="https://www.together.ai/">https://www.together.ai/</a></li><li><strong>Docs</strong>: <a href="https://docs.together.ai/">https://docs.together.ai/</a></li><li><strong>GitHub</strong>: <a href="https://github.com/togethercomputer">https://github.com/togethercomputer</a></li></ul><hr><h2 id="Replicate"><a href="#Replicate" class="headerlink" title="Replicate"></a>Replicate</h2><h3 id="Official-Website-3"><a href="#Official-Website-3" class="headerlink" title="Official Website"></a>Official Website</h3><ul><li><a href="https://replicate.com/">https://replicate.com/</a></li></ul><h3 id="Overview-11"><a href="#Overview-11" class="headerlink" title="Overview"></a>Overview</h3><p>Replicate is an <strong>AI model hosting platform</strong> that makes it easy for developers to run open-source AI models — including large language models, image generation, audio processing, and more.</p><h3 id="Core-Strengths-11"><a href="#Core-Strengths-11" class="headerlink" title="Core Strengths"></a>Core Strengths</h3><h4 id="1-Rich-Model-Library"><a href="#1-Rich-Model-Library" class="headerlink" title="1. Rich Model Library"></a>1. Rich Model Library</h4><ul><li><strong>Language models</strong>: Llama, Mistral, Falcon, and more</li><li><strong>Image generation</strong>: Stable Diffusion series</li><li><strong>Image processing</strong>: Super-resolution, inpainting, style transfer, and more</li><li><strong>Audio processing</strong>: Speech synthesis, recognition, and more</li><li><strong>Video generation</strong>: Video synthesis and editing</li><li><strong>Other models</strong>: OCR, NLP, and more</li></ul><h4 id="2-Ease-of-Use"><a href="#2-Ease-of-Use" class="headerlink" title="2. Ease of Use"></a>2. Ease of Use</h4><ul><li><strong>Simple API</strong>: Clean REST API</li><li><strong>Python SDK</strong>: Python client</li><li><strong>Web Playground</strong>: Test models online</li><li><strong>Rich examples</strong>: Extensive usage examples</li></ul><h4 id="3-Custom-Models-1"><a href="#3-Custom-Models-1" class="headerlink" title="3. Custom Models"></a>3. Custom Models</h4><ul><li><strong>Upload models</strong>: Upload your own models</li><li><strong>Docker support</strong>: Docker-based model deployment</li><li><strong>Cog API</strong>: Performance-optimized Cog API</li><li><strong>Version control</strong>: Model versioning</li></ul><h4 id="4-Community-Ecosystem"><a href="#4-Community-Ecosystem" class="headerlink" title="4. Community Ecosystem"></a>4. Community Ecosystem</h4><ul><li><strong>Model sharing</strong>: Community model library</li><li><strong>Fork models</strong>: Build on others’ models</li><li><strong>Open-source friendly</strong>: Large open-source model collection</li></ul><h4 id="5-Developer-Experience-3"><a href="#5-Developer-Experience-3" class="headerlink" title="5. Developer Experience"></a>5. Developer Experience</h4><ul><li><strong>Live preview</strong>: Preview model output online</li><li><strong>Debugging tools</strong>: Convenient debugging and optimization</li><li><strong>Monitoring dashboard</strong>: Usage and cost monitoring</li><li><strong>Webhooks</strong>: Async task callbacks</li></ul><h3 id="Pricing-2"><a href="#Pricing-2" class="headerlink" title="Pricing"></a>Pricing</h3><ul><li><strong>Billing</strong>: By compute time</li><li><strong>Transparent pricing</strong>: Clear hourly cost</li><li><strong>Free credits</strong>: Free credits for new users</li><li><strong>Pay-as-you-go</strong>: Flexible billing</li></ul><h3 id="Best-For-11"><a href="#Best-For-11" class="headerlink" title="Best For"></a>Best For</h3><ul><li>Rapid prototyping</li><li>Testing different models</li><li>Small-scale applications</li><li>Teams needing diverse model types</li><li>Open-source model enthusiasts</li></ul><h3 id="Pros-Cons-11"><a href="#Pros-Cons-11" class="headerlink" title="Pros &amp; Cons"></a>Pros &amp; Cons</h3><p><strong>Pros:</strong></p><ul><li>✅ Very rich model library</li><li>✅ Easy to use, quick to get started</li><li>✅ Supports custom models</li><li>✅ Active community</li><li>✅ Relatively affordable</li></ul><p><strong>Cons:</strong></p><ul><li>❌ Does not include proprietary models (GPT, Claude)</li><li>❌ Performance may not match dedicated services</li><li>❌ Limited enterprise features</li><li>❌ Multimodal integration requires manual handling</li></ul><h3 id="Docs-Resources-11"><a href="#Docs-Resources-11" class="headerlink" title="Docs &amp; Resources"></a>Docs &amp; Resources</h3><ul><li><strong>Website</strong>: <a href="https://replicate.com/">https://replicate.com/</a></li><li><strong>Docs</strong>: <a href="https://replicate.com/docs">https://replicate.com/docs</a></li><li><strong>GitHub</strong>: <a href="https://github.com/replicate">https://github.com/replicate</a></li></ul><hr><h2 id="Fireworks-ai"><a href="#Fireworks-ai" class="headerlink" title="Fireworks.ai"></a>Fireworks.ai</h2><h3 id="Official-Website-4"><a href="#Official-Website-4" class="headerlink" title="Official Website"></a>Official Website</h3><ul><li><a href="https://fireworks.ai/">https://fireworks.ai/</a></li></ul><h3 id="Overview-12"><a href="#Overview-12" class="headerlink" title="Overview"></a>Overview</h3><p>Fireworks.ai is a <strong>high-performance AI inference platform</strong> focused on delivering fast, low-cost AI model inference.</p><h3 id="Core-Strengths-12"><a href="#Core-Strengths-12" class="headerlink" title="Core Strengths"></a>Core Strengths</h3><h4 id="1-High-Performance-Inference"><a href="#1-High-Performance-Inference" class="headerlink" title="1. High-Performance Inference"></a>1. High-Performance Inference</h4><ul><li><strong>Ultra-fast inference</strong>: Industry-leading inference speed</li><li><strong>Low latency</strong>: Optimized inference engine</li><li><strong>High throughput</strong>: Supports large-scale concurrency</li><li><strong>GPU optimization</strong>: Deep hardware-level optimization</li></ul><h4 id="2-Model-Ecosystem"><a href="#2-Model-Ecosystem" class="headerlink" title="2. Model Ecosystem"></a>2. Model Ecosystem</h4><ul><li><strong>Open-source models</strong>: Llama, Mistral, and more</li><li><strong>Optimized models</strong>: Fireworks-optimized model variants</li><li><strong>Custom models</strong>: Support for custom model deployment</li><li><strong>Multimodal</strong>: Text, images, and more</li></ul><h4 id="3-Cost-Advantage-1"><a href="#3-Cost-Advantage-1" class="headerlink" title="3. Cost Advantage"></a>3. Cost Advantage</h4><ul><li><strong>Transparent pricing</strong>: Clear billing</li><li><strong>Pay-as-you-go</strong>: Flexible billing model</li><li><strong>Volume discounts</strong>: Discounts for high usage</li><li><strong>Reserved instances</strong>: Lower costs for long-term use</li></ul><h4 id="4-Developer-Experience"><a href="#4-Developer-Experience" class="headerlink" title="4. Developer Experience"></a>4. Developer Experience</h4><ul><li><strong>OpenAI-compatible</strong>: Works with the OpenAI SDK</li><li><strong>Python SDK</strong>: Full Python client</li><li><strong>REST API</strong>: Standard REST interface</li><li><strong>Monitoring tools</strong>: Usage tracking</li></ul><h4 id="5-Enterprise-Features-1"><a href="#5-Enterprise-Features-1" class="headerlink" title="5. Enterprise Features"></a>5. Enterprise Features</h4><ul><li><strong>Private deployment</strong>: Private cloud support</li><li><strong>Data security</strong>: Enterprise-grade security</li><li><strong>SLA guarantees</strong>: Service level agreements</li><li><strong>Technical support</strong>: Professional support team</li></ul><h3 id="Technical-Highlights"><a href="#Technical-Highlights" class="headerlink" title="Technical Highlights"></a>Technical Highlights</h3><ul><li><strong>Flash Attention</strong>: Accelerated attention computation</li><li><strong>KV Cache</strong>: Optimized caching mechanism</li><li><strong>Quantization</strong>: Model quantization to reduce costs</li><li><strong>Distributed inference</strong>: Distributed deployment support</li></ul><h3 id="Pricing-3"><a href="#Pricing-3" class="headerlink" title="Pricing"></a>Pricing</h3><ul><li><strong>Billing</strong>: Per token</li><li><strong>Cost advantage</strong>: Competitive pricing compared to other providers</li><li><strong>Flexible billing</strong>: Multiple billing modes supported</li></ul><h3 id="Best-For-12"><a href="#Best-For-12" class="headerlink" title="Best For"></a>Best For</h3><ul><li>Performance-demanding applications</li><li>Cost-sensitive large-scale applications</li><li>Scenarios requiring low latency</li><li>Projects preferring open-source models</li></ul><h3 id="Pros-Cons-12"><a href="#Pros-Cons-12" class="headerlink" title="Pros &amp; Cons"></a>Pros &amp; Cons</h3><p><strong>Pros:</strong></p><ul><li>✅ Extremely fast inference speed</li><li>✅ Clear cost advantage</li><li>✅ OpenAI-compatible, low migration cost</li><li>✅ Well-optimized performance</li><li>✅ Enterprise-grade features</li></ul><p><strong>Cons:</strong></p><ul><li>❌ Relatively fewer models</li><li>❌ Does not include proprietary models</li><li>❌ Limited multimodal support</li><li>❌ Smaller community ecosystem</li></ul><h3 id="Docs-Resources-12"><a href="#Docs-Resources-12" class="headerlink" title="Docs &amp; Resources"></a>Docs &amp; Resources</h3><ul><li><strong>Website</strong>: <a href="https://fireworks.ai/">https://fireworks.ai/</a></li><li><strong>Docs</strong>: <a href="https://fireworks.ai/docs">https://fireworks.ai/docs</a></li></ul><hr><h2 id="Hugging-Face-Inference"><a href="#Hugging-Face-Inference" class="headerlink" title="Hugging Face Inference"></a>Hugging Face Inference</h2><h3 id="Official-Website-5"><a href="#Official-Website-5" class="headerlink" title="Official Website"></a>Official Website</h3><ul><li><a href="https://huggingface.co/">https://huggingface.co/</a></li></ul><h3 id="Overview-13"><a href="#Overview-13" class="headerlink" title="Overview"></a>Overview</h3><p>Hugging Face is the <strong>largest open-source model community</strong>, offering model hosting, inference services, datasets, and more. Hugging Face Inference is its inference API service.</p><h3 id="Core-Strengths-13"><a href="#Core-Strengths-13" class="headerlink" title="Core Strengths"></a>Core Strengths</h3><h4 id="1-Model-Ecosystem-Largest"><a href="#1-Model-Ecosystem-Largest" class="headerlink" title="1. Model Ecosystem (Largest)"></a>1. Model Ecosystem (Largest)</h4><ul><li><strong>Massive model library</strong>: Tens of thousands of models</li><li><strong>Language models</strong>: Llama, Mistral, BERT, T5, and more</li><li><strong>Image models</strong>: Stable Diffusion, ViT, and more</li><li><strong>Audio models</strong>: Whisper, AudioLDM, and more</li><li><strong>Multimodal</strong>: All kinds of multimodal models</li></ul><h4 id="2-Community-Driven"><a href="#2-Community-Driven" class="headerlink" title="2. Community-Driven"></a>2. Community-Driven</h4><ul><li><strong>Open-source ecosystem</strong>: Largest open-source model community</li><li><strong>Model sharing</strong>: Users can share their models</li><li><strong>Collaborative development</strong>: Community-driven model improvements</li><li><strong>Rich resources</strong>: Tutorials, docs, and examples galore</li></ul><h4 id="3-Inference-Services"><a href="#3-Inference-Services" class="headerlink" title="3. Inference Services"></a>3. Inference Services</h4><ul><li><strong>Serverless API</strong>: Serverless inference</li><li><strong>Inference Endpoints</strong>: Dedicated inference endpoints</li><li><strong>Private deployment</strong>: Private cloud support</li><li><strong>GPU acceleration</strong>: GPU-accelerated inference</li></ul><h4 id="4-Developer-Tools-4"><a href="#4-Developer-Tools-4" class="headerlink" title="4. Developer Tools"></a>4. Developer Tools</h4><ul><li><strong>Python SDK</strong>: The <code>transformers</code> library</li><li><strong>JavaScript SDK</strong>: Browser support</li><li><strong>API clients</strong>: Clients for multiple languages</li><li><strong>Web UI</strong>: Online testing and demos</li></ul><h4 id="5-Enterprise-Features-2"><a href="#5-Enterprise-Features-2" class="headerlink" title="5. Enterprise Features"></a>5. Enterprise Features</h4><ul><li><strong>Inference Endpoints</strong>: Enterprise-grade inference endpoints</li><li><strong>Data security</strong>: GDPR compliant</li><li><strong>SLA guarantees</strong>: Service level agreements</li><li><strong>Private repositories</strong>: Private model repositories</li></ul><h3 id="Pricing-4"><a href="#Pricing-4" class="headerlink" title="Pricing"></a>Pricing</h3><ul><li><strong>Serverless</strong>: Pay per usage</li><li><strong>Inference Endpoints</strong>: Hourly billing (monthly&#x2F;annual)</li><li><strong>Free tier</strong>: Free usage available</li><li><strong>Enterprise pricing</strong>: Customized enterprise plans</li></ul><h3 id="Best-For-13"><a href="#Best-For-13" class="headerlink" title="Best For"></a>Best For</h3><ul><li>Teams needing specific open-source models</li><li>Open-source model enthusiasts</li><li>Research and experimentation</li><li>Projects requiring diverse model choices</li><li>Open-source initiatives</li></ul><h3 id="Pros-Cons-13"><a href="#Pros-Cons-13" class="headerlink" title="Pros &amp; Cons"></a>Pros &amp; Cons</h3><p><strong>Pros:</strong></p><ul><li>✅ Most models of any platform</li><li>✅ Richest community ecosystem</li><li>✅ Open-source friendly</li><li>✅ Rich documentation and tutorials</li><li>✅ Supports virtually all open-source models</li></ul><p><strong>Cons:</strong></p><ul><li>❌ Does not include proprietary models (GPT, Claude)</li><li>❌ Performance may not match dedicated providers</li><li>❌ Enterprise-grade features require extra payment</li><li>❌ Inference speed may be slower</li></ul><h3 id="Docs-Resources-13"><a href="#Docs-Resources-13" class="headerlink" title="Docs &amp; Resources"></a>Docs &amp; Resources</h3><ul><li><strong>Website</strong>: <a href="https://huggingface.co/">https://huggingface.co/</a></li><li><strong>Docs</strong>: <a href="https://huggingface.co/docs/api-inference">https://huggingface.co/docs/api-inference</a></li><li><strong>GitHub</strong>: <a href="https://github.com/huggingface">https://github.com/huggingface</a></li></ul><hr><h2 id="SiliconFlow"><a href="#SiliconFlow" class="headerlink" title="SiliconFlow"></a>SiliconFlow</h2><h3 id="Official-Website-6"><a href="#Official-Website-6" class="headerlink" title="Official Website"></a>Official Website</h3><ul><li><a href="https://siliconflow.cn/">https://siliconflow.cn/</a></li></ul><h3 id="Overview-14"><a href="#Overview-14" class="headerlink" title="Overview"></a>Overview</h3><p>SiliconFlow is a Chinese company aiming to become a leading global <strong>AI capability provider</strong>. It offers multimodal model capabilities spanning language, speech, images, and video, aggregating both domestic and international model sources.</p><h3 id="Core-Strengths-14"><a href="#Core-Strengths-14" class="headerlink" title="Core Strengths"></a>Core Strengths</h3><h4 id="1-Full-Scenario-Product-Matrix-Multimodal-Aggregation"><a href="#1-Full-Scenario-Product-Matrix-Multimodal-Aggregation" class="headerlink" title="1. Full-Scenario Product Matrix (Multimodal Aggregation)"></a>1. Full-Scenario Product Matrix (Multimodal Aggregation)</h4><ul><li><strong>Language models</strong>: DeepSeek-R1, DeepSeek-V3, QwQ-32B, GLM-4-9B-Chat, and more</li><li><strong>Voice models</strong>: CosyVoice2-0.5B</li><li><strong>Image models</strong>: Kolors</li><li><strong>Video models</strong>: HunyuanVideo-HD, Wan2.1-I2V-14B-720P, Wan2.1-T2V-14B, and more</li></ul><h4 id="2-Performance-Optimization"><a href="#2-Performance-Optimization" class="headerlink" title="2. Performance Optimization"></a>2. Performance Optimization</h4><ul><li><strong>High-speed inference</strong>: Language model speed improved by <strong>10x+</strong></li><li><strong>Low latency</strong>: Voice generation latency as low as <strong>100ms</strong></li><li>Deep optimization for domestic Chinese models</li></ul><h4 id="3-Cost-Advantage-2"><a href="#3-Cost-Advantage-2" class="headerlink" title="3. Cost Advantage"></a>3. Cost Advantage</h4><ul><li>Image generation cost savings of <strong>66%</strong></li><li>Language model cost savings of <strong>46%</strong></li><li>Hosting cost reduction for customers of <strong>52%</strong></li></ul><h4 id="4-Enterprise-Grade-Features"><a href="#4-Enterprise-Grade-Features" class="headerlink" title="4. Enterprise-Grade Features"></a>4. Enterprise-Grade Features</h4><h5 id="High-Stability"><a href="#High-Stability" class="headerlink" title="High Stability"></a>High Stability</h5><ul><li>Developer-validated high reliability</li><li>Comprehensive monitoring and fault-tolerance</li><li>Enterprise-grade professional technical support</li></ul><h5 id="High-Security"><a href="#High-Security" class="headerlink" title="High Security"></a>High Security</h5><ul><li><strong>BYOC deployment</strong>: Protect data privacy</li><li><strong>Compute&#x2F;network&#x2F;storage isolation</strong>: Comprehensive security</li><li>Meets industry standards and compliance requirements</li><li>Supports domestic-only deployment</li></ul><h5 id="High-Scalability"><a href="#High-Scalability" class="headerlink" title="High Scalability"></a>High Scalability</h5><ul><li>Dynamic scaling to support elastic workloads</li><li>One-click custom model deployment</li><li>Hybrid cloud deployment support</li></ul><h4 id="5-Intelligent-Capabilities"><a href="#5-Intelligent-Capabilities" class="headerlink" title="5. Intelligent Capabilities"></a>5. Intelligent Capabilities</h4><ul><li>Smart scaling for flexible business growth</li><li>Intelligent cost analysis for budget control</li><li>Access to multiple advanced model services</li></ul><h3 id="Technical-Advantages"><a href="#Technical-Advantages" class="headerlink" title="Technical Advantages"></a>Technical Advantages</h3><ul><li>Deep optimization for domestic Chinese LLMs (DeepSeek, GLM, etc.)</li><li>Comprehensive multimodal capabilities</li><li>Enterprise deployment solutions</li><li>Compliant with Chinese data regulations</li><li>Localized service support</li></ul><h3 id="Pricing-5"><a href="#Pricing-5" class="headerlink" title="Pricing"></a>Pricing</h3><ul><li><strong>Billing</strong>: Per token or per call</li><li><strong>Cost advantage</strong>: Significant savings compared to overseas providers</li><li><strong>Flexible plans</strong>: Multiple pricing options available</li></ul><h3 id="Best-For-14"><a href="#Best-For-14" class="headerlink" title="Best For"></a>Best For</h3><ul><li>Domestic enterprises using Chinese large models</li><li>Multimodal AI application development</li><li>Scenarios with strict data security and compliance requirements</li><li>Cost-sensitive projects</li><li>Enterprise-grade deployment scenarios</li></ul><h3 id="Pros-Cons-14"><a href="#Pros-Cons-14" class="headerlink" title="Pros &amp; Cons"></a>Pros &amp; Cons</h3><p><strong>Pros:</strong></p><ul><li>✅ Clear cost advantage</li><li>✅ Comprehensive multimodal capabilities</li><li>✅ Well-optimized for Chinese domestic models</li><li>✅ Compliant with Chinese regulations</li><li>✅ Localized service support</li><li>✅ Comprehensive enterprise features</li></ul><p><strong>Cons:</strong></p><ul><li>❌ International model coverage not as broad as OpenRouter</li><li>❌ Documentation and community relatively new</li><li>❌ Lower degree of internationalization</li></ul><h3 id="Docs-Resources-14"><a href="#Docs-Resources-14" class="headerlink" title="Docs &amp; Resources"></a>Docs &amp; Resources</h3><ul><li><strong>Website</strong>: <a href="https://siliconflow.cn/">https://siliconflow.cn/</a></li><li><strong>API docs</strong>: <a href="https://docs.siliconflow.cn/">https://docs.siliconflow.cn/</a></li></ul><hr><h2 id="Comparison-Summary"><a href="#Comparison-Summary" class="headerlink" title="Comparison Summary"></a>Comparison Summary</h2><h3 id="Native-Providers-vs-Third-Party-Providers"><a href="#Native-Providers-vs-Third-Party-Providers" class="headerlink" title="Native Providers vs Third-Party Providers"></a>Native Providers vs Third-Party Providers</h3><table><thead><tr><th>Feature</th><th>Native Providers</th><th>Third-Party Providers</th></tr></thead><tbody><tr><td><strong>Model capability</strong></td><td>Strongest</td><td>Depends on upstream</td></tr><tr><td><strong>Model variety</strong></td><td>Single vendor</td><td>Rich selection</td></tr><tr><td><strong>Unified interface</strong></td><td>Per vendor</td><td>✅ Unified interface</td></tr><tr><td><strong>Smart routing</strong></td><td>❌</td><td>✅</td></tr><tr><td><strong>Failover</strong></td><td>❌</td><td>✅</td></tr><tr><td><strong>Integration complexity</strong></td><td>High (multi-vendor)</td><td>Low</td></tr><tr><td><strong>Vendor lock-in</strong></td><td>High</td><td>Low</td></tr><tr><td><strong>Latency</strong></td><td>Low</td><td>Slightly higher</td></tr><tr><td><strong>Stability</strong></td><td>High</td><td>Platform-dependent</td></tr><tr><td><strong>Cost</strong></td><td>Higher</td><td>More optimization room</td></tr><tr><td><strong>Ecosystem</strong></td><td>Mature but closed</td><td>Open</td></tr><tr><td><strong>Enterprise features</strong></td><td>Comprehensive</td><td>Partial support</td></tr><tr><td><strong>Compliance</strong></td><td>Needs verification</td><td>Mixed</td></tr></tbody></table><h3 id="Quick-Comparison-Table-All-Providers"><a href="#Quick-Comparison-Table-All-Providers" class="headerlink" title="Quick Comparison Table (All Providers)"></a>Quick Comparison Table (All Providers)</h3><table><thead><tr><th>Feature</th><th>OpenAI</th><th>Google</th><th>Claude</th><th>Zhipu</th><th>Baidu</th><th>Alibaba</th><th>Doubao</th><th>Kimi</th><th>GitHub Copilot</th><th>OpenRouter</th><th>Together</th><th>Replicate</th><th>Fireworks</th><th>HF</th><th>SiliconFlow</th></tr></thead><tbody><tr><td><strong>Type</strong></td><td>Native</td><td>Native</td><td>Native</td><td>Native</td><td>Native</td><td>Native</td><td>Native</td><td>Native</td><td>Coding tool</td><td>Third-party</td><td>Third-party</td><td>Third-party</td><td>Third-party</td><td>Third-party</td><td>Third-party</td></tr><tr><td><strong>Model capability</strong></td><td>⭐⭐⭐⭐⭐</td><td>⭐⭐⭐⭐</td><td>⭐⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐⭐⭐⭐</td><td>⭐⭐⭐⭐</td><td>⭐⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐⭐⭐</td></tr><tr><td><strong>Model variety</strong></td><td>1</td><td>1</td><td>1</td><td>1</td><td>1</td><td>1</td><td>1</td><td>1</td><td>Multi-provider</td><td>300+</td><td>50+</td><td>Thousands</td><td>20+</td><td>Tens of thousands</td><td>Multiple</td></tr><tr><td><strong>Chinese</strong></td><td>⭐⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐⭐⭐</td><td>⭐⭐⭐⭐⭐</td><td>⭐⭐⭐⭐⭐</td><td>⭐⭐⭐⭐⭐</td><td>⭐⭐⭐⭐⭐</td><td>⭐⭐⭐⭐⭐</td><td>⭐⭐⭐⭐</td><td>⭐⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐⭐⭐</td><td>⭐⭐⭐⭐⭐</td></tr><tr><td><strong>Multimodal</strong></td><td>✅</td><td>✅</td><td>Partial</td><td>✅</td><td>Partial</td><td>Partial</td><td>✅</td><td>❌</td><td>❌</td><td>✅</td><td>Partial</td><td>✅</td><td>Partial</td><td>✅</td><td>✅</td></tr><tr><td><strong>Smart routing</strong></td><td>❌</td><td>❌</td><td>❌</td><td>❌</td><td>❌</td><td>❌</td><td>❌</td><td>❌</td><td>❌</td><td>✅</td><td>❌</td><td>❌</td><td>❌</td><td>❌</td><td>Partial</td></tr><tr><td><strong>Cost</strong></td><td>High</td><td>Medium</td><td>Medium-high</td><td>Medium</td><td>Medium</td><td>Medium</td><td>Low</td><td>Medium</td><td><strong>Extremely low</strong></td><td>Medium</td><td>Low</td><td>Low</td><td>Low</td><td>Medium</td><td>Low</td></tr><tr><td><strong>Enterprise features</strong></td><td>⭐⭐⭐⭐⭐</td><td>⭐⭐⭐⭐⭐</td><td>⭐⭐⭐⭐</td><td>⭐⭐⭐⭐</td><td>⭐⭐⭐⭐</td><td>⭐⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐</td><td>⭐⭐⭐⭐</td><td>⭐⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐⭐⭐</td></tr><tr><td><strong>Documentation</strong></td><td>⭐⭐⭐⭐⭐</td><td>⭐⭐⭐⭐</td><td>⭐⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐</td><td>⭐⭐⭐⭐⭐</td><td>⭐⭐⭐⭐⭐</td><td>⭐⭐⭐⭐</td><td>⭐⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐⭐⭐⭐</td><td>⭐⭐⭐</td></tr><tr><td><strong>Community</strong></td><td>⭐⭐⭐⭐⭐</td><td>⭐⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐</td><td>⭐⭐⭐⭐⭐</td><td>⭐⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐⭐⭐</td><td>⭐⭐⭐</td><td>⭐⭐⭐⭐⭐</td><td>⭐⭐⭐</td></tr><tr><td><strong>Compliance</strong></td><td>Low</td><td>Low</td><td>Low</td><td>High</td><td>High</td><td>High</td><td>High</td><td>High</td><td>Medium</td><td>Medium</td><td>Medium</td><td>Medium</td><td>Medium</td><td>Medium</td><td>High</td></tr></tbody></table><h3 id="Recommendations"><a href="#Recommendations" class="headerlink" title="Recommendations"></a>Recommendations</h3><h4 id="Choose-a-native-provider-if"><a href="#Choose-a-native-provider-if" class="headerlink" title="Choose a native provider if:"></a>Choose a native provider if:</h4><p><strong>OpenAI</strong></p><ul><li>You need peak model performance</li><li>Enterprise-grade apps with high stability requirements</li><li>Global products needing multilingual support</li><li>You don’t want to rely on third parties</li><li>Cost is not a primary concern</li></ul><p><strong>Google Gemini</strong></p><ul><li>You need Google ecosystem integration</li><li>Multimodal application development</li><li>You’re on Google Cloud</li><li>You need MLOps capabilities</li></ul><p><strong>Anthropic Claude</strong></p><ul><li>High safety requirements</li><li>You need long context (200K)</li><li>Coding assistant tools</li><li>Chatbots</li></ul><p><strong>Zhipu AI</strong></p><ul><li>Domestic Chinese application development</li><li>Chinese-primary applications</li><li>Strict compliance requirements</li><li>Cost-sensitive</li></ul><p><strong>Baidu ERNIE Bot</strong></p><ul><li>Baidu ecosystem integration</li><li>Need Baidu Cloud services</li><li>SMB rapid deployment</li></ul><p><strong>Alibaba Cloud Qwen</strong></p><ul><li>Existing Alibaba Cloud users</li><li>E-commerce applications</li><li>Open-source model preference</li></ul><p><strong>ByteDance Doubao</strong></p><ul><li>ByteDance ecosystem integration</li><li>Multimodal applications</li><li>Consumer-facing apps</li><li>Cost-sensitive</li></ul><p><strong>Moonshot Kimi</strong></p><ul><li>Long document analysis</li><li>Research and academic work</li><li>Personal knowledge management</li></ul><p><strong>GitHub Copilot</strong></p><ul><li>Everyday coding development (<strong>strongly recommended</strong>)</li><li>Coding scenarios needing multi-model switching</li><li>Limited budget but need high-quality AI assistance</li><li>Seamless in-IDE use without switching between browser and editor</li></ul><h4 id="Choose-a-third-party-provider-if"><a href="#Choose-a-third-party-provider-if" class="headerlink" title="Choose a third-party provider if:"></a>Choose a third-party provider if:</h4><p><strong>OpenRouter</strong></p><ul><li>You need to connect to multiple models at once</li><li>You want smart routing and failover</li><li>Reducing vendor lock-in risk</li><li>You need A&#x2F;B testing</li></ul><p><strong>Together AI</strong></p><ul><li>You prefer open-source models</li><li>You need custom model training</li><li>Cost-sensitive large-scale applications</li></ul><p><strong>Replicate</strong></p><ul><li>Rapid prototyping</li><li>Testing different models</li><li>Small-scale applications</li><li>Open-source model enthusiasts</li></ul><p><strong>Fireworks.ai</strong></p><ul><li>Extremely high performance requirements</li><li>Cost-sensitive large-scale applications</li><li>Low latency requirements</li></ul><p><strong>Hugging Face</strong></p><ul><li>Specific open-source models</li><li>Research and experimentation</li><li>Community-driven development</li></ul><p><strong>SiliconFlow</strong></p><ul><li>Domestic enterprises</li><li>Multimodal applications</li><li>Strict compliance requirements</li><li>Cost-sensitive</li></ul><h3 id="Best-Practices"><a href="#Best-Practices" class="headerlink" title="Best Practices"></a>Best Practices</h3><h4 id="1-Hybrid-Strategy"><a href="#1-Hybrid-Strategy" class="headerlink" title="1. Hybrid Strategy"></a>1. Hybrid Strategy</h4><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">Core features → Native provider (stability, capability)</span><br><span class="line">Cost optimization → Third-party open-source models</span><br><span class="line">Compliance requirements → Locally compliant provider</span><br><span class="line">A/B testing → Third-party aggregation platform</span><br></pre></td></tr></table></figure><h4 id="2-Avoiding-Vendor-Lock-In"><a href="#2-Avoiding-Vendor-Lock-In" class="headerlink" title="2. Avoiding Vendor Lock-In"></a>2. Avoiding Vendor Lock-In</h4><ul><li>Use an abstraction layer to wrap the API</li><li>Design swappable model selection strategies</li><li>Maintain multi-provider backup plans</li></ul><h4 id="3-Cost-Optimization"><a href="#3-Cost-Optimization" class="headerlink" title="3. Cost Optimization"></a>3. Cost Optimization</h4><ul><li>Use caching to reduce repeated requests</li><li>Choose models based on task complexity</li><li>Monitor usage and costs</li><li>Take advantage of free quotas</li></ul><h4 id="4-Monitoring-and-Observability"><a href="#4-Monitoring-and-Observability" class="headerlink" title="4. Monitoring and Observability"></a>4. Monitoring and Observability</h4><ul><li>Track model performance metrics</li><li>Monitor usage and costs</li><li>Set up alerting mechanisms</li><li>Use platform analytics tools</li></ul><hr><h2 id="Learning-Resources"><a href="#Learning-Resources" class="headerlink" title="Learning Resources"></a>Learning Resources</h2><h3 id="Native-Providers"><a href="#Native-Providers" class="headerlink" title="Native Providers"></a>Native Providers</h3><ul><li><strong>OpenAI</strong>: <a href="https://platform.openai.com/docs">https://platform.openai.com/docs</a></li><li><strong>Google AI</strong>: <a href="https://ai.google.dev/docs">https://ai.google.dev/docs</a></li><li><strong>Anthropic</strong>: <a href="https://docs.anthropic.com/">https://docs.anthropic.com/</a></li><li><strong>Zhipu AI</strong>: <a href="https://open.bigmodel.cn/dev/api">https://open.bigmodel.cn/dev/api</a></li><li><strong>Baidu</strong>: <a href="https://cloud.baidu.com/doc/WENXINWORKSHOP/">https://cloud.baidu.com/doc/WENXINWORKSHOP/</a></li><li><strong>Alibaba Cloud</strong>: <a href="https://help.aliyun.com/zh/dashscope/">https://help.aliyun.com/zh/dashscope/</a></li><li><strong>ByteDance</strong>: <a href="https://platform.volcengine.com/">https://platform.volcengine.com/</a></li><li><strong>Kimi</strong>: <a href="https://www.moonshot.cn/">https://www.moonshot.cn/</a></li><li><strong>GitHub Copilot</strong>: <a href="https://docs.github.com/en/copilot">https://docs.github.com/en/copilot</a></li></ul><h3 id="Third-Party-Providers"><a href="#Third-Party-Providers" class="headerlink" title="Third-Party Providers"></a>Third-Party Providers</h3><ul><li><strong>OpenRouter</strong>: <a href="https://openrouter.ai/docs">https://openrouter.ai/docs</a></li><li><strong>Together AI</strong>: <a href="https://docs.together.ai/">https://docs.together.ai/</a></li><li><strong>Replicate</strong>: <a href="https://replicate.com/docs">https://replicate.com/docs</a></li><li><strong>Fireworks.ai</strong>: <a href="https://fireworks.ai/docs">https://fireworks.ai/docs</a></li><li><strong>Hugging Face</strong>: <a href="https://huggingface.co/docs/api-inference">https://huggingface.co/docs/api-inference</a></li><li><strong>SiliconFlow</strong>: <a href="https://docs.siliconflow.cn/">https://docs.siliconflow.cn/</a></li></ul><hr><h2 id="Search-Keywords"><a href="#Search-Keywords" class="headerlink" title="Search Keywords"></a>Search Keywords</h2><ul><li><code>AI subscription plan comparison</code></li><li><code>LLM API pricing</code></li><li><code>OpenAI vs Claude vs Google</code></li><li><code>third-party AI provider</code></li><li><code>Chinese AI model comparison</code></li><li><code>AI API aggregation platform</code></li><li><code>OpenRouter tutorial</code></li><li><code>AI inference platform</code></li></ul><hr><h2 id="Future-Updates"><a href="#Future-Updates" class="headerlink" title="Future Updates"></a>Future Updates</h2><p>This document will be updated continuously to track the latest developments and pricing changes from AI providers. I recommend checking each provider’s official announcements and changelogs regularly.</p><p><strong>Update plan:</strong></p><ul><li><input disabled="" type="checkbox"> Update pricing information</li><li><input disabled="" type="checkbox"> Add new models and services</li><li><input disabled="" type="checkbox"> Supplement with real-world use cases</li><li><input disabled="" type="checkbox"> Add performance benchmark data</li><li><input disabled="" type="checkbox"> Update compliance and privacy policies</li></ul><hr><p><em>This document is based on information as of March 2026. AI providers change rapidly — always refer to official sources for the latest information.</em></p>]]></content:encoded>
      
      
      
      <category domain="https://eugenepage.com/tags/AI/">AI</category>
      
      <category domain="https://eugenepage.com/tags/LLM/">LLM</category>
      
      <category domain="https://eugenepage.com/tags/Claude/">Claude</category>
      
      <category domain="https://eugenepage.com/tags/API/">API</category>
      
      <category domain="https://eugenepage.com/tags/Subscription/">Subscription</category>
      
      <category domain="https://eugenepage.com/tags/OpenAI/">OpenAI</category>
      
      <category domain="https://eugenepage.com/tags/Google/">Google</category>
      
      
      <comments>https://eugenepage.com/2026/03/13/20260314.AI%20Services/#disqus_thread</comments>
      
    </item>
    
  </channel>
</rss>
