Skip to content

AI编译器融合技术系统化分类总结

从依赖拓扑到全局优化的深度解析

🚧 持续更新中 (Work in Progress)

本文档处于活跃编写阶段,内容将随着 AI 编译器技术的发展而持续迭代。 最近更新:2026-01-26 | 欢迎在 GitHub 提交 Issue 或 PR 参与贡献。

📚 全景思维导图


📖 章节导航

章节标题核心关键词
00全篇概述与阅读建议Fusion 本质, 章节概览
01依赖拓扑 (Topology)Vertical/Horizontal Fusion, FlashAttention, SwiGLU
02循环与迭代空间 (Loops)Loop Fusion, Tiling, Software Pipelining
03数据布局与表示 (Layout)Packing, Padding, Bufferization, Swizzling
04内存层次与分块 (Memory)Multi-level Tiling, Memory Promotion, Double Buffering
05并行性与分布式 (Parallelism)SIMD, Tensor Core, SPMD, Comm Overlap
06硬件适配与权衡 (Hardware)Register Pressure, Rematerialization, Quantization
07控制流与动态性 (Dynamism)Predication, Symbolic Shape, JIT Specialization
08跨层次全局优化 (Global)Layout Propagation, Cost Model, Auto-tuning
09实战场景映射 (Scenarios)LLM, MoE, Sparse, Edge, DLRM
附录主流编译器与参考XLA, TVM, Triton, MLIR, CANN

Released under the CC BY-NC-ND 4.0 License.