# IR Interface Overview ## Multi-level IR Abstraction - High-level abstract interfaces that hide low-level details and map hardware-agnostic expressions to low-level instructions, improving operator development usability - Fine-grained performance control interfaces for precise control of on-chip memory addresses, pipeline sync insertion, and ping-pong pipeline optimization - Multi-level interfaces support flexible integration of custom DSLs and frameworks for high-performance custom operators on Ascend NPU ```text Torch-MLIR / Triton (Framework/DSL layer) | v Linalg / Tensor (General tensor algebra layer) | v HFusion (Hardware-aware fusion & scheduling layer) | v HIVM (NPU instruction layer) | v LIR -> Binary (Machine code generation) ``` - **Linalg / Tensor layer**: Standard MLIR dialects for operator semantics (Elemwise, Broadcast, Reduce, Transpose, Concat, etc.); HFusion performs fusion, tiling, and scheduling automatically - **HFusion layer**: Ascend-NPU-aware named ops (e.g. `hfusion.elemwise_unary`, `hfusion.cast`, `hfusion.select`, `hfusion.reduce_with_index`), tensor semantics, automatic bufferization, tiling, and scheduling - **HIVM layer**: Direct mapping to NPU instructions, explicit control of memory hierarchy (GM/UB/L1/L0), compute pipelines (Vector/Cube/MTE), and sync primitives for fine-grained tuning These layers allow custom DSLs and frameworks to integrate. Triton and PyTorch connect via IR conversion for high-performance custom operators on Ascend NPU. ## Compile Options and Function Attributes ### Compile options `bishengir-compile` provides these common options: | Option | Default | Description | |--------|---------|-------------| | `-target` | `Ascend` | Target device (core count, on-chip memory size and other hardware specifications), queried via `npu-smi info`. | | `-block-dim` | `1` | Number of blocks; compiled kernel carries `hacc.block_dim` | | `-enable-hfusion-compile` | `false` | Enable HFusion pipeline (fusion, scheduling, tiling) | | `-enable-hivm-compile` | `true` | Enable HIVM pipeline (lower to HIVM and optimize) | | `-enable-torch-compile` | `false` | Enable Torch-MLIR pipeline | | `-enable-triton-kernel-compile` | `false` | Enable Triton kernel pipeline | Supported target devices include the Atlas A2/A3, Ascend 950PR/Ascend 950DT series. ### Function attributes The following attributes mark kernel entry functions and are shared across all integration paths: | Attribute | Description | |-----------|-------------| | `hacc.entry` | Marks the current function as the kernel entry | | `hacc.function_kind = #hacc.function_kind` | Function runs on the DEVICE side | | `hacc.function_kind = #hacc.function_kind` | Function runs on the HOST side; HFusion will outline device kernels | Example: ```mlir func.func @kernel(...) attributes {hacc.entry, hacc.function_kind = #hacc.function_kind} { ... } ``` ## Triton integration Triton is a widely used language for high-performance kernel development. [Triton Ascend](https://gitcode.com/Ascend/triton-ascend/) converts Triton kernels to MLIR for the AscendNPU IR stack and enables running Triton kernels on Ascend NPU. See [Triton interface](triton_interface.md) for installation, environment, op mapping, and Ascend extensions. ## TileLang integration TileLang (tilelang-ascend) is a domain-specific language for Ascend NPU kernel development, built on tile-lang's Pythonic syntax and [TVM](https://tvm.apache.org/). It supports GEMM, vector operations, and attention mechanisms, compiling kernels to AscendNPU IR (HIVM) for execution on Ascend NPU. See [TileLang interface](tile_lang_interface.md) for installation and usage details. ## Framework integration AscendNPU IR supports framework integration (PyTorch/TensorFlow/MindSpore) in two ways: - **DSL integration**: e.g., Triton and TileLang - **IR integration**: e.g., Torch IR, Linalg/HFusion IR, and HIVM IR See [Framework interface](framework_interface.md) for details.