cfn/heterogeneous-distributed-training-framework

Files

tianyutong d6ce507681 Initial Commit of Megatron-LM-0.8.0

Change-Id: Ifb4c061207ee2644a21e161ad52fc6ff40564e39

2025-05-23 09:54:48 +08:00

1.9 KiB

Raw Permalink Blame History

fusions package

This package provides modules that provide commonly fused operations. Fusing operations improves compute efficiency by increasing the amount of work done each time a tensor is read from memory. To perform the fusion, modules in this either rely on PyTorch functionality for doing just-in-time compilation (i.e. torch.jit.script in older PyTorch versions of torch.compile in recent versions), or call into custom kernels in external libraries such as Apex or TransformerEngine.

Submodules

fusions.fused_bias_dropout module

This module uses PyTorch JIT to fuse the bias add and dropout operations. Since dropout is not used during inference, different functions are used when in train mode and when in inference mode.

core.fusions.fused_bias_dropout

fusions.fused_bias_gelu module

This module uses PyTorch JIT to fuse the bias add and GeLU nonlinearity operations.

core.fusions.fused_bias_gelu

fusions.fused_layer_norm module

This module provides a wrapper around various fused LayerNorm implementation in Apex.

core.fusions.fused_layer_norm

fusions.fused_softmax module

This module provides wrappers around variations of Softmax in Apex.

core.fusions.fused_softmax

fusions.fused_cross_entropy_loss module

This module uses PyTorch JIT to fuse the cross entropy loss calculation and batches communication calls.