AttentionRollout#

class txv.exp.AttentionRollout(model: Module)#

Link to Paper: Quantifying Attention Flow in Transformers This is a class-agnostic explanation method. Therefore, an index cannot be passed as an argument.

__init__(model: Module) → None#

Tip

Use the model with lrp=False as LRP models have higher memory footprint.

explain(input: Tensor, layer: int = 0, abm: bool = True) → Tensor#

Parameters:

input (torch.Tensor) – Input tensor
layer (int, optional) – Layer number to start the computation of rollout, by default 0. 0 \(\leq\) layer \(\leq\) model.depth - 1
abm (bool, optional) – Architecture based modification, by default True