Attention mechanism

In artificial neural networks, attention is a technique that is meant to mimic cognitive attention. This effect enhances some parts of the input data while diminishing other parts—the motivation being that the network should devote more focus to the important parts of the data, even though they may be small portion of an image or sentence. Learning which part of the data is more important than another depends on the context, and this is trained by gradient descent.