teras.layers.TransformerFeedForward

teras.layers.TransformerFeedForward#

class teras.layers.TransformerFeedForward(embedding_dim, hidden_dim=None, activation='relu', dropout=0.0, **kwargs)[source]#

Transformer Feed Forward layer as proposed in the original Transformers architecture, in the paper,”Attention is all you need”, with a slight addition of optional Dropout layer.

Reference(s):

https://arxiv.org/abs/1706.03762

Parameters:
  • embedding_dim (int) – int, dimensionality of embeddings being used in the model

  • hidden_dim (int) – int, hidden dimensionality to use. By default, it is four-times of the embedding_dim.

  • activation (Union[str, Callable, Layer]) – str or callable, activation function to use for the inner linear layer. Defaults to “relu”,

  • dropout (float) – float, dropout rate to use for the dropout layer that is applied in between the two linear layer. Defaults to 0., because the original transformer architecture doesn’t employ a Dropout layer.

__init__(embedding_dim, hidden_dim=None, activation='relu', dropout=0.0, **kwargs)[source]#

Methods

__init__(embedding_dim[, hidden_dim, ...])

add_loss(loss)

Can be called inside of the call() method to add a scalar loss.

add_metric()

add_variable(shape, initializer[, dtype, ...])

Add a weight variable to the layer.

add_weight([shape, initializer, dtype, ...])

Add a weight variable to the layer.

build(input_shape)

build_from_config(config)

Builds the layer's states with the supplied config dict.

call(inputs)

compute_mask(inputs, previous_mask)

compute_output_shape(*args, **kwargs)

compute_output_spec(*args, **kwargs)

count_params()

Count the total number of scalars composing the weights.

from_config(config)

Creates an operation from its config.

get_build_config()

Returns a dictionary with the layer's input shape.

get_config()

Returns the config of the object.

get_weights()

Return the values of layer.weights as a list of NumPy arrays.

load_own_variables(store)

Loads the state of the layer.

quantize(mode)

quantized_call(*args, **kwargs)

save_own_variables(store)

Saves the state of the layer.

set_weights(weights)

Sets the values of layer.weights from a list of NumPy arrays.

stateless_call(trainable_variables, ...[, ...])

Call the layer without any side effects.

symbolic_call(*args, **kwargs)

Attributes

compute_dtype

The dtype of the computations performed by the layer.

dtype

Alias of layer.variable_dtype.

dtype_policy

input

Retrieves the input tensor(s) of a symbolic operation.

input_dtype

The dtype layer inputs should be converted to.

input_spec

losses

List of scalar losses from add_loss, regularizers and sublayers.

metrics

List of all metrics.

metrics_variables

List of all metric variables.

non_trainable_variables

List of all non-trainable layer state.

non_trainable_weights

List of all non-trainable weight variables of the layer.

output

Retrieves the output tensor(s) of a layer.

path

The path of the layer.

quantization_mode

The quantization mode of this layer, None if not quantized.

supports_masking

Whether this layer supports computing a mask using compute_mask.

trainable

Settable boolean, whether this layer should be trainable or not.

trainable_variables

List of all trainable layer state.

trainable_weights

List of all trainable weight variables of the layer.

variable_dtype

The dtype of the state (weights) of the layer.

variables

List of all layer state, including random seeds.

weights

List of all weight variables of the layer.