teras.layers.TransformerFeedForward#
- class teras.layers.TransformerFeedForward(embedding_dim, hidden_dim=None, activation='relu', dropout=0.0, **kwargs)[source]#
Transformer Feed Forward layer as proposed in the original Transformers architecture, in the paper,”Attention is all you need”, with a slight addition of optional Dropout layer.
- Reference(s):
- Parameters:
embedding_dim (
int) – int, dimensionality of embeddings being used in the modelhidden_dim (
int) – int, hidden dimensionality to use. By default, it is four-times of the embedding_dim.activation (
Union[str,Callable,Layer]) – str or callable, activation function to use for the inner linear layer. Defaults to “relu”,dropout (
float) – float, dropout rate to use for the dropout layer that is applied in between the two linear layer. Defaults to 0., because the original transformer architecture doesn’t employ a Dropout layer.
Methods
__init__(embedding_dim[, hidden_dim, ...])add_loss(loss)Can be called inside of the call() method to add a scalar loss.
add_metric()add_variable(shape, initializer[, dtype, ...])Add a weight variable to the layer.
add_weight([shape, initializer, dtype, ...])Add a weight variable to the layer.
build(input_shape)build_from_config(config)Builds the layer's states with the supplied config dict.
call(inputs)compute_mask(inputs, previous_mask)compute_output_shape(*args, **kwargs)compute_output_spec(*args, **kwargs)count_params()Count the total number of scalars composing the weights.
from_config(config)Creates an operation from its config.
get_build_config()Returns a dictionary with the layer's input shape.
get_config()Returns the config of the object.
get_weights()Return the values of layer.weights as a list of NumPy arrays.
load_own_variables(store)Loads the state of the layer.
quantize(mode)quantized_call(*args, **kwargs)save_own_variables(store)Saves the state of the layer.
set_weights(weights)Sets the values of layer.weights from a list of NumPy arrays.
stateless_call(trainable_variables, ...[, ...])Call the layer without any side effects.
symbolic_call(*args, **kwargs)Attributes
compute_dtypeThe dtype of the computations performed by the layer.
dtypeAlias of layer.variable_dtype.
dtype_policyinputRetrieves the input tensor(s) of a symbolic operation.
input_dtypeThe dtype layer inputs should be converted to.
input_speclossesList of scalar losses from add_loss, regularizers and sublayers.
metricsList of all metrics.
metrics_variablesList of all metric variables.
non_trainable_variablesList of all non-trainable layer state.
non_trainable_weightsList of all non-trainable weight variables of the layer.
outputRetrieves the output tensor(s) of a layer.
pathThe path of the layer.
quantization_modeThe quantization mode of this layer, None if not quantized.
supports_maskingWhether this layer supports computing a mask using compute_mask.
trainableSettable boolean, whether this layer should be trainable or not.
trainable_variablesList of all trainable layer state.
trainable_weightsList of all trainable weight variables of the layer.
variable_dtypeThe dtype of the state (weights) of the layer.
variablesList of all layer state, including random seeds.
weightsList of all weight variables of the layer.