teras.utils.get_metadata_for_embedding#
- teras.utils.get_metadata_for_embedding(dataframe, categorical_features=None, numerical_features=None)[source]#
Utility function that create metadata for features in a given dataframe required by the Categorical and Numerical embedding layers in Teras. For numerical features, it maps each feature name to feature index. For categorical features, it maps each feature name to a tuple of feature index and vocabulary of words in that categorical feature. This metadata is usually required by the architectures that create embeddings of Numerical or Categorical features, such as TabTransformer, TabNet, FT-Transformer, etc.
- Parameters:
dataframe (
DataFrame) – Input dataframecategorical_features – List of names of categorical features in the input dataset
numerical_features – List of names of categorical features in the input dataset
- Returns:
A dictionary which contains sub-dictionaries for categorical and numerical features where categorical dictionary is a mapping of categorical feature names to a tuple of feature indices and the lists of unique values (vocabulary) in them, while numerical dictionary is a mapping of numerical feature names to their indices {feature_name: (feature_idx, vocabulary)} for feature in categorical features. {feature_name: feature_idx} for feature in numerical features.