langvae.arch package

Submodules

langvae.arch.vae module

class langvae.arch.vae.LangVAE(model_config: VAEConfig, encoder: SentenceEncoder | None, decoder: SentenceDecoder | None)[source]

Bases: VAE

A language-oriented Variational Autoencoder (VAE) that can be used for text generation.

Parameters:
  • model_config (VAEConfig) – The configuration of the VAE model.

  • encoder (Optional[SentenceEncoder]) – Language encoder model that processes input data and returns sentence embeddings.

  • decoder (Optional[SentenceDecoder]) – Language decoder model that generates text from latent representations.

decode_sentences(z: Tensor, cvars_emb: List[Tensor] = None) List[str][source]

Decodes the latent variable tensor into a list of sentences.

Parameters:

z (Tensor) – The latent variable tensor to be decoded.

Returns:

A list of strings representing the decoded sentences.

Return type:

List[str]

encode_z(x: Tensor, c: Dict[str, Tensor] = None) Tuple[Tensor, List[Tensor]][source]

Encodes the input tensor into a latent variable tensor.

Parameters:

x (Tensor) – The input tensor to be encoded.

Returns:

A tuple of tensors containing the sampled latent variables and conditional variable embeddings if available, respectively.

Return type:

Tuple[Tensor, List[Tensor]]

forward(inputs: BaseDataset, **kwargs)[source]

The VAE model

Parameters:

inputs (BaseDataset) – The training dataset with labels

Returns:

An instance of ModelOutput containing all the relevant parameters

Return type:

ModelOutput

classmethod load_from_folder(dir_path)[source]

Class method to be used to load the model from a specific folder

Parameters:

dir_path (str) – The path where the model should have been be saved.

Note

This function requires the folder to contain:

  • a model_config.json and a model.pt if no custom architectures were provided

or

  • a model_config.json, a model.pt and a encoder.pkl (resp. decoder.pkl) if a custom encoder (resp. decoder) was provided
classmethod load_from_hf_hub(hf_hub_path: str)[source]

Class method to be used to load a pretrained model from the Hugging Face hub

Parameters:

hf_hub_path (str) – The path where the model should have been be saved on the hugginface hub.

Note

This function requires the folder to contain:

  • a model_config.json and a model.pt if no custom architectures were provided

or

  • a model_config.json, a model.pt and a encoder.pkl (resp. decoder.pkl) if a custom encoder (resp. decoder) was provided
loss_function(recon_x, x, mu, log_var, z) Tuple[Tensor, Tensor, Tensor][source]

Computes the loss function for the VAE model.

Parameters:
  • recon_x (Tensor) – The reconstructed input tensor.

  • x (Tensor) – The original input tensor.

  • mu (Tensor) – The mean of the latent variable distribution.

  • log_var (Tensor) – The logarithm of the variance of the latent variable distribution.

  • z (Tensor) – The sampled latent variable tensor.

Returns:

A tuple containing the reconstruction loss, the KL divergence

loss, and the total loss.

Return type:

Tuple[Tensor, Tensor, Tensor]

push_to_hf_hub(hf_hub_path: str)[source]

Uploads the VAE model to the Hugging Face Hub.

Parameters:

hf_hub_path (str) – The HF hub path where the model should be uploaded to.

save(dir_path: str)[source]

Method to save the model at a specific location. It saves, the model weights as a models.pt file along with the model config as a model_config.json file. If the model to save used custom encoder (resp. decoder) provided by the user, these are also saved as decoder.pkl (resp. decoder.pkl).

Parameters:

dir_path (str) – The path where the model should be saved. If the path path does not exist a folder will be created at the provided location.

langvae.arch.vae.vae_nll_loss(recon_x: Tensor, x: Tensor, mu: Tensor, log_var: Tensor, z: Tensor, pad_token_id: int, beta: float, target_kl: float) Tuple[Tensor, Tensor, Tensor][source]

Calculates the negative log-likelihood (NLL) loss for a Variational Autoencoder (VAE).

Parameters:
  • recon_x (Tensor) – The reconstructed input tensor.

  • x (Tensor) – The original input tensor.

  • mu (Tensor) – The mean of the latent variable distribution.

  • log_var (Tensor) – The logarithm of the variance of the latent variable distribution.

  • z (Tensor) – The latent variable tensor.

  • pad_token_id (int) – The padding token ID for the input sequence.

  • beta (float) – A hyperparameter that controls the trade-off between reconstruction loss and KL divergence.

  • target_kl (float) – A target value for the KL divergence (cut-off).

Returns:

  • Total NLL loss (reconstruction loss + KL divergence).

  • Average reconstruction loss.

  • Average KL divergence.

Return type:

Tuple[Tensor, Tensor, Tensor]

Module contents