langvae.arch package

Submodules

langvae.arch.vae module

class langvae.arch.vae.LangVAE(model_config: VAEConfig, encoder: SentenceEncoder | None, decoder: SentenceDecoder | None)[source]

Bases: VAE

A language-oriented Variational Autoencoder (VAE) that can be used for text generation.

Parameters:

model_config (VAEConfig) – The configuration of the VAE model.
encoder (Optional[SentenceEncoder]) – Language encoder model that processes input data and returns sentence embeddings.
decoder (Optional[SentenceDecoder]) – Language decoder model that generates text from latent representations.

decode_sentences(z: Tensor, cvars_emb: List[Tensor] = None) → List[str][source]

Decodes the latent variable tensor into a list of sentences.

Parameters:: z (Tensor) – The latent variable tensor to be decoded.
Returns:: A list of strings representing the decoded sentences.
Return type:: List[str]

encode_z(x: Tensor, c: Dict[str, Tensor] = None) → Tuple[Tensor, List[Tensor]][source]

Encodes the input tensor into a latent variable tensor.

Parameters:: x (Tensor) – The input tensor to be encoded.
Returns:: A tuple of tensors containing the sampled latent variables and conditional variable embeddings if available, respectively.
Return type:: Tuple[Tensor, List[Tensor]]

forward(inputs: BaseDataset, **kwargs)[source]

The VAE model

Parameters:: inputs (BaseDataset) – The training dataset with labels
Returns:: An instance of ModelOutput containing all the relevant parameters
Return type:: ModelOutput

classmethod load_from_folder(dir_path)[source]

Class method to be used to load the model from a specific folder

Parameters:: dir_path (str) – The path where the model should have been be saved.

Note

This function requires the folder to contain:

a model_config.json and a model.pt if no custom architectures were provided

or

a model_config.json, a model.pt and a encoder.pkl (resp. decoder.pkl) if a custom encoder (resp. decoder) was provided

classmethod load_from_hf_hub(hf_hub_path: str)[source]

Class method to be used to load a pretrained model from the Hugging Face hub

Parameters:: hf_hub_path (str) – The path where the model should have been be saved on the hugginface hub.

Note

This function requires the folder to contain:

a model_config.json and a model.pt if no custom architectures were provided

or

a model_config.json, a model.pt and a encoder.pkl (resp. decoder.pkl) if a custom encoder (resp. decoder) was provided

loss_function(recon_x, x, mu, log_var, z) → Tuple[Tensor, Tensor, Tensor][source]

Computes the loss function for the VAE model.

Parameters:

recon_x (Tensor) – The reconstructed input tensor.
x (Tensor) – The original input tensor.
mu (Tensor) – The mean of the latent variable distribution.
log_var (Tensor) – The logarithm of the variance of the latent variable distribution.
z (Tensor) – The sampled latent variable tensor.

Returns:

A tuple containing the reconstruction loss, the KL divergence: loss, and the total loss.

Return type:

Tuple[Tensor, Tensor, Tensor]

push_to_hf_hub(hf_hub_path: str)[source]

Uploads the VAE model to the Hugging Face Hub.

Parameters:: hf_hub_path (str) – The HF hub path where the model should be uploaded to.

save(dir_path: str)[source]

Method to save the model at a specific location. It saves, the model weights as a models.pt file along with the model config as a model_config.json file. If the model to save used custom encoder (resp. decoder) provided by the user, these are also saved as decoder.pkl (resp. decoder.pkl).

Parameters:: dir_path (str) – The path where the model should be saved. If the path path does not exist a folder will be created at the provided location.

langvae.arch.vae.vae_nll_loss(recon_x: Tensor, x: Tensor, mu: Tensor, log_var: Tensor, z: Tensor, pad_token_id: int, beta: float, target_kl: float) → Tuple[Tensor, Tensor, Tensor][source]

Calculates the negative log-likelihood (NLL) loss for a Variational Autoencoder (VAE).

Parameters:

recon_x (Tensor) – The reconstructed input tensor.
x (Tensor) – The original input tensor.
mu (Tensor) – The mean of the latent variable distribution.
log_var (Tensor) – The logarithm of the variance of the latent variable distribution.
z (Tensor) – The latent variable tensor.
pad_token_id (int) – The padding token ID for the input sequence.
beta (float) – A hyperparameter that controls the trade-off between reconstruction loss and KL divergence.
target_kl (float) – A target value for the KL divergence (cut-off).

Returns:

Total NLL loss (reconstruction loss + KL divergence).
Average reconstruction loss.
Average KL divergence.

Return type:

Tuple[Tensor, Tensor, Tensor]

langvae.arch package

Submodules

langvae.arch.vae module

Module contents