langvae.arch package
Submodules
langvae.arch.vae module
- class langvae.arch.vae.LangVAE(model_config: VAEConfig, encoder: SentenceEncoder | None, decoder: SentenceDecoder | None)[source]
Bases:
VAEA language-oriented Variational Autoencoder (VAE) that can be used for text generation.
- Parameters:
model_config (VAEConfig) – The configuration of the VAE model.
encoder (Optional[SentenceEncoder]) – Language encoder model that processes input data and returns sentence embeddings.
decoder (Optional[SentenceDecoder]) – Language decoder model that generates text from latent representations.
- decode_sentences(z: Tensor, cvars_emb: List[Tensor] = None) List[str][source]
Decodes the latent variable tensor into a list of sentences.
- Parameters:
z (Tensor) – The latent variable tensor to be decoded.
- Returns:
A list of strings representing the decoded sentences.
- Return type:
List[str]
- encode_z(x: Tensor, c: Dict[str, Tensor] = None) Tuple[Tensor, List[Tensor]][source]
Encodes the input tensor into a latent variable tensor.
- Parameters:
x (Tensor) – The input tensor to be encoded.
- Returns:
A tuple of tensors containing the sampled latent variables and conditional variable embeddings if available, respectively.
- Return type:
Tuple[Tensor, List[Tensor]]
- forward(inputs: BaseDataset, **kwargs)[source]
The VAE model
- Parameters:
inputs (BaseDataset) – The training dataset with labels
- Returns:
An instance of ModelOutput containing all the relevant parameters
- Return type:
ModelOutput
- classmethod load_from_folder(dir_path)[source]
Class method to be used to load the model from a specific folder
- Parameters:
dir_path (str) – The path where the model should have been be saved.
Note
This function requires the folder to contain:
- a
model_config.jsonand amodel.ptif no custom architectures were provided
or
- a
model_config.json, amodel.ptand aencoder.pkl(resp.decoder.pkl) if a custom encoder (resp. decoder) was provided
- classmethod load_from_hf_hub(hf_hub_path: str)[source]
Class method to be used to load a pretrained model from the Hugging Face hub
- Parameters:
hf_hub_path (str) – The path where the model should have been be saved on the hugginface hub.
Note
This function requires the folder to contain:
- a
model_config.jsonand amodel.ptif no custom architectures were provided
or
- a
model_config.json, amodel.ptand aencoder.pkl(resp.decoder.pkl) if a custom encoder (resp. decoder) was provided
- loss_function(recon_x, x, mu, log_var, z) Tuple[Tensor, Tensor, Tensor][source]
Computes the loss function for the VAE model.
- Parameters:
recon_x (Tensor) – The reconstructed input tensor.
x (Tensor) – The original input tensor.
mu (Tensor) – The mean of the latent variable distribution.
log_var (Tensor) – The logarithm of the variance of the latent variable distribution.
z (Tensor) – The sampled latent variable tensor.
- Returns:
- A tuple containing the reconstruction loss, the KL divergence
loss, and the total loss.
- Return type:
Tuple[Tensor, Tensor, Tensor]
- push_to_hf_hub(hf_hub_path: str)[source]
Uploads the VAE model to the Hugging Face Hub.
- Parameters:
hf_hub_path (str) – The HF hub path where the model should be uploaded to.
- save(dir_path: str)[source]
Method to save the model at a specific location. It saves, the model weights as a
models.ptfile along with the model config as amodel_config.jsonfile. If the model to save used custom encoder (resp. decoder) provided by the user, these are also saved asdecoder.pkl(resp.decoder.pkl).- Parameters:
dir_path (str) – The path where the model should be saved. If the path path does not exist a folder will be created at the provided location.
- langvae.arch.vae.vae_nll_loss(recon_x: Tensor, x: Tensor, mu: Tensor, log_var: Tensor, z: Tensor, pad_token_id: int, beta: float, target_kl: float) Tuple[Tensor, Tensor, Tensor][source]
Calculates the negative log-likelihood (NLL) loss for a Variational Autoencoder (VAE).
- Parameters:
recon_x (Tensor) – The reconstructed input tensor.
x (Tensor) – The original input tensor.
mu (Tensor) – The mean of the latent variable distribution.
log_var (Tensor) – The logarithm of the variance of the latent variable distribution.
z (Tensor) – The latent variable tensor.
pad_token_id (int) – The padding token ID for the input sequence.
beta (float) – A hyperparameter that controls the trade-off between reconstruction loss and KL divergence.
target_kl (float) – A target value for the KL divergence (cut-off).
- Returns:
Total NLL loss (reconstruction loss + KL divergence).
Average reconstruction loss.
Average KL divergence.
- Return type:
Tuple[Tensor, Tensor, Tensor]