AutoencoderOobleck

Oobleck 变分自编码器（VAE）模型与 KL 损失在 Stability-AI/stable-audio-tools 和 Stable Audio Open 中由 Stability AI 引入。该模型用于 🤗 Diffusers 中，将音频波形编码为潜在表示，并将潜在表示解码为音频波形。

论文的摘要如下：

开放的生成模型对社区至关重要，允许进行微调，并在展示新模型时作为基线。然而，大多数当前的文本到音频模型都是私有的，艺术家和研究人员无法在其基础上进行构建。在这里，我们描述了一种新的开放权重文本到音频模型的架构和训练过程，该模型使用 Creative Commons 数据进行训练。我们的评估显示，该模型在各种指标上的表现与最先进的技术相当。值得注意的是，报告的 FDopenl3 结果（衡量生成的真实性）展示了其在 44.1kHz 下高质量立体声合成的潜力。

AutoencoderOobleck

[[autodoc]] AutoencoderOobleck - decode - encode - all

OobleckDecoderOutput

[[autodoc]] models.autoencoders.autoencoder_oobleck.OobleckDecoderOutput

OobleckDecoderOutput

[[autodoc]] models.autoencoders.autoencoder_oobleck.OobleckDecoderOutput