Due to popular demand, we've added text-to-speech support to Transformers.js! ๐
https://github.com/xenova/transformers.js/assets/26504141/9fa5131d-0e07-47fa-9a13-122c1b69d233
You can get started in just a few lines of code!
import { pipeline } from '@xenova/transformers';
let speaker_embeddings = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/speaker_embeddings.bin';
let synthesizer = await pipeline('text-to-speech', 'Xenova/speecht5_tts', { quantized: false });
let out = await synthesizer('Hello, my dog is cute', { speaker_embeddings });
// {
// audio: Float32Array(26112) [-0.00005657337896991521, 0.00020583874720614403, ...],
// sampling_rate: 16000
// }
You can then save the audio to a .wav file with the wavefile package:
import wavefile from 'wavefile';
import fs from 'fs';
let wav = new wavefile.WaveFile();
wav.fromScratch(1, out.sampling_rate, '32f', out.audio);
fs.writeFileSync('out.wav', wav.toBuffer());
Alternatively, you can play the file in your browser (see below).
Don't like the speaker's voice? Well, you can choose another from the >7000 speaker embeddings in the CMU Arctic dataset (see here)!
Note: currently, we only support TTS w/ speecht5, but in future we'll add others like bark and MMS!
To showcase the power of in-browser TTS, we're also releasing a simple example app (demo, code). Feel free to make improvements to it... and if you do (or end up building your own), please tag me on Twitter! ๐ค
https://github.com/xenova/transformers.js/assets/26504141/98adea31-b002-403b-ba9d-1edcc7e7bf11
< and > symbols generated from docs in https://github.com/xenova/transformers.js/pull/335Full Changelog: https://github.com/xenova/transformers.js/compare/2.6.2...2.7.0
Fetched April 7, 2026