
What Is ebook2audiobook?
If you’ve ever wanted to listen to your favourite e‑books instead of reading them, ebook2audiobook may be the tool you’ve been waiting for. Created by developer Drew Thomasson, this open‑source project converts digital books into high‑quality audiobooks. It not only splits your e‑book into logical chapters but also adds metadata and supports voice cloning so you can personalise the narration. The tool leverages cutting‑edge text‑to‑speech engines — like Coqui XTTSv2, Bark, Vits, Fairseq, YourTTS and Tacotron2 — to generate lifelike speech from text .
Key Features and Benefits
High‑quality text‑to‑speech – ebook2audiobook taps into advanced models such as Coqui XTTSv2 and Fairseq for natural‑sounding narration . Optional voice cloning – you can provide your own voice sample and have the audiobooks narrated in your own voice . Chapter splitting – the tool automatically splits your e‑book into chapters, making it easier to navigate . Multilingual support – more than 1,100 languages and dialects are supported. Built‑in languages include English, Spanish, Chinese, French, German and many more . Lightweight – designed to run on systems with as little as 4 GB of RAM . Cross‑platform – works on Windows, macOS and Linux. The project also provides Docker images and Gradio interfaces for quick deployment .
Supported Languages
ebook2audiobook covers a wide array of languages, from popular European languages like English, Spanish, French, German and Italian to Asian languages such as Chinese, Japanese, Hindi and Korean, and even lesser‑spoken languages like Yoruba, Swahili and Persian . A full list of supported languages and dialects is available in the project’s documentation .
System Requirements
The developers recommend at least 4 GB of RAM, though 8 GB is ideal for smoother generation. You can run the tool on Intel, AMD or ARM CPUs and it supports Nvidia, AMD and Intel GPUs for hardware acceleration . Docker deployment requires virtualisation to be enabled on Windows systems .
Getting Started: Installation and Usage
Clone the Repository
To start using ebook2audiobook locally, clone the GitHub repository and navigate into it:
git clone https://github.com/DrewThomasson/ebook2audiobook.git
cd ebook2audiobook
Launch the Gradio Web Interface
The easiest way to run the tool is through its web UI. After cloning the repository, run the appropriate launch script for your operating system:
Linux/MacOS – ./ebook2audiobook.sh Windows – ebook2audiobook.cmd Mac Launcher – double‑click Mac Ebook2Audiobook Launcher.command
This opens a local Gradio application where you can upload your e‑book, select a TTS engine, optionally provide a voice‑cloning file, and choose the language. A public shareable link can be created with the –share flag .
Running Headless for Automation
If you prefer to convert books without a GUI, you can run ebook2audiobook in headless mode using command‑line parameters. Below are some examples for Linux/Mac and Windows:
# Linux or Mac
./ebook2audiobook.sh –headless –ebook <path_to_ebook_file> –voice <path_to_voice_file> –language <language_code>
# Windows
ebook2audiobook.cmd –headless –ebook <path_to_ebook_file> –voice <path_to_voice_file> –language <language_code>
The –ebook argument specifies the path to your digital book; –voice is optional and points to a WAV file for voice cloning; and –language sets the ISO‑639‑1 or ISO‑639‑3 language code .
Using Custom Models
ebook2audiobook is compatible with custom TTS models packaged as zip files. A custom model must contain mandatory files such as config.json, model.pth, vocab.json and ref.wav . Use the –custom_model option to load a model:
./ebook2audiobook.sh –headless –ebook <ebook_file_path> –voice <voice_file_path> –language <language_code> –custom_model <custom_model_path>
Advanced Options and Parameters
For power users, the tool exposes dozens of optional flags to fine‑tune performance. You can specify the device (–device cpu/gpu/mps), choose among different TTS engines (–tts_engine XTTSv2,BARK,VITS,FAIRSEQ,TACOTRON2,YOURTTS), adjust generation temperature, sampling and beam search settings, set the output format (MP3/OGG/WAV), or specify the output directory .
Running via Docker
Users who prefer containerised deployments can run ebook2audiobook through Docker. Run the prebuilt image with CPU or GPU support:
docker run –pull always –rm -p 7860:7860 athomasson2/ebook2audiobook # CPU
docker run –pull always –rm –gpus all -p 7860:7860 athomasson2/ebook2audiobook # GPU
These commands start the Gradio interface on port 7860. A separate Docker file is also provided for custom builds .
Why Use ebook2audiobook?
Accessibility – People with visual impairments or reading difficulties can easily convert e‑books to audio without paying for commercial narrations. Language learning – Listen to books in various languages to improve pronunciation and comprehension. Personalisation – Voice cloning lets you or a loved one become the narrator of any book. Open source & free – The project is licensed under Apache 2.0 and built by a vibrant community, receiving over 13,000 stars on GitHub.
Where to Try It
You can explore ebook2audiobook directly on its GitHub repository. A live web demo is available via Hugging Face Spaces, and there are Colab notebooks for GPU acceleration and a ready‑to‑run Docker image . Detailed installation instructions, a list of supported languages, and fine‑tuned models are provided in the README and wiki.
Leave a Reply