Convert e-books to txt
If you want to convert e-books to plain text (txt) on linux or other Unix-like systems, here are some command-line utilities that you can use along with their terminal commands:
File type | Text conversion utilities | Examples |
---|---|---|
.djvu | 1. djvutxt 2. ebook-convert |
1. djvutxt input.djvu output.txt 2. ebook-convert input.djvu output.txt |
.epub | 1. epub2txt 2. ebook-convert 3. unzip |
1. epub2txt input.epub > output.txt 2. ebook-convert input.epub output.txt 3. unzip -c input.epub > output.txt |
.doc | 1. catdoc 2. textutil (macOS) 3. ebook-convert |
1. catdoc input.doc > output.txt 2. textutil -convert txt input.doc -output output.txt 3. ebook-convert input.doc output.txt |
1. pdftotext 2. ebook-convert |
1. pdftotext input.pdf output.txt 2. ebook-convert input.pdf output.txt |
NOTES:
- There is a caveat for unzip: the generated output file will also include HTML tags since epubs are zipped HTML files. That's why I put it in 3rd position in case that you want a quick and dirty solution.
- ebook-convert is a utility from the e-book library manager calibre that can support many other e-book formats for text conversion.
- textutil can take as input files: txt, html, rtf, rtfd, doc, wordml, or webarchive
Image source:
cheeseisdisgusting, CC BY-SA 3.0, via Wikimedia Commons
Comments
Post a Comment