I tried Kotobee Author and found that importing docx makes one large chapter.
How to split my book to chapters by style or pattern? (OK, I can copy and paste text chapter by chapter but I'd like to automate this process.)
Better doing it automatically during import.
And how to merge two chapters? (Again: copy-paste doesn't play!)
In brief using chapters is rather silly. You ought to work on it!
Thanks Stan for the feedback. I understand what you're saying.
If it's easy to convert Word to HTML, then it's easy to just paste in the HTML into the chapter's source mode. The source mode displays the HTML behind the chapter, which you can edit or replace entirely. Actually, through the file manager you have access to each single chapter (html) file. The great thing about using HTML rather than PDF, is that you will maintain the reflowable layout of the chapter.
But you're right, we should include an easier option to have all this done automatically, by importing a Word file.
No, you are missing the point.
many/most editor/publishers that would on layout will work on the entire book. Word has a ONE CLICK export to HTML, filtered. It creates ONE LARGE HTML with heading tags. Pasting this large, SINGLE FILE HTML into you source mode does nothing for the user. Having to cut and past 36 times for a novel is tedious and error prone. Programs are made to automate tedious, error prone things.
Of course HTML maintains the reflow...that is the point of HTML which is ALL THAT AN EPUB IS (fundamentally).
Your program makes adding some epub 3 elements a bit easier, which I like, but to use this I would do the following: Save word to Filtered HTML. Use Calibre to convert to an EPUB AUTOMATICALLY creating chapters and 95% of what you want. Open in Sigil and check some things like indenting and meta data. If you want to add epub3, either use Sigil and edit elements by hand, or use your program to add some epub3 elements. So, for my process this might help, but only occasionally as I still need to use the other process for a FAST, EFFICIENT epub (or mobi, including I might add, the new Amazon format which is really a modified epub3).
This was really helpful. That's going on our to-do list as high priority.
But may I ask, what distinguishes the different chapters in the 'one-large' html file, once it's exported from Word? How does Calibre (for example) know the start and end of each chapter? I mean, the entire Word file can be intended to be a single chapter, or multiple chapters. Unless the user is using page-breaks, or inserting some special placeholder text, that is to be detected by Calibre. Would love to hear from you about this.
Can you remind me with the extension of the new Amazon format?
Thanks again for sharing this. Love to hear from actual users!
Just letting you know that this feature now exists.
Calibre (and the online programs such as Amazon and Barnes&Noble) take the H1-H3 HTML tags to indicate chapters (there are some options in Calibre, as I recall to say how many levels deep you want to go). Word automatically takes the internal "Heading 1" through "Heading 3" and creates H1-H3 when creating a filtered html file.
The same thing occurs with RTF, which closely resembles HTML.
The programs then create a HTML file for each section and wrap it up into the zip file which is ePub (as I am sure you know, just rename an epub to zip and you can see all the files).
The only issue with Microsoft's/Word' does put a few bits of extraneous data, but not bad. If one is very good at using the same tags for formatting, then it also makes for a nice format (e.g. FirstParagraph) where you may want to NOT indent the first line).
Amazon. When I say "new" I mean a couple of years at this point. Mobi (which is what the AZW uses is really an old database format (again as you must know).
What I am talking about there is the AZW3 or KF8. Those are not Mobi. They are closer to epub3.
Then there is the KFX, which has some eInk settings, but is really just a downloader, not really a format.
I am NOT an expert on this. Just random bits.
Got it. Thanks. Hopefully in the next update (or the next) we'll have an HTML import function.
This is an article we've written for our blog (the very first article actually): https://www.kotobee.com/blog/understanding-the-epub-format/
I'd appreciate if you can share it. We tend to explain EPUB in absolute layman's terms; actually using a fun analogy so that newbies in the field can easily associate and remember the different components of epub. It came to my mind when you said rename an epub to zip to see all the files.
Thanks once again for your help.
Why don't you convert your Word document into a PDF document and import it? This way it will split each page into a page/chapter by itself.
Let me know if that works for you.
usually the h1 tag means the chapter header. But it is configurable in Calibre and th Jutoh too.
Moreover in Jutoh you can define a text pattern (such as 'chapter') or any CSS style to use as split marker.
Well, I tried that too. But its not too helpful to import a roman page to page.
And the main problem is to maintain (merge, split) chapters AFTER import!
If you want the best practice in creating a great ebook experience, then create the content as reflowable chapters manually. Sorry this means that you will need to copy bits and pieces from your Word document, and create separate chapters in Kotobee Author for this. It may sound like a headache, but trust me it will pay off at the end.
I trust you, but I'm making ebooks years ago with Jutoh. It split Word (and odt, html, txt) documents by pattern or by style at the process of import. After import you can reorganize chapters, merge or split them. You haven't to suck with separate pages.
It's a lot easier, trust me! :)
In Kotobee Author, the chapter can be as long as you want, since it's vertically scrollable. We haven't done much work with Word import, but only PDF import.
I apologize if how it is currently isn't very convenient for you.
Well in Jutoh (but in Word too) the chapter can be arbitrary long as long as fit in the 300k epub standard.
The problem is not Word import solely. If I could handle the chapters later manually more easy I didn't care import any more.
But the page-by-page pdf import causes breaks in the middle of sentences, what is unacceptable.
It doesn't make ebook creating easier but troublesome instead.