Category Archives: Creating book html files

Refining with a text editor


At the end of the last blog, we had finished with our word processor document, and created a new text editor document by copying and pasting your novel.  This example uses the well-known editor TextEdit, popular with programmers.

In case you don’t know, an .html file requires each paragraph to be wrapped with the starting <p> and ending </p>  A variation is to use some of the special utility features such as <p class=”first_paragraph”> which you may notice I have done. This will be presented in the style sheet attachment blog later.

Again, this is our starting document after some preliminary changes in Word, then pasted into our text editor


<p class = “chapter_title”>Prologue
<p class = “location”></p>Western Iranian Desert
The rising sun sent a shaft of piercing yellow light across the arid world of the desert, impacting the pale sand, instantly raising its temperature, unofficially marking the start of another hot day in western Iran.


As it stands, the document needs changes to properly function in .html.  One glaring issue is that the </p> paragraph markers are not in the correct place.  Since our document is many thousands of words in length, we need some help to correct it, That is the job of the editor…to save you a lot of hand work.  The first step is to use the advanced search and replace feature. This requires the regular expression mode of your search and replace window as follows:

regular expression search                [\r\n]+</(.*?)>
replace        </$1>\n

The above instructions are strangely wonderful and will result in the following changes throughout your document.


<i> </i>

<p class = “chapter_number”>Chapter 1
<p class = “chapter_title”></p>Target
<p class = “location></p>Sunrise Beach County Park…Tacoma
1900 Hours</p>

The last remaining rays of the sun were being consumed by the trees on Vashon Island, across the narrow channel of fast moving water, and evening was extending its big hand across the little park, the light fading by the minute. The deep sound of a motorcycle moving slowly wafted in little bites, riding on humid air


That helped, but we now have paragraphs with  no endings and inappropriate paragraph endings within a sentence.  The next step is to find and replace unwanted paragraph endings as follows:

find      <p class = “location></p>
replace        <p class = “location>

And this will be necessary using any place that you see the </p> other than at the end of a paragraph.


And the next step is to add paragraph endings where needed with the following.  This time you need the regular expression search and replace (a check box in the search and replace window):


regular expression search      ^(.+)$
replace      <p>$1</p>


Those two actions resulted in the following:


<p class = “chapter_number”>Chapter 2</p>
<p class = “chapter_title”>Reunion</p>
<p class = “location>Interstate 5, South of Seattle</p>

<p>Special Agent April Chauncy couldn’t shake the feeling that she was being watched. She had used her FBI training, her best spycraft methods, to detect her pursuers but had no luck spotting them. It was probably a team, a very good one,


Looks a lot better, doesn’t it? But wait! We are adding another step to add flourish to your otherwise boring ebook.  We are going to change the first line of the first paragraph in each chapter. This will allow you to insert a fancy, large and distinctive letter to start the chapter and, as you will see, makes a great difference. Sorry. This time you have to do each chapter by hand by selecting the <p> which starts the paragraph and inserting the following:


<p class=”first_paragraph”><span class=”scrollfont cap”>T</span>


By the way, the letter T between the > < is the first letter you want revised.  You have to delete the first letter in your text and place it between the > < for it to work. In this case I was using a T.

This accomplishes two tasks. It allows you to have special formatting for the first line. For example, you may want to follow a long tradition using no indentation for the first line in a chapter. And it allows you to define in your style sheet the font you wish to display for the first letter.  You have to trust me for now and will understand completely when you see the style sheet and, later, the complete document.  Your book should be looking like the following:


<p class = “chapter_number”>Chapter 37</p>
<p class = “chapter_title”>From Out Of Nowhere</p>
<p class = “location”>Montgomery County Airpark, Gaithersburg, MD</p>

<p class=”first_paragraph”><span class=”scrollfont cap”>T</span>he plane was nearly finished and the tools and men gone. It sat on the smooth concrete gleaming like a new toy, an 18,000 pound one. All unnecessary hardware had been stripped, and the markings removed and


Don’t be confused by the display in your browser. The lines are set to display on your monitor but are actually full length in your text editor.

One last but important issue is the use of parentheses. The ones you used in your document are fine, but you should be aware that html statements use a special character parenthesis which is hard to detect visually.  You should do another search and replace for both right and left parenthesis, especially if you find unexplained errors.


Hope this gives you a little help or at least more insight into creating your own ebook.  Really, it isn’t as hard as it looks, and once you get the hang of it, it goes quickly and gives you all the control to present a very lovely ebook.


Alexander Francis



From word processor to HTML— Creating your own ebook file


In a previous blog, I briefly discussed taking your book from a word processor file to an html file and then to a format suitable for upload,  publication, and distribution as an e-book.

Now that I have ten books in print and also releases of them in ebook formats, including the Kindle version and the .epub versions, I have some comments about the process for my readers which may give more insight into doing the work yourself. You should know that I am not a professional programmer or web developer, and likely just like you, a struggling writer trying to get my books out to be read. In my case, I also created the covers and this web site you are now visiting.

I currently use Apple products after many years of using MS Windows.  Although I am a fan of Apple, I have issues with their primary word processor, Pages, which is easy to use but has created endless problems for me when I try to use my finished copy and produce a .pdf file for submission to Ingram Spark for printing and distribution.  Pages simply isn’t up to the task.  I have resorted to using a Mac version of MS Word for the final product, and yes, there are problems with that also,  but Word is capable of producing a fine finished product, given some caveats.

First all, NEVER use tabs when creating your book. Always use Styles for formatting and the reason will be shortly clear. Same with spaces.  Both extra spaces and all tabs will have to be eliminated before you can create proper html code.  Speaking of HTML, you should know that a typical  book uses very little sophisticated html programming code, and once you get the hang of it, you might find that you like it better than using  word processors with all their complex settings.

About an ellipsis in your work: That’s the little row of three dots … These are not simply three periods, though that’s how you create the ellipsis when typing.  An ellipsis is a unique character in the world of computer character sets, not three periods.  Your word processor (hopefully) will take your three periods and convert them to a single ellipsis character. The reason I’m discussing this is to outline a problem with Word and other processors such as Pages.  Copying your text from Pages and inserting it into Word does not convert your three periods to a proper ellipsis.  Only typing the three  periods in Word produces the ellipsis. Searching in Word won’t find what you thought was a proper ellipsis.  Be warned.  The reason I bring this up is because of how your sentences are handled during formatting and your particular wishes of how to display your ellipsis.  Correct form of using the ellipsis is controversial and you may choose to use a leading or following space, both or even none. All more difficult to  do if your word processor isn’t seeing an ellipsis  as a single character.

Assuming you took my advice and used Word to produce your final version of your book, you now want to convert your document to an .html file. I will give you some examples of how I do it. There are more steps after conversion but more on that later. A word of advice regarding Word. There is a feature that ‘automatically’ coverts your document to an .html file. Save yourself the agony because it won’t work and will be large and overly complex.  There are also sites on the web which offer to convert your finished pdf file to a .html or even a .epub or Kindle format.  Perhaps. But they won’t do as good a job as you can and their output will never be as beautiful as your hand finished work where you have control over the smallest detail.


Below is a small sample from my book Elapid.  I can’t display the fonts and style I actually used in the book on this web page, so you can either go look at the original or try to have some imagination.  My book was created using styles for the chapter numbers, the chapter titles and the location, as well as formatting for the first and following paragraphs. While still in Word, you should prepare your work as follows:

Eliminate your fancy dropped cap or graphic for the first letter in the chapter.

Remove any graphics, and as mentioned, all tabs and extra spaces.

Change your style settings to remove any use of italics, but don’t remove the italics used in the body of the text of your book.

Don’t forget to save your file using another name unless you want to lose your valuable original!

Original Example


Western Iranian Desert

The rising sun sent a shaft of piercing yellow light across the arid world of the desert, impacting the pale sand, instantly raising its temperature, unofficially marking the start of another hot day in western Iran. The snake understood, by experience and by genetics, that a sheltered place, hidden in shadow, would be necessary to survive another day, and

Using the Advanced Search tool (Find and Replace) in Word, select the Format button and search for Font…Italics.

Then use the following expression in the Replace With area:


This will outline your italicized text for proper display using HTML.


Now we are going to do the same with all our style settings so that our style sheets in HTML can do the same thing your word processor was doing.

My Word style for my chapter titles was named  chapter_title and I search for all occurrences by selecting that style in the Find and Replace tool.  Then replace with the following:

<p class = “chapter_title”>^&</p>


Do the same for your chapter number style and your location style (if any).  Your document will start to look like this:


Second Example:

<p class = “chapter_number”>Chapter 1
</p><p class = “chapter_title”>
<p class = “location”></p>Sunrise Beach County Park…Tacoma
1900 Hours
The last remaining rays of the sun were being consumed by the trees on Vashon Island, across the narrow channel of fast moving water, and evening was extending its big hand across the little park, the light fading by the minute. The deep sound of a motorcycle moving slowly wafted in little bites,

Not very pretty, is it?  And you may have noticed that the  formatting extends across lines.  We’ll fix that using the text editor, not to worry.

You are done with the word processor.  Select the entire document and copy it into memory.  Using a text editor (I use TextMate), paste your book into a new document.  Now the fun starts.


Look for another blog on what to do next!


Alexander Francis


About Fonts


When creating an e-book, you should not attempt to make the body of the text conform to your latest fancy font.  The platform and the reader’s preferences should, and mostly will, prevail anyway.  Save the fancy font for the front matter of your book and the chapter and title headings.  Other than making sure that it is legible, you can use a font which will add distinction to your e-book. The problem, though, is that many older and more primitive readers, such as the original Kindles, will not use your fonts and will respond to a change in font size or italics only.  Too bad for us, because the page then becomes bland and ordinary.

The other issue is concerning the old bugaboo, the copyright.  Yes, many fonts, even ones resident on your computer, may not be used commercially without authorization.  That means a fee, in case you are wondering. The other issue is that postscript fonts will not be displayed by all e-readers.  You should use, therefore, a TrueType font.  The TrueType was developed early in the 80’s by a joint effort between Microsoft and Apple.

Don’t be discouraged, however, because there are many places on the web to obtain free, copyright free, fonts in the TrueType style. has both free fonts and a converter for changing nearly any font to a TrueType.  You have to be sure that the font you use from your own system is in fact free to use before you proceed.  Fonts that you select should be installed in the folder that you are using and will be called by your style sheet (more on this later.)

By the way, sizing your font in HTML is different than in a word processor and once you get used to the idea and technique, far easier and more reliable.  As I mentioned previously, I recommend that you purchase Guido Henkel’s book, Zen of eBook Formatting.  He lays it out for you concerning font use.  We both recommend use of the em sizing method, as you shall see.

So far, in this and the previous blog, I keep postponing detail with the “more on this later.”  The reason is that I will present all the necessary information in one spot to keep you from digging through the text to find the next step.

There is one more thing I would like to mention about fonts.  The first letter of a new chapter is, by custom, larger and florid, or at least descendent into the following line. It does look better than simply starting the chapter with an ordinary letter.  I have two methods that have worked for me and both have good and bad issues.  The first method is to use a fancy font, larger in size with no indent.  This method works well in most cases, but there are problems, particularly with the Apple readers. The other alternative also works well with your printed book. With this method you will need the imaging software we discussed earlier.  I use Photoshop but other products will work.  The font is converted to an image (jpg) and sized slightly larger than you will require.  After inserting the image into your html file, the image is sized using the em method, your text starting on the same line.  I will include an example of this in the final blog on this topic.

Alexander Francis

Some Pointers Regarding Tools


As you know, the first thing you need to write a book is a good word processor, one that is simple enough to allow concentration on your ideas and not the struggle with software.  Therefore, it doesn’t matter at an early stage which word processor you choose. At a later date, you will be able to export the text  to a more comprehensive and complex environment.

Bad habits are hard to break, but I’ll name a few that you would be wise to overcome while writing.  Never use tabs to format your text. They will give you problems later. Same with spaces.  I always struggle with the temptation to add two, sometimes three spaces after a period.  To me, it just looks better.  But they are hard to deal with later, so take my advice and avoid overuse of spaces.  You should learn to use the style sheet concept to set your text in the position you want.  You can rename the styles or create your own, but, whatever you do, be consistent from book to book.  If you use style names that are easily remembered, you will find the conversion to HTML a breeze. Don’t get carried away with chapter numbers or names during the early phases of writing your book, and postpone formatting for style and font until the last. It’s much easier later, and since you will be more consistent after the book is complete, it will lead to fewer errors.

After many years of using Windows, I switched to the Mac.  I love Apple products and wouldn’t go back for gold, but Pages has limitations.  You will find that you require a fully featured word processor as you near completion of your manuscript. An example is the use of headers and footers which differ on odd and even pages. That, of course, doesn’t apply to an e-book format which cannot use headers or footers.  Converting your document to an HTML file requires functions that Pages doesn’t supply.  On the advice of some very enthusiastic supporters of free software, I tried two alternatives: LibreOffice and OpenOffice.  Both are fully featured and free, but I just couldn’t get comfortable with either one of them. Against my inner instinct, I paid for a version of MS Word for my Mac. It crashes frequently for no reason, has a long list of quirks, is overly complex and exasperatingly poorly written for a product so long on the market.  Nevertheless, it works and, with patience and some insight, will produce a flawless copy of your book for print and make the transition to HTML possible.

My advice is to not use the feature Word offers of conversion to HTML.  It includes more text than you want to deal with and produces a file of ponderous size.  You simply want to go from a word processor file to a text file in HTML format.  Bringing in your work with styles already imbedded saves a tremendous effort later. My workflow  had been reduced to the following simplified overview:

1. Replace all double quotation marks with “.  Sounds ridiculous but the step will find errors that you didn’t know you have.  It turns out that there are three characters which look like quotation marks and are hard to detect when in a text file.  Same thing for single quotation  marks.

2. Remove any tab characters.

3. Remove spaces, except for single spaces.

3. Using the search and replace functions in Word,  bracket the text marked with a certain style with the name of the style (more later on this).

4. Using the advanced search and replace, bracket each paragraph with the HTML code <p>your text here</p>.  The same goes for italics. More on this later also.

4. Select the entire document and copy and paste it into a text processor. I suggest TextMate.

Now you are done with your word processor, but be sure and save any changes into a separate file for possible later corrections.

As you can see, so far the process is not challenging.  The next steps are more complex, however, but you are well on your way to the creation of an e-book.  Congratulations!  To be continued….

Alexander Francis

Making An E-book


When I watch old movies about an author and his/her interaction with a publisher or an agent, I get jealous.  That’s what I thought it would be like, and of course, it does still happen that way for some writers.  You remember the scene I’m talking about.  The writer comes in with either a large folder or perhaps a cardboard box full of handwritten or poorly typed pages of his recent masterpiece, and they take it from there. Simple.

Now I (we) know that it doesn’t happen that way any more, at least for the likes of me and others like me.  Publishers and agents want a sure thing, a name, notorious or otherwise famous.  You and I are on our own in a strange world of competition and marketing.

After my first two books were printed (an entirely different but interesting story), I was proud as a new father to hold them, the physical manifestation of all that labor. To me, there is nothing which will ever replace the printed book. The feel of it, the smell of the pages, and the fact that it will last nearly forever makes a book a very desirable thing.  Nonetheless, there is a trend toward the electronic transmission of music, video, and books which is not going to go away.  Likely, both forms will co-exist for some time yet, but as an author, you will sell more books in electronic form than printed and make more money doing it.

Creating a perfect printed book is possible, but a perfect digital version is elusive. There are several versions of e-readers, some incompatible with each other.  Creating a book which can be read by all of them requires making your book more simple than you might want. The fancy font, in particular, is nearly the first thing lost.  To be sure, the requirements vary not only as to the individual device being used but also its size, its color, resolution and also its users’ whims. There is no universal solution, only  solutions of compromise.

Some organizations will claim that a writer only need to supply a .pdf or even a .doc version of their book, and their proprietary conversion software will turn it into a e-book.  Easy? Sure is tempting but don’t fully believe it. To get the best array of compromise and to be sure what your work is going to look like, it is best to get yourself involved  in the conversion process, and better still, to do it yourself.

You will need some help to start, most of which is available on the Web. By that, I mean the explanation of how to do a conversion, not the actual doing of it.  Be careful of where you send your valuable manuscript, and be sure and have it copyrighted before you release it to anyone. No exceptions.

If you are still interested, then I have some suggestions for reading. Obtain the Kindle book by Guido Henkel called “Zen of eBook Formatting.” Henkel gives an interesting and enthusiastic view of ebook creation.  I used this book to get started and to convince myself that I could do it. Be warned that his book is not a complete guide and has some omissions that will prevent you from actually producing a finished e-book on your own. It will, however,  give you a leg up and does provide a tremendous insight into the process.

In the  next installment of this series, I will give you a more detailed summary of problems I encountered, and my solutions.  Using this method, I have completed seven books so far, and they range in size from 75,000 to 135,000 words. Two are published and widely distributed for a variety of devices, and yes, both are also in printed format.


Alexander Francis