word count discrepancy

marquesamarquesa Creator
Hi all, I'm formatting a word doc to ebook format ready for uploading to Lulu (I've successfully done this twice before) applied appropriate heading styles, checked everything is okay in the document map and all looks fine. However, I've noticed that the newly formatted doc shows a lower word count than the source doc, 6 fewer words to be exact. I can't for the life in me see any missing words, and definitely not accidentally deleted any, so I'm wondering if it's a formatting error I've overlooked. Has anyone had this before? Thanks.


  • KeevesKeeves Author
    Here's what I do in this kind of situation. (Warning: This is slow and tedious, but I don't know a better way.)

    Make copies of the old and new versions. Split each in half. Let's call them Old-A, Old-B, New-A, and New-B. The total words in Old-A plus Old-B should match the original, and the total of New-A plus New-B should be 6 fewer than that. If so, then we have verified that you didn't mess anything up.

    Now, the big questions are: Do Old-A and New-A match? Do Old-B and New-B match? If you are lucky, one or the other will be off by exactly one word. Repeat the process until the sections are small enough that the missing word will become obvious.

    Hopefully, you'll get to a point where you understand how this happened. For example, you might find a word that somehow go split into two. Or you may see a stray character that ended up somewhere. Once you find it, it may give you enough clues so that you can find the other five missing words more easily.

    Good luck!
  • A_A_CainA_A_Cain Oz Creator
    Just "compare" your two files using Word - any differences will be shown either as deleted text in the second version, or added text.

    Note: there might not be any difference at all. I've found, as have others writers I chat to, that many word processing packages can't actually count accurately. A file saved in docx, say, might have a different word count when compared to the same file saved in rtf. Besides, you can 99.98% guarantee you won't have caught every error in the copy. I don't think you can get 100% perfect copy, no matter how many times you proof read - something always jumps out at you the first time you read your print copy. It's a Murphy's Law, I reckon.
  • marquesamarquesa Creator
    Here's what I found by accident: Believing I'd unwittingly deleted text I started over from scratch. As I've done successfully in the past to create a suitable doc for Lulu's ebook converter, I copied and pasted the text from the source doc into notepad. From there I copied and pasted into a new Word doc, clicked on show formatting symbols and started formatting all over again, which includes deleting all the extra space symbols between breaks in text etc, you know the ones, look like music note symbols. I noticed that another symbol amongst these, barely noticeable, resembles a tiny zero or letter 'o'. Never really paid them any attention before, but every time I deleted one the word count dropped by one. So, whatever this symbol is or does, I still don't know, but apparently Word counts them as a word, even though they are invisible when show formatting is turned off. So what are they? Anyone else noticed them before when formatting? 
  • That's not a symbol I'm familiar with. But if you had some unique formatting in the source doc (like an uncommon font or wordart), notepad may have stripped it. Pasting into the fresh Doc would then result in Word not knowing what to do with that content containing non-formatted text or other objects.

  • MaggieMaggie Creator
    It replaces the spacing.

    What you can do is copy one of them in Word, with the formatting symbol on so you can see them. Then, under Find and Replace paste the one in Find area and under Replace just press the space tab once and click on Replace all.

    If you leave them as is you will have formatting problems.

    * Print Book and Ebook Formatting Services http://www.custom-book-tique.com/

  • marquesamarquesa Creator
    What I also noticed is that one appears at the end of every chapter in my book just in front of the first soft return I've used. Now I've deleted every one of them, each at the end of all 29 chapters. So now I have a word count of exactly 29 less than the source doc, but no actual missing text as far as I can see. Strange, but sorted I think!
Sign In or Register to comment.