This morning I’ve been working on one of many writing projects and went online spelunking on Amazon. I came across some very interesting metrix they have on their book offerings, all under the seemingly innocent link text “concordance” and “text stats.” “Concordance”, yup, I can see that being a great traffic generator… not. Unless you’re curious by nature; are a geek, a English or linguistic student; or clicked the link by accident (Yup, I know you were going for link to the enlarged version of the front cover.)
But as I can attribute all of the above reasons except the last one to myself, let me share that those links are well worth looking at. Take concordance. You’re probably more familiar with them as the world clouds that services like Flickr and Technorati produce based on tags and text found in the content they analyse. The Technorati concordance for this blog can be seen on the right. It’s an interesting snap shot, even for me, the author of the words that resulted in this cloud. (I didn’t think my fandom for Keeanu Reeves was that dominant! Now that I’ve mentioned his name again, that elevation of his name versus, say Johnny Depp, who has also been mentioned in posts past, will remain elevated in my verbage profile.)
The word cloud above is generated off my Flickr content. You can see traces of my globe trotting, something about a wedding, honeymoon and a huber, and “chris”? Even I am mystified by that one.
But turn the same technoloy onto the text of entire books and you get an equally interesting take. For example, I’m currently reading Stanley Bing’s “Sun Tsu was a sissy.” According to this book’s concordance, it’s really about people, war and dpesn’t mention the word “sissy” in its text sufficient times to be listed. All true. (You can see the list here.)
Take the next text analytics that Amazon offers using their Search Inside! program and things get not only interesting, but useful. Modestly title “text stats,” Amazon provides rankings of the book based on readability, complexity and other quantitative measures, including words per dollar and words per pound (a new take on gewtting value for your reading money!)
It is the first two that are utterly fascinating and have real business applications. Personally, I’d love to be able to feed the text of one of my projects through the same engine and see what it spat out. Then I could do a comparison of how my text performed versus others in the same category appealing to the same audience.
And think about feeding in a document from the organization you work for! At last, a benchmarking tool that could objectively let you take a dense corporate treatise or a PR vanity project alike, and let the measures speak for themselves. “As you can see, the Fog Index — which indicates the number of years of formal education required to read and understand a passage of text — is 34, showing that the reader of our proposed text would have to have the equivalent of a triple doctorate.”Alternately, the good news might be that your fluff piece is accessible to anyone who finished elementary school.
Unless the objective is to purposefully limit readership, this is one bell curve you want to land well in the middle of. (And if obfuscation is the aim, the tool is equally useful — “I need this revised to include 12% more complex words with at least a 5 score on the average number of syllables per word by the end of the day!”)