[quote=""lpetrich""]I have discovered several of them:
Flesch–Kincaid readability tests
Gunning fog index
Coleman–Liau index
Automated readability index
SMOG
Dale–Chall readability formula
Spache readability formula
Fry readability formula
They all use words per sentence, along with syllables or letters or long/unfamiliar/complex/difficult words per word.
That seems to me overly simplistic, because it does not involve some estimate of syntactical complexity. To see why that is a problem, let us consider these three sets of sentences.
- I watched the cat. She was eating her dinner.
- I watched the cat, and she was eating her dinner.
- I watched the cat, as she was eating her dinner.
Most readability indexes would make (2) and (3) equal or very close to equal, even though (2) and (3) are syntactically rather different. (2) is essentially (1) with the sentences each turned into co-equal clauses. (3) is different from the other two. The second clause (she was eating her dinner) is turned into a modifier of the first clause (I watched the cat).
A complexity-based index would place (2) as not much more complex than (1) and (3) as significantly more complex than (1) or (2).
I went over to scholar.google.com to look for attempts to use syntactic complexity in readability testing, but I found only a few papers, and they did not state their results very clearly.[/quote]
I disagree. #1 is simpler than #2 because the period completes the first thought and then the second sentence builds upon that. You have less pending material to think about.
In the programming world this is very clear--our complexity measures definitely favor #1 over #2 or #3.
Also, look at books for young kids--#1 is definitely favored.