Quantitative Characteristics of Quality Social Texts
Some formulas were compared by implementing them on the same text clusters. Three types of corpora (and after processing – clusters) were used - fiction, newspapers and scientific articles. Some of clusters contained the similar texts, translated to different languages (English, Polish, Russian). The purpose of the research is to determine the field of application not only of each formula, but of different types of methods it used. The weak point was in lack of precision in scientific text clusters assessment, because of its extra complicated syntactic structure.
During readability formulas analysis, we got attendant information about difficulty various texts in different languages that enable to conclude the complexity of the different languages (and genres) and to estimate different translations of the similar texts. Cloze-tests is one of the basic evaluation variant with informants. Thus, our paper concerns the basic parts of Natural Language Processing and Cognitive Science.