Wednesday, September 12, 2018

Does POS tagging improves text clustering?

Answer: No, at least for sweedish language.

Rosell, M., 2009. Part of speech tagging for text clustering in swedish. In Proceedings of the 17th Nordic Conference of Computational Linguistics (NODALIDA 2009) (pp. 150-157).

How to evaluate text clustering? 
Answer:
Normalized Mutual Information (similar to information gain) 

Any available dataset to test?
Answer:
http://ana.cachopo.org/datasets-for-single-label-text-categorization