Quantcast
Viewing all articles
Browse latest Browse all 5648

mahout text mining

Mahout has an example of using naive bayes to classify 20 news group. but
how to just classify paragraphs (e.g. twitter message, movie review) in
text files such as:

Text files has content like:

text paragraph 1 class a
text paragraph 2 class b
text paragraph 3 class a
text paragraph 4 class b
............. ...

does it support n grams, stem, stop words, etc?

thanks for any suggestions.

Viewing all articles
Browse latest Browse all 5648

Trending Articles