Quantcast
Channel: Apache Timeline
Viewing all articles
Browse latest Browse all 5648

mahout text mining

$
0
0
Mahout has an example of using naive bayes to classify 20 news group. but
how to just classify paragraphs (e.g. twitter message, movie review) in
text files such as:

Text files has content like:

text paragraph 1 class a
text paragraph 2 class b
text paragraph 3 class a
text paragraph 4 class b
............. ...

does it support n grams, stem, stop words, etc?

thanks for any suggestions.

Viewing all articles
Browse latest Browse all 5648

Trending Articles