Hi,
Should I avoid using 0 as a preference value in mahout’s input file to do recommendation?
I am running mahout-0.9’s recommenditembased on hadoop 2.0 cluster with two nodes, with pearson correlation as similarity class. If I use 1 and 2 as preference values, the generated similarity is correct; but if I use 0 and 1 as preference values, the generated similarity is missing.
1. Input File:
0,0,1
1,0,2
0,1,1
1,1,2
Generated similarity between item 1 and 0 is 0.9999, which is correct
2. Input File:
0,0,0
1,0,1
0,1,0
1,1,1
Similarity is not generated between item 1 and 0, which is not as expected
3. Detailed Command:
1. Run Recommendation
mahout recommenditembased -s SIMILARITY_PEARSON_CORRELATION -i 0_1_tuples.csv -o output --numRecommendations 5 --outputPathForSimilarityMatrix similarityMatrix --randomSeed 2014
2. View similarity between item 1 and 0:
hdfs dfs -cat similarityMatrix/part-r-00000
Thank you,
Peng
Should I avoid using 0 as a preference value in mahout’s input file to do recommendation?
I am running mahout-0.9’s recommenditembased on hadoop 2.0 cluster with two nodes, with pearson correlation as similarity class. If I use 1 and 2 as preference values, the generated similarity is correct; but if I use 0 and 1 as preference values, the generated similarity is missing.
1. Input File:
0,0,1
1,0,2
0,1,1
1,1,2
Generated similarity between item 1 and 0 is 0.9999, which is correct
2. Input File:
0,0,0
1,0,1
0,1,0
1,1,1
Similarity is not generated between item 1 and 0, which is not as expected
3. Detailed Command:
1. Run Recommendation
mahout recommenditembased -s SIMILARITY_PEARSON_CORRELATION -i 0_1_tuples.csv -o output --numRecommendations 5 --outputPathForSimilarityMatrix similarityMatrix --randomSeed 2014
2. View similarity between item 1 and 0:
hdfs dfs -cat similarityMatrix/part-r-00000
Thank you,
Peng