hi folks,
I was trying to get the ALS recommender working for our data.
I have user_ids and item_ids that are longs. To handle this,
I set the usesLongIDs ‘true’ in the parallelALS job first.
mahout parallelALS --input /mahout/input_data/user_item_interest/ --output /mahout/output_data/alsrecommender_longs/ --lambda 0.1 --implicitFeedback true --alpha 0.1 --numFeatures 20 --numIterations 10 --numThreadsPerSolver 4 --usesLongIDs true --tempDir /mahout/tmp -Dmapred.reduce.tasks=256
This job created the output folders "/userIDIndex" and "/itemIDIndex" along
with the "/U" and the "/M" latent factor folders. Then,I gave these
paths for the generated userIDIndex and itemIDIndex to
the recommendfactorized job.
mahout recommendfactorized --input /mahout/output_data/alsrecommender_longs/userRatings --userFeatures /mahout/output_data/alsrecommender_longs/U/ --itemFeatures /mahout/output_data/alsrecommender_longs/M/ --itemIDIndex /mahout/output_data/alsrecommender_longs/itemIDIndex/ --userIDIndex /mahout/output_data/alsrecommender_longs/userIDIndex/ --numRecommendations 100 --output /mahout/output_data/alsrecommender_longs/user_recommendations_longs --maxRating 1 --numThreads 4 --tempDir /mahout/tmp -Dmapred.reduce.tasks=256
Both the jobs completed successfully. I was assuming logically that since my input user and item ids were longs, the computed user recommendations would have the user_ids and recommended item_ids as the original longs.
However, the output seem to still hold the recommendations in the ints that it internally created as part of its first job. Is there any other parameter to be set to get back the recommendations in their original long ids. It seems
absurd that the "factorizedrecommender" job does not have post-processing step that would map the ints back to the longs.
Thanks,
Nilesh
I was trying to get the ALS recommender working for our data.
I have user_ids and item_ids that are longs. To handle this,
I set the usesLongIDs ‘true’ in the parallelALS job first.
mahout parallelALS --input /mahout/input_data/user_item_interest/ --output /mahout/output_data/alsrecommender_longs/ --lambda 0.1 --implicitFeedback true --alpha 0.1 --numFeatures 20 --numIterations 10 --numThreadsPerSolver 4 --usesLongIDs true --tempDir /mahout/tmp -Dmapred.reduce.tasks=256
This job created the output folders "/userIDIndex" and "/itemIDIndex" along
with the "/U" and the "/M" latent factor folders. Then,I gave these
paths for the generated userIDIndex and itemIDIndex to
the recommendfactorized job.
mahout recommendfactorized --input /mahout/output_data/alsrecommender_longs/userRatings --userFeatures /mahout/output_data/alsrecommender_longs/U/ --itemFeatures /mahout/output_data/alsrecommender_longs/M/ --itemIDIndex /mahout/output_data/alsrecommender_longs/itemIDIndex/ --userIDIndex /mahout/output_data/alsrecommender_longs/userIDIndex/ --numRecommendations 100 --output /mahout/output_data/alsrecommender_longs/user_recommendations_longs --maxRating 1 --numThreads 4 --tempDir /mahout/tmp -Dmapred.reduce.tasks=256
Both the jobs completed successfully. I was assuming logically that since my input user and item ids were longs, the computed user recommendations would have the user_ids and recommended item_ids as the original longs.
However, the output seem to still hold the recommendations in the ints that it internally created as part of its first job. Is there any other parameter to be set to get back the recommendations in their original long ids. It seems
absurd that the "factorizedrecommender" job does not have post-processing step that would map the ints back to the longs.
Thanks,
Nilesh