I am building a slope one based recommender. The system has 40 million
users and 1 million items. Each user has rated around 10 items. The
deviation matrix is calculated as part of batch processing and is loaded in
memory ( putting into a distributed cache)
The issue i am facing is performance at runtime !!! At run-time for
recommendenation for a user the logic is as follows: for each item not
rated by user (this num is quite high = 1 million - 10) for each item rated
by user (10) calculate the avg deviation ...
Now the fundamental issue is the outer loop is too high ( 1 million - 10).
Doing some many calculations at runtime will never perform !
What should be done to improve the runtime performance?
users and 1 million items. Each user has rated around 10 items. The
deviation matrix is calculated as part of batch processing and is loaded in
memory ( putting into a distributed cache)
The issue i am facing is performance at runtime !!! At run-time for
recommendenation for a user the logic is as follows: for each item not
rated by user (this num is quite high = 1 million - 10) for each item rated
by user (10) calculate the avg deviation ...
Now the fundamental issue is the outer loop is too high ( 1 million - 10).
Doing some many calculations at runtime will never perform !
What should be done to improve the runtime performance?