Quantcast
Channel: Apache Timeline
Viewing all articles
Browse latest Browse all 5648

ALS, weighed vs. non-weighed regularization paper

$
0
0
Probably a question for Sebastian.

As we know, the two papers (Hu-Koren-Volynsky and Zhou et. al) use slightly
different loss functions.

Zhou et al. are fairly unique in that they multiply norm of U, V vectors
additionally by the number of observied interactions.

The paper doesn't explain why it works except saying along the lines of "we
tried several regularization matrices, and this one worked better in our
case".

I tried to figure why that is. And still not sure why it would be better.
So b asically we say, by allowing smaller sets of observation having
smaller regularization values, it is ok for smaller observation sets to
overfit slightly more than larger observations sets.

This seems to be counterintuitive. Intuition tells us, smaller sets
actually would tend to overfit more, not less, and therefore might possibly
use larger regularization rate, not smaller one. Sebastian, what's your
take on weighing regularization in ALS-WR?

thanks.
-d

Viewing all articles
Browse latest Browse all 5648

Trending Articles