Quantcast
Channel: Apache Timeline
Viewing all articles
Browse latest Browse all 5648

Poor performance of the MaxtrixMultiplicationJob

$
0
0
Hi-

I'm working on implementing a custom algorithm using the Mahout library. The algorithm requires matrix multiplication, which I saw was available at the object level (.times) as well as being implemented in the MatrixMultiplicationJob. I am currently testing a step in the algorithm that requires me to multiply a 10x2.4m matrix by one that is 2.4mx2.4m. The performance has been awful, taking 11-12 hours to complete. This might be fine if it was the extent of the algorithm, but I will have multiple similarly sized steps, all of which will be repeated in a loop.

I dug into this further, looking at the job running on my Hadoop cluster (Google Cloud Compute, 3 nodes @ 16 GB each). I noticed that the job appeared to only be running a single map and thus on a single node, as opposed to previous steps such as TransposeJob that ran multiple mappers and finished in a fraction of the time. Researching it a bit further, I found a handful of concerning posts such as the two below:

Hadoop File Splits : CompositeInputFormat : Inner Join

Hadoop File Splits : CompositeInputFormat : Inner Join
I am using CompositeInputFormat to provide input to a hadoop job. The number of splits generated is the total number of files given as input to CompositeInputFormat...
View on stackoverflow.com Preview by Yahoo

MatrixMultiplicationJob runs with 1 mapper only ?

MatrixMultiplicationJob runs with 1 mapper only ?
Hi, I am trying to multiple dense matrix of size [100 x 100k]. The size of the file is 104MB and
with default block sizeof 64MB only 2 blocks are getting created.
View on mail-archives.apache.org Preview by Yahoo

So, my questions are as follows. Is the MatrixMultiplicationJob truly limited to only being able to be run on a single node? If so, it seems fairly useless. And if so, what is the recommended way to do decently sized multiplication such as I require?

Viewing all articles
Browse latest Browse all 5648

Trending Articles