Quantcast
Channel: Apache Timeline
Viewing all articles
Browse latest Browse all 5648

Store in MongoDB with a Pig UDF

$
0
0
Hi there,

We are given a text file containing several lines, where each one
corresponds a mongo document, and we load it as follows:

DEFINE PigToMongo com.beeva.PigToMongo.PigToMongo();

A = LOAD '/home/hduser/pigfiles/input.txt' USING TextLoader() AS
(line:chararray);

B = FOREACH A GENERATE PigToMongo(line);

DUMP B

By using PigToMongo(line), we connect to mongo, map A, write and close the
connection.

PigToMongo creates a connection for each line as follows (which implies our
MongoDB is down*):

MongoClient mongoClient = new MongoClient( "localhost" , 27017 );
DB db = mongoClient.getDB( "hadoopDB" );
DBCollection coll = db.getCollection("output0");

I wonder whether it is possible to open and close the connection only once,
outside the UDF.

- By the way, does MongoDB support multiple connections at the same
time? (from several reducers storing data during a map/reduce job, for
example)

Thank you,

*CÉSAR PUMAR GARCÍA*

*BEEVA FOR GRADUATES*

*cesar.pumar [ at ] beeva.com cesar.pumar [ at ] beeva.com[image: www.beeva.com]
<http://www.beeva.com>*

Viewing all articles
Browse latest Browse all 5648

Trending Articles