Hi,
I am having twitter data in a single txt file as:
@VancityBeerGuy - RT @BCBerrie: well @VancityBeerGuy you know what they say about guys with #smallenfreuden right? Hahaha Created At:Mon Jun 03 07:18:46 IST 2013
@IanSylves - RT @PTorgo91: @otterN9NE you're the best thing to happen to the #sabres since Drury #lordstanley#nextyear #smallenfreuden Created
At:Mon Jun 03 07:18:37 IST 2013
@LiLItalyPasta - RT @LamyaAsiff: #smallenfreuden is #stupidfreuden. Created At:Mon Jun 03 07:17:36 IST 2013
@MMBris - RT @jaimestein: Whenever you find yourself on the side of the majority, it is time to pause and #smallenfreuden. -Mark Twain Created At:Mon Jun 03 07:16:43 IST 2013
@SeanBickerton - RT @kbieksa3: Big save by Bernier to keep it somewhat close. Leave it to a french guy to get the boys going... @aburr14 #Smallenfreuden Created At:Mon Jun 03 07:16:41 IST 2013
I need to generate vectors for KMeans clustering from this txt file using java.
I need help to select the features.
Lines from Mahout in Action:
The process of selecting the features of an object and mapping them to numbers is
known as feature selection. The process of encoding features as a vector is vectorization.
Thanks
-N
I am having twitter data in a single txt file as:
@VancityBeerGuy - RT @BCBerrie: well @VancityBeerGuy you know what they say about guys with #smallenfreuden right? Hahaha Created At:Mon Jun 03 07:18:46 IST 2013
@IanSylves - RT @PTorgo91: @otterN9NE you're the best thing to happen to the #sabres since Drury #lordstanley#nextyear #smallenfreuden Created
At:Mon Jun 03 07:18:37 IST 2013
@LiLItalyPasta - RT @LamyaAsiff: #smallenfreuden is #stupidfreuden. Created At:Mon Jun 03 07:17:36 IST 2013
@MMBris - RT @jaimestein: Whenever you find yourself on the side of the majority, it is time to pause and #smallenfreuden. -Mark Twain Created At:Mon Jun 03 07:16:43 IST 2013
@SeanBickerton - RT @kbieksa3: Big save by Bernier to keep it somewhat close. Leave it to a french guy to get the boys going... @aburr14 #Smallenfreuden Created At:Mon Jun 03 07:16:41 IST 2013
I need to generate vectors for KMeans clustering from this txt file using java.
I need help to select the features.
Lines from Mahout in Action:
The process of selecting the features of an object and mapping them to numbers is
known as feature selection. The process of encoding features as a vector is vectorization.
Thanks
-N