Quantcast
Channel: Apache Timeline
Viewing all articles
Browse latest Browse all 5648

clusterdump - structure of JSON output

$
0
0
Hi all, I'm working on some automated analysis of the clusterdump output
using '-of = JSON'. While digging into the structure of the
representation of the data I've noticed something that seems a little
odd to me.

In order to access the data for a particular cluster, the 'cluster',
'n', 'c' 'r' values are all in one continuous string. For example:

{"cluster":"VL-10515{n=5924 c=[action:0.023, adherence:0.223,
administration:0.011 r=[action:0.446, adherence:1.501,
administration:0.306]}"}

This is also the case for the "point":

{"point":"013FFD34580BA31AECE5D75DE65478B3D691D138 = [body:6.904,
harm:10.101]","vector_name":"013FFD34580BA31AECE5D75DE65478B3D691D138","weight":"1.0"}

This leads me to believe that the only way I can get to the individual
data in these items is by string parsing. For JSON deserialization I
would have expected to see something along the lines of:

"cluster":"VL-10515",
"n":5924,
"c":

{"action":0.023},
{"adherence":0.223},
{"administration":0.011}
],
"r":

{"action":0.446},
{"adherence":1.501},
{"administration":0.306}

and:

"point": {
"body": 6.904,
"harm": 10.101
},
"vector_name": "013FFD34580BA31AECE5D75DE65478B3D691D138",
"weight": 1.0

Please forgive the naive question if I'm missing something obvious, but
can anybody explain the rationale for the current structure of the JSON?
Is there another efficient way to access the items in question using
JSON without using custom string parsing logic? Or would it make sense
to modify the json output from clusterdump?

Thanks,

Terry

Viewing all articles
Browse latest Browse all 5648

Trending Articles