text - How to cluster different strings using machine learning in python -

- March 15, 2013

i have dataset consists of building names.e.g {hill view,hills view,hill apartment...}.i want cluster these strings using machine learning.for eg after clustering 1 cluster should contain strings similar or similar {hills,hill...}.i have tried various scikit algorithms k-means,affinity propagation etc did not succedd.kindly help.

machine learning isn't magic! uses mathematical objects , functions.

you need first steps - known data mining - kind of consists in:

transforming input (string, pictures, videos, anything...) numbers (vectors, matrices or relevent structure).
defining distance , similarity between vectors (= distance between numerical representation of input ~= distance between string, pictures, videos, anything).

this not trivial , can done different ways depending on data/objectives.

since don't know background in cs/ml/maths, give general approach is, in general case, quite good/easy.

that general speach, in pratice problematic complex , there's lot learn on that. need edit distance intuitive distance between words, should consider stemming which.

can't give better anwser without more information on data/context.

regards

Search This Blog

Ant COmde

text - How to cluster different strings using machine learning in python -

Comments

Post a Comment

Popular posts from this blog

sql - invalid in the select list because it is not contained in either an aggregate function -

Angularjs unit testing - ng-disabled not working when adding text to textarea -

How to start daemon on android by adb -