text - How to cluster different strings using machine learning in python -
i have dataset consists of building names.e.g {hill view,hills view,hill apartment...}.i want cluster these strings using machine learning.for eg after clustering 1 cluster should contain strings similar or similar {hills,hill...}.i have tried various scikit algorithms k-means,affinity propagation etc did not succedd.kindly help.
machine learning isn't magic! uses mathematical objects , functions.
you need first steps - known data mining - kind of consists in:
transforming input (string, pictures, videos, anything...) numbers (vectors, matrices or relevent structure).
defining distance , similarity between vectors (= distance between numerical representation of input ~= distance between string, pictures, videos, anything).
this not trivial , can done different ways depending on data/objectives.
since don't know background in cs/ml/maths, give general approach is, in general case, quite good/easy.
that general speach, in pratice problematic complex , there's lot learn on that. need edit distance intuitive distance between words, should consider stemming which.
can't give better anwser without more information on data/context.
regards
Comments
Post a Comment