token - Using bigram using Stanford NLP in java -
i using stanford nlp api document collection , code used tokenization
ptbtokenizer<corelabel> ptbt = new ptbtokenizer<>(reader, new corelabeltokenfactory(), ""); while (ptbt.hasnext()) { corelabel token = ptbt.next(); string word = token.get(textannotation.class); }
this code delimited on white space. mean convert words alarm activated in 2 words alarm , activated. guess bigram solve problem not sure how use here. can body suggest thing use bigram ptbtokenizer or how use bigram in tokenization using stanford nlp.
Comments
Post a Comment