stringr - R: Select strings that have the same general pattern -


i have list of strings follows:

> with(providers, head(provider.name, 30))  [1] 1st care (uk) limited                   [2] 1st care limited                        [3] 229 mitcham lane limited                [4] 24-7 care ltd                           [5] 3 dimensions care limited               [6] 3 trees community support limited       [7] 365 care homes limited                  [8] 3a care (solihull) limited              [9] 3l care limited                        [10] 5 star tlc limited                     [11] 92 higher drive limited                [12] & care home ltd                    [13] & l care homes limited               [14] & n kachra                           [15] & r care limited                     [16] better carehome ltd                  [17] a.g.e. nursing homes limited           [18] a.r.m. healthcare limited              [19] aaa elderly care limited               [20] aaa medics ltd                         [21] aadams residential care home limited   [22] abacus quality care ltd                [23] abberdale limited                      [24] abbeville rch limited                  [25] abbey care centre limited              [26] abbey care direct ltd                  [27] abbey care home limited                [28] abbey healthcare (aaron court) limited [29] abbey healthcare (kendal) limited      [30] abbey healthcare (knebworth) ltd   

my aim identify observations follow similar pattern , rename them accordingly pattern. ideally, output should similar following (please note particularly changes observations 1, 2 , 25 30)

> with(providers, head(provider.name, 30))      [1] 1st care limited                       [2] 1st care limited                            [3] 229 mitcham lane limited                    [4] 24-7 care ltd                               [5] 3 dimensions care limited                   [6] 3 trees community support limited           [7] 365 care homes limited                      [8] 3a care (solihull) limited                  [9] 3l care limited                            [10] 5 star tlc limited                         [11] 92 higher drive limited                    [12] & care home ltd                        [13] & l care homes limited                   [14] & n kachra                               [15] & r care limited                         [16] better carehome ltd                      [17] a.g.e. nursing homes limited               [18] a.r.m. healthcare limited                  [19] aaa elderly care limited                   [20] aaa medics ltd                             [21] aadams residential care home limited       [22] abacus quality care ltd                    [23] abberdale limited                          [24] abbeville rch limited                      [25] abbey care                  [26] abbey care                      [27] abbey care                   [28] abbey healthcare      [29] abbey healthcare         [30] abbey healthcare   

my question how write "general pattern" enables extract observations have same pattern. have tried str_extract think missing while writing general pattern.

library(stringr) home = "[a-za-z]{2,}" # select general pattern first 2 words similar test = with(providers, str_extract(provider.name, home)) 

does know whether there function in r enables identify patterns generally? many in advance.


Comments

Popular posts from this blog

sql - invalid in the select list because it is not contained in either an aggregate function -

Angularjs unit testing - ng-disabled not working when adding text to textarea -

How to start daemon on android by adb -