dataframe - Count specific patterns in data frame, doing interpolation in R -

- August 15, 2012

i have dataset in data frame format, this:

wpt    id   fuel  express   local  1     s36   12      0         1  2     s36   14      1         0  inter s36   na      1         0  inter s36   na      1         0  3     s36   16      1         0  inter s36   na      0         1  4     s36   18      1         0  5     s36   22      1         0  6     w09   45      0         1  inter w09   na      1         0  7     w09   48      0         1

i'd treat subdata dat[c(2,inter,inter,3),] (any part "inter" combined regular numbered wpt) unit.

(1) count how many such sub-units in data frame, in case has 2 (unit row 2 3, , 3 4)

(2) count how many such units express or local value consistent starting ending value of sub-unit. in case, has 1 such unit consistent (row 2 3, it's express) , 1 unit different (row 3 4, start , end express, inter local) starting or ending value of such units.

(3) calculations id.

the expected output this:

id   consistent    total s36      1            2 w09      0            1

(4) if want interpolate missing values in fuel column? doing simple linear interpolation. first 2 nas replaced 14.66667 , 15.33333, come from:

seq(14, 16, length.out=3)

the expected out put this:

wpt    id   fuel    express   local  1     s36   12        0         1  2     s36   14        1         0  inter s36   14.66667  1         0  inter s36   15.33333  1         0  3     s36   16        1         0  inter s36   17        0         1  4     s36   18        1         0  5     s36   22        1         0  6     w09   45        0         1  inter w09   45.75     1         0  inter w09   46.50     1         0  inter w09   47.25     1         0  7     w09   48        0         1

thanks in advance!

subs <- with(rle(df$wpt),{     ends <- cumsum(lengths);     n <- grepl('^[0-9]+$',values);     w <- which(head(n,-2l) & values[-c(1l,length(n))]=='inter' & tail(n,-2l));     data.frame(start=c(0l,ends)[w]+1l,end=ends[w+2l]); }); subs$id <- df$id[subs$start]; subs$consistent <- mapply(function(s,e,eq) all(eq[s:e]),subs$start,subs$end-1l,moreargs=list(diff(df$express)==0l)); subs; ##   start end  id consistent ## 1     2   5 s36       true ## 2     5   7 s36      false ## 3     9  11 w09      false res <- aggregate(cbind(consistent,total=rep(1l,length(id)))~id,subs,sum); res; ##    id consistent total ## 1 s36          1     2 ## 2 w09          0     1

data

df <- data.frame(wpt=c('1','2','inter','inter','3','inter','4','5','6','inter','7'),id=c( 's36','s36','s36','s36','s36','s36','s36','s36','w09','w09','w09'),fuel=c(12l,14l,na,na,16l, na,18l,22l,45l,na,48l),express=c(0l,1l,1l,1l,1l,0l,1l,1l,0l,1l,0l),local=c(1l,0l,0l,0l,0l,1l, 0l,0l,1l,0l,1l),stringsasfactors=f);

Search This Blog

Ant COmde

dataframe - Count specific patterns in data frame, doing interpolation in R -

Comments

Post a Comment

Popular posts from this blog

sql - invalid in the select list because it is not contained in either an aggregate function -

Angularjs unit testing - ng-disabled not working when adding text to textarea -

How to start daemon on android by adb -