Getting counts of data in R -
i using r first time. have following data set ( mockup of large data set working with):
type date size color l shape 2008-04-14 161 blue l shape 2010-10-16 654 yellow l shape 2005-07-03 149 blue l shape 2006-08-16 657 yellow l shape 2007-04-08 229 yellow l shape 2004-03-17 784 green y shape 2014-02-22 917 pink y shape 2012-05-04 186 green y shape 2006-11-25 641 yellow y shape 2015-09-07 493 blue y shape 2011-07-06 953 green
i number of occurrances of each color each type, dates each type , min, max , mean size each type. output should this:
type colors dates mean size min size max size l shape 3 2008-04-14 439 149 784 2010-10-16 2005-07-03 2006-08-16 2007-04-08 2004-03-17 y shape 4 2014-02-22 638 186 953 2012-05-04 2006-11-25 2015-09-07 2011-07-06
this i’ve done far:
cat <- big_catalog na.rm = true library(plyr) mydata <-ddply(cat, c("type", "date", "size", "color"), summarize, colors = length(color), dates = (date), mean_size = mean(size), minimum_size = min(size), maximum_size = max(size) )
but end this:
type date size color colors dates mean size min size max size l shape 2008-04-14 161 blue 2 2008-04-14 161 161 161 l shape 2010-10-16 654 yellow 3 2010-10-16 654 654 654 l shape 2005-07-03 149 blue 2 2005-07-03 149 149 149 l shape 2006-08-16 657 yellow 3 2006-08-16 657 657 657 l shape 2007-04-08 229 yellow 2 2007-04-08 229 229 229 l shape 2004-03-17 784 green 1 2004-03-17 784 784 784 y shape 2014-02-22 917 pink 1 2014-02-22 917 917 917 y shape 2012-05-04 186 green 2 2012-05-04 186 186 186 y shape 2006-11-25 641 yellow 1 2006-11-25 641 641 641 y shape 2015-09-07 493 blue 1 2015-09-07 493 493 493 y shape 2011-07-06 953 green 2 2011-07-06 953 953 953
i apparently need loop on this, new r , don’t see how it.
something like....
df <- read.table(text= "type date size color lshape 2008-04-14 161 blue lshape 2010-10-16 654 yellow lshape 2005-07-03 149 blue lshape 2006-08-16 657 yellow lshape 2007-04-08 229 yellow lshape 2004-03-17 784 green yshape 2014-02-22 917 pink yshape 2012-05-04 186 green yshape 2006-11-25 641 yellow yshape 2015-09-07 493 blue yshape 2011-07-06 953 green", header=true) by(df, df$type, function(x){ data.frame(colors = length(unique(x$color)), dates = paste(x$date, collapse=";"), mean.size = mean(x$size), min.size = min(x$size), max.size = max(x$size)) })
Comments
Post a Comment