Getting counts of data in R -


i using r first time. have following data set ( mockup of large data set working with):

type       date         size       color l shape    2008-04-14   161    blue     l shape    2010-10-16   654    yellow l shape    2005-07-03   149    blue l shape    2006-08-16   657    yellow l shape    2007-04-08   229    yellow l shape    2004-03-17   784    green y shape    2014-02-22   917    pink y shape    2012-05-04   186    green y shape    2006-11-25   641    yellow y shape    2015-09-07   493    blue y shape 2011-07-06  953 green 

i number of occurrances of each color each type, dates each type , min, max , mean size each type. output should this:

type       colors   dates           mean size   min size    max size l shape      3          2008-04-14  439         149         784                 2010-10-16                           2005-07-03                           2006-08-16                           2007-04-08                           2004-03-17            y shape     4           2014-02-22  638         186         953                 2012-05-04                           2006-11-25                           2015-09-07                           2011-07-06           

this i’ve done far:

cat <- big_catalog  na.rm = true  library(plyr)  mydata <-ddply(cat, c("type", "date", "size", "color"), summarize,                colors = length(color),                dates = (date),                mean_size = mean(size),                minimum_size = min(size),                maximum_size = max(size) ) 

but end this:

type    date    size    color   colors  dates   mean size   min size    max size l shape 2008-04-14  161 blue    2   2008-04-14  161 161 161 l shape 2010-10-16  654 yellow  3   2010-10-16  654 654 654 l shape 2005-07-03  149 blue    2   2005-07-03  149 149 149 l shape 2006-08-16  657 yellow  3   2006-08-16  657 657 657 l shape 2007-04-08  229 yellow  2   2007-04-08  229 229 229 l shape 2004-03-17  784 green   1   2004-03-17  784 784 784 y shape 2014-02-22  917 pink    1   2014-02-22  917 917 917 y shape 2012-05-04  186 green   2   2012-05-04  186 186 186 y shape 2006-11-25  641 yellow  1   2006-11-25  641 641 641 y shape 2015-09-07  493 blue    1   2015-09-07  493 493 493 y shape 2011-07-06  953 green   2   2011-07-06  953 953 953 

i apparently need loop on this, new r , don’t see how it.

something like....

df <- read.table(text= "type       date         size       color lshape    2008-04-14   161    blue     lshape    2010-10-16   654    yellow lshape    2005-07-03   149    blue lshape    2006-08-16   657    yellow lshape    2007-04-08   229    yellow lshape    2004-03-17   784    green yshape    2014-02-22   917    pink yshape    2012-05-04   186    green yshape    2006-11-25   641    yellow yshape    2015-09-07   493    blue yshape 2011-07-06  953 green", header=true)  by(df, df$type, function(x){   data.frame(colors = length(unique(x$color)),              dates = paste(x$date, collapse=";"),              mean.size = mean(x$size),              min.size = min(x$size),              max.size = max(x$size)) }) 

Comments

Popular posts from this blog

java - nested exception is org.hibernate.exception.SQLGrammarException: could not extract ResultSet Hibernate+SpringMVC -

sql - Postgresql tables exists, but getting "relation does not exist" when querying -

asp.net mvc - breakpoint on javascript in CSHTML? -