r - ggplot: confusing about user-defined bin number in density plot -
my basic question how set bin number (default 30) geom_density.
i found density in y-axis did not change bin has been modified.
here example:
values <- runif(1000, 1, 100) ind <- as.factor(rep(c(1:2), each=500)) inout <- as.factor(rep(c(1:2), each =500)) df <- data.frame(values,ind,inout) ggplot(df,aes(x=values, ..density..)) + geom_freqpoly(aes(group=interaction(ind,inout), colour=factor(inout)), alpha=1, bins=1)
the density should 1, because bin number defined 1. however, result did not show expected.
do know miss here? tips define bin number or bin threshold ggplot geom_density?
thanks lot.
in ggplot don't set number of bins per se, instead set width of bins using binwidth
(default range/30). bin
isn't term geom_freqpoly understands ignored in example code.
i think example using range 0-1 (instead of 1-100) better illustrate expecting see:
values <- runif(1000, 0, 1) # generate values between 0 , 1 ind <- as.factor(rep(c(1:2), each=500)) inout <- as.factor(rep(c(1:2), each =500)) df <- data.frame(values,ind,inout) ggplot(df, aes(x=values, ..density..)) + geom_freqpoly(aes(group=interaction(ind,inout), colour=factor(inout)), alpha=1) #use default binwidth, i.e. 1/30
this gives graph similar code generated
with range of 1, setting binwidth = 1
means there 1 bin give density of 1 @ value of 0.5. notice range of values 0.5 1.5 area under density curve must sum 1.
ggplot(df, aes(x=values, ..density..)) + geom_freqpoly(aes(group=interaction(ind,inout), colour=factor(inout)), alpha=1, binwidth = 1) #binwidth = 1
if increase number of points randomly generate , decrease binwidth (e.g. try 0.1, 0.01, 0.001, etc) you'll closer "square-looking" probability density function you'd expect uniform distribution (e.g. as shown on wikipedia)
Comments
Post a Comment