python - recognize separate normal distributions in one data set -
a model have constructed produces output takes shape of 3 normal distributions.
import numpy np d1 = [np.random.normal(2,.1) _ in range(100)] d2 = [np.random.normal(2.5,.1) _ in range(100)] d3 = [np.random.normal(3,.1) _ in range(100)] sudo_model_output = d1 + d2 + d3 np.random.shuffle(sudo_model_output)
what pythonic way find normal distribution mean , standard deviation associated each normal distribution? cannot hardcode estimate of distributions start , end (~ 2.25 , 2.75 here) because value change each iteration of simulation.
i adapted fit : fitting histogram python
from scipy.optimize import leastsq import numpy np import matplotlib.pyplot p %matplotlib inline d1 = [np.random.normal(2,.1) _ in range(1000)] d2 = [np.random.normal(2.5,.1) _ in range(1000)] d3 = [np.random.normal(3,.1) _ in range(1000)] sum1 = d1 + d2 + d3 bins=np.arange(0,4,0.01) a=np.histogram(sum1,bins=bins) fitfunc = lambda p, x: p[0]*exp(-0.5*((x-p[1])/p[2])**2) +\ p[3]*exp(-0.5*((x-p[4])/p[5])**2) +\ p[6]*exp(-0.5*((x-p[7])/p[8])**2) errfunc = lambda p, x, y: (y - fitfunc(p, x)) xdata,ydata=bins[:-1],a[0] p.plot(xdata,ydata) init = [40, 2.1, 0.1,40, 2.4, 0.1,40, 3.1, 0.1 ] out = leastsq(errfunc, init, args=(xdata, ydata)) c = out[0] print c
now fit looks pretty good, came close inital guesses (see init) amplitude, center , width of these 9 variables. if knew same height or width , therefore lower number of variables, fit.
Comments
Post a Comment