Predicting out future values using OLS regression (Python, StatsModels, Pandas) -
i'm trying implement mlr in python , not sure how go applying coefficients i've found future values.
import pandas pd import statsmodels.formula.api sm import statsmodels.api sm2 tv = [230.1, 44.5, 17.2, 151.5, 180.8] radio = [37.8,39.3,45.9,41.3,10.8] newspaper = [69.2,45.1,69.3,58.5,58.4] sales = [22.1, 10.4, 9.3, 18.5,12.9] df = pd.dataframe({'tv': tv, 'radio': radio, 'newspaper': newspaper, 'sales': sales}) y = df.sales x = df[['tv','radio','newspaper']] x = sm2.add_constant(x) model = sm.ols(y, x).fit() >>> model.params const -0.141990 tv 0.070544 radio 0.239617 newspaper -0.040178 dtype: float64
so let's want predict out "sales" following dataframe:
edit tv radio newspaper sales 230.1 37,8 69.2 22.4 44.5 39.3 45.1 10.1 ... ... ... ... 25 15 15 30 20 22 35 22 36
i've been trying method found here can't seem working: forecasting using pandas ols
thank you!
assuming df2 new out of sample dataframe:
model = sm.ols(y, x).fit() new_x = df2.loc[df.sales.notnull(), ['tv', 'radio', 'newspaper']].values new_x = sm2.add_constant(new_x) # sm2 = statsmodels.api y_predict = model.predict(new_x) >>> y_predict array([ 4.61319034, 5.88274588, 6.15220225])
you can assign results directly df2 follows:
df2.loc[:, 'sales'] = model.predict(new_x)
to fill missing sales values original dataframe predictions regression, try:
x = df.loc[df.sales.notnull(), ['tv', 'radio', 'newspaper']] x = sm2.add_constant(x) y = df[df.sales.notnull()].sales model = sm.ols(y, x).fit() new_x = df.loc[df.sales.isnull(), ['tv', 'radio', 'newspaper']] new_x = sm2.add_constant(new_x) # sm2 = statsmodels.api df.loc[df.sales.isnull(), 'sales'] = model.predict(new_x)
Comments
Post a Comment