python 2.7 - Strange behaviour when adding columns -
i'm using python 2.7.8 |anaconda 2.1.0. i'm wondering why strange behavior below occurs
i create pandas dataframe 2 columns, add third column summing first 2 columns
x = pd.dataframe(np.random.randn(5, 2), columns = ['a', 'b']) x['c'] = x[['a', 'b']].sum(axis = 1) #or x['c'] = x['a'] + x['b'] out[7]: b c 0 -1.644246 0.851602 -0.792644 1 -0.129092 0.237140 0.108049 2 0.623160 0.105494 0.728654 3 0.737803 -1.612189 -0.874386 4 0.340671 -0.113334 0.227337
all far. want set values of column c 0 if negative
x[x['c']<0] = 0 out[9]: b c 0 0.000000 0.000000 0.000000 1 -0.129092 0.237140 0.108049 2 0.623160 0.105494 0.728654 3 0.000000 0.000000 0.000000 4 0.340671 -0.113334 0.227337
this gives desired result in column 'c'
, reason columns 'a'
, 'b'
have been modified - don't want happen. wondering why happening , how can fix behavior?
you have specify want 'c' column:
x.loc[x['c']<0, 'c'] = 0
when index boolean array/series, select full rows, can see in example:
in [46]: x['c']<0 out[46]: 0 true 1 false 2 false 3 true 4 false name: c, dtype: bool in [47]: x[x['c']<0] out[47]: b c 0 -0.444493 -0.592318 -1.036811 3 -1.363727 -1.572558 -2.936285
Comments
Post a Comment