python - group-by + case when Equivalent -


want select:

select     user_id,     max(case when value > 0 timestamp else 0 end) max_timestamp_when_value_is_positive df group user_id 

what right way aggregate?

groupped = raw_data.groupby('user_id') res = groupped.agg({<how-to-do-described-aggregation?>}) 

update explanation , example.

in [2]: df = pd.dataframe({'user_id': [1, 1, 1, 2, 2, 3, 3, 3, 3],                            'timestamp': [100, 200, 300, 10, 110, 10, 110, 210, 250],                            'value': [0, 1, 0, 0, 0, 0, 10, 0, 1]})  in [3]: groupped = df.groupby('user_id')  in [4]: res = groupped.agg({'timestamp': [min, max],                             'value': lambda x: sum(x > 0),                             <described-magic>})  in [5]: res out[5]:          timestamp         value   <...magic...>               min  max <lambda> user_id                         1             100  300        1    200 2              10  110        0    0 3              10  250        2    210 

magic want.

create new column positive_value_timestamp as

df['positive_value_timestamp'] = df.timestamp * df.value.apply(lambda x: 1 if x > 0 else 0) 

when grouping, take max of column

res = df.groupby('user_id').agg(     {         'timestamp': [min, max],         'value': sum,         'positive_value_timestamp': max     }) 

Comments

Popular posts from this blog

IF statement in MySQL trigger -

c++ - What does MSC in "// appease MSC" comments mean? -

javascript - Blogger related post gadget image Resize s72-c [ Need Expert Help ] -