python - Drop pandas dataframe rows based on groupby() condition -
there pandas dataframe on input:
store_id item_id items_sold date 1 1 0 2015-12-28 1 2 1 2015-12-28 1 1 0 2015-12-28 2 2 0 2015-12-28 2 1 1 2015-12-29 2 2 1 2015-12-29 2 1 0 2015-12-29 3 1 0 2015-12-30 3 1 0 2015-12-30
i need drop rows items have never been sold in particular store: pairs (1,1), (3,1) of (store_id, item_id) in dataframe
the output expect following:
store_id item_id items_sold date 1 2 1 2015-12-28 2 2 0 2015-12-28 2 1 1 2015-12-29 2 2 1 2015-12-29 2 1 0 2015-12-29
i've figured out how find required pairs of (store_id, item_id)
using pd.groupby()[].sum()
, stuck dropping them initial dataframe
is want?
in [30]: df[df.groupby(['store_id', 'item_id'])['items_sold'].transform('sum') > 0] out[30]: store_id item_id items_sold date 1 1 2 1 2015-12-28 3 2 2 0 2015-12-28 4 2 1 1 2015-12-29 5 2 2 1 2015-12-29 6 2 1 0 2015-12-29
Comments
Post a Comment