3

Pandas Pandas: How the filling date varies in a multiindex

 2 years ago
source link: https://www.codesd.com/item/pandas-pandas-how-the-filling-date-varies-in-a-multiindex.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Pandas Pandas: How the filling date varies in a multiindex

advertisements

Suppose I was trying to organize sales data for a membership business.

I only have the start and end dates. Ideally sales between the start and end dates appear as 1, instead of missing.

I can't get the 'date' column to be filled with in-between dates. That is: I want a continuous set of months instead of gaps. Plus I need to fill missing data in columns with ffill.

I have tried different ways such as stack/unstack and reindex but different errors occur. I'm guessing there's a clean way to do this. What's the best practice to do this?

Suppose the multiindexed data structure:

                 variable     sales
vendor date
a      2014-01-01  start date 1
       2014-03-01    end date 1
b      2014-03-01  start date 1
       2014-07-01    end date 1

And the desired result

                   variable   sales
vendor date
a      2014-01-01  start date 1
       2014-02-01  NaN        1
       2014-03-01    end date 1
b      2014-03-01  start date 1
       2014-04-01  NaN        1
       2014-05-01  NaN        1
       2014-06-01  NaN        1
       2014-07-01    end date 1


you can do:

>>> f = lambda df: df.resample(rule='M', how='first')
>>> df.reset_index(level=0).groupby('vendor').apply(f).drop('vendor', axis=1)
                     variable  sales
vendor date
a      2014-01-31  start date      1
       2014-02-28         NaN    NaN
       2014-03-31    end date      1
b      2014-03-31  start date      1
       2014-04-30         NaN    NaN
       2014-05-31         NaN    NaN
       2014-06-30         NaN    NaN
       2014-07-31    end date      1

and then just .fillna on sales column if needed.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK