How to compare dates from Excel sheets for value filling purposes on python -


i have excel file 2 sheets.

one sheet containing data:

date       tmax tmin 20110706    317 211 20110707    322 211 20110708    317 211 20110709    322 211 20110710    328 222 20110711    333 244 20110712    356 250 20110713    356 222 

and other sheet includes:

start date  end date    rep month    cost    kwh     kw  7/6/2011    8/3/2011    july     5,065.17    76,640      205  8/3/2011    9/7/2011    august   5,572.38    86,640      195  

my goal write column on sheet 1 (kwh) sheet 2 depending on if date on sheet 1 falls within range of kwh.

for example:

date        tmax    tmin    kwh 20110706    317   211   76640 20110707    322   211   76640 20110708    317   211   76640 20110709    322   211   76640 20110710    328   222   76640 20110711    333   244   76640 20110712    356   250   76640 20110713    356   222   76640 20110801    344   228   76640 20110802    356   200   76640 20110803    367   200   86640 20110804    361   228   86640 

i having trouble figuring out how kind of algorithmic parsing able implement trying do.

i familiar how write file read file/cells pandas.

here code:

import pandas pd pandas import excelwriter  df = pd.read_excel("thecddhddtest.xlsx",'sheet1') df2 = pd.read_excel("thecddhddtest.xlsx",'sheet2') df.head()   df["date"] = pd.to_datetime(df["date"], format="%y%m%d") pd.to_datetime(df2["start date"], format="%m/%d/%y")  df3 = df2.set_index("start date")   df3["kwh"].reindex(df["date"], method="ffill") df["kwh"] = df3["kwh"].reindex(df["date"], method="ffill") print(df["kwh"])   writer = excelwriter('thecddhddtestkwh.xlsx') df.to_excel(writer,'sheet1',index=false) df2.to_excel(writer,'sheet2',index=false) writer.save() 

which results in:

date       tmax tmin kwh 20110706    317 211 20110707    322 211 20110708    317 211 20110709    322 211 20110710    328 222 20110711    333 244 20110712    356 250 20110713    356 222 

kwh cell empty reason

it's critical parse date columns pandas timestamps/ numpy datetime64. best way use to_datetime format.

    in [11]: df     out[11]:         date  tmax  tmin 0   20110706   317   211 1   20110707   322   211 2   20110708   317   211 3   20110709   322   211 4   20110710   328   222 5   20110711   333   244 6   20110712   356   250 7   20110713   356   222 8   20110801   344   228 9   20110802   356   200 10  20110803   367   200 11  20110804   361   228  in [12]: df["date"] = pd.to_datetime(df["date"], format="%y%m%d")  in [13]: df out[13]:          date  tmax  tmin 0  2011-07-06   317   211 1  2011-07-07   322   211 2  2011-07-08   317   211 3  2011-07-09   322   211 4  2011-07-10   328   222 5  2011-07-11   333   244 6  2011-07-12   356   250 7  2011-07-13   356   222 8  2011-08-01   344   228 9  2011-08-02   356   200 10 2011-08-03   367   200 11 2011-08-04   361   228 

similarly (with different format):

in [14]: pd.to_datetime(df2["start date"], format="%m/%d/%y") out[14]: 0   2011-07-06 1   2011-08-03 name: start date, dtype: datetime64[ns] 

now, first observation wouldn't make sense if periods not mutually exclusive. means need consider start date*.

this means can reindex seconds sheet, forward fill, , you're done:

in [21]: df3 = df2.set_index("start date")  in [22]: df3 out[22]:             end date rep month      cost     kwh   kw start date 2011-07-06  8/3/2011      july  5,065.17  76,640  205 2011-08-03  9/7/2011    august  5,572.38  86,640  195 

this allows reindex dates dataframe:

in [23]: df3["kwh"].reindex(df["date"], method="ffill") out[23]: date 2011-07-06    76,640 2011-07-07    76,640 2011-07-08    76,640 2011-07-09    76,640 2011-07-10    76,640 2011-07-11    76,640 2011-07-12    76,640 2011-07-13    76,640 2011-08-01    76,640 2011-08-02    76,640 2011-08-03    86,640 2011-08-04    86,640 name: kwh, dtype: object 

and set column in df.

in [24]: df["kwh"] = df3["kwh"].reindex(df["date"], method="ffill") 

*if there "empty" periods add in nan rows, corresponding "empty" start-date.


Comments

Popular posts from this blog

1111. appearing after print sequence - php -

java - WARN : org.springframework.web.servlet.PageNotFound - No mapping found for HTTP request with URI [/board/] in DispatcherServlet with name 'appServlet' -

Ruby on Rails, ActiveRecord, Postgres, UTF-8 and ASCII-8BIT encodings -