python - Operating on tuples held within Pandas DataFrame column -


i have following dataframe:

   start      end         days 0  2015-07-01 2015-07-07         (1, 2, 3, 4, 5, 6, 7) 1  2015-07-08 2015-07-14    (8, 9, 10, 11, 12, 13, 14) 2  2015-07-15 2015-07-21  (15, 16, 17, 18, 19, 20, 21) 3  2015-07-22 2015-07-28  (22, 23, 24, 25, 26, 27, 28) 4  2015-07-29 2015-08-04      (29, 30, 31, 1, 2, 3, 4) 5  2015-08-05 2015-08-11       (5, 6, 7, 8, 9, 10, 11) 6  2015-08-12 2015-08-18  (12, 13, 14, 15, 16, 17, 18) 7  2015-08-19 2015-08-25  (19, 20, 21, 22, 23, 24, 25) 8  2015-08-26 2015-09-01   (26, 27, 28, 29, 30, 31, 1) 9  2015-09-02 2015-09-08         (2, 3, 4, 5, 6, 7, 8) 10 2015-09-09 2015-09-15   (9, 10, 11, 12, 13, 14, 15) 11 2015-09-16 2015-09-22  (16, 17, 18, 19, 20, 21, 22) 12 2015-09-23 2015-09-29  (23, 24, 25, 26, 27, 28, 29) 

i interested in working days column containing tuples, using pandas syntax basic filtering not appear work:

df[4 in df['days'] == true] 

i hoping above filter dataframe return following rows, i.e. tuples containing 4:

       start      end             days     0  2015-07-01 2015-07-07         (1, 2, 3, 4, 5, 6, 7)     4  2015-07-29 2015-08-04      (29, 30, 31, 1, 2, 3, 4)     9  2015-09-02 2015-09-08         (2, 3, 4, 5, 6, 7, 8) 

instead empty dataframe returned.

i have tried creating new column hold true/false values based on checking against expression so:

df['daystf'] = 4 in df['days'] 

this returns dataframe 'daystf' column set true rows, instead of true if 4 contained within tuple.

one way use series.apply method, though may not fast. example -

df[df['days'].apply(lambda x: 4 in x)] 

demo -

in [139]: df out[139]:          start         end                          days 0   2015-07-01  2015-07-07         (1, 2, 3, 4, 5, 6, 7) 1   2015-07-08  2015-07-14    (8, 9, 10, 11, 12, 13, 14) 2   2015-07-15  2015-07-21  (15, 16, 17, 18, 19, 20, 21) 3   2015-07-22  2015-07-28  (22, 23, 24, 25, 26, 27, 28) 4   2015-07-29  2015-08-04      (29, 30, 31, 1, 2, 3, 4) 5   2015-08-05  2015-08-11       (5, 6, 7, 8, 9, 10, 11) 6   2015-08-12  2015-08-18  (12, 13, 14, 15, 16, 17, 18) 7   2015-08-19  2015-08-25  (19, 20, 21, 22, 23, 24, 25) 8   2015-08-26  2015-09-01   (26, 27, 28, 29, 30, 31, 1) 9   2015-09-02  2015-09-08         (2, 3, 4, 5, 6, 7, 8) 10  2015-09-09  2015-09-15   (9, 10, 11, 12, 13, 14, 15) 11  2015-09-16  2015-09-22  (16, 17, 18, 19, 20, 21, 22) 12  2015-09-23  2015-09-29  (23, 24, 25, 26, 27, 28, 29)  in [141]: df['days'][0] out[141]: (1, 2, 3, 4, 5, 6, 7)  in [142]: type(df['days'][0]) out[142]: tuple  in [143]: df[df['days'].apply(lambda x: 4 in x)] out[143]:         start         end                      days 0  2015-07-01  2015-07-07     (1, 2, 3, 4, 5, 6, 7) 4  2015-07-29  2015-08-04  (29, 30, 31, 1, 2, 3, 4) 9  2015-09-02  2015-09-08     (2, 3, 4, 5, 6, 7, 8) 

Comments

Popular posts from this blog

1111. appearing after print sequence - php -

java - WARN : org.springframework.web.servlet.PageNotFound - No mapping found for HTTP request with URI [/board/] in DispatcherServlet with name 'appServlet' -

Ruby on Rails, ActiveRecord, Postgres, UTF-8 and ASCII-8BIT encodings -