python 3.x - Remove duplicates in a csv file based on two columns? -

September 15, 2015

i have csv must read , have duplicate values removed before gets written.

duplicate value based on 2 columns (date, price) (and conditional statement). therefore in example below row 1, row 2, , row 4 written csv. row 3 qualify duplicate (since same date , price match row 1) , excluded (not written csv).

address      floor       date         price 40 b street    18        3/29/2015    2200000 40 b street    23        1/7/2015     999000 40 b street    18        3/29/2015    2200000 40 b street    18        4/29/2015    2200000

you can use dictreader , dictwriter fulfill task.

import csv  def main(): """read csv file, delete duplicates , write it."""     open('test.csv', 'r',newline='') inputfile:         open('testout.csv', 'w', newline='') outputfile:             duplicatereader = csv.dictreader(inputfile, delimiter=',')             uniquewrite = csv.dictwriter(outputfile, fieldnames=['address', 'floor', 'date', 'price'], delimiter=',')             uniquewrite.writeheader()             keysread = []             row in duplicatereader:                key = (row['date'], row['price'])                if key not in keysread:                    print(row)                    keysread.append(key)                    uniquewrite.writerow(row)  if __name__ == '__main__':     main()

Search This Blog

TSQL

python 3.x - Remove duplicates in a csv file based on two columns? -

Comments

Post a Comment

Popular posts from this blog

java - WARN : org.springframework.web.servlet.PageNotFound - No mapping found for HTTP request with URI [/board/] in DispatcherServlet with name 'appServlet' -

android - How to create dynamically Fragment pager adapter -

1111. appearing after print sequence - php -