python 3.x - Remove duplicates in a csv file based on two columns? -


i have csv must read , have duplicate values removed before gets written.

duplicate value based on 2 columns (date, price) (and conditional statement). therefore in example below row 1, row 2, , row 4 written csv. row 3 qualify duplicate (since same date , price match row 1) , excluded (not written csv).

address      floor       date         price 40 b street    18        3/29/2015    2200000 40 b street    23        1/7/2015     999000 40 b street    18        3/29/2015    2200000 40 b street    18        4/29/2015    2200000 

you can use dictreader , dictwriter fulfill task.

import csv  def main(): """read csv file, delete duplicates , write it."""     open('test.csv', 'r',newline='') inputfile:         open('testout.csv', 'w', newline='') outputfile:             duplicatereader = csv.dictreader(inputfile, delimiter=',')             uniquewrite = csv.dictwriter(outputfile, fieldnames=['address', 'floor', 'date', 'price'], delimiter=',')             uniquewrite.writeheader()             keysread = []             row in duplicatereader:                key = (row['date'], row['price'])                if key not in keysread:                    print(row)                    keysread.append(key)                    uniquewrite.writerow(row)  if __name__ == '__main__':     main() 

Comments

Popular posts from this blog

html - Outlook 2010 Anchor (url/address/link) -

javascript - Why does running this loop 9 times take 100x longer than running it 8 times? -

Getting gateway time-out Rails app with Nginx + Puma running on Digital Ocean -