Tutorial :How to replace a column using Python's built-in .csv writer module?



Question:

I need to do a find and replace (specific to one column of URLs) in a huge Excel .csv file. Since I'm in the beginning stages of trying to teach myself a scripting language, I figured I'd try to implement the solution in python.

I'm having trouble with the "replace" part of the solution. I've read the official csv module documentation about how to use the writer, but there isn't really a clear enough example for me (yes, I'm slow). So, now for the question: how does one iterate through the rows of a csv file with a writer object?

p.s. apologies in advance for the clumsy code, I'm still learning :)

import csv    csvfile = open("PALTemplateData.csv")  csvout = open("PALTemplateDataOUT.csv")  dialect = csv.Sniffer().sniff(csvfile.read(1024))  csvfile.seek(0)  reader = csv.reader(csvfile, dialect)  writer = csv.writer(csvout, dialect)    total=0;  needchange=0;  changed = 0;  temp = ''  changeList = []    for row in reader:      total=total+1      temp = row[len(row)-1]      if '/?' in temp:          needchange=needchange+1;          changeList.append(row.index)    for row in writer:           #this doesn't compile, hence the question      if row.index in changeList:          changed=changed+1          temp = row[len(row)-1]          temp.replace('/?', '?')          row[len(row)-1] = temp          writer.writerow(row)    print('Total URLs:', total)  print('Total URLs to change:', needchange)  print('Total URLs changed:', changed)  


Solution:1

The reason you're getting an error is that the writer doesn't have data to iterate over. You're supposed to give it the data - presumably, you'd have some sort of list or generator that produces the rows to write out.

I'd suggest just combining the two loops, like so:

for row in reader:      row[-1] = row[-1].replace('/?', '?')      writer.writerow(row)  

And with that, you don't even need total, needchange, and changeList. (There are a bunch of optimizations in there that I unfortunately don't have time to explain, but I'll see if I can edit that info in later)


Solution:2

You should only have one loop and read and write at the same time - if your replacements only affect one line at a time, you don't need to loop over the data twice.

for row in reader:    total=total+1    temp = row[len(row)-1]    if '/?' in temp:      temp = row[len(row)-1]      temp.replace('/?', '?')      row[len(row)-1] = temp    writer.writerow(row)  

This is just to illustrate the loop, not sure if the replacement code will work like this.


Solution:3

Once you have your csv in a big list, one easy way to replace a column in a list would be to transpose your matrix, replace the row, and then transpose it back:

mydata = [[1, 'a', 10], [2, 'b', 20], [3, 'c', 30]]    def transpose(matrix):      return [[matrix[x][y] for x in range(len(matrix))] for y in range(len(matrix[0]))]    transposedData = transpose(mydata)  print transposedData  >>> [[1, 2, 3], ['a', 'b', 'c'], [10, 20, 30]]    editedData = transposedData[:2] + [50,70,90]  print editedData  >>> [[1, 2, 3], ['a', 'b', 'c'], [50, 70, 90]]    mydata = transpose(editedData)  print mydata  >>> [[1, 'a', 50], [2, 'b', 70], [3, 'c', 90]]  

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »