Let’s say you’re having problems parsing a csv file, represented as an InMemoryUploadedFile, that you’ve just uploaded through a Django form. There are a bunch of answers on stackoverflow! They all totally work with Python 2! …and lead to hours of frustration if, say, hypothetically, like me, you’re using Python 3.
If you are getting errors like
_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?) — and then getting different errors about DictReader not getting an expected iterator after you use
.decode('utf-8') to coerce your file to
str — this is the post for you.
It turns out all you need to do (e.g. in your
What’s going on here?
seek statement ensures the pointer is at the beginning of the file. This may or may not be required in your case. In my case, I’d already read the file in my
forms.py in order to validate it, so my file pointer was at the end. You’ll be able to tell that you need to
seek() if your
csv.DictReader() doesn’t throw any errors, but when you try to loop over the lines of the file you don’t even enter the for loop (e.g.
print() statements you put in it never print) — there’s nothing left to loop over if you’re at the end of the file.
read() gives you the file contents as a bytes object, on which you can call
decode('utf-8') turns your bytes into a string, with known encoding. (Make sure that you know how your CSV is encoded to start with, though! That’s why I was doing validation on it myself. Unicode, Dammit is going to be my friend here. Even if I didn’t want an excuse to use it because of its title alone. Which I do.)
io.StringIO() gives you the iterator that DictReader needs, while ensuring that your content remains stringy.
tl;dr I wrote two lines of code (but eight lines of comments) for a problem that took me hours to solve. Hopefully now you can copy these lines, and spend only a few minutes solving this problem!