Tutorial :Why doesn't appending binary pickles work?



Question:

I know this isn't exactly how the pickle module was intended to be used, but I would have thought this would work. I'm using Python 3.1.2

Here's the background code:

import pickle    FILEPATH='/tmp/tempfile'    class HistoryFile():      """      Persistent store of a history file        Each line should be a separate Python object      Usually, pickle is used to make a file for each object,          but here, I'm trying to use the append mode of writing a file to store a sequence      """        def validate(self, obj):          """          Returns whether or not obj is the right Pythonic object          """          return True        def add(self, obj):          if self.validate(obj):              with open(FILEPATH, mode='ba') as f:    # appending, not writing                  f.write(pickle.dumps(obj))          else:              raise "Did not validate"        def unpack(self):          """          Go through each line in the file and put each python object          into a list, which is returned          """          lst = []          with open(FILEPATH, mode='br') as f:              # problem must be here, does it not step through the file?              for l in f:                  lst.append(pickle.loads(l))          return lst  

Now, when I run it, it only prints out the first object that is passed to the class.

if __name__ == '__main__':        L = HistoryFile()      L.add('a')      L.add('dfsdfs')      L.add(['dfdkfjdf', 'errree', 'cvcvcxvx'])        print(L.unpack())       # only prints the first item, 'a'!  

Is this because it's seeing an early EOF? Maybe appending is intended only for ascii? (in which case, why is it letting me do mode='ba'?) Is there a much simpler duh way to do this?


Solution:1

Why would you think appending binary pickles would produce a single pickle?! Pickling lets you put (and get back) several items one after the other, so obviously it must be a "self-terminating" serialization format. Forget lines and just get them back! For example:

>>> import pickle  >>> import cStringIO  >>> s = cStringIO.StringIO()  >>> pickle.dump(23, s)  >>> pickle.dump(45, s)  >>> s.seek(0)  >>> pickle.load(s)  23  >>> pickle.load(s)  45  >>> pickle.load(s)  Traceback (most recent call last):     ...  EOFError  >>>   

just catch the EOFError to tell you when you're done unpickling.


Solution:2

The answer is that it DOES work, but without the '+' in mode the newlines automatically added by the append feature of open mixes up the binary with the string data (a definite no-no). Change this line:

with open(FILEPATH, mode='ab') as f:    # appending, not writing      f.write(pickle.dumps(obj))  

to

with open(FILEPATH, mode='a+b') as f:    # appending, not writing      pickle.dump(obj, f)  

Alex also points out that for more flexibility use mode='r+b', but this requires the appropriate seeking. Since I wanted to make a history file that behaved like a first-in, last-out sort of sequence of pythonic objects, it actually made sense for me to try appending objects in a file. I just wasn't doing it correctly :)

There is no need to step through the file because (duh!) it is serialized. So replace:

for l in f:      lst.append(pickle.loads(l))  

with

while 1:      try:          lst.append(pickle.load(f))      except IOError:          break  

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »