Basic Python

pickle, dump, load

Naranjito 2023. 10. 24. 17:10
  • pickle

Pickle is one of the python standard libraries for object serialization/deserialization. Pickle serialization saves objects into a file in byte format while deserialization is the reverse of serialization. While trying to serialize an object into a pickle file, it is necessary to state the byte/binary format first otherwise, you will run into an error. Several objects such as lists, tuples, dictionaries, transformers, models, Machine Learning models and many others can be pickled/serialized 

 

What happens if we want to store some structured data in the files? For example, we want to store some employee details like employee identification number (int type), name (string type) and salary (float type) in a file. This data is well structured and got different types. To store such data, we need to create a class Employee with the instance variables id, name and sal as shown below.

class Emp:
    def __init__(self,id,name,sal):
        self.id=id
        self.name=name
        self.sal=sal
    def display(self):
        print("{:5d}{:20s}{:10.2f}".format(self.id, self.name, self.sal))

 

 

 

  • dump
pickle.dump(object, file)

pickle.dump(object, file)

Then we create an object to this class and store actual data into that object. This object should be stored into a binary file in the form of bytes. This is called pickle or serialization . So, let's understand that pickle is a process of converting a class object into a byte stream so that it can be stored into a file. This is also called object serialization.

class Emp:
    def __init__(self,id,name,sal):
        self.id=id
        self.name=name
        self.sal=sal
    def display(self):
        print("{:5d}{:20s}{:10.2f}".format(self.id, self.name, self.sal))
f = open('emp.dat', 'wb') 
n = int(input('How many employees?')) 
for i in range(n): 
    id = int(input('Enter id:')) 
    name = input('Enter name:') 
    sal = float(input('Enter salary:')) 
   #create Emp class object
    e = Emp(id, name, sal)
    pickle.dump(e, f) 
f.close()

 

  • load
object = pickle.load(file)

Unpickle is a process whereby a byte stream is converted back into a class object. It means, unpickling represents reading the class objects from the file. Unpickling is also called desearialization.

 

Here, the load() method reads an object from a binary 'file' and returns it into 'object'. Let's remember that pickling and unpickling should be done using binary files since they support byte streams. The word stream represents data flow. So, byte stream represents flow of bytes.

 

dbfile=open('emp.dat', 'rb')
print('Employees details:')
while True:
    try:
        db=pickle.load(dbfile)
        db.display()
    except EOFError:        
        break
        
>>>
Employees details:
  123orange                 1000.00
  456apple                  2000.00
  789banana                 3000.00

  • allow_pickle

Allow loading pickled object arrays stored in npy files. 

Reasons for disallowing pickles include security, as loading pickled data can execute arbitrary code. 

If pickles are disallowed, loading object arrays will fail. 

Default: False

 

https://www.gkindex.com/python-advanced/python-pickle.jsp

https://medium.com/mlearning-ai/saving-your-machine-learning-model-in-python-pickle-dump-b01ae60a791c

https://korbillgates.tistory.com/173

'Basic Python' 카테고리의 다른 글

glob  (0) 2023.11.04
bytes VS bytearray  (0) 2023.10.30
sys.path, getcwd  (0) 2023.10.19
assert  (0) 2022.12.05
return vs yield, yield vs yield from  (0) 2022.11.30