Home >Backend Development >Python Tutorial >Detailed explanation of JSON and pickle for python serialization
JSON module
JSON (JavaScript Object Notation) is a lightweight data exchange format. It is based on a subset of ECMAScript. JSON uses a completely language-independent text format, but also uses conventions similar to the C language family (including C, C++, Java, JavaScript, Perl, Python, etc.). These properties make JSON an ideal data exchange language. It is easy for humans to read and write, and it is also easy for machines to parse and generate (generally used to increase network transmission rates).
JSON consists of list and dict respectively in python.
1. Convert python type data and JSON data format to each other
pthon The str type is converted to JSON to unicode type, None is converted to null, dict corresponds to object
2. Data encoding and decoding
1. Simple Type data encoding and decoding
The so-called simple types refer to the python types that appear in the above table.
dumps: Serialize the object
#coding:utf-8 import json # 简单编码=========================================== print json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}]) # ["foo", {"bar": ["baz", null, 1.0, 2]}] #字典排序 print json.dumps({"c": 0, "b": 0, "a": 0}, sort_keys=True) # {"a": 0, "b": 0, "c": 0} #自定义分隔符 print json.dumps([1,2,3,{'4': 5, '6': 7}], sort_keys=True, separators=(',',':')) # [1,2,3,{"4":5,"6":7}] print json.dumps([1,2,3,{'4': 5, '6': 7}], sort_keys=True, separators=('/','-')) # [1/2/3/{"4"-5/"6"-7}] #增加缩进,增强可读性,但缩进空格会使数据变大 print json.dumps({'4': 5, '6': 7}, sort_keys=True,indent=2, separators=(',', ': ')) # { # "4": 5, # "6": 7 # } # 另一个比较有用的dumps参数是skipkeys,默认为False。 # dumps方法存储dict对象时,key必须是str类型,如果出现了其他类型的话,那么会产生TypeError异常,如果开启该参数,设为True的话,会忽略这个key。 data = {'a':1,(1,2):123} print json.dumps(data,skipkeys=True) #{"a": 1}
dump: Serialize the object and save it to the file
#Serialize the object and save it to the file obj = ['foo', {'bar': ('baz', None , 1.0, 2)}]
with open(r"c:json.txt","w+") as f:
json.dump(obj,f)
loads: Deserialize the serialized string
import json obj = ['foo', {'bar': ('baz', None, 1.0, 2)}] a= json.dumps(obj) print json.loads(a) # [u'foo', {u'bar': [u'baz', None, 1.0, 2]}]
load: Read and deserialize the serialized string from the file
with open(r"c:json.txt","r") as f: print json.load(f)
3. Customize complex data type encoding and decoding
For example, when we encounter data types such as datetime objects or custom class objects that are not supported by json by default, we need to customize encoding and decoding functions. There are two ways to implement custom codecs.
1. Method 1: Customize the encoding and decoding function
#! /usr/bin/env python # -*- coding:utf-8 -*- # __author__ = "TKQ" import datetime,json dt = datetime.datetime.now() def time2str(obj): #python to json if isinstance(obj, datetime.datetime): json_str = {"datetime":obj.strftime("%Y-%m-%d %X")} return json_str return obj def str2time(json_obj): #json to python if "datetime" in json_obj: date_str,time_str = json_obj["datetime"].split(' ') date = [int(x) for x in date_str.split('-')] time = [int(x) for x in time_str.split(':')] dt = datetime.datetime(date[0],date[1], date[2], time[0],time[1], time[2]) return dt return json_obj a = json.dumps(dt,default=time2str) print a # {"datetime": "2016-10-27 17:38:31"} print json.loads(a,object_hook=str2time) # 2016-10-27 17:38:31
2. Method 2: Inherit the JSONEncoder and JSONDecoder classes and rewrite related methods
#! /usr/bin/env python # -*- coding:utf-8 -*- # __author__ = "TKQ" import datetime,json dt = datetime.datetime.now() dd = [dt,[1,2,3]] class MyEncoder(json.JSONEncoder): def default(self,obj): #python to json if isinstance(obj, datetime.datetime): json_str = {"datetime":obj.strftime("%Y-%m-%d %X")} return json_str return obj class MyDecoder(json.JSONDecoder): def __init__(self): json.JSONDecoder.__init__(self, object_hook=self.str2time) def str2time(self,json_obj): #json to python if "datetime" in json_obj: date_str,time_str = json_obj["datetime"].split(' ') date = [int(x) for x in date_str.split('-')] time = [int(x) for x in time_str.split(':')] dt = datetime.datetime(date[0],date[1], date[2], time[0],time[1], time[2]) return dt return json_obj # a = json.dumps(dt,default=time2str) a =MyEncoder().encode(dd) print a # [{"datetime": "2016-10-27 18:14:54"}, [1, 2, 3]] print MyDecoder().decode(a) # [datetime.datetime(2016, 10, 27, 18, 14, 54), [1, 2, 3]]
pickle module
Python’s pickle module implements all data sequences and decoding of python Serialization. Basically, the function usage is not much different from the JSON module, and the methods are also dumps/dump and loads/load. cPickle is a relatively faster C-language compiled version of the pickle module.
Different from JSON, pickle is not used for data transmission between multiple languages. It is only used as a persistence method for python objects or a method for transferring objects between python programs. Therefore, it supports all python data types.
The object deserialized by pickle is an equivalent copy object to the original object, similar to deepcopy.
dumps/dump serialization
from datetime import date try: import cPickle as pickle #python 2 except ImportError as e: import pickle #python 3 src_dic = {"date":date.today(),"oth":([1,"a"],None,True,False),} det_str = pickle.dumps(src_dic) print det_str # (dp1 # S'date' # p2 # cdatetime # date # p3 # (S'\x07\xe0\n\x1b' # tRp4 # sS'oth' # p5 # ((lp6 # I1 # aS'a' # aNI01 # I00 # tp7 # s. with open(r"c:\pickle.txt","w") as f: pickle.dump(src_dic,f)
loads/load deserialization
from datetime import date try: import cPickle as pickle #python 2 except ImportError as e: import pickle #python 3 src_dic = {"date":date.today(),"oth":([1,"a"],None,True,False),} det_str = pickle.dumps(src_dic) with open(r"c:\pickle.txt","r") as f: print pickle.load(f) # {'date': datetime.date(2016, 10, 27), 'oth': ([1, 'a'], None, True, False)}
The difference between JSON and pickle modules
1. JSON can only handle basic data types. pickle can handle all Python data types.
2. JSON is used for character conversion between various languages. Pickle is used for persistence of Python program objects or network transmission of objects between Python programs, but there may be differences in serialization of different versions of Python.