Home  >  Article  >  Backend Development  >  Detailed explanation of JSON and pickle for python serialization

Detailed explanation of JSON and pickle for python serialization

高洛峰
高洛峰Original
2016-10-29 10:17:041409browse

JSON module

JSON (JavaScript Object Notation) is a lightweight data exchange format. It is based on a subset of ECMAScript. JSON uses a completely language-independent text format, but also uses conventions similar to the C language family (including C, C++, Java, JavaScript, Perl, Python, etc.). These properties make JSON an ideal data exchange language. It is easy for humans to read and write, and it is also easy for machines to parse and generate (generally used to increase network transmission rates).
JSON consists of list and dict respectively in python.

1. Convert python type data and JSON data format to each other

Detailed explanation of JSON and pickle for python serialization

pthon The str type is converted to JSON to unicode type, None is converted to null, dict corresponds to object

2. Data encoding and decoding

1. Simple Type data encoding and decoding

The so-called simple types refer to the python types that appear in the above table.

dumps: Serialize the object

#coding:utf-8
import json

# 简单编码===========================================
print json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}])
# ["foo", {"bar": ["baz", null, 1.0, 2]}]

#字典排序
print json.dumps({"c": 0, "b": 0, "a": 0}, sort_keys=True)
# {"a": 0, "b": 0, "c": 0}

#自定义分隔符
print json.dumps([1,2,3,{'4': 5, '6': 7}], sort_keys=True, separators=(',',':'))
# [1,2,3,{"4":5,"6":7}]
print json.dumps([1,2,3,{'4': 5, '6': 7}], sort_keys=True, separators=('/','-'))
# [1/2/3/{"4"-5/"6"-7}]

#增加缩进,增强可读性,但缩进空格会使数据变大
print json.dumps({'4': 5, '6': 7}, sort_keys=True,indent=2, separators=(',', ': '))
# {
#   "4": 5,
#   "6": 7
# }


# 另一个比较有用的dumps参数是skipkeys,默认为False。
# dumps方法存储dict对象时,key必须是str类型,如果出现了其他类型的话,那么会产生TypeError异常,如果开启该参数,设为True的话,会忽略这个key。
data = {'a':1,(1,2):123}
print json.dumps(data,skipkeys=True)
#{"a": 1}

dump: Serialize the object and save it to the file

#Serialize the object and save it to the file obj = ['foo', {'bar': ('baz', None , 1.0, 2)}]
with open(r"c:json.txt","w+") as f:
json.dump(obj,f)

loads: Deserialize the serialized string

import json

obj = ['foo', {'bar': ('baz', None, 1.0, 2)}]
a= json.dumps(obj)
print json.loads(a)
# [u'foo', {u'bar': [u'baz', None, 1.0, 2]}]

load: Read and deserialize the serialized string from the file

with open(r"c:json.txt","r") as f: print json.load(f)

3. Customize complex data type encoding and decoding

For example, when we encounter data types such as datetime objects or custom class objects that are not supported by json by default, we need to customize encoding and decoding functions. There are two ways to implement custom codecs.

1. Method 1: Customize the encoding and decoding function

#! /usr/bin/env python
# -*- coding:utf-8 -*-
# __author__ = "TKQ"
import datetime,json

dt = datetime.datetime.now()



def time2str(obj):
    #python to json
    if isinstance(obj, datetime.datetime):
        json_str = {"datetime":obj.strftime("%Y-%m-%d %X")}
        return json_str
    return obj

def str2time(json_obj):
    #json to python
    if "datetime" in json_obj:
        date_str,time_str = json_obj["datetime"].split(' ')
        date = [int(x) for x in date_str.split('-')]
        time = [int(x) for x in time_str.split(':')]
        dt = datetime.datetime(date[0],date[1], date[2], time[0],time[1], time[2])
        return dt
    return json_obj


a = json.dumps(dt,default=time2str)
print a
# {"datetime": "2016-10-27 17:38:31"}
print json.loads(a,object_hook=str2time)
# 2016-10-27 17:38:31

2. Method 2: Inherit the JSONEncoder and JSONDecoder classes and rewrite related methods

#! /usr/bin/env python
# -*- coding:utf-8 -*-
# __author__ = "TKQ"
import datetime,json

dt = datetime.datetime.now()
dd = [dt,[1,2,3]]

class MyEncoder(json.JSONEncoder):
    def default(self,obj):
        #python to json
        if isinstance(obj, datetime.datetime):
            json_str = {"datetime":obj.strftime("%Y-%m-%d %X")}
            return json_str
        return obj

class MyDecoder(json.JSONDecoder):
    def __init__(self):
        json.JSONDecoder.__init__(self, object_hook=self.str2time)

    def str2time(self,json_obj):
        #json to python
        if "datetime" in json_obj:
            date_str,time_str = json_obj["datetime"].split(' ')
            date = [int(x) for x in date_str.split('-')]
            time = [int(x) for x in time_str.split(':')]
            dt = datetime.datetime(date[0],date[1], date[2], time[0],time[1], time[2])
            return dt
        return json_obj


# a = json.dumps(dt,default=time2str)
a =MyEncoder().encode(dd)
print a
# [{"datetime": "2016-10-27 18:14:54"}, [1, 2, 3]]
print MyDecoder().decode(a)
# [datetime.datetime(2016, 10, 27, 18, 14, 54), [1, 2, 3]]

pickle module

Python’s pickle module implements all data sequences and decoding of python Serialization. Basically, the function usage is not much different from the JSON module, and the methods are also dumps/dump and loads/load. cPickle is a relatively faster C-language compiled version of the pickle module.

Different from JSON, pickle is not used for data transmission between multiple languages. It is only used as a persistence method for python objects or a method for transferring objects between python programs. Therefore, it supports all python data types.

The object deserialized by pickle is an equivalent copy object to the original object, similar to deepcopy.

dumps/dump serialization

from datetime import date

try:
    import cPickle as pickle    #python 2
except ImportError as e:
    import pickle   #python 3


src_dic = {"date":date.today(),"oth":([1,"a"],None,True,False),}
det_str = pickle.dumps(src_dic)
print det_str
# (dp1
# S'date'
# p2
# cdatetime
# date
# p3
# (S'\x07\xe0\n\x1b'
# tRp4
# sS'oth'
# p5
# ((lp6
# I1
# aS'a'
# aNI01
# I00
# tp7
# s.
with open(r"c:\pickle.txt","w") as f:
    pickle.dump(src_dic,f)

loads/load deserialization

from datetime import date

try:
    import cPickle as pickle    #python 2
except ImportError as e:
    import pickle   #python 3


src_dic = {"date":date.today(),"oth":([1,"a"],None,True,False),}
det_str = pickle.dumps(src_dic)
with open(r"c:\pickle.txt","r") as f:
    print pickle.load(f)
# {'date': datetime.date(2016, 10, 27), 'oth': ([1, 'a'], None, True, False)}

The difference between JSON and pickle modules

1. JSON can only handle basic data types. pickle can handle all Python data types.

2. JSON is used for character conversion between various languages. Pickle is used for persistence of Python program objects or network transmission of objects between Python programs, but there may be differences in serialization of different versions of Python.


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn