search

Home  >  Q&A  >  body text

求解释一下python中bytearray和memoryview 的使用 以及适用的场景

x = bytearray(b'abcde')
y = memoryview(x)
y[1:3] = b'yz'
x[1:3] = b'ab'
y[3] = ord(b'e')
x[3] = ord(b'f')

x = bytearray(b'abcde')
while len(x)>0:
x = x[1:]

x = bytearray(b'abcde')
y = memoryview(x)
while len(y)>0:
y = y[1:]

PHPzPHPz2898 days ago609

reply all(1)I'll reply

  • 高洛峰

    高洛峰2017-04-18 10:02:13

    I just recently used memoryview to answer this question.

    Bytearray is a mutable byte sequence, relative to str in Python2, but str is immutable.
    In Python3, since str is unicode encoding by default, it can only be accessed by bytes through bytearray.

    Memoryview provides a byte-by-byte memory access interface for objects that support buffer protocol[1,2]. The advantage is that there is no memory copy.
    Str and bytearray support buffer procotol by default.
    Comparison of the following two behaviors:
    To put it simply, the slicing operation of str and bytearray will generate new slices str and bytearray and copy the data, but it will not happen after using memoryview.

    1. Do not use memoryview

      >> a = 'aaaaaa'
      >> b = a[:2]    # 会产生新的字符串
      
      >> a = bytearray('aaaaaa')
      >> b = a[:2]    # 会产生新的bytearray
      >> b[:2] = 'bb' # 对b的改动不影响a
      >> a
      bytearray(b'aaaaaa')
      >> b
      bytearray(b'bb')
      
    2. Use memoryview

      >> a = 'aaaaaa'
      >> ma = memoryview(a)
      >> ma.readonly  # 只读的memoryview
      True
      >> mb = ma[:2]  # 不会产生新的字符串
      
      >> a = bytearray('aaaaaa')
      >> ma = memoryview(a)
      >> ma.readonly  # 可写的memoryview
      False
      >> mb = ma[:2]      # 不会会产生新的bytearray
      >> mb[:2] = 'bb'    # 对mb的改动就是对ma的改动
      >> mb.tobytes()
      'bb'
      >> ma.tobytes()
      'bbaaaa'
      

    My usage scenario is socket reception and analysis of received data in network programs:

    1. The sock receiving code before using memoryview is simplified as follows

      def read(size):

      ret = '' 
      remain = size
      while True:
          data = sock.recv(remain)
          ret += data     # 这里不断会有新的str对象产生
          if len(data) == remain:
              break
          remain -= len(data)
      return ret
      
    2. After using meoryview, we avoid constant string splicing and the generation of new objects

      def read(size):
          ret = memoryview(bytearray(size)) 
          remain = size
          while True:
              data = sock.recv(remain)
              length = len(data)
              ret[size - remain: size - remain + length] = data
              if len(data) == remain:
                  break
              remain -= len(data)
          return ret
      

      There is another advantage of returning memoryview. When using struct for unpack parsing, you can directly receive the memoryview object, which is very efficient (avoiding a large number of slicing operations when parsing large str in segments).

    For example:

        mv = memoryview('\x00\x01\x02\x00\x00\xff...')
        type, len = struct.unpack('!BI', mv[:5])
        ...
    

    [1] https://jakevdp.github.io/blo...
    [2] http://legacy.python.org/dev/...

    reply
    0
  • Cancelreply