Home  >  Article  >  Backend Development  >  Can Multiprocessing Share Read-Only Shared Data Without Replication?

Can Multiprocessing Share Read-Only Shared Data Without Replication?

Barbara Streisand
Barbara StreisandOriginal
2024-10-25 02:46:30154browse

Can Multiprocessing Share Read-Only Shared Data Without Replication?

Does Multiprocessing Replicate Read-Only Shared Data?

Introduction

In multiprocessing scenarios, it's crucial to optimize resource usage by ensuring that shared data is not duplicated across multiple processes. Understanding how read-only data is handled in these situations can save significant memory and performance overhead.

Question

Consider the following Python code:

<code class="python">glbl_array = # a 3 Gb array

def my_func(args, def_param=glbl_array):
    # do stuff on args and def_param

if __name__ == '__main__':
    pool = Pool(processes=4)
    pool.map(my_func, range(1000))</code>

Can we guarantee or encourage that the different processes share the glbl_array without creating individual copies?

Answer

To ensure shared access without duplication, we can utilize the shared memory mechanism provided by the multiprocessing module in Python. Here's how it can be implemented:

<code class="python">import multiprocessing
import ctypes
import numpy as np

shared_array_base = multiprocessing.Array(ctypes.c_double, 10 * 10)
shared_array = np.ctypeslib.as_array(shared_array_base.get_obj())
shared_array = shared_array.reshape(10, 10)

# Parallel processing
def my_func(i, def_param=shared_array):
    shared_array[i, :] = i

if __name__ == '__main__':
    pool = multiprocessing.Pool(processes=4)
    pool.map(my_func, range(10))

    print(shared_array)</code>

Implementation Details

The code creates a shared memory array (shared_array_base) using the multiprocessing.Array class. It then converts it into a Numpy array (shared_array) for convenient manipulation.

The main function (my_func) takes shared_array as a default parameter to avoid unnecessary copying, and Linux's copy-on-write semantics ensure that data duplication only occurs when modifications are made to the shared area.

By running the code, you'll notice that the shared shared_array is printed without any duplication, indicating that the processes shared the same memory object.

The above is the detailed content of Can Multiprocessing Share Read-Only Shared Data Without Replication?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn