Optimizing Python Code for Parallel Computing
Techniques for writing Python code that efficiently utilizes multicore processors and parallel computing frameworks.
0 likes
8 views
Rule Content
--- description: Enforce best practices for writing Python code that efficiently utilizes multicore processors and parallel computing frameworks. globs: "**/*.py" tags: [python, parallel-computing, performance] priority: 3 version: 1.0.0 --- # Optimizing Python Code for Parallel Computing ## Context - Applicable when developing Python applications intended to leverage multicore processors for improved performance. - Relevant for tasks involving data processing, scientific computations, and other CPU-bound operations. ## Requirements - **Use Appropriate Parallel Computing Libraries**: Utilize libraries such as `multiprocessing`, `concurrent.futures`, `joblib`, or `Dask` to manage parallelism effectively. - **Minimize Inter-Process Communication (IPC)**: Design applications to reduce the overhead associated with IPC by batching data and minimizing shared state. - **Avoid Global Interpreter Lock (GIL) Limitations**: For CPU-bound tasks, prefer multiprocessing over multithreading to bypass GIL constraints. - **Implement Efficient Data Partitioning**: Divide tasks into independent, smaller chunks that can be processed concurrently to maximize resource utilization. - **Manage Process Pools Effectively**: Use process pools to control the number of worker processes, balancing between resource availability and task requirements. - **Leverage Shared Memory When Appropriate**: For large datasets, consider using shared memory to reduce data duplication and improve performance. - **Profile and Benchmark Code**: Regularly profile code to identify bottlenecks and ensure that parallelization leads to actual performance gains. ## Examples <example> **Good Practice**: Using `multiprocessing` to perform parallel computations on a list of numbers. from multiprocessing import Pool def square_number(n): return n * n if __name__ == "__main__": numbers = [1, 2, 3, 4, 5] with Pool(processes=4) as pool: results = pool.map(square_number, numbers) print(results) # Output: [1, 4, 9, 16, 25] </example> <example type="invalid"> **Bad Practice**: Using multithreading for CPU-bound tasks, leading to suboptimal performance due to GIL. from threading import Thread def square_number(n): return n * n numbers = [1, 2, 3, 4, 5] threads = [] results = [] for n in numbers: thread = Thread(target=lambda q, arg1: q.append(square_number(arg1)), args=(results, n)) threads.append(thread) thread.start() for thread in threads: thread.join() print(results) # Output may be inconsistent or slower than expected </example>