# Tutorial: MPI-parallelized calculation of spectra¶

01/2021: updated to pyGDM v1.1+

In this short tutorial we demonstrate how to accelerate the calculation of spectra via MPI multi-processing parallelization.

[1]:

## --- for benchmark: limit openmp to 1 thread (for parallel scipy/numpy routines)

import os

from pyGDM2 import structures
from pyGDM2 import materials
from pyGDM2 import fields
from pyGDM2 import propagators

from pyGDM2 import core

import numpy as np

## --- Note: It is not necessary to load mpi4py within the simulation script, this
## --- will be done automatically by pyGDM2 prior to the actual MPI-simulation.
## --- We do it however at this point to do some output to stdout only from
## --- the master process (rank == 0).
from mpi4py import MPI
rank = MPI.COMM_WORLD.rank


## Config of the simulation¶

We will demonstrate the MPI spectra calculation on the simple example of a small dielectric sphere

[2]:

## ---------- Setup structure
mesh = 'cube'
step = 20.0
material = materials.dummy(2.0)
struct = structures.struct(step, geometry, material)

## ---------- Setup incident field
field_generator = fields.planewave
wavelengths = np.linspace(400, 800, 20)
kwargs = dict(theta = [0.0])
efield = fields.efield(field_generator, wavelengths=wavelengths, kwargs=kwargs)

## ---------- vacuum environment
n1, n2 = 1.0, 1.0

## ---------- Simulation initialization

structure initialization - automatic mesh detection: cube
structure initialization - consistency check: 203/203 dipoles valid


## Run the simulation with the MPI wrapper to core.scatter¶

The only difference to a non MPI-parallelized run of the simulation is, that we use core.scatter_mpi instead of core.scatter. scatter_mpi will automatically distribute the calculation of the different wavelengths in the spectrum on the available processes.

[3]:

## --- mpi: print in process with rank=0, to avoid flooding of stdout
if rank == 0:
print("performing MPI parallel simulation... ")

core.scatter_mpi(sim)

if rank == 0:
print("simulation done.")


performing MPI parallel simulation...

/home/hans/.local/lib/python3.8/site-packages/pyGDM2/core.py:443: UserWarning: Executing only one MPI process! Should be run using e.g. 'mpirun -n X python scriptname.py', where X is the number of parallel processes.
warnings.warn("Executing only one MPI process! Should be run using" +
/home/hans/.local/lib/python3.8/site-packages/numba/core/dispatcher.py:237: UserWarning: Numba extension module 'numba_scipy' failed to load due to 'ValueError(No function '__pyx_fuse_0pdtr' found in __pyx_capi__ of 'scipy.special.cython_special')'.
entrypoints.init_all()

simulation done.


IMPORTANT NOTE: In order to be run by MPI, the script needs to be executed by the program mpirun:

\$ mpirun -n 4 python3 pygdm_script_using_mpi.py


where in this example, the argument “-n 4” tells MPI to run 4 parallel processes.

Note: in case the number of wavelengths in the spectra is not divisable by the number of MPI processes, some MPI-processes will be idle for some time during execution. In this case scatter_mpi will return a warning:

UserWarning: Efficiency warning: Number of wavelengths (20) not divisable by Nr of processes (3)!


## Timing of the above example¶

• MPI-run: 1.25 s

• sequential-run: 4.05 s

• speed-up: x3.2

This should be more close to x4 in case of a larger simulation, where the MPI overhead becomes small compared to the simulation runtime.