Python parallel processing on Unity HPC.¶
Objectives¶
- Learn how to request a parallel processor 'job'and set up a Python environment that will use that job.
- Write a short function to test and confirm that parallel processing is taking place (in-class).
- Use these concepts to modify landsatexplore.py from Week09 and implement the NDVI calculation (take-home).
- Use the python library Dask to extend Numpy array and Pandas dataframe calculations over multiple p
Dask documentation explains how Dask works with parallel computing: https://tutorial.dask.org/02_array.html
Use Unity documentation to understand more using HPC resources:https://docs.unity.uri.edu/documentation/
Step 1: Log in and request resources for your compute node.¶
Use your login info to connect to https://ood.unity.rc.umass.edu/ as before.
The Unity documentation describes how to request cluster computing resources or jobs
, which are categorized into several distinct partitions. Unity uses SLURM to manage and allocate resources, but we won't dedicate much time to understanding how SLURM works. The Python library for distributed processing of array data (dask) will be the tool we focus on.
The salloc
command allows to request cluster jobs on Unity. Documentation for both is linked above. For salloc
, you can request the number of cpu nodes -c
, the amount of time you want the job to last --time
, and the RAM or memory -mem
, and which partition to use -p
. A simple request looks like this:
]$ salloc -c 5 # Request 5 cpu nodes
The command below requests 12 cpus and 150 GB of RAM for 60 minutes on the partition called 'cpu'. I find that the fewer processors you request, the faster your job is allocated to you. NOTE: Please do not request more than 24 processors.
]$ salloc -J interact -c 12 --time=0:60:00 --mem=150G -p cpu
Step 2: Load your conda environment.¶
Load your conda environment, following the same steps as in Week10.
]$ module load anaconda/latest
]$ conda activate your_env_here
Remember to type ]$ conda deactivate
to close your Anaconda environment.
Aside. You can include all of these commands into a text file called a shell script and just run the script to speed up the process. The script must begin with the line #!/usr/bin/bash
. You can use nano to create this shell script. The protocol is to give it the file extension .sh
, ie start.sh
. After you have created the shell script you need to make it executable with chmod
command.
]$ chmod a+x start.sh
The script can be run at the command line using
]$ source start.sh
Step 3: Make a coreclock script to confirm parallel processing of computations.¶
- Download the script coreclock.py and examine the comments and contents.
- Add a module called coreclock() to the script, following the comments in the script.
- Upload the script to Unity.
- Open the Unity OOD Shell.
- Request compute resources for your job following Step 2.
- Load modules, activate your conda environment.
Notes about the code in coreclock.py¶
# Client() and LocalCluster() will be used to connect to the job resources that
# were requested.
from dask.distributed import Client, LocalCluster
# Progress function reports the computation status to the screen
from dask.distributed import progress
# Use time library for sleep
import time
# Connect to resources.
cluster = LocalCluster()
job = Client(cluster)
print(job)
The code block above creates a connection to the salloc
resources that were requested before starting python. The resources can be viewed with print(job)
- Use client.map() to execute coreclock() 500 times and distribute them over the requested cpus.
- Run coreclock.py at the command line:
$ python coreclock.py
[#### ] | 10% Completed | 11.6s
- How long does it take for the code to complete execution?
- Based on 500 instances of coreclock() and the delay you added with time.sleep(), how long would it take for a single processor to complete the same task?
- Is parellel computation working as expected?
Step 4: What to turn in?¶
- Answer the questions from Step 3 in this .ipynb.
- Upload this .ipynb
- Upload your modified version of coreclock.py