Directory structure, paths, modules¶
Motivation:
- As you begin to write more mature code, you may want to make it reusable, because it will have multiple use cases.
- The best way to make code reusable is to make it into a module or function.
- A module/function has clearly defined inputs and outputs.
- The module has its own variable scope that is kept separate from the workspace.
Modules have three important sections:
- The definition or declaration where inputs are defined. This is like the head.
- The internal operations. This is the body.
- The return statement where outputs are specified.
Directory structure is the organization of files, and other data objects into drives and folders.¶
- The document root specifies the lowest level in the structure. This is usually equivalent to the harddrive.
- The subfolders sit within the document root.
The path specifies the location of a file in the directory strucure.¶
- The path specifies the lowest level in the structure. This is usually equivalent to the harddrive.
- A path can be absolute. Absolute paths are specified w.r.t. the document root.
- A path can be relative to the current location.
Paths are used to help executable files find the other files they need.
Example: If I have an .ipynb in /pylibrary/spongebob and I want to use it to load an image in /Users/Downloads, I can specify the location as:
imread('/Users/Downloads/myimage.png')
# or
imread('../../Users/Downloads/myimage.png')
cd ../brice # relative path, when I'm in spongebob
cd /Users/pylibrary/brice. # absolute path
# The ../ sequence means "go up one level" in the directory structure.
Python path¶
Python maintains paths as a text list, which shows all the places that executables can be found.
# The sys package maintains a list of modules and objects that deal with the operating system
import sys
# Print the python path
print(sys.path)
# Add a folder to your python path
sys.path.append('/path/to/the/')
# Note, the first location on the path is the current directory.
['/Users/huero/GDrive/teaching/OCG404/2021/Lectures+Notes/Week08', '/Users/huero/anaconda3/envs/cart/lib/python38.zip', '/Users/huero/anaconda3/envs/cart/lib/python3.8', '/Users/huero/anaconda3/envs/cart/lib/python3.8/lib-dynload', '', '/Users/huero/.local/lib/python3.8/site-packages', '/Users/huero/anaconda3/envs/cart/lib/python3.8/site-packages', '/Users/huero/anaconda3/envs/cart/lib/python3.8/site-packages/IPython/extensions', '/Users/huero/.ipython', '/path/to/the/']
Modules use functional programming to accept inputs, do operations, and pass outputs.¶
# This is a module with no inputs
def Sayit(): # This is the defintion in the head.
Iter = 0; # This is an operation in the body.
Ntimes = 10; # This is an operation in the body.
while Iter < Ntimes:
Iter += 1
print("Run for your life!")
return # This is the return statement where outputs are defined.
# This is an example of overloading modules. You can specify default values, in case the user does not specify them.
def Repeatafterme(Message="No msg at this time", Ntimes=5): # This is the definition in the head.
iter = 0
while iter <= Ntimes:
print(Message)
iter += 1
;
return # This is the return statement where outputs are defined.
Repeatafterme(Ntimes=10)
No msg at this time No msg at this time No msg at this time No msg at this time No msg at this time No msg at this time No msg at this time No msg at this time No msg at this time No msg at this time No msg at this time
Repeatafterme("The word is the message.")
The word is the message. The word is the message. The word is the message. The word is the message. The word is the message. The word is the message.
Modules have a scope that differs from the workspace.¶
def addtoa(a):
b = 10;
c = a + b;
return c
This module take the variable a as input, it adds the variable b to a and saves it in c. Then it returns c as output. After the module executes b ceases to exist. The "workspace" or other modules don't know about b or its values.
# We can place the module definition in a separate python folder and then import it.
# This only works if the file is on the python path.
# from sayitall import Sayit2
# from sayitall import Sayit3, Sayit2
import sayitall as s
s.Sayit2()
Run for your life! Run for your life! Run for your life! Run for your life! Run for your life! Run for your life! Run for your life! Run for your life! Run for your life! Run for your life!
# It is not possible to pass a path to the import command
import('../../Lectures+Notes/Week08/sayitall')
File "/var/folders/0z/j7ytwmyn37551vn5d49bdcjm0000gn/T/ipykernel_98661/880019498.py", line 3 import('../../Lectures+Notes/Week08/sayitall') ^ SyntaxError: invalid syntax
# BUT: If the file is not on the Python path, here is a workaround to directly specify the path to your module
import imp
imp.load_source('sayitall', '../../Lectures+Notes/Week08/sayitall.py')
Sayit3()
Modules can return multiple outputs¶
def addtoa(a):
b = 10;
c = a + b;
return c, b
myC, myB = addtoa(a=10)
print(myC)
myCB = addtoa(a=10)
print(myCB)
In_class assignment part 1:¶
Put your basic stats code inside a module. Save that module in a file called basic_stats.py or similar.
The inputs should be a list vector or numpy array x of unspecified length.
The outputs should be the sample mean $\bar{x}$ and sample standard deviation $s$, as well as a figure containing the histogram of x.
Upload basic_stats.py to Week08 assignment, along with your work from the HPC_Intro.ipynb.