Making Packages for upload to Github¶
TASK: Create a github public repository to share a module or other function as Open Source.
MOTIVATION:
To maintain the integrity and reproduce scientific results, it is important that both data and code are preserved in a repository.
Document and package a module you have written. For example, you can complete this exercise with your
fibonacci modules
, you can put yourbasic_stats
code into a module you wrote earlier in the class. (If you want to go through this process with a different module you wrote, that is also good, even excellent).Use git to upload and maintain the basic_stats module.
Types of Projects¶
Ryan Abernathy recommends categorizing your intellectual output into one of three types:
- Exploratory analyses: When exploring a new idea, a single notebook or script is often all we need.
- Single Paper or Research Report: The "paper" is a standard unit of scientific output. The code related to a single paper usually belongs together.
- Reusable software elements: In the course of our research computing, we often identify specialized routines that we want to package for reuse in other projects, or by other scientists. This is where "scripts" become "software."
We will focus on the Reusable software element.
Reuseable Software Elements¶
Scientific software can perhaps be grouped into two categories: single-use "scripts" that are used in a very specific context to do a very specific thing (e.g.~to generate a specific figure for a paper), and reuseable components which encapsulate a more generic workflow. Once you find yourself repeating the same chunks of code in many different scripts or projects, it's time to start composing reusable software elements.
Modules:¶
Earlier in the class you have created a module. Here, that code will get copied into a stand-alone Python .py file, so that it can be imported and executed by any other script or in the Python IDE. More info about modules.
Below is an example module to compute the Saturation vapor pressure for some common gases in the atmosphere. I use this code as part of a toolbox on air-sea exchange.
Included with this assignment is a file calledair_sea.py
, which contains the module definition for the satureation vapor pressure:
"""
Saturation Vapor Pressure in mb
"""
def sat_vp(T):
"""
Saturation Vapor Pressure in mb
% function es = sat_vp(T),
%- INPUT:
%- 1) Temperature in Deg. C
%- OUTPUT:
%- 1) vapor pressure in mb calculate the saturation vapor pressure in mb.
%-
"""
import numpy as np
Lf = 2.453e6 # (J/kg)
Rv = 461. # (J/kg)
if (T <= 0):
Tk = 273.15+T
# es in hPa
loges = -7.90298*(373.16/Tk-1) + 5.02808*np.log10(373.16/Tk) \
- 1.3816e-7*(1011.344*(1-Tk/373.16)-1) + 8.1328e-3*(10**-3.49149*(373.16/Tk-1)-1) \
+ np.log10(1013.246)
es = 10**loges
else:
Tk = 273.15+T
loges = Lf/Rv*(1/273.15 - 1/Tk)
es = 6.11*np.exp(loges)
return es
This module can be imported if it is in the same folder as this Jupyter notebook.
import air_sea
help(air_sea)
Help on package air_sea: NAME air_sea DESCRIPTION Air-sea Package for Python Version 1.0 Oct. 8, 2020 Author: Brice Loose, Graduate School of Oceanography, URI. Email: bloose@uri.edu # # DISCLAIMER: # This software is provided "as is" without warranty of any kind. #========================================================================= PACKAGE CONTENTS air_sea FUNCTIONS sat_vp(T) Saturation Vapor Pressure in mb % function es = sat_vp(T), %- INPUT: %- 1) Temperature in Deg. C %- OUTPUT: %- 1) vapor pressure in mb calculate the saturation vapor pressure in mb. %- DATA __all__ = ['sat_vp'] FILE /Users/huero/GDrive/teaching/OCG404/2023/Lab/Github_Lab/air_sea/__init__.py
And let's try using it to make a calculation
air_sea.sat_vp(-1)
5.670131452875462
If the module is modified, you must restart the kernel or reload the module.
from importlib import reload
reload(air_sea)
Modules are a simple way to share code between different scripts or notebooks in the same project. Module files must reside in the same directory as any script which imports them! This is a big limitation; it means you can't share modules between different projects.
Once you have a piece of code that is general-purpose enough to share between projects, you need to create a package.
Packages¶
Packages are python's way of encapsulating reusable code elements for sharing with others. Packaging is a huge and complicated topic. We will just scratch the surface.
We have already interacted with many packages already. Browse some of their github repositories to explore the structure of a large python package:
These packages all have a common basic structure. Imagine we wanted to turn the sat_vp module into a package called air-sea. It would look like this.
README.md
LICENSE
setup.py
air_sea/__init__.py
air_sea/sat_vp.py
The actual package is contained in the air_sea
subdirectory. The other files are auxilliary files which help others understand and install your package. Here is an overview of what they do
File Name | Purpose |
---|---|
README.md |
Explain what the package is for |
LICENSE |
Defines the legal terms under which other can use the package. Open source is encouraged! |
setup.py |
A special python script which installs your package. (more info) |
The actual package¶
The directory air_sea
is the actual package. Any directory that contains an __init__.py
file is recognized by python as a package. This file can be blank, but it needs to be present.
setup.py
is the magic file that makes your package installable and accessible anywhere. Here is a basic setup.py
from setuptools import setup
setup(
name = "air_sea",
version = "0.0.1",
author = "Brice Loose",
packages=['sat_vp'],
install_requires=['numpy','scipy.io'],
)
To run the setup script, we call the following from the command line
python setup.py install
The package files are copied to our python library directory. If we plan to keep developing the package, we can install it in "developer mode" as
python setup.py develop
In this case, the files are symlinked rather than copied.
Follow similar steps to the github pages in-class activity.¶
- Back in a web browser, open your github.com account and make a new empty repository called
basic_stats
or some other name that describes what the code does. Edit the repository name and description.
- Include a LICENSE and README.md.
- Copy the .git URL to your clipboard in preparation to clone the repository to your local computer.
Clone the empty basic_stats repository and copy in your python files.¶
Use the git clone
command to make a local copy of your repository that you can modify and push.
]$ cd ~ # Change to your home directory
]$ cd git_repos/
]$ git clone [repository_url].git #this should be copied from the code dropdown.
]$ cd [repository_name] # Change into the directory you created.
- Add setup.py to the repository folder.
- Make a folder called basic_stats.py
- Add init.py to this folder. Modify according to the air_sea example.
- Add basic_stats.py to this folder.
- Synchronize your changes using the add, commit, push sequence that we have used.
Step 6: What to turn in.¶
Upload the contents of your new github repository for basic_stats to the assignment upload. Paste the URL for your repository into the assignment submission text box, so I can have a look at your repo.