The Enthought PDF slides give further detail on the different types of indexing that exist.¶

In [ ]:

# PYTHON
# Import modules
import numpy as np
import matplotlib.pyplot as plt

In [2]:

# R. We only need base R for this, because the matrix library is part of base R.
# Import modules

Creating 1D arrays with array() arange() and linspace()¶

In [ ]:

# Python
# Use array() to create arrays of any dimension if you already know or have the values to put 
# into the array.
x = np.array([5,4,3,2,1])

# Linspace inputs are start, stop, # of elements
xls = np.linspace(0,100,100)

# arange inputs are start,stop,interval
xar = np.arange(0,100,0.9999)

#print(xar.shape)
#xls.shape

# This term tells you to pull the last value of the array out
#print( xar[-1], xls[-1])

In [ ]:

# R
# Use array() to create arrays of any dimension if you already know or have the values to put 
# into the array.
x <- c(5, 4, 3, 2, 1)

# seq inputs are from, to, and length.out
xls <- seq(0,100,100)
xls <- seq(from = 0, to = 100, length.out = 100)


# arange inputs are start,stop,interval
xar <- seq(from = 0, to = 100, by = 0.9999)

#print(xar.shape)
dim(xar)

# This term tells you to pull the last value of the array out
#print( xar[-1], xls[-1])

Numpy can store numeric information (usually float() or int() data types) in 2, 3 or even N- dimensional arrays. Note that the indexing of 2D arrays goes like [row #, col #], e.g. a[3,2] gives the element at row=4 and column=3.

Creating 2D arrays with array(), zeros(), ones()¶

In [ ]:

# PYTHON
# Assembling a 2D array by concatenating 1D arrays.
x = np.array([[1,2,3],[3,4,5]])
print(x.shape)

# Currently this is a 1D array
y = np.array([1,2,3])

# Sometimes you need to set an array up to be 2D, so you can add data to it later.
# This should be a 2D array
y2 = np.array([1,2,3],ndmin = 2)

#print("Y has",y.ndim,"dimensions. Y2 has",y2.ndim,"dimensions")

In [ ]:

# R
# Assembling a 2D array by concatenating 1D arrays.
x <- matrix(c(1, 2, 3, 3, 4, 5), nrow = 2, byrow = TRUE)
print(dim(x))

# Currently this is a 1D array
y <- c(1, 2, 3)

# Sometimes you need to set an array up to be 2D, so you can add data to it later.
# This should be a 2D array
y2 <- matrix(c(1, 2, 3), nrow = 1)

# Print the dimensions
print(paste("Y has", length(dim(y)), "dimensions. Y2 has", length(dim(y2)), "dimensions"))

In [ ]:

y*y2

Arithmetic on Arrays (element-wise or linear algebra)¶

By default, Numpy will try to carry out element-wise arithmetic (+,-,*,/) on arrays of like dimension. Where possible, Numpy will also use array broadcasting to make the operation work.

In [ ]:

# PYHON AND R
y + y2  #This is permitted.  It takes on the higher dimensions.
y * y2  #Likewise permitted.  It takes on the higher dimensions.

In [ ]:

# PYTHON
# Examples of array manipulations.

R = np.arange(0,100,12)  # Create a vector of 9 elements

# Element-wise operation.  P is the same size as R.
P = R*R

# Impliclit element-wise operation.

Q = P.copy() - 3
Q = P*R

# Make R into a 3 x 3 matrix (2D array), and store it in S.
S = R.reshape(3,3)

# An array multiplication with broadcast operation.  
T = S*np.array([3,2,1])
#print(T)

In [ ]:

# R
# Examples of array manipulations.
R <- seq(from = 0, to = 100, by = 12)  # Create a vector of 9 elements

# Element-wise operation.  P is the same size as R.
P <- R * R

# Implicit element-wise operation.
Q <- P - 3
Q <- P * R

# Make R into a 3 x 3 matrix (2D array), and store it in S.
S <- matrix(R, nrow = 3, byrow = TRUE)

# An array multiplication with broadcast operation. 
T <- S * c(3, 2, 1)
print(T)

In [ ]:

# PYTHON
# (Object-oriented notation, Functional notation)
print(T.max(axis=0),np.max(T,axis=0) )  #Take the max along the row axis

In [ ]:

# R
# Object-oriented notation
print(apply(T, 1, max), max(T, along = 1))  # Take the max along the row axis

In [ ]:

help(apply)

Combining arrays for data wrangling.¶

In [ ]:

#In general, when concatenating (merging or pasting together) arrays they must have the same shape and same dimensions 

#help(np.concatenate)
np.concatenate((y,y2))  # Not permitted.

np.concatenate((y[np.newaxis,:],y2))   #Expand the dimensions of y before concatenating.

# Stack vertically.  This has same effect as concatenate
np.vstack((y,y2))   # Permitted, because arrays have the same column dimensions


# Stack horizontally.  
np.hstack((y[np.newaxis,:],y2))   # Not Permitted, because y and y2 have the different row dimensions

#np.hstack((y[np.newaxis,:],y2))

In [ ]:

# R
# In general, when concatenating (merging or pasting together) arrays they must have the same shape and same dimensions 
## GEMINI NOTES:
# In R, the equivalents for concatenating arrays along different axes are:
#
#    c(): Concatenates elements along the first dimension (rows).
#    rbind(): Concatenates matrices or data frames row-wise.
#    cbind(): Concatenates matrices or data frames column-wise.
#
# The rbind() function is used to expand the dimensions of a vector before concatenating it with a matrix, 
# similar to the np.newaxis operation in Python.

# help(np.concatenate)
c(y, y2)  # Not permitted.

rbind(y, y2)  # Expand the dimensions of y before concatenating.

# Stack vertically.  This has same effect as concatenate
rbind(y, y2)  # Permitted, because arrays have the same column dimensions

# Stack horizontally. 
cbind(y, y2)  # Not Permitted, because y and y2 have the different row dimensions

Indexing and boolean operations for 2D arrays¶

In [ ]:

# PYTHON

z = np.ones((100,50)) # Make a 2D array of ones that is 100 x 50.

# Index individual row or column in 2D array

# Save a single row of z to a new variable
zr = z[9,:]

# Save a single column of z to a new variable
zc = z[:,9]

print(zc.shape,zr.shape, z.shape)

In [ ]:

# R

z <- matrix(1, nrow = 100, ncol = 50) # Make a 2D array of ones that is 100 x 50.

# Index individual row or column in 2D array

# Save a single row of z to a new variable
zr <- z[9, ]

# Save a single column of z to a new variable
zc <- z[, 9]

print(dim(zc), dim(zr), dim(z))

In [ ]:

#print(z)

In [ ]:

# Make a 2D column vector with 10 elements in it. 
a = 3.2*np.ones([10,1])

# Copy that column vector 10 times to make a square array.
b = np.tile(a,10)


# Make a vector of 10 elements and then place them in the diagonal of a 10 x 10 square array.
c = np.ones(10)*100
d = np.diag(c)

# Use the Matplotlib spy() function to visualize the array b+d
plt.spy(b+d,precision=10,markersize=10)
plt.show()

#print(b+d)


#np.random.randn(10)

In [ ]:

# Make a 2D column vector with 10 elements in it.
a <- 3.2 * matrix(1, nrow = 10, ncol = 1)

# Make a 10 x 10 array.
b <- 3.2 * matrix(3.2, nrow = 10, ncol = 10)


# Make a vector of 10 elements and then place them in the diagonal of a 10 x 10 square array.
c <- 100 * rep(1, 10)
d <- diag(c)

# Use equivalent R functions to visualize the array b + d
library(ggplot2)  # Load ggplot2 for visualization

# Create a data frame for ggplot
data <- data.frame(x = seq(1, 10), y = seq(1, 10), z = b + d)

# Create the plot (I was not able to find the equivalent solution in R).
# ggplot(data, aes(x = x, y = y, fill = z)) +
#  geom_raster() +
#  scale_fill_gradient(name = "Values", low = min(b + d), high = max(b + d)) +
#  coord_fixed() +
#  labs(title = "b + d", x = "X-axis", y = "Y-axis") +
#  theme_void()

# No need for plt.show() in R, the plot is displayed automatically

In [ ]:

#help(t)

In [ ]:

# PYTHON AND R
# Check out shape, ndim, dtype
z = d+b

#print(z.shape)

#print(z.dtype)

#print(z)

In [ ]:

# PYTHON
# Use boolean operators to change values.

z2 = z.copy()

# Find all the values equa1 to 1.
id1 = (z2 == 1)

# id1 now has a boolean record of which values are ==1.
print(id1)

In [ ]:

# R
# Use boolean operators to change values.

z2 = z

# Find all the values equa1 to 1.
id1 <- z2 == 1

# id1 now has a boolean record of which values are ==1.
print(id1)
print(z2)

In [ ]:

# PYTHON AND R
# Let's change all the elements equal to 1.

z2[id1] = 3.14159

#z2[z2 == 3.14159] = np.nan

print(z2)

Let's try this exercise together. How many ways can this be solved, algorithmically?

Algorithm 1:

Create a
Divide all elements by 3.
Look for values with a remainder of zero.
Create boolean array to subindex.

Algorithm 2:

Create a
Divide all elements in a by 3.
Check to see which elements are equal to their integer counterparts.

Finish the notebook by solving the cells below with code.¶

In [ ]:

# Convert all the values of z2 that are > 99 into NaNs.

In [ ]:

# Make a 2D numpy array named Arr of size 10 x 10 and fill it with random values that range between 0 and 99.  You can use numpy's random module

In [ ]:

# Use boolean indexing to replace all the values in Arr greater than 80 and less than 20 with NaNs.

In [ ]:

# Use the append() or concatenate() commands in numpy to add more columns to Arr.

Concept Review: More looping practice

In [ ]:

# We already saw that np.diag() can insert elements along the diagonal of a square array.
# Use your understanding of for loops to carry out the same operation.
# Create a 10 x 10 array of ones and then modify the center diagonal to be 101 instead of 1.
# Hint: You will need two indices, e.g i and j to specify the row and column to modify.
# Hint: You can use a boolean operator to decide which elements in the square array to modify.