These lecture notebooks are here:
https://github.com/spacetelescope/pylunch/tree/master/session3
A prettified version is here:
http://spacetelescope.github.io/pylunch
If you have not yet signed to the mailing list, please do so here:
http://bit.ly/stsci-pylunch-signup
If you want to do a more challenging set of exercises:
https://github.com/spacetelescope/pylunch/blob/master/session3/numpy100-qs.ipynb
NumPy is an acronym for "Numeric Python" or "Numerical Python". It is an open source extension module for Python, which provides fast precompiled functions for mathematical and numerical routines. Furthermore, NumPy enriches the programming language Python with powerful data structures for efficient computation of multi-dimensional arrays and matrices. The implementation is even aiming at huge matrices and arrays. Besides that the module supplies a large library of high-level mathematical functions to operate on these matrices and arrays.
Important subtle difference from e.g., MATLAB or IDL: numpy arrays are not part of the core language. So they can be developed, extend, and modified without installing a new python.
Advantages of using Numpy with Python:
But before we dive into Numpy, let's take a detour through Python data types...
# Numbers can be integers (including long), float, complex or boolean
a = 5
print('Integer:', a, type(a))
b = 51924361948403939480293840938
print('Long integer:', b, type(b))
c = 10.7
print('Float', c, type(c))
t, f = True, False
print('Boolean: ',t, type(t))
d = 9.322e-36j
print('Complex:', d, type(d))
Integer: 5 <class 'int'> Long integer: 51924361948403939480293840938 <class 'int'> Float 10.7 <class 'float'> Boolean: True <class 'bool'> Complex: 9.322e-36j <class 'complex'>
# Strings are straight-forward as we saw with 'Hellow World!'
string = 'A moose once bit my sister.'
print(string)
A moose once bit my sister.
# Lists are the ones most similar to arrays, but not quite.
ll = [1.2, 23.6, 'foo', 11] ### <---- lists use square brackets!
print(ll)
# Lists are easy to append! Use in cases where you do not know the size of the input array!
ll.append('temp')
print(ll)
import numpy as np
tt = np.random.rand(20)
num = []
for t in tt:
if t > 0.5:
num.append(t)
print(len(num))
[1.2, 23.6, 'foo', 11] [1.2, 23.6, 'foo', 11, 'temp'] 6
# Tuples: similar to lists, but have an interesting quality: once created they cannot be changed.
# i.e., they are "immutable", they cannot be sorted, appended, etc.
# This is good for certain cases (e.g. they can be keys to a dictionary, let you "protect" data),
# but generally are useful for scientific computing than lists/arrays
tup = (1,2,3,6.7) ### <---- use rounded brackets for tuples
print(tup, type(tup))
(1, 2, 3, 6.7) <class 'tuple'>
# Dictionaries: we are just mentioning here that they exist. More elsewhere.
dd = {}
dd['lock'] = 1
dd['key'] = 2
print(dd)
{'key': 2, 'lock': 1}
Numpy arrays are a different data type, beyond the five above: the ndarray. Numpy arrays can only contain one type of data but there are lots of options as to what that type is. A full list of Numpy data types can be found here:
http://docs.scipy.org/doc/numpy/user/basics.types.html
list
s are good for.Full list is here: http://docs.scipy.org/doc/numpy-1.10.1/reference/routines.array-creation.html
# values evenly spaced within an interval, specify the STEP:
# np.arange(start, stop, step)
np.arange(0,10,1, dtype=np.float) # if you don't specify the data type, Python will use the one that takes the least space
array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])
# values evenly spaced within an interval, specify the NUMBER OF VALUES:
# np.linspace(star, stop, num=10)
np.linspace(0,9,10, dtype=np.int)
# there is also np.logspace
#np.logspace(0,1,10)
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
# create array from existing data:
a = [1,2,3,4,5]
b = np.array(a)
b
array([1, 2, 3, 4, 5])
# another way to get a pre-filled array is to set all values to ones, zeros, or leave them empty:
a = np.ones(10)
print('Ones:', a)
b = np.zeros(10)
print('Zeros:', b)
c = np.empty(10, dtype=np.str)
print('Empty:', c)
Ones: [ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] Zeros: [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] Empty: ['' '' '' '' '' '' '' '' '' '']
# if you already have an array A and want one that is the same size but with different values,
# here are a couple handy ways to accomplish this
a = [[1,2,3],[4,5,6],[7,8,9]]
b = np.zeros_like(a, dtype=np.float)
print('Zeros: ', b)
c = np.ones_like(a, dtype=np.float)
print('Ones: ', c)
d = np.empty_like(a, dtype=np.float)
print('Empty: ', d)
Zeros: [[ 0. 0. 0.] [ 0. 0. 0.] [ 0. 0. 0.]] Ones: [[ 1. 1. 1.] [ 1. 1. 1.] [ 1. 1. 1.]] Empty: [[ 0. 0. 0.] [ 0. 0. 0.] [ 0. 0. 0.]]
You can explore other options on your own.
1) Create a one-dimensional array, print the length
and shape
of the array
2) Create a two-dimensional array, print the length
and shape
of the array
3) Add the two arrays created above -- what happens?
4) Create a 100 x 100 array of integers, and trim off the top/bottom rows, and left/right columns
5) Write out and use an index array to select out positive values from this array
np.array([1, -1, -2, 3, -5])
6) Experiment with arange
, ones
, and zeros
to create arrays of different shapes
7) Using a boolean array mask, select out the elements of the following array between 5 and 10:
np.array([0.6429498677659073, 1.150547235455569, 1.1915607017440888, 8.283179653420964, 5.1635384867953595, 8.06221365954315, 5.941607350505754, 9.426996923221827, 9.828300195624534, 8.061581259382875, 9.350471376998248, 2.5337332496612266, 3.8933693630535062, 7.854245437743151, 0.7965058455412621, 2.7207245408915623, 4.693244676240291, 1.3620057998648716, 8.880004623574631, 6.504379354779315])
If you come from IDL, you probably LOVE the "where" function. A similar function exists in Numpy:
ll = np.linspace(0,20,20)
idx = np.where(ll > 10)
print('Indexes: ', idx)
print('Selection: ', ll[idx])
Indexes: (array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19]),) Selection: [ 10.52631579 11.57894737 12.63157895 13.68421053 14.73684211 15.78947368 16.84210526 17.89473684 18.94736842 20. ]
But. You should not use it. Why? Because boolean arrays.
a = np.array([1,2,3,5,8,13])
a > 3 # the result is a boolean array!
array([False, False, False, True, True, True], dtype=bool)
# expressions can be combined:
# "|" == "or"
# "&" == "and"
# must use | and & with numpy arrays
print((a > 3) | (a == 1))
print((a > 2) & (a < 10))
[ True False False True True True] [False False True True True False]
# "~" is the inverse operator:
~np.array([True, True, False])
array([False, False, True], dtype=bool)
# How is this useful?
idx = (ra > 11.1324) & (ra < 31.5134)
selected = ra[idx]
not_selected = ra[~idx]
x = np.random.randn(100000)
%timeit np.where(x<3)
The slowest run took 6.62 times longer than the fastest. This could mean that an intermediate result is being cached. 10000 loops, best of 3: 96.7 µs per loop
%timeit x<0
The slowest run took 6.38 times longer than the fastest. This could mean that an intermediate result is being cached. 10000 loops, best of 3: 28.3 µs per loop
# Randoms
# https://docs.scipy.org/doc/numpy-dev/reference/routines.random.html
# Linear algebra:
# https://docs.scipy.org/doc/numpy-dev/reference/routines.linalg.html
# Stats
# https://docs.scipy.org/doc/numpy-dev/reference/routines.statistics.html
http://docs.scipy.org/doc/numpy/
http://docs.scipy.org/doc/numpy/user/quickstart.html
http://docs.scipy.org/doc/numpy/user/basics.html
http://www.python-course.eu/numpy.php