import numpy as np
import pandas as pd
# use pandas to extract rainfall inches as a NumPy array
rainfall = pd.read_csv('data/Seattle2014.csv')['PRCP'].values
inches = rainfall / 254.0 # 1/10mm > inches
inches.shape
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn; seaborn.set() # set plot styles
plt.hist(inches, 40);
+
, 
, *
, /
, and others on arrays leads to elementwise operations.
NumPy also implements comparison operators such as <
(less than) and >
(greater than) as elementwise ufuncs.
The result of these comparison operators is always an array with a Boolean data type.
All six of the standard comparison operations are available:x = np.array([1, 2, 3, 4, 5])
x < 3 # less than
x > 3 # greater than
x <= 3 # less than or equal
x >= 3 # greater than or equal
x != 3 # not equal
x == 3 # equal
(2 * x) == (x ** 2)
x < 3
, internally NumPy uses np.less(x, 3)
.
A summary of the comparison operators and their equivalent ufunc is shown here:==
np.equal
!=
np.not_equal

<
np.less
<=
np.less_equal

>
np.greater
>=
np.greater_equal
rng = np.random.RandomState(0)
x = rng.randint(10, size=(3, 4))
x
x < 6
x
, the twodimensional array we created earlier.print(x)
True
entries in a Boolean array, np.count_nonzero
is useful:# how many values less than 6?
np.count_nonzero(x < 6)
np.sum
; in this case, False
is interpreted as 0
, and True
is interpreted as 1
:np.sum(x < 6)
sum()
is that like with other NumPy aggregation functions, this summation can be done along rows or columns as well:# how many values less than 6 in each row?
np.sum(x < 6, axis=1)
np.any
or np.all
:# are there any values greater than 8?
np.any(x > 8)
# are there any values less than zero?
np.any(x < 0)
# are all values less than 10?
np.all(x < 10)
# are all values equal to 6?
np.all(x == 6)
np.all
and np.any
can be used along particular axes as well. For example:# are all values in each row less than 8?
np.all(x < 8, axis=1)
sum()
, any()
, and all()
functions. These have a different syntax than the NumPy versions, and in particular will fail or produce unintended results when used on multidimensional arrays. Be sure that you are using np.sum()
, np.any()
, and np.all()
for these examples!&
, 
, ^
, and ~
.
Like with the standard arithmetic operators, NumPy overloads these as ufuncs which work elementwise on (usually Boolean) arrays.np.sum((inches > 0.5) & (inches < 1))
inches > (0.5 & inches) < 1
np.sum(~( (inches <= 0.5)  (inches >= 1) ))
&
np.bitwise_and
 np.bitwise_or

^
np.bitwise_xor
~
np.bitwise_not
print("Number days without rain: ", np.sum(inches == 0))
print("Number days with rain: ", np.sum(inches != 0))
print("Days with more than 0.5 inches:", np.sum(inches > 0.5))
print("Rainy days with < 0.2 inches :", np.sum((inches > 0) &
(inches < 0.2)))
x
array from before, suppose we want an array of all values in the array that are less than, say, 5:x
x < 5
x[x < 5]
True
.# construct a mask of all rainy days
rainy = (inches > 0)
# construct a mask of all summer days (June 21st is the 172nd day)
days = np.arange(365)
summer = (days > 172) & (days < 262)
print("Median precip on rainy days in 2014 (inches): ",
np.median(inches[rainy]))
print("Median precip on summer days in 2014 (inches): ",
np.median(inches[summer]))
print("Maximum precip on summer days in 2014 (inches): ",
np.max(inches[summer]))
print("Median precip on nonsummer rainy days (inches):",
np.median(inches[rainy & ~summer]))
and
and or
on one hand, and the operators &
and 
on the other hand.
When would you use one versus the other?and
and or
gauge the truth or falsehood of entire object, while &
and 
refer to bits within each object.and
or or
, it's equivalent to asking Python to treat the object as a single Boolean entity.
In Python, all nonzero integers will evaluate as True. Thus:bool(42), bool(0)
bool(42 and 0)
bool(42 or 0)
&
and 
on integers, the expression operates on the bits of the element, applying the and or the or to the individual bits making up the number:bin(42)
bin(59)
bin(42 & 59)
bin(42  59)
1 = True
and 0 = False
, and the result of &
and 
operates similarly to above:A = np.array([1, 0, 1, 0, 1, 0], dtype=bool)
B = np.array([1, 1, 1, 0, 1, 1], dtype=bool)
A  B
or
on these arrays will try to evaluate the truth or falsehood of the entire array object, which is not a welldefined value:A or B
ValueError Traceback (most recent call last)
<ipythoninput385d8e4f2e21c0> in <module>()
> 1 A or B
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

or &
rather than or
or and
:x = np.arange(10)
(x > 4) & (x < 8)
ValueError
we saw previously:(x > 4) and (x < 8)