import numpy as np
def selection_sort(x):
for i in range(len(x)):
swap = i + np.argmin(x[i:])
(x[i], x[swap]) = (x[swap], x[i])
return x
x = np.array([2, 1, 4, 3, 5])
selection_sort(x)
def bogosort(x):
while np.any(x[:-1] > x[1:]):
np.random.shuffle(x)
return x
x = np.array([2, 1, 4, 3, 5])
bogosort(x)
np.sort
and np.argsort
sort
and sorted
functions to work with lists, we won't discuss them here because NumPy's np.sort
function turns out to be much more efficient and useful for our purposes.
By default np.sort
uses an , quicksort algorithm, though mergesort and heapsort are also available. For most applications, the default quicksort is more than sufficient.np.sort
:x = np.array([2, 1, 4, 3, 5])
np.sort(x)
sort
method of arrays:x.sort()
print(x)
argsort
, which instead returns the indices of the sorted elements:x = np.array([2, 1, 4, 3, 5])
i = np.argsort(x)
print(i)
x[i]
axis
argument. For example:rand = np.random.RandomState(42)
X = rand.randint(0, 10, (4, 6))
print(X)
# sort each column of X
np.sort(X, axis=0)
# sort each row of X
np.sort(X, axis=1)
np.partition
function. np.partition
takes an array and a number K; the result is a new array with the smallest K values to the left of the partition, and the remaining values to the right, in arbitrary order:x = np.array([7, 2, 3, 1, 6, 5, 4])
np.partition(x, 3)
np.partition(X, 2, axis=1)
np.argsort
that computes indices of the sort, there is a np.argpartition
that computes indices of the partition.
We'll see this in action in the following section.argsort
function along multiple axes to find the nearest neighbors of each point in a set.
We'll start by creating a random set of 10 points on a two-dimensional plane.
Using the standard convention, we'll arrange these in a array:X = rand.rand(10, 2)
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn; seaborn.set() # Plot styling
plt.scatter(X[:, 0], X[:, 1], s=100);
dist_sq = np.sum((X[:, np.newaxis, :] - X[np.newaxis, :, :]) ** 2, axis=-1)
# for each pair of points, compute differences in their coordinates
differences = X[:, np.newaxis, :] - X[np.newaxis, :, :]
differences.shape
# square the coordinate differences
sq_differences = differences ** 2
sq_differences.shape
# sum the coordinate differences to get the squared distance
dist_sq = sq_differences.sum(-1)
dist_sq.shape
dist_sq.diagonal()
np.argsort
to sort along each row. The leftmost columns will then give the indices of the nearest neighbors:nearest = np.argsort(dist_sq, axis=1)
print(nearest)
np.argpartition
function:K = 2
nearest_partition = np.argpartition(dist_sq, K + 1, axis=1)
plt.scatter(X[:, 0], X[:, 1], s=100)
# draw lines from each point to its two nearest neighbors
K = 2
for i in range(X.shape[0]):
for j in nearest_partition[i, :K+1]:
# plot a line from X[i] to X[j]
# use some zip magic to make it happen:
plt.plot(*zip(X[j], X[i]), color='black')