# Data
from sklearn.datasets import make_blobs
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from matplotlib import font_manager
"C:\Windows\Fonts\FiraSans-Regular.ttf")
font_manager.fontManager.addfont("font.family"] = "Fira Sans"
plt.rcParams[283)
np.random.seed(
= 120
n = 3
p_features
= make_blobs(
X, y =n,
n_samples=p_features - 1,
n_features=[(-1.7, -1.7), (1.7, 1.7)]
centers
)
= plt.scatter(X[:, 0], X[:, 1], c=y)
fig = plt.xlabel("Feature 1")
xlab = plt.ylabel("Feature 2") ylab
Image credits: BruceBlaus under CC BY 3.0 via Wikimedia Commons.
Python source: 451-blog/perceptron.py at main · doabell/451-blog
Instructions can be found at Implementing the Perceptron Algorithm.
Experiments
Linearly separable data
Through trial and error, we generate some linearly separable data and test the perceptron algorithm on it.
There is a clear line that separates the two blobs here.
Let us find out if our perceptron algorithm can find that line.
First, we fit the algorithm with at most 1000 steps:
# Algorithm
from perceptron import Perceptron
= Perceptron()
p =1000) p.fit(X, y, max_steps
Then, we can visualize the accuracy of each iteration:
# score history
= plt.plot(p.history)
fig = plt.xlabel("Iteration")
xlab = plt.ylabel("Accuracy") ylab
We can see that for this example, w only changed 4 times.
Then the model reached 100% accuracy and stopped running.
We can then visualize the dividing line found by the perceptron:
# visualize
def draw_line(w, x_min, x_max):
= np.linspace(x_min, x_max, 101)
x = -(w[0]*x + w[2])/w[1]
y ="black")
plt.plot(x, y, color
= plt.scatter(X[:, 0], X[:, 1], c=y)
fig = draw_line(p.w, -2, 2)
fig
= plt.xlabel("Feature 1")
xlab = plt.ylabel("Feature 2") ylab
# score, should be 1
p.score(X, y)
1.0
That line divided the two blobs right in the middle, resulting in a 1.0 accuracy. Not bad!
Not linearly separable
Next, we consider 2-d data that is not linearly separable.
227)
np.random.seed(
= 120
n = 3
p_features
= make_blobs(
X, y =n,
n_samples=p_features - 1,
n_features=[(-1.7, -1.7), (1.7, 1.7)]
centers
)
= plt.scatter(X[:, 0], X[:, 1], c=y)
fig = plt.xlabel("Feature 1")
xlab = plt.ylabel("Feature 2") ylab
The dividing line should be at about the same position, but now the two blobs overlap with each other.
Let us observe whether the perceptron algorithm still manages to find that line.
# Algorithm
= Perceptron()
p =1000)
p.fit(X, y, max_steps
# score history
= plt.plot(p.history)
fig = plt.xlabel("Iteration")
xlab = plt.ylabel("Accuracy") ylab
We can see that given data that is not linearly separable, w does not converge.
Instead, the algorithm iterates the full 1000 iterations, and accuracy fluctuates between around 0.95 and 0.97.
This is expected, because there is no line that perfectly separates these two blobs.
As for the dividing line:
# visualize
def draw_line(w, x_min, x_max):
= np.linspace(x_min, x_max, 101)
x = -(w[0]*x + w[2])/w[1]
y ="black")
plt.plot(x, y, color
= plt.scatter(X[:, 0], X[:, 1], c=y)
fig = draw_line(p.w, -2, 2)
fig
= plt.xlabel("Feature 1")
xlab = plt.ylabel("Feature 2") ylab
# score, should not be 1
p.score(X, y)
0.95
At the cost of ~6 misclassified data points, the algorithm yielded an accuracy of 0.95. Good enough.
Higher dimensions
Finally, we increase the number of features to reach higher dimensions.
Visualization will be vard beyond 3-d, so we will instead look at the score history during training only.
We will be using 8 features (including y) and 1500 data points.
149)
np.random.seed(
= 1500
n = 7
p_features
# make centers with correct dimensions
= np.empty(p_features - 1)
c1 -1.5)
c1.fill(
= np.empty(p_features - 1)
c2 1.5)
c2.fill(
= make_blobs(
X, y =n,
n_samples=p_features - 1,
n_features=[c1, c2]
centers
)
# Algorithm
= Perceptron()
p =1000)
p.fit(X, y, max_steps
# score history
= plt.plot(p.history)
fig = plt.xlabel("Iteration")
xlab = plt.ylabel("Accuracy") ylab
# score
p.score(X, y)
1.0
On this dataset, our perceptron algorithm reached 100% accuracy!
As for the accuracy during model fitting, the process took more than 500 iterations, but ended up achieving a perfect score at the end.
Because our perceptron converged, we can say that this particular dataset is linearly separable.
Note on complexity
What is the runtime complexity of a single iteration of the perceptron algorithm update as described by Equation 1? Assume that the relevant operations are addition and multiplication. Does the runtime complexity depend on the number of data points n? What about the number of features p?
The relevant line is Line 51 in perceptron.py
:
self.w = self.w + ((y_[i] * self.w @ X_[i]) < 0) * y_[i] * X_[i]
We can use numpy.ndarray.shape
to find out the shape of these variables involved.
# From line 39
= np.append(X, np.ones((X.shape[0], 1)), 1)
X_
print(f"Shape of self.w is {p.w.shape}")
print(f"Shape of y_[i] is {y[0].shape}")
print(f"Shape of X_[i] is {X_[0].shape}")
Shape of self.w is (7,)
Shape of y_[i] is ()
Shape of X_[i] is (7,)
We can see that self.w
and X_[i]
have length p, while y[i]
is a single integer.
Therefore, the runtime complexity is O(p) and does not depend on the number of data points n.
By intuition, we do not have to look at all the data points when updating self.w
. We only need to do that when determining the score, but that is elsewhere in the function and not described by Equation 1.