What is k-Nearest Neighbors (k-NN)?
k-Nearest Neighbors (k-NN) is one of the simplest and most intuitive supervised learning algorithms.
It’s used for classification (predicting categories) and regression (predicting continuous values).
The idea:
Store all training data.
To predict a new point, look at its k closest neighbors (using distance, usually Euclidean).
For classification: take a majority vote of neighbors’ classes.
For regression: take the average of neighbors’ values.
Example with Iris 🌸
Suppose k=3.
A new flower is measured:
[5.1, 3.5, 1.4, 0.2].The algorithm finds the 3 closest flowers in training data.
If 2 are Setosa and 1 is Versicolor, prediction = Setosa.
👉 k-NN is called a “lazy learner” because it doesn’t build a mathematical model; it just stores the training set and uses it when making predictions.
Choosing k
Small k (like 1) → very sensitive to noise (overfits).
Large k → smoother, but may miss details (underfits).
Usually odd k (3, 5, 7) is chosen to avoid ties.
Example: Classifying a New Flower with k-NN
Training Data (simplified)
Suppose we only have 6 flowers in our training set:
| Sepal Length | Sepal Width | Petal Length | Petal Width | Species (Label) |
|---|---|---|---|---|
| 5.1 | 3.5 | 1.4 | 0.2 | Setosa (0) |
| 4.9 | 3.0 | 1.4 | 0.2 | Setosa (0) |
| 5.8 | 2.7 | 4.1 | 1.0 | Versicolor (1) |
| 6.0 | 2.7 | 5.1 | 1.6 | Versicolor (1) |
| 6.3 | 3.3 | 6.0 | 2.5 | Virginica (2) |
| 5.8 | 2.7 | 5.1 | 1.9 | Virginica (2) |
🌸 New Flower to Classify
Features = [5.7, 3.0, 4.2, 1.2]
(We don’t know the species, that’s what we want to predict.)
Step 1: Compute Distances
We use Euclidean distance:
![]()
For example, distance from new flower to the first Setosa:
![]()
Simplifying step by step:
![]()
![]()
![]()
👉 Do this for all 6 training samples (I won’t calculate all here, but you get the idea).
Step 2: Find Nearest Neighbors
Suppose the 3 smallest distances (k=3) are to:
A Versicolor (distance ~0.45)
Another Versicolor (distance ~0.9)
A Virginica (distance ~1.3)
Step 3: Majority Voting
Versicolor = 2 votes
Virginica = 1 vote
Setosa = 0 votes
✅ Predicted Class = Versicolor (1)
Why Scaling Matters in k-NN
1. k-NN is based on distance
Prediction is made by finding the closest neighbors in feature space.
So, features with larger numeric ranges contribute more to the distance.
2. Example of Imbalance
Suppose we have two features to classify fruits:
Weight (in grams) → ranges from 100g to 1000g.
Color (encoded 0=green, 1=red).
Now compare two fruits:
Fruit A =
[150, 0](150g, green)Fruit B =
[900, 1](900g, red)New Fruit =
[160, 1](160g, red)
Distances:
To A:



To B:



👉 The “color” difference (0 vs 1) barely matters compared to the huge “weight” difference.
Even though color is very important to classify fruits, it gets ignored.
3. Effect: Bias in Distance
Features with large scales dominate.
Features with small scales are ignored.
Model may perform badly because it uses the “wrong” feature importance.
Solution: Feature Scaling
Two common preprocessing techniques:
🔹 Normalization (Min-Max Scaling)
Rescales values into the range [0,1].
![]()
Example: If weight ranges from 100–1000, then
![]()
Now both weight and color are in comparable ranges.
🔹 Standardization (Z-score scaling)
Centers features around 0 with standard deviation 1:
![]()
Example: If petal length mean = 3.7 cm and std = 1.7, then
![]()
After scaling, each feature contributes equally in distance calculations.
Visual Example (Iris)
Without scaling, suppose:
Sepal length (cm) ranges from 4–8.
Petal length (cm) ranges from 1–7.
Petal length has a larger spread, so distance is mostly determined by petal length.
After scaling, both features are equally important.
✅ Summary
k-NN is distance-based, so scaling is critical.
Without scaling, large-valued features dominate.
Normalization and standardization put features on the same “footing.”
Always scale features when using k-NN, SVM, clustering, PCA, etc.
import numpy as np
from sklearn.neighbors import KNeighborsClassifier
# Training data (features)
X_train = np.array([
[5.1, 3.5, 1.4, 0.2],
[4.9, 3.0, 1.4, 0.2],
[5.8, 2.7, 4.1, 1.0],
[6.0, 2.7, 5.1, 1.6],
[6.3, 3.3, 6.0, 2.5],
[5.8, 2.7, 5.1, 1.9]
])
y_train = np.array([0, 0, 1, 1, 2, 2]) # Labels: 0=Setosa, 1=Versicolor, 2=Virginica
# New flower
X_new = np.array([[5.7, 3.0, 4.2, 1.2]])
# Train k-NN with k=3
knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X_train, y_train)
# Prediction
prediction = knn.predict(X_new)
print("Predicted class:", prediction)
# Output: Predicted class: [1] → Versicolor
Normalization
import numpy as np
from sklearn.preprocessing import MinMaxScaler, StandardScaler
# Example data: weight (100–1000), color (0 or 1)
X = np.array([[150, 0], [900, 1], [160, 1]])
# Min-Max Normalization
scaler = MinMaxScaler()
X_norm = scaler.fit_transform(X)
print("Normalized:\n", X_norm)
# Standardization
scaler = StandardScaler()
X_std = scaler.fit_transform(X)
print("Standardized:\n", X_std)
