Feel free to play around with the code below.
from sklearn.neighbors import KNeighborsRegressor
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import load_boston
from sklearn.pipeline import Pipeline
import matplotlib.pylab as plt
X, y = load_boston(return_X_y=True)
pipe = Pipeline([
("scale", StandardScaler()),
("model", KNeighborsRegressor(n_neighbors=1))
])
pred = pipe.fit(X, y).predict(X)
plt.scatter(pred, y)
Note the effect of setting n_neighbors
. What does the plot tell us?
Is it giving us a trustworthy summary?