Scikit-Learn is possibly the most popular machine learning framework in the world. In this series of videos we'd like to give an overview of the main features and how you can use the framework to approach most machine learning problems. Do watch all the videos because we also want to highlight the dangers of it.


from sklearn.neighbors import KNeighborsRegressor
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import load_boston
from sklearn.pipeline import Pipeline
import matplotlib.pylab as plt

X, y = load_boston(return_X_y=True)

pipe = Pipeline([
    ("scale", StandardScaler()),
    ("model", KNeighborsRegressor(n_neighbors=1))
pred = pipe.predict(X)
plt.scatter(pred, y)

Note the effect of setting n_neighbors. What does the plot tell us? Is it giving us a trustworthy summary?

