Instead of a SGDRegressor
you may also consider using a PassiveAgressiveRegressor
.
It's an algorithm that has a different update mechanic and at times
this method may converge faster. We're mainly going to play with the stepsize parameter in
this video to get a feeling on system convergence.
We've added an extra cell to the notebook that contains the loop that switches between a cold and a warm stepsize.
from sklearn.linear_model import PassiveAggressiveRegressor
# Set jump coefficients
c_cold, c_warm = 0.1, 0.01
# Run for Stats
mod_pac = PassiveAggressiveRegressor(C=c_cold)
data = []
for i, x in enumerate(X_train):
mod_pac.partial_fit([x], [y_train[i]])
data.append({
'c0': mod_pac.intercept_[0],
'c1': mod_pac.coef_.flatten()[0],
'c2': mod_pac.coef_.flatten()[1],
'mse_test': np.mean((mod_pac.predict(X_test) - y_test)**2),
'normal_mse_test': normal_mse_test,
'i': i
})
if i == 500:
mod_pac.C = c_warm
df_stats = pd.DataFrame(data)
We've also added a cell that plots the original SGD rates and the new one.
alt.data_transformers.disable_max_rows()
pltr1 = (pd.melt(df_stats[['i', 'c1', 'c2']], id_vars=["i"]))
pltr2 = (pd.melt(df_stats[['i', 'normal_mse_test', 'mse_test']], id_vars=["i"]))
q1 = (alt.Chart(pltr1, title='PA evolution of weights')
.mark_line()
.encode(x='i', y='value', color='variable', tooltip=['i', 'value', 'variable'])
.properties(width=300, height=150)
.interactive())
q2 = (alt.Chart(pltr2, title='PA evolution of mse')
.mark_line()
.encode(x='i', y='value', color='variable', tooltip=['i', 'value', 'variable'])
.properties(width=350, height=150)
.interactive())
(p1 | p2) & (q1 | q2)
More Reading
If you'd like to read more about the effect of the warm/cold stepsizes you might enjoy reading this blogpost.