aboutsummaryrefslogtreecommitdiff
path: root/linear_regression/polynomial.org
diff options
context:
space:
mode:
Diffstat (limited to 'linear_regression/polynomial.org')
-rw-r--r--linear_regression/polynomial.org18
1 files changed, 10 insertions, 8 deletions
diff --git a/linear_regression/polynomial.org b/linear_regression/polynomial.org
index 7201122..921d755 100644
--- a/linear_regression/polynomial.org
+++ b/linear_regression/polynomial.org
@@ -15,19 +15,21 @@ h_w(x) = w_1 + w_2x + w_3x^2
Then, we should define a cost function. A common approach is to use the *Mean Square Error*
cost function:
\begin{equation}\label{eq:cost}
- J(w) = \frac{1}{2n} \sum_{i=0}^n (h_w(x^{(i)}) - \hat{y}^{(i)})^2
+ J(w) = \frac{1}{2n} \sum_{i=0}^n (h_w(x^{(i)}) - y^{(i)})^2
\end{equation}
-Note that in Equation \ref{eq:cost} we average by $2n$ and not $n$. This is because it get simplify
-while doing the partial derivatives as we will see below. This is a pure cosmetic approach which do
-not impact the gradient decent (see [[https://math.stackexchange.com/questions/884887/why-divide-by-2m][here]] for more informations). The next step is to $min_w J(w)$
-for each weight $w_i$ (performing the gradient decent). Thus we compute each partial derivatives:
+With $n$ the number of observations, $x^{(i)}$ the value of the independant variable associated with
+the observation $y^{(i)}$. Note that in Equation \ref{eq:cost} we average by $2n$ and not $n$. This
+is because it simplify the partial derivatives expression as we will see below. This is a pure
+cosmetic approach which do not impact the gradient decent (see [[https://math.stackexchange.com/questions/884887/why-divide-by-2m][here]] for more informations). The next
+step is to $min_w J(w)$ for each weight $w_i$ (performing the gradient decent, see [[https://towardsdatascience.com/gradient-descent-demystified-bc30b26e432a][here]]). Thus we
+compute each partial derivatives:
\begin{align}
\frac{\partial J(w)}{\partial w_1}&=\frac{\partial J(w)}{\partial h_w(x)}\frac{\partial h_w(x)}{\partial w_1}\nonumber\\
- &= \frac{1}{n} \sum_{i=0}^n (h_w(x^{(i)}) - \hat{y}^{(i)})\\
+ &= \frac{1}{n} \sum_{i=0}^n (h_w(x^{(i)}) - y^{(i)})\\
\text{similarly:}\nonumber\\
- \frac{\partial J(w)}{\partial w_2}&= \frac{1}{n} \sum_{i=0}^n x(h_w(x^{(i)}) - \hat{y}^{(i)})\\
- \frac{\partial J(w)}{\partial w_3}&= \frac{1}{n} \sum_{i=0}^n x^2(h_w(x^{(i)}) - \hat{y}^{(i)})
+ \frac{\partial J(w)}{\partial w_2}&= \frac{1}{n} \sum_{i=0}^n x(h_w(x^{(i)}) - y^{(i)})\\
+ \frac{\partial J(w)}{\partial w_3}&= \frac{1}{n} \sum_{i=0}^n x^2(h_w(x^{(i)}) - y^{(i)})
\end{align}