Let $E(\mathbf{x})$ represent an objective function to be minimized with the constraint that $G(\mathbf{x}) = 0$
If there is some optimal point $\mathbf{x}_0$, then $E(\mathbf{x}_0 + \epsilon \mathbf{v}) \geq E(\mathbf{x}_0)$ for all $\mathbf{v}$ where $G(\mathbf{x}_0 + \epsilon \mathbf{v}) = 0$
Therefore if $G$ does not change in the direction $\mathbf{v}$, then $E$ must not change and vice versa.
$dG|_{\mathbf{x}_0} ⟦ \mathbf{v} ⟧ = 0 $ implies $dE|_{\mathbf{x}_0} ⟦ \mathbf{v} ⟧ = 0$
$dE|_{\mathbf{x}_0} ⟦ \mathbf{v} ⟧ = 0 $ implies $dG|_{\mathbf{x}_0} ⟦ \mathbf{v} ⟧ = 0$
Therefore $\frac{\partial E}{\partial \mathbf{x}}$ is orthogonal to all vectors orthogonal to $\frac{\partial G}{\partial \mathbf{x}}$ at $\mathbf{x}_0$
This can only happen if $\frac{\partial G}{\partial \mathbf{x}}$ is a scalar multiple of $\frac{\partial G}{\partial \mathbf{x}}$
So at the optimal point $\mathbf{x}_0$, there exists some $\lambda$, where $\frac{\partial E}{\partial \mathbf{x}} + \lambda \frac{\partial G}{\partial \mathbf{x}} = 0$
Let $L(\mathbf{z}) = L(\mathbf{x}, \lambda) = E(\mathbf{x}) + \lambda G(\mathbf{x})$
The optimal value $\mathbf{x}_0$ is where $\frac{\partial L}{\partial \mathbf{z}} = 0$
Now let us assume that instead of parameters to be optimized, there is a path of a particle $y$.
The goal is to choose the optimal value for $y$ such that the energy functional is minimized and the constraint functional is satisfied.
Minimize $E(y) = \int_0^a f(y(t), \dot y (t), t) dt$
such that $G(y) = \int_0^a g(y(t), \dot y (t), t) dt = 0$
Let $L(y, \dot y, t, \lambda) = f(y(t), \dot y(t), t)dt + \lambda g(y(t), \dot y(t), t)$
Using the KKT conditions, the goal is to optimize $S(y,\zeta) = \int_0^a L(y, \dot y, t, \lambda) dt$
For convenience, the $\lambda$s will be included as part of $y$
Let $v$ represent a minor alteration to the path $y$, where the altered path is $v(t) + y(t)$. Also we are assuming the start and end of the path is the same so $v(a) = v(0) = 0$.
If a path $y$ is optimal, then minor alterations to the path which satisfy the constraint have a higher value $E(y)$.
$\frac{\delta S}{\delta y}(v) = lim_{\epsilon \to 0} \frac{1}{\epsilon} \int_0^a L(y+\epsilon v, \frac{d}{dt}(y + \epsilon v), t) - L(y, \dot y, t) dt = 0$
$= \int_0^a lim_{\epsilon \to 0} \frac{1}{\epsilon} [L(y+\epsilon v, \frac{d}{dt}(y + \epsilon v), t) - L(y, \dot y, t) dt] $
$= \int_0^a \frac{\partial L}{\partial y} v(t) + \frac{\partial L}{\partial \dot y} \dot v dt $
$ = \int_0^a \frac{\partial L}{\partial y} v(t) dt + \int_0^a \frac{\partial L}{\partial \dot y} \dot v dt $
Using integration by parts
$ = \int_0^a \frac{\partial L}{\partial y} v(t) dt + [ \frac{\partial L}{\partial \dot y} v(t) ]|_0^a - \int_0^a \frac{d}{dt}(\frac{\partial L}{\partial \dot y}) v(t) dt $
$ = \int_0^a \frac{\partial L}{\partial y} v(t) dt + (0 - 0) - \int_0^a \frac{d}{dt} (\frac{\partial L}{\partial \dot y}) v(t) dt $
$ = \int_0^a v(t)[ \frac{\partial L}{\partial y} - \frac{d}{dt}(\frac{\partial L}{\partial \dot y}) ]dt = 0 $
Since $v(t)$ can be non-zero at all time points except for $0$ and $a$,
$\frac{\partial L}{\partial y} - \frac{d}{dt}(\frac{\partial L }{\partial \dot y}) = 0 $