First, thank you @hanyang-21 for all of your great work!
Currently, I'm trying to implement the algorithm you describe myself, running into 2 discrepancies/ambiguities that I'd like clarification on:
1. Definition of the SMC correction term (Switching Control) $\Delta e(t)$
You define the switching control as $-K\cdot\text{sign}(s(t))$ in:
However, you define it as $-K\cdot\frac{s(t)}{|s(t)|_2}$ in:
- Robust Stability Analysis at page 14
Equivalence of the two is claimed in Table 4. However, in general for high dimensional vectors like $s(t),$ the two are not equivalent.
Which should be used? Based on your Robust Stability Analysis, I would assume $-K\cdot\frac{s(t)}{|s(t)|_2},$ though I'd like to confirm this intuition is true.
2. Usage of the Error Signal $e(t)$
In Algorithm 1, at a specific timestep $t,$ the previous error signal $e(t+1)$ is used to estimate a sliding surface. However, on line 12, you apply the update $e(t)\gets e(t)+\Delta e_t.$ This results in 2 possible $e(t).$
Thus, should $e(t+1)$ be the Error Signal prior to the SMC update, $v_{t+1}(c) - v_{t+1}(\emptyset),$ or post update, $(v_{t+1}(c) - v_{t+1}(\emptyset)) + \Delta e_{t+1}$?
Results
Following your python implementation as is, I run into issues where $s(t)$ never shrinks below certain values that seem to depend on the hyperparameters, resulting in sub-par generations.
I've tried to work out why that is in theory:
-
$\Delta e\gets-K\cdot\text{sign}(s(t)).$ For real-valued vectors $s(t),$ the probability that an element is exactly 0 is exceedingly small; thus $\text{sign}(s(t))$ is a vector containing $\pm1,$ thus $|\Delta e|_2\approx k\sqrt D,$ where $D$ is the number of dimensions of $s(t).$ For typical models $D\gg0$.
-
$e(t)_\text{post-update}\gets e(t)_\text{pre-update}+\Delta e.$ Because $e(t)_\text{pre-update}=v_t(c)-v_t(\emptyset)\to0$ near low noise, thus $|e(t)_\text{post-update}|_2\approx|\Delta e|_2\approx k\sqrt D\gg0$ near low noise for moderate $k$.
- On the next step, $s(t-1)=(e(t-1)-e(t))+\lambda\cdot e(t)=e(t-1)_\text{pre-update}+(\lambda-1)e(t)_\text{post-update}.$ As the pre-update error signal shrinks to 0 at low noise, $|s(t-1)|_2\approx(\lambda-1)|e(t)_\text{post-update}|_2\approx (\lambda-1)k\sqrt D\gg 0.$
For most hyperparameters $\lambda\not\approx1, k\not\approx0,$ this contradicts your claim of finite-time convergence to the manifold $s(t)=0,$ which I believe is why I'm getting sub-par generations.
First, thank you @hanyang-21 for all of your great work!
Currently, I'm trying to implement the algorithm you describe myself, running into 2 discrepancies/ambiguities that I'd like clarification on:
1. Definition of the SMC correction term (Switching Control)$\Delta e(t)$
You define the switching control as$-K\cdot\text{sign}(s(t))$ in:
However, you define it as$-K\cdot\frac{s(t)}{|s(t)|_2}$ in:
Equivalence of the two is claimed in Table 4. However, in general for high dimensional vectors like$s(t),$ the two are not equivalent.
Which should be used? Based on your Robust Stability Analysis, I would assume$-K\cdot\frac{s(t)}{|s(t)|_2},$ though I'd like to confirm this intuition is true.
2. Usage of the Error Signal$e(t)$
In Algorithm 1, at a specific timestep$t,$ the previous error signal $e(t+1)$ is used to estimate a sliding surface. However, on line 12, you apply the update $e(t)\gets e(t)+\Delta e_t.$ This results in 2 possible $e(t).$
Thus, should$e(t+1)$ be the Error Signal prior to the SMC update, $v_{t+1}(c) - v_{t+1}(\emptyset),$ or post update, $(v_{t+1}(c) - v_{t+1}(\emptyset)) + \Delta e_{t+1}$ ?
Results
Following your python implementation as is, I run into issues where$s(t)$ never shrinks below certain values that seem to depend on the hyperparameters, resulting in sub-par generations.
I've tried to work out why that is in theory:
For most hyperparameters$\lambda\not\approx1, k\not\approx0,$ this contradicts your claim of finite-time convergence to the manifold $s(t)=0,$ which I believe is why I'm getting sub-par generations.