Friday, April 1, 2016

For Tom: The GDP Object


[4/3/2016 1:30 update
Tom comments that the drawing shows a "square hyperbola", also known as a "rectangular hyperbola". This is a hyperbola with the relation xy = 1. This is useful in creating a model economy. We can assign one axis to represent money, the second axis to represent transactions. Assuming every transaction can be represented by money, the sum of all transactions multiplied by the sum of all money values will form a square hyperbola if every-possible-combination-that-forms-the-same-constant-value is plotted. This gives us the "GDP Object". 

The GDP Object will be useful in writing a SIM style of spreadsheet model that avoids iteration but yields similar results. This results in easier to understand equations. (I hope!)
(This concept and the enabling equations have not yet passed peer review.)]]

[4/2/2016 10:00 AM update    
"The GDP Object" may not be the best way to characterize the nature of GDP.

GDP (Gross Domestic Product) can be related to at least three different concepts:

1)   A measure of economic activity. It can be considered as the sum of all transactions with a price value. Here, GDP is a defined measurement. If government expenditures are also known, an average tax rate can be calculated.

2)   A theoretical limit. Money supplied by government can be taxed every time it is received. If only one issue is made, money disappears from the private sector and returns to government. Eventually, the entire issue is recovered. Limit GDP is the maximum possible GDP if the tax rate is known.

3)  A  PERIOD theoretical limit. The theoretical limit can be divided into time periods. Each time period will have a different GDP limit based on a period common tax rate and rate for parallel money collectors, and a period-unique beginning money supply.

A characterization of GDP as an "object" nears becoming misdirection. Perhaps we should characterize GDP as a "limiture" (where we combine the words "limit" and "measure"), giving GDP an unique characterization.]

This is for Tom. It is quick and dirty. I am bogged down with detail in another attempt to present the same material.

Figure 1. The GDP Object is the value on the GDP curve at any point in money-transaction space. 

GDP can be considered a limit defined by G and T as in


where T is a dimensionless pure number. G is money and GDP is money. GDP is constant when G and T are defined. Now GDP is an object.

Find GDP for a period

We can use T with a time period dimension to find the GDP expansion for that period. We can count on T being less than one because it takes an infinite time period with infinite transactions to find the GDP limit by series expansion.

If we assume that we have TWO taxing authorities, one authority can be government using the assigned tax rate FOR A PERIOD. The second authority will be assigned a tax rate that captures the remainder of the potential GDP FOR A PERIOD. The remaining GDP potentially available for capture is GDP*(1-T) .

This gives us two equations that capture the entire GDP expansion to limit.

Convert into a series of annual events 

We can convert GDP to annual events (AGDP) by considering every step is a division between two taxing authorities. The primary authority will receive the Annual Tax (AT) share and the second authority (savers) will receive the remainder (AR). Write this in two equations.

(1) AGDP*Tr = AT


(2) AGDP*(1-Tr)*Rr = AR

where Rr is the Remainder "tax" rate.

Notice that the sequence of events is important here. Tax is removed from GDP before a remainder can be calculated.
We know that the sum of the two tax divisions is the original injection by government (G):

(3) AT + AR = G

Substitute  1 and 2 into 3 to get

(4) AGDP*Tr + AGDP*(1-Tr)*Rr = G

Simplify 5 to get

(5) AGDP*( Tr + (1-Tr)*Rr) = G

Now we can define parameter Rr just as we defined the government tax rate. AGDP is constant for the period just as GDP is the constant GDP Object. Once we know AGDP, we can find wealth and every other term as you did using Linear System Analysis.

We can next add wealth to the next period by assuming that wealth is also all spent to create a new GDP Object.  It now becomes repetition to complete the table for as many years as desired.

At this point, I think we may be in correlation with Linear System Analysis

Does this make sense now?



  1. Hi Roger! Thanks for this. Why is T dimensionless? It's taxes, right? That's what it says in your diagram. Shouldn't it be in dollars?

    Have you seen this?
    He explains real GDP in terms of a beer based economy. It's pretty good! Real GDP he measures in "real units" (i.e. "cans of Bud").

    I'll have to think about your post a little more. I'm actually pretty crappy with economics. I don't have a good sense for what all components are. (Young John Handley seems like he's got it nailed down pretty well!).

    You're diagram shows "square hyperbolas" I notice, which reminds me of my conversation with Nick Rowe and Scott Sumner that I turned into a couple of blog posts some time back here:

    Shockingly enough, Frances Coppola left me a few comments there.

    I may be busy with some other things for a few days, but I will come back and revisit, and see if I can sort it out more the second time through. Regarding linear systems analysis: my gut feeling is it won't help you much here. I could be wrong though.

    1. Also I notice there were quite a few more comments and questions left there (on the real GDP explanation post) since I left my question. My question is very specific and he answers in a satisfying way.

    2. Thanks for taking a look!

      You ask "Why is T dimensionless? It's taxes, right?"

      Good question. A very important question. I will think on this and answer tomorrow. Having thought about it some more, I think taxes should have a dimension here. But what in each of three situations?

      I downloaded Octave today. Your code ran the first time I tried it. Amazing. I will study this more.

      Yes, I thought he answered your question well. It seemed to me like he forgot that there was a period when the spent money was freely exchange by money holders. Or did he specifically deny that this possibility existed? I mentioned this, using other words, in a comment.

      I will read your references and give a better answer later.

    3. This comment has been removed by the author.

    4. Octave: great! now you can easily create a random linear system and test it:

      % Parameters: start (feel free to change)
      Nx = 10; % dimension of state space
      Nu = 4; % dimension of input space
      Ny = 2; % dimension of measurement space
      N = 100; % number of samples
      % Parameters: end

      A = randn(Nx,Nx);
      B = randn(Nx,Nu);
      C = randn(Ny,Nx);
      D = randn(Ny,Nu);

      % Notice how the systems matricies nicely fit together
      % in one large (Nx)(Ny) by (Nx)(Nu) rectangle when
      % stacked like this:
      % |A B|
      % |C D|
      % Or to put them together like that for laughs:
      [A B; C D]

      % Now pick a random starting vector (initial state) x0
      x0 = randn(Nx,1);
      % And a random input sequence:
      U = randn(Nu,N);
      % And an impulse (of just one of the inputs):
      ui = zeros(Nu,1);
      ui(1) = 1;
      % Now make space for an output sequence:
      Y = zeros(Ny,N);

      % And finally, let's implement the system:
      % -- first with x0 and the random input U:
      x = x0;
      for k=1:N
      Y(:,k) = C*x + D*U(:,k);
      x = A*x + B*U(:,k);

      % Plot the results:
      figure(1),plot(Y'),title('Random x0 and u');

      % Again but just 1 of the Nu sets of Ny impulse resp.
      x = zeros(Nx,1); % start with an initial x0 = 0;
      U = zeros(Nu,N); % zero out U
      U(:,1) = ui; % just try the one impulse response
      for k=1:N
      Y(:,k) = C*x + D*U(:,k);
      x = A*x + B*U(:,k);

      % Plot the results:
      figure(2),plot(Y'),title('Impulse response');

      % Last thing: did any of the outputs look unstable,
      % i.e. climbing without bound in magnitude? If so then
      % you probably have eigenvalues outside the unit
      % circle. Let's check:

      [V,D] = eig(A); % A = V*D*inv(V), or V*D = A*V
      % V are the eigenvectors and the diagonal elements of
      % D are the corresponding eigenvalues. You can check
      % that each row (eigenvector) of V gets amplified by
      % the corresponding eigenvalue in D. For example:
      Av1 = A*V(:,1); % multiply A by it's first eigenvector
      % Now multiply this vector by it's corresponding
      % eigenvalue (a scalar):
      dv1 = D(1,1)*V(:,1);
      % Compare them side by side (they should be the same):
      [Av1 dv1]

      % D is a diagonal matrix: only non-zero on the diagonal)
      % so extract it's diagonal elements in a vector:
      d = diag(D)
      % Let's plot them against the unit circle:
      theta = [0:0.01:1]*2*pi;
      ucir = exp(1i*theta);
      grid on,
      title('Poles of A (red) wrt unit circle (blue)');

      % Let's move any poles outside the unit circle inside:
      ius = abs(d)>1;

      % And now reassemble our (now stable) A matrix
      if length(ius == 0)
      display('You can stop here');
      d(ius) = 1./d(ius);

      % Assuming you have some poles that we moved:
      % Reconstruct a stable A:
      A = V*diag(d)*inv(V);

      % Take a look at the impulse response again:
      x = zeros(Nx,1); % start with an initial x0 = 0;
      for k=1:N
      Y(:,k) = C*x + D*U(:,k);
      x = A*x + B*U(:,k);

      % Plot the results:
      figure(4),plot(Y'),title('Impulse response (stable)');

      % END script

    5. Now if that works first time I'll be amazed because I didn't test it!

      You should see that any poles in the Z-plane (the unit circle plot) (appearing as red boxes), that are complex should be in complex conjugate pairs (symmetric about the real (x) axis). That's because A is real. If you instead created a random A like this:

      A = (randn(Nx,Nx) + 1i*randn(Nx,Nx))/sqrt(2);
      % sqrt(2) to keep it the same average "size"

      You'll find no such symmetry, and of course your outputs y will be complex as well.

      Note that MatLab uses the following for the unit imaginary number: i, 1i and j. You can overwrite i and j by using them, for example as an index:

      for i=1:10

      % Now i isn't the sqrt(-1) anymore!

      That's why 1i is preferred. I'm not sure that's how it is with Octave.

      Another thing you can do with that random input and a stable A, is look at the magnitude of the FFT of the outputs (you may want to increase the number of samples to something bigger, like 1024 or more):

      FYm = abs(fft(Y')');

      Now you can look at a monte-carlo of it's frequency response:


      or if it's too big a mess, pick out a particular one:


      or a log of that to put it in decibels (dB):

      plot(20*log10(FYm(1,:)')); % 20* gives power

      You can do the same with the any of the impulse responses for a more 'analytic' view.

      In general FFT outputs like that look terrible, so if you can look at a "power spectral density" you'll be better off. See if octave has a function called psd, by typing:

      help psd

      Matlab used to have a nice simple one, but they replaced it with something more complicated I have to look up every time now. It's also easy to make your own... but I won't get into that. It's a good way to smooth out the FFT results so you can easily see the response. I'm almost positive you won't have one called freqz, but you can check. that's great for giving you the analytic frequency response. Again, you can make your own, but I won't get into that either.

      One last thing: here's one way you could use the eigenvalue / eigenvector decomposition to more efficiently calculate what x will be after N iterations:

      [V,D] = eig(A);
      xN = V*(D.^N)*inv(V)*x0

      Likely some imaginary components will sneak in when you do that (since the diagonal of D will be complex in general, and so might V), so to get rid of those bits you could take the real part: real(xN). I'm not sure how numerically stable that is, but you can test it against the straightforward way:

      x = x0;
      for k=1:N
      x = A*x;
      xN = x

      This is an even less efficient way:
      AA = A;
      for k=1:N
      AA = AA*A;
      xN = AA*x0

      This is probably the most efficient and numerically stable way:
      AA = expm(logm(A)*N);
      xN = AA*x0

      You can think of that as converting back to continuous time, and then rediscretizing with a different time step (N times the 1st).

      A square-matrix vector multiply is order of Nx squared multiplies (Nx the dimension of a side of A) and a matrix matrix multiply is order of Nx cubed. You can see that if A was very large in dimension, this could quickly add up, especially if you had lots of time steps.

      Also, there's actually never a reason to use inv() (the matrix inverse function), because you always really need it times something else. A more efficient and numerically stable way to get the same thing is the "left divide"

      xN = V*(D.^N)*(V\x);

    6. This is an even less efficient way:
      Should have been:

      AA = A;
      for k=2:N
      AA = AA*A;
      xN = AA*x0

      Else you get an extra A matrix multiply

    7. My favorite matrix decomposition is the singular value decomposition. Wikipedia has a good article:

      You can take a singular value decomposition of ANY matrix (square, rectangular, complex, singular... anything). I think it's the best way to understand what ANY matrix actually does. For square matrices, the singular values are the absolute values of the eigenvalues, so one good way to generate a guaranteed stable A matrix is:

      [U,S,V] = svd(randn(Nx,Nx));

      The diagonal matrix S has the non-negative singular values on its diagonal, but why not just pick a whole new stable set?:

      A = U*diag(rand(Nx,1))*V'; % A guaranteed stable

      rand() without the "n" (randn()) is a uniform distribution on (0,1) (rather than Gaussian).

      You may have noticed that diag() is like a toggle: for a matrix it returns the vector corresponding to the diagonal, and for a vector, it returns a square diagonal matrix.

      The singular values are always ordered largest to smallest, and they are always real and non-negative. If the matrix is singular (non invertible), then there will be at least 1 zero singular value. The singular values are basically the amplification of the matrix. Bot U and V have unit length, mutually orthogonal columns (i.e. U'*U = I and V'*V = I), so their "gain" is 1 (unitary is the name). S provides all the amplification... and you can think of it like this: express a vector x in the reference frame V (basis V): x = x1 + x2 + x3... etc, out to Nx such vectors aligned with the columns of V. Then
      are the coordinates of x in the V reference frame. The result did not change the length of x. Let's examine x1 in particular. x1 = a1*v1 where v1 is the first column of V and a1 is the scalar representing the 1st coordinate of x in the V frame. Thus V'*x1 = a1*e1, where e1 is a vector that looks like [1 0 0 ... 0]'. e2 would have the 1 in the 2nd row, etc. Then S*V'*x1 = s1*a1*e1 = [s1*a1 0 0 0 ... 0]'. Finally, if the 1st column of U is u1, then the result is s1*a1*u1, where again s1 and a1 are scalars. The the role of A in this case was to multiply a vector lined up with v1 by s1 and map it to u1 (where both v1 and u1 are unit length). So the SVD tells you exactly how A maps vectors from one vector space to another: mapping v1 to u1 and multiplying by s1 along the way. Same for v2, u2, s2, etc. Since any vector x can be expressed as a linear combination of vi vectors, the result is the sum of all those mappings. It even works for non-square matrices (mapping between two different sized vector spaces), and it tells you how to approximate a matrix (just zero out the small singular values). This is handy for calculating a "pseudo inverse" which is what you need for non-square matrices, or for singular matrices. Notice that:
      A = U*S*V'
      inv(A) = V*inv(S)*U'
      but if S has zero (or near zero) elements on it's diagonal, simply exclude those by taking the upper left corner of S, and only keeping the corresponding columns of U and V (call them Us and Vs and Ss, so Us*Ss*Vs' is an approximation to A). Then if you're trying to solve:
      A*x = b
      In a sensible way, even though you can't invert A, you can still calculate:
      Vs'*x = inv(Ss)*Us'*b
      Now what is the smallest x that gets you as close as possible to b?
      x = Vs*inv(Ss)*Us'*b
      If A had more rows than columns to begin with, and the number of "valid" singular values (not just round off noise away from zero) were at least as many as the columns, then V would be square like S (with the "economy" SVD, which is svd(A,0) in Matlab), and this would be the least squares formula:
      x = V*inv(S)*U'*b
      inv(S) is easy, it's just diag(1./diag(S))
      And it's numerically stable. Any singular values you suspect are just noise, just eliminate (along with the corresponding columns of U and V). Those small noisy values make your answer blow up.


    8. And it's numerically stable! The usual least squares (linear regression) formula is:

      x = inv(A'*A)*A*b

      Which can be improved a bit by:

      x = (A'*A)\(A*b)

      But most of the numerical damage is done by A'*A (the Grammian). Forming the Grammian destroys precision. If you had 6 significant figures before forming the Grammian, then you'll have 3 afterwards (it halves the number). The singular value decomposition allows you to avoid forming the Grammian.

      The QR decomposition is probably used more though... it's a poor man's SVD. With QR you add a small diagonal matrix to R if it's close to singular. But if you ever really need to have a handle on what's going on with your matrix A, an SVD is often very handy.

      In my opinion, they should stress the SVD much more in linear algebra classes: understanding it really helps you understand what the world of matrices is all about.


      One last bit, in my pseudo-inverse example above, I mention that gets you as close as possible to B with the smallest possible x, but what is the error?

      A*x - b = Us*Ss*Vs'*Vs*inv(Ss)*Us'*b - b
      = Us*Us'*b - b

      If Us were square (rather than having more rows that columns), then this would just be:

      I*b - b = 0

      But, that's not the case. What is Us*Us' then? It's a "projection matrix" ... it projects b onto the columns of Us... and since those correspond to the largest singular values (or only non-zero ones, as the case may be), that's as close as x can get you to b (by combining columns of A). The remainder are the parts of b (i.e. a vector) that are orthogonal to the subspace spanned by Us. Now if you're unlucky, that could be most of b (or even all of it). Still, we've done our best with a singular A. Think of it in 3 dimensions: A is 3x3 and say has one 0 singular value, so it cannot be inverted. Then x and b are 3x1. We want to find the minimum x to minimize:

      |A*x - b|

      The result, x, multiplied by A will be a projection of b onto the space spanned by the columns of A (which will be a plane (subspace) rather than all of 3D space). You can check this is true by making sure the error is orthogonal to A:

      A'*(A*x - b)

      should be 0

    9. There are many ways to construct such an A. Here's an example: first get any two 3x1 vectors to define a plane:

      A = randn(3,2);

      Now for the 3rd vector, make it any linear combination of the other two (so it lies in the same plane):

      A = [A A*randn(2,1)];

      So now A is singular, it's columns spanning a 2D subspace in 3D space (thus it's rank is 2, and it will have 1 zero eigenvalue and 1 zero singular value, and it's determinant (the product of the eigenvalues) will be 0... and a QR factorization (qr()) will have at least one zero on the diagonal of R, and the cholesky factorization (chol()) will fail, as will the inverse (inv())).

      OK, have fun with Octave!

    10. Wow, lots to think about here! Tom, this will take a while for me to process into my brain! LOL

      Hey, to your question "Why is T dimensionless? It's taxes, right?",

      [I had to think for quite a while to come up with this answer!]

      it becomes dimensionless when the ratio G/GDP is stable.

      Because both G and GDP carry the identical dimensions, in the stable situation, the dimensions divide to equal one(1). In the dynamic situation, the ratio would change in each period so T would carry the period dimensions, such as "money/year".

      But remember, SIM builds in the dimensionless tax rate and uses that tax rate to determine all the other results using an initial G. BUT, by building in the tax rate, SIM also builds in the assumption that the tax rate will be dimensioned with "per replication." This is why the idea of "A GDP Object" becomes worthwhile, to show that a stable tax rate applied to a stable injection of money will ultimately result in a stable GDP. Then, knowing the stable GDP value, we can find all the results without the need for iteration within-the-periods.

    11. Tom, I got a little way into the Octave script. Two changes need so far:

      [A B; C D] to %[A B; C D]





      [rb to r]

      With the two changes in place, I get a blue circle and a number of red lines that seem to have no pattern. I will keep looking later today. An interesting learning experience!

    12. Hi Roger, the red lines are not supposed to be lines. That's supposed to be a scatter plot (just the points). Do a help plot.

      In Matlab, it goes like this 'r', 'g', 'b', etc are colored lines. 'rb' is supposed to be red boxes (with no lines). Try 'r*' or 'r.' or even just '.' or '*'. '.' and '*' are other shaped markers. Actually I'm pretty sure '.' will work, since it worked before for you.

      Tax rate dimension: Ah, thanks. I'll stew on that a bit.

    13. This comment has been removed by the author.

    14. ... yeah, I imagine [A B; C D] would make a mess with 10 states!... I originally had 3, but I bumped it up so you'd be pretty likely to have some unstable poles outside the unit circle. I have no idea what your output looks like: probably just a mess with a random input. The impulse response (or a step response) should look pretty clean, but you probably get gigantic numbers! 100 iterations on one or more unstable poles can send it to infinity fast. You might have to cut down on the number of sample times to avoid overflows, even with double precision numbers.

      Now we could add noise to the plant with covariance G'*G, and noise to the measurements with covariance M'*M:
      G = randn(Nx,Nx);
      M = randn(Ny,Ny);
      % the "random" u is already like a noise with
      % covariance eye(Nx) (i.e. an Nx sized identity matrix)

      y = C*x + B*U(:,k) + M*randn(Ny,1);
      x = A*x + B*U(:,k) + G*randn(Nx,1);

      and then we could build a Kalman filter to optimally track all 10 states and plot the principal axis length of the error covariance based on just those two outputs... but we can leave that for another day (like maybe never, since you've probably had enough!... Lol).

      Fun fact about the function eig(): it knows how to diagonalize a square matrix, but that's it. Of course not all matrices are square, but even the square ones aren't all diagonalizable. The diagonal values being the eigenvalues of course. Once simple one that isn't (which comes up all the time) is the A matrix associated with the states position, velocity, acceleration, rate of acceleration (jerk), etc. Such a simple matrix and system, but one of the extremely rare cases that's not diagonalizable. With just position and velocity, in continuous time that's:

      A = [0 1; 0 0]; % Not invertible or diagonalizable!

      And in discrete time (invertible, but not diagonalizable):

      A = [1 T; 0 T]; % where T is the time step.

      If you try eig(A) in either case I'd guess it'll fail. That doesn't mean there are no eigenvalues... the problem is there's not enough eigenvectors. But here's the thing: in either case, A is just a tiny epsilon away from a diagonalizable matrix:

      A = [1 T; 1e-10 1]

      Now you can diagonalize A just fine. There's no such problem with singular values... but then they don't do EVERYTHING, so you need the eigenvalues in many cases.

      Once you build a state estimator, you'll have a state estimate xhat, and then you can feed that back to stabilize the system instead of manually moving the poles like I did:

      x = A*x + B*U(:,k) + G*randn(Nx,1) + L*xhat;

      Where L is your "control law"... the idea being in the real world you don't have access to A (other than a model of it): you only have y and you know what u is hopefully. So you form xhat from y and u, and use it to stabilize the system.

      That's how they can stabilize that inverted pendulum. They know the angle of each segment (with sensors)... and they can do a transformation on that to move the cart to always restore it to "equilibrium." The feedback moves all the poles inside the unit circle. I'm only talking about when it's already close to equilibrium... I have no idea how they move the cart to stack the segments up.

      A potential economics application? You have an unstable system (or one with undesireable dynamics anyway) you'd like to stabilize/make-more-desireable perhaps? I'm sure I'm only the billionth person to have thought that! Lol. (That's what the Taylor rule is all about, BTW: that's L the control law).

      Here's another application: a servo motor. You might be able to stabilize the system so that a given voltage corresponds to a certain angle of the drive shaft... but stable is not always good enough. Perhaps it oscillates a bit when it gets there... the oscillations die down to zero, but you may not be able to tolerate any. So you can use feedback to dampen those out, or get rid of them completely.

    15. ... economics is a classic case requiring "robust control": even if the system were linearizable (which it may not be), you don't really know A, B, C, D, G or M... so you have to build a feedback control law (and state estimator) that works (to some minimal level of performance: stability at least for example) for a whole set of possible plants! It's not an easy chore. Perhaps impossible. You've got to hand it to Taylor for giving it a stab anyway.

      Or you can look at it as a non-linear control problem... even harder!

      In fact, just figuring out a nominal A and the other matrices and the range of uncertainty on them (system identification) is proving to be pretty much impossible I think.

      BTW, you can handle "colored" noise (not white: white meaning the autocorrelation function is an impulse... i.e. all frequencies represented equally) with this setup by adding extra states designed to color the noise.

      Also, some economists think the system is stable to start out, and what makes it unstable are attempts to add feedback. So there's that too. Others think it's not stable, but you can't improve it any with feedback, so it's no use trying.

    16. I just noticed an error with my expression for y above (not in the script). I wrote:

      y = C*x + B*U(:,k) + M*randn(Ny,1);

      But it should be

      y = C*x + D*U(:,k) + M*randn(Ny,1);

      and of course in term of the script, where we're recording every y, then it's

      Y(:,k) = C*x + D*U(:,k) + M*randn(Ny,1);

      It turns out SIM is a bit of an odd beast in this regard... it actually has a non-zero D. If you think about it for a system like a rocket, servo motor or inverted pendulum that you're trying to control, you don't normally have a non-zero D: that would just be taking a linear combination of the input control signal u (which you presumably already know) and adding that to your measurements. Why would you do that?

      And like I say above, you're lucky if you know A, even approximately, but as a control engineer you can't usually have any say over what A is: it describes the system you've been instructed to control. You might be able to determine that the system is uncontrollable (depends on A and B) or unobservable (depends on A and C)... which could lead to a redesign of the system, or tell something about the stability of states for which there's no controllability or observability (you'd better hope in both cases they are at least stable). In the same manner, your analytical feedback to your customer could lead to adopting different or more sensors (C) or adding more controls/actuators (B), or improving the quality of components to decrease or alter the character of the (thermal) noise (M or G), ... or giving up on the project altogether!

  2. Roger, if you want to see an application of this linear system theory stuff, and particularly a "root locus" (i.e. predicting system behavior based on the location of the systems roots (poles)), Jason tried setting gamma = 10 and noticed that the system oscillated, and thought that was interesting. In terms of how gamma affects the location of the system's pole (just the scalar A in the case of SIM), and how that in turn can explain and predict SIM's behavior, I left a couple more comments in response on his RLC circuit post.

  3. OK, a simple feedback control example: imagine the following single state system:

    x[n+1] = a*x[n] + b*u[n] + g*w[n]
    y[n] = c*x[n] + m*v[n]

    Where c = 1, g = 0.3, m = 0.3, b = 1 and a = 1.01, x0 = 0 and u[n] = 1, n >= 0, else 0, and w and v are N(0,1) (zero mean, unity variance additive white Gaussian noise (AWGN).

    Then xhat = y is an unbiased estimate of x. The pole of this system is simply a, which is just outside the unit circle, so it should be unstable. But now lets apply a gain L to xhat and feed it back:

    x[n+1] = a*x[n] + b*u[n] + L*x[n]

    Where I've eliminated the noise terms for simplicity. So as long as L is on (-0.01,-1.01) we can stabilize the system:

    X(z) = (a+L)*X(z)*z^-1 + b*U(z)

    X(z)/U(z) = 1/(1-(a+L)*z^-1) = z/(z - (a+L)), so the pole is at (a+L). If we don't want it to oscillate, and we want to dampen the control signal u, then a good choice would be a+L on (0,1). Let's put the pole at 0.5, which means L = 0.51.

    a = 1.01;
    b = 1;
    L = 0.51;
    g = 0.1; % "plant noise" standard deviation
    m = 0.1; % measurement noise standard deviation
    u = 1;
    x0 = 0;
    N = 100; % samples

    Y = (1,N);

    % first with no feedback control compensation:
    x = x0;
    for k=1:N
    Y(k) = x + m*randn
    x = a*x + b*u + g*randn

    figure(1),plot(Y),title('No feedback control');

    % Now with feedback control compensation:
    x = x0;
    for k=1:N
    Y(k) = x + m*randn
    x = a*x + b*u + g*randn + L*Y(k);

    figure(2),plot(Y),title('With feedback control');

    % END script

    With more complex systems, you can determine what the poles will do ahead of time with different gains L when you "close the loop." Generally you make a plot of how they move as function of the gain. And sometimes your "gain" has states of it's own.

    Here's a typical diagram in the Z-plane showing how two poles move as a function of the feedback gain in relation to the unit circle (black). You can see (red lines) how they (the poles) move along the real line until they meet, and then they branch off into conjugate complex numbers, circle around, and meet again on the other side of zero, where they again split from each other on the negative real line. Any one of those (pair of) positions on the red curves corresponds to a particular gain. So the designer would choose a gain that puts them where he's like.

    I thought this was a nice diagram too showing what the impulse response of poles look like at different locations around the S-plane (so continuous time this time). Now instead of the unit disk (the unit circle and it's insides) the stable region is the right half plane (and the imaginary axis). This diagram provides a nice map between the S-plane and the Z-plane. This one's good too.

    1. In the script, this:

      Y = (1,N);

      Was supposed to be this:

      Y = zeros(1,N);

      Also instead of

      L = 0.51;

      It should be

      L = -0.51;

  4. Tom, Those Z and S plane figures (that you referenced) are very interesting. They invite careful examination and a lot of study.

    OH, I changed the plot style/color from rb to sr. Matlab "red box' to Octave "square red". Now I get 10 scattered squares.

    Also changed [AB; CD] to [AB; CD]; . Now I also have two curve displays. The curves are not stable and equal replication-to-replication so maybe more needs to be done.

    It astonishes me that you wrote the code nearly error free the very first time. Shows great skill!

    I am trying a new approach on "The GDP Object". My argument must convince folks skilled in math, as well as convincing me. That is a high standard, not yet met.

    OH, more. I spend some time working farther into matrix math. Yes, I see how it works. Now I need to get a better skill level in using the tool.

  5. If you put a semicolon at the end, the line won't print. So
    [A B; C D]

    didn't actually do anything except form that one big matrix, but it wasn't assigned to anything, so it has no function except to print to your console. The thing is, it probably makes a mess, so I can understand you not wanting to see it! :D

    for a cleaner way to print to console look up "disp" and "display" ... I was just too lazy to use those.

    Matrix math... one of the Z and S plane links comes from a page all about mechanical systems: it turns out the RLC circuit Jason was using for an analogy is also analogous to a simple 2nd order mechanical analogy: a mass (usually pictures as a horizontal cart with frictionless wheels & bearings), a massless spring, and a "dashpot" (like a dampening device). The have the diagram right there on one of those pages. It fits in well with your "mechanical perspective" I think.

    My "feedback control" script above doesn't use any matrices (just scalars) so it's easier to follow. It should be clear in that case how adding feedback move the system's pole, but you can run it and verify that for yourself. Since SIM is a one pole system, it's directly analogous, especially when people start moving the pole around with gamma or alpha2 (as Jason and Ramanan do, resp.).

    BTW, here's that page with the mass, spring and dashpot... and a servo motor actuator as well (to provide a forcing function u, much like Jason's voltage generator in his RLC circuit):

    2nd order systems are very popular on exams because they're just complex enough to be interesting, yet you can solve them with nothing more complex than the quadratic formula (to find the roots of the system). Solving 3rd order polynomial problems and higher was not generally a fun or fruitful project with your pocket calculator back when I was in school! And of course, in general there's no closed form solution for 5th order or above (except special cases) as Galois proved more than a century and a half ago. (If you've ever seen the closed form for a 4th order polynomial, it's UGLY!). Matlab easily solves all that numerically though... that's essentially what eig() does.

    1. specifically, the "characteristic equation" is the polynomial equation resulting from this:

      det(A - lambda*I) = 0

      The values of lambda that satisfy that are the roots of the characteristic equation and the poles of your system.

      That reduces to a regular old scalar coefficient polynomial... so if A ia a 2x2, it's a quadratic formula. "det" is the determinant. Even calculating the determinant gets to be a major pain if you do it symbolically with a 3x3 or a 4x4, etc. Except in special cases. There's never a reason to do it by hand really... except on exams!

      Fun fact: If you replace the scalar lambda in the characteristic polynomial, with the *matrix* A, then A satisfies it! (provided, of course, you replace the 0 with a matrix of zeros, and interpret A^n as (n-1) matrix multiplies of A with itself). In other words, A satisfies it's own characteristic equation. Strange, eh?

    2. ... another fun fact, even though 3rd order and higher solutions are notoriously difficult to solve by hand, with a few geometric rules in the S or Z plane, you can still approximately sketch out their "root locus" diagrams as a function of the feedback gain in a system. It's pretty cool...

    3. Tom, I'm deep into a rewrite. I think this effort is better. Hopefully, I will post in draft form tomorrow.

      I tried Octave with the array in. It hangs someplace after figure one. I will play with it more. Next, I will read the instructions. So far, I am just going on intuition based on some C++ programming I have done, and on the comments and wonderful code that you wrote.

    4. With the array in it? You mean [A B; C D]? Don't worry about that... it doesn't do anything. You can comment it out. It shouldn't cause anything to hang though.

      Check it out Roger: you probably won't follow all this (I didn't either actually... I used to draw these things myself, but it's been a while). Anyway, this guy has two videos that are kind of fun to watch: he shows you what happens to the roots of a system as you change the control function. Here's the 1st video:

      It's all actually very geometrical. His "rules" may seem mysterious, but it all comes down the the geometry of the poles and zeros in the plane: their distance and their angles. Anyway, it's kind of fun, because he's got the intuition nailed, so he can draw them fast.

    5. Tom, Thanks for sharing this link to Brian Douglas' video series. Very interesting.

      Sorry that I am so slow to learn. It seems like I need to 'internalize' things, making them part of my mechanical system, the way I perceive things.

      That said, I was thinking about matrix's last night. What a beautiful system of organization. Of course, all they are is a regular system of identification. Each point on the grid corresponds to a unique designation. You understand, I am just opening my eyes to the possibilities! Thanks again for pointing the way.

      I am slowly beginning to understanding the link between SIM and control systems. I recognized one of the equations (a part) that Brian Douglas displayed, so you will (hopefully) like the new approach to my SIM equations. Hopefully, a post today, and more SIM Excel examples later in the week.

    6. Roger, you're not slow to learn!!! Shoot, I've been drowning you in stuff it took me a lot of time to learn.

      Now Brian didn't show in that 1st video (perhaps earlier ones he does) how gets the equations for those "closed loop" transfer functions in the 1st place. It's actually very simple. You can do the exact same thing with Z transforms too BTW. So open loop you may have:

      Y(s) = P(s)*U(s)

      Where y(t) is the output and Y(s) is it's Laplace transform, and same for u(t) and U(s). P(s) is the transform for the "plant" (the system, like SIM is). In the time domain, the "impulse response" of the plant in time (call it p(t)... like what happens if you hit SIM with just $1 of gov spending in the first period and then back to $0 afterwards) you'd say:

      y(t) = p(t)*u(t)

      too, but the "*" symbol means "convolve with" rather than multiply in that case. "convolve with" means for every input in u(t) you put a copy of p(t)'s impulse response. The impulse response times 20 is what you'd get if you didn't accumulate at the end of Jason's version of SIM, because he just has G at $20 for period 1, and $0 after.

      So multiplication in the frequency domain is convolution in the time domain. The system is linear, so to find the response to ANY input is just the sum of impulses responses to a bunch of impulses added together.

      Now the converse is true as well: convolution in the frequency domain, is equal to multiplication in the time domain... so if two signals are multiplied in the time domain (like what a mixer does in radios), then in the frequency domain their spectrums are convolved.

      But back to Brians videos:

      Y(s) = P(s)*U(s), so the transfer function is just (dropping the s argument):

      Y/U = P

      And because most realizable systems have Laplace transforms that are ratios of polynomials, those are just all polynomials in s. Now, these are what are called SISO systems (single input single output). Translating this into a "state space" representation still means that A can be NxN in general (lots of states = lots of terms in the polynomials), but it means that the y output and the u input are both just scalars. Otherwise you have LOTS of impulse responses: from each u to each y. It's likely however, that they will all have the same polynomial in the denominator: only the numerator polynomial will change (most of the time). That denominator polynomial (in s) is found from the characteristic equation of A:

      det(A - s*I) = 0

      But Brian starts with just the polynomials themselves, which is fine.

      The other thing you need to know is that addition in the time domain is addition in the frequency domain. So when you take Y and feed it back to the input by subtracting it from U (negative feedback... the usual) you get after "closing the loop":

      Y = P*(U - Y)


      Y + P*Y = P*U
      Y(1+P) = P*U
      Y = P*U/(1+P)
      Y/U = P/(1+P)

      Now he puts a gain K in there after in the forward path after the summing junction, so it's

      Y/U = K*P/(1-K*P)

      Where that "1" is really a 1 and not an I (not an identity matrix in this case), because of course it's a SISO system. These functions of s are all just scalars.

      Now K can also be a function of s, so it's K(s), but the easiest thing to do is just make it a scalar K. And convolution by a constant K is the same as multiplication by a constant K.

    7. Then he factors P(s) into it's numerator and denominator... so you can see what K does to the roots of the closed loop system. I forget what he calls them, but lets call them N(s) and D(s). Then K the gain moves the roots of the closed loop system from those of D(s) (when K = 0) to those of N(s) (when K = infinity). The zeros of a system (open loop, those are the roots of N(s)) have an effect on a system, but they don't determine it's stability or it's "speed" or oscillatory behavior... those critical things are all determined by the roots of D(s) (again, open loop, not closed loop), which again are the roots of

      det(A - s*I)

      which are the eigenvalues of A, which are the poles of the system, which are the roots of characteristic equation... a lot of names for the same thing!

      But the interesting part is how as K goes from 0 to infinity the poles of the *CLOSED LOOP* system move from D(s) to N(s). The poles move to the zeros.

      Now imagine that K isn't just a constant gain, but also has a form like this:

      K(s) = Nk(s)/Dk(s)

      It's easy to work out (in terms of numerator polynomials and denominator polynomials) what the closed loop system is. It's best to just reduce it to one numerator polynomial divided by one denominator polynomial so it's easy to see (IMO).

      As he shows, K a constant doesn't always do the trick in terms of being able to stabilize a naturally unstable system. Sometimes you have to add dynamics to K as well. In the one case I saw, he added a zero (which is called a differentiator), so K had a form like

      K(s) = a + b*s

      Or what's known as proportional (a) differential (b) control: a PD controller. If you add in an integrator as well (PID), it's something like a + b*s + c/s.

      1/s is the Laplace transform of an integrator, and s is the Laplace transform of a differentiator.

      The problem with differentiators is they tend to amplify noise (they are high pass filters), whereas integrators tend to reduce noise (low pass filters)... but noise may be less important than stability!

      Sometimes what happens with your controller (K) is that it actually has a model of the plant built right into it... so K has within it P(s)... now why would that be? Well that would be to estimate ALL the states of P... recall, we can express P's polynomial as a set of states with x' = A*x.

      That's precisely what a Kalman filter is: it's a little model of the system P which estimates all the states. It's married together with a control law which then tries to modify the states (change the characteristic equation). If you can accurately model all the states, then you can put the poles of the system wherever you'd like.

    8. One last thought: using the state space representation allows you to write the system differential equation as a 1st order linear MATRIX differential equation:

      x' = A*x

      Just 1st order!... but in terms of a scalar differential equation, it's not 1st order, it's order is the dimension of x! So if x is Nx1, the it'll be Nth order. All those terms in the characteristic equation (a*s^n + b*s^(n-1) ... etc, down to c) are all just differentiated x's... in a differential equation they'd be terms like a*x''' + b*x'' + ... + c = 0. So taking the Laplace transform of a differential equations replaced nth derivatives with s^n, and it replaces integrations over time with factors of 1/s, double integrals with 1/s^2, etc.

      All the same applies for Z transforms, except that instead of differentials and integrals you look for cases where the time index is different. If a term is delayed by one x[n-1] then in the Z domain it's X(z)*z^-1. If it's delayed by 2 then it's X(z)*z^-2, etc. A delay in digital circuits is often marked with Z^-n, where n is the magnitude of the delay.

      Laplace transforms and Z-transforms can make life easy once you get the hang of it. You can look up in a table what the transforms are for many common functions, and if you know a few rules, you can adapt those to your circumstance.

      In the end with Z transforms, it boils down to the same thing for SISO systems: a ratio of polynomials in z. So it has a root locus, just like its continuous time cousin in s.

    9. In case you're wondering, the analog of an integrator in the discrete world is an accumulator. So instead of this:

      x' = u

      You have:

      x[n+1] = x[n] + u[n]

      Taking the Laplace transform of the 1st:

      X(s)*s = U(s)

      So that's why an integrator is 1/s: X/U = 1/s

      In the discrete time world, what do we have? Well the way I wrote it, x[n+1] is "delayed" by -1 (i.e. advanced by one) so:

      X(z)*z = X(z) + U(z)

      X/U = 1/(z-1) = z^-1/(1-z^-1)

    10. Above, where I wrote:

      "1st order linear MATRIX differential equation"

      I should have written:

      "1st order linear VECTOR differential equation"

      The vector being x. It needs a square matrix A to fill out the equation, but A is not the state. A matrix differential equation would be a case where a matrix X was involved:

      X' = A*X

      Which is really just a bunch of vector differential equations:

      x1' = A*x1
      x2' = A*x2

      etc, where xi is the ith column of X. You can write the optimal, time varying, continuous time Kalman filter gain equation in a matrix differential equation form, but unfortunately it's non-linear... I forget what it is, but it's something like:

      X' = R*X + X*R + X*Q*X + F

      or something horrible like that, where X is the Kalman gain. (The Kalman gain is a time varying gain you need to perform an optimal state estimate). It's the Riccati equation I mentioned once, only in a matrix differential form.

    11. Tom, I just posted A Master Equation for SIM Models Using the GDP Object . I labeled it draft and welcome comments for improvement. I think I am satisfied with this but are you? My understanding kept evolving as I wrote.

      I will look at your comments next.

    12. It's unfortunate that the star (*) is the normal symbol for convolution, since it's used to represent multiplication in text equations and computer programs. But Wikipedia has a good article on it (complete with animations):

      It's easier to visualize in discrete time: basically everywhere an input signal to a system is non-zero, then for the output, replace that with a scaled and shifted system impulse response, and then add all those scaled and shifted impulse responses together: that's the superposition concept at play: you can't do that with a non-linear system where superposition doesn't hold. (it even works for a time-varying linear system, only the impulse response in that case is a function of time: it's not fixed like for an LTI system).

      Tables of Laplace and Z-transforms (and their inverses) make solving convolution integrals (convolution sums in discrete time) unnecessary if you can describe the impulse response and the system input as shifted and weighted sums of common functions in the table.

      A classic case is the action of a digital to analog converter (DAC). A DAC transitions you from the discrete time world to the continuous time world. Typically they work by a "sample and hold" or zero order hold (ZOH) technique, meaning they hold their output voltage constant during each sample period. You can think of the DAC's impulse response then as a shifted rectangle (which in turn are two step functions, one up and one down, added together: u(t) - u(t-T), where u(t) here is the unit step function). Essentially that impulse response is being convolved with the discrete time input to the DAC. In continuous time a discrete time signal's spectrum is repeated every factor of Fs (the sampling frequency). But the frequency response of a rectangular pulse is a modified sinc() function (having the general form sin(a*f)/(a*f)). So in the frequency domain, you multiply the spectrums (since you're convolving in the time domain), thus the spectral images of the discrete time signal are attenuated as you go further out in frequency in the continuous time output of the DAC. Also a shift in one domain is a phase change in the other. So a time shifted rectangle has a frequency response which is a phase shifted version of the spectrum of a 0 centered rectangle: the magnitude doesn't change. Another principal is that signals which are wide in the time domain, are narrow in the frequency domain, and vice versa. So a wide rectangle (a long sample period) means a narrow sinc in the frequency domain. A Gaussian bell curve is the same in both domains other than its width. So a wide Gaussian in one domain transforms to a narrow Gaussian in the other.

      Still, a DAC will produce many undesirable (though attenuated) harmonics of the fundamental (the desired signal), and you generally add an analog filter to its output to eliminate the ones you don't want.

    13. Tom, Just thinking here. I'm reading your material (and more) about control systems. I can now see the connection between SIM and control loops. So yes, I think Eq. 10 (the Master Equation) would be a control object (programming concept of an object).

      Any feedback into the control object would necessarily be from the past (in a time sense) so it could be added to any of the four controlling parameters. [I wonder if Excel has a iteration loop for a single cell, that could run and continually sample a "future" cell?]

      I think that from a system control viewpoint, the Master Equation is an integrator summing four different inputs including one sample from memory. The G input is master. and inputs 2 and 3 are successive and diminishing in effectiveness.

      More inputs could be added at any of the entry points.

      I am starting to get a glimmer of how the Z transform works (I think).

      This is all very cool. Thanks!

    14. "I wonder if Excel has a iteration loop for a single cell, that could run and continually sample a "future" cell?"

      You can turn something like that on here:

      Excel->options->formulas->'Enable iterative calculation'

      There's also what they call their "Solver" (really a single root finder) and the fancier "non-linear equation solver" which you need to install their analysis package for. The analysis package has a fancier linear regression capability (than the usual one in their formulas, or built into their plots), and it has matrix manipulation capabilities like matrix multiplies and matrix inverse. Octave is much better for matrix manipulations though!

  6. Roger, you're right: there's no "box" marker: it's a square. From Matlab help:

    b blue . point - solid
    g green o circle : dotted
    r red x x-mark -. dashdot
    c cyan + plus -- dashed
    m magenta * star (none) no line
    y yellow s square
    k black d diamond
    w white v triangle (down)
    ^ triangle (up)
    < triangle (left)
    > triangle (right)
    p pentagram
    h hexagram