Why Even Learn Classical Control?
...when optimal-control exists, and learning-based control is all the rage?
I'm currently sitting near Mirror Lake in Yosemite National Park. The goal was to cap off a week-long road trip with the epic Half-Dome hike. Unfortunately I got a tendon injury during a practice hike that has been bothering me for the last couple of weeks, so I had to abort :( While my friends continue on their trail, I decided to make the most of this time to write down all my thoughts and learnings from being a course assistant for ENGR 205: Feedback Control Systems, at Stanford.
Although I have a longer post detailing what I learnt as an instructor and why this class meant so much to me as a student and instructor, a single portion of that post became a slightly longer, different kind of essay. So, here’s that part for now, my argument for why those old-timey control design methods still make sense to learn.
I think it is safe to say that there is no "classical control" research happening right now. Frequency domain control design topics (think root locus, bode plots) that involve analyzing the direct relation between an input and output, have extremely solid theory. The classical control design techniques, implementation methods and analysis tools have been refined and formalized over the large part of the 20th century (especially during the world wars and the space race), and they are so ubiquitous now that there is no practical advancement in these first principles left, no open questions. Any existing research tends to be abstract topics that are closer to mathematical philosophy than useful engineering. Naturally, grad students who want to take classes that apply to their research, will think graduate-level classes in classical control are useless in the 2020s.
But these controllers are literally everywhere. Your thermostat, telephone lines, your car, planes. Everything. And at the end of the day, they are ridiculously simple to implement. I was watching this keynote speech from Professor Boyd and just listen to his description of PID controllers:
So, I don't want the next generation of engineers to see a controls problem where they need a simple actuator to track a known trajectory reliably, efficiently, and think "yeah ok let me train a controller using reinforcement learning". No! Given the system model (that is the key here, not always true), you can literally design a controller on pen and paper. And it just works.
What about optimal controllers?
Optimization based controllers are of course, awesome. If you can define your control objective (preferably as a convex functio), and then constrain your problem to some system dynamics, then you don’t have to use these ancient control methods! A solver designs a controller for you that meets your objective. In some cases (like infinite horizon LQR), this takes a nice simple PID-like form. With us now having the compute power to deploy these easily, they may become more common-place. But essentially what you have is a machine that is doing the design for you. Analyzing the controller itself, you can still use frequency analysis methods. I like this cartoon here from Gunter Stein's lecture "Respect The Unstable" (great lecture btw, you should watch the whole thing):
What about learning-based control?
I'm no ML expert, so some of this is me talking out of my ass. Maybe I will have a different perspective on this once I finish taking AA 203: Optimal Control from Marco Pavone. But in my mind, RL is best when you don't have a model of your system. Robotics has seen incredible progress with this, which is why most modern research is focused on reinforcement or imitation learning methods. Boston Dynamics' new video showcasing RL abilities, for example, blows my mind.
This is harder to reconcile with the feedback control methods because an RL controller is highly non-linear and doesn't relate to these linear feedback design methods. But I think my main defense is:
Simple problems should be solved with simple solutions. PID is simpler than RL.
PID is deterministic and extremely reliable. There are cases, such as aerospace, where RL makes little to no sense (at least in its current form).
It is wise to generally do RL to some level, and a deterministic feedback controller for the lower level!
In fact, based on this blog post from Boston Dynamics, the third point rings true. RL seems to work great to perform some higher level control by mapping states to specific gait patterns. It seems to allow for much more expressive control. However, that gait is then passed to a “locomotion controller”, which is used to allocate control to all of the humanoid’s actuators. This controller is an optimization/classical controller, and so the previous sections’ argument holds. Fight me on this, but I would pick this over an “end-to-end learning system” any day.
Takeaway
As an aerospace engineer I am clearly biased, but I think this is a perspective that I want my fellow graduate students in robotics research to at least consider: Be aware that at the first principles level, there is significant, really wonderful theory and engineering that has already been done over the 20th century that works really really well. PID controllers are used in industry often because of how capable they are at stuff like trajectory tracking and low-level actuator control. When you understand the extent of these methods' capabilities, then you can focus on the newer methods that can do what simple methods cannot.
If you think any of these are hot takes, I’d love to chat in the comments :)
Anshuk
So you disagree with Teslas end to end learning approach?