The Butcher and the Brain Surgeon, Or Theory and Practice

When I took my first class on machine learning, one of the things that struck me was how "hand-wavey" most of the methods presented were. By that, I mean that most methods at the state of the art are pretty crude, and worked out by trial and error as opposed to clinical theorizing from first principles. Instead of deriving a general mathematical theory of neural network performance, machine learning journals are ripe with people suggesting slightly tweaked architectures for some special case that have marginally better performance than the previous best. Machine learning is not the only field that operates this way. Even the most staggeringly intricate medical surgeries are really nothing more than glorified butchery, in the sense that they are really just crude procedures that seem to have worked well in the past, post-facto justified by our middling understanding of anatomy coupled with various biochemical heuristics. That is to say, surgeons are really nothing more than glorified butchers. And "butchery," here, is not a pejorative (as we will see), but really quite high praise.

The key distinction I'm trying to make here is between fields that are tied to a central and objective source of truth, and concerned with performance relative to that source of truth, and fields that are broad oceans of intellectual inquiry with no clear goal in mind. Machine learning, butchery, and brain surgery are all examples of the first type -- machine learning architectures are judged by their performance on real datasets (eg. classification results on ImageNet), butchers by the quality of their meat as judged by customers, and brain surgeons by their clinical outcomes (morbidity and complication numbers of their patients). Examples of the latter type of field include mathematics (abstractly playing with number-theoretic ideas without regard for where they may lead), the hard humanities (critiquing an obscure passage by some obscure ancient philosopher for no clear reason), etc. I don't mean to draw the line based on "pragmatism" or "applicability," to be clear. I'm just drawing a distinction between "guild-like" fields where people are looking to improve performance of something at the end, and more "theorizing-like" fields that involve toying with ideas in the abstract. Another example of the "guild-like" is renaissance painting: the old masters like Alberti and Boticelli used to correspond by mail, commenting on novel techniques for mixing paint to get more realistic shadows, and such things. Their goal was to produce more lifelike paintings, not to understand the chemical nature of paint from first principles.

Having made this distinction, the point I'm trying to make is that many theorists look down upon the "guild-like" fields for having a lack of rigor. Case in point: talk to any theoretical statistician, and they will wax endlessly about how the lack of foundations in machine learning, and how poorly estimators produced by ML methods are understood or are intepretable. Of course, this criticism is valid: we don't currently understand (mathematically) why certain networks perform better than others, or the limits of performance of various networks. However, it also misses the point. I came into school being taught by many such theorists (I am a math major) and being told to believe it is misguided to approaching thinking in this goal-oriented trial-and-error sort of way that is endemic in the "guild-like" subjects. Now, I've come to think that this manner of thinking is not only more likely to lead to progress, but is in some sense more humble and honorable as a stark contrast to post-facto theorizing of the kind common in, say, economics.

And this is really the crux of what I'm trying to get at: the despicable post-facto justification. I think the "guild-like" thinking is more moral because of its epistemic humility. ML practitioners trying to improve image classification performance will not pretend that they understand what's happening inside a neural network, but theoretical statisticians will, even though they don't! Similarly, journals in surgery are ripe with lists of procedures and interventions that seem to have worked OK in practice, with no shame in not justifying why this is the case, whereas biochemists and other such life scientists co-authoring with surgeons will come around and try to come up with justifications for such practises post-facto, making it seem as if they knew this would be the outcome all along.

The point of the scientific method is to make a priori predictions that can be tested, and not to "explain" outcomes post-facto. This is especially egregious in economics, where theories have no predictive power, and instead are used to "explain" phenomena like financial crises or business cycles after the fact. The lack of epistemic humility in scenarios like these can have real consequences, like misguided government monetary policy based on the thinking of said economists, that leads to economic hardship for millions of people. Of course theory is a useful tool -- I am a theorist-in-training, after all -- but it's soul is found when it's used to study conundrums and make a priori predictions that help practitioners, and not to belittle them or hide under the pretense of knowing why everything that works a certain way does so.