I'm going to conflate mathematics with statistics as Carlo Beenakker did. Then the earliest application that I know of is that Decision Trees were invented by Breiman et al. to analyze the issue of- among people who have had a heart attack, which patients were most likely to have another heart attack. I also believe that Ed Frenkel, in his autobiography, claimed to have developed a similar methodology to explain to doctors how to triage. A really beautiful application is something called James-Stein estimation which deals with the inadmissability of the mle for estimating (true) means of many variable. The short story is that if you have 4 or more series of observations, then the individual estimated mean of each series of observations is not the best estimates of the real means. This was applied by Efron (mentioned in various books and papers including "Large Scale Inference") to the question of deciding which genes (via gene expression) were likely to be influential in causing a specific cancer.