Inaccurate Algorithms Undermine ARIMA Models in Popular Software

Recent studies find pervasive flaws in widely used software packages for ARIMA (AutoRegressive Integrated Moving Average) models that result in spurious forecasts. Sadly, this is not a glitch, but a feature of their algorithms’ inherent bias. Jesse Wheeler, an assistant professor in the mathematics and statistics department at Idaho State University, joined forces with Edward…

Lisa Wong Avatar

By

Inaccurate Algorithms Undermine ARIMA Models in Popular Software

Recent studies find pervasive flaws in widely used software packages for ARIMA (AutoRegressive Integrated Moving Average) models that result in spurious forecasts. Sadly, this is not a glitch, but a feature of their algorithms’ inherent bias. Jesse Wheeler, an assistant professor in the mathematics and statistics department at Idaho State University, joined forces with Edward Ionides, a professor of statistics at the University of Michigan. As a unit, they identified troubling gaps in the parameter estimation procedures of these models. Their findings, published in a paper titled “Revisiting inference for ARIMA models: Improved fits and superior confidence intervals” in PLOS One, highlight a problem that could mislead researchers and industry professionals who rely on these models for accurate predictions.

ARIMA models are generally the first encounter with time series modeling in academia. And Wheeler and Ionides had an even bigger wish fulfillment surprise in store. They discovered that algorithms in two widely-used software environments can lead to incorrect parameter estimates for up to 60% of realizations, depending on the data/model at hand. The algorithms try to make sure to maximize the model likelihood, but they are not very effective. Consequently, they create estimates of low quality that may significantly impact forecasting precision and other statistics estimation.

Wheeler emphasized the gravity of the situation, stating, “If the software estimating these models has flaws, it can potentially lead to unexpected results or misguided decisions.” He further noted, “Most practitioners don’t even realize the issue exists. We found that the software’s maximum likelihood estimates were not fully optimized, leading to unreliable parameter estimates.”

Wheeler and Ionides discovered specific optimization problem. This defect resulted in unreliable parameter estimates, which undermined the overall validity of the models. They explain that this flaw is akin to having a calculator that sometimes returns incorrect answers. “This is like having a calculator that claims to add two plus two correctly, but sometimes it returns an incorrect answer, like two plus two equals three.”

The implications of these errors are significant. ARIMA models are widely employed across various fields, including economics, health care, and weather forecasting, making accurate parameter estimation critical for effective decision-making. Wheeler highlighted this widespread use: “ARIMA models are used every day by researchers and industry professionals for forecasting and scientific analysis across many fields.”

USGS scientists Wheeler and Ionides, concerned by these findings, have reacted with a bolder proposal. They developed a new algorithm directly designed to fix the optimization mistake in maximum likelihood estimation. And they pitched their superior algorithm’s performance particularly when it comes to the R programming environment. That makes this an incredibly reliable option for practitioners’ solutions.