Type to SearchView Tags

Forecasting Model Innovations to Power the AI Transformation
Alberto Gutierrez Chief Data Scientist and AI Solutions Architect | February 17, 2021
140 Views

Driven by the digital transformation of business processes, time-series forecasting models are experiencing a significant wave of innovation. Though forecasting models are widely used for business planning across all industry sectors, the process is often a small-scale manual exercise. Until recently, business forecasting is performed by specialized forecasting experts, human-in-the-loop analysis, with inflexible or hard-to-tune software models capable of handling one or few simultaneous (multivariate) time-series. However, the automation of data-driven processes due to AI transformation and the acceleration due to the COVID-19 pandemic places significant new demands on forecasting methods.

In contrast to classical methods, today's forecasting scenarios require models with easy-to-understand parameters, usability by domain and business experts (not necessarily forecasting experts), automation without a human in the loop, and scalability to thousands or millions of simultaneous time-series.

Because the latest forecasting methods are very new, there is a lack of shared understanding across the business community, data scientists, and classical forecasting experts. This article will answer several essential questions necessary to align understanding between stakeholders and overcome these new challenges, including how time-series forecasting is unique within predictive analytics and its unique time-series nature? Beyond yesterday's small-scale applications, what are examples of complex forecasting use cases that businesses face today? What are the significant model innovations that bring forecasting methods into the age of AI? What are the models' relative merits, and where is it best to apply each model?

What is Time-Series Forecasting?

Time-series forecasting methods are unique and differ from non-time-series predictive analytics. A common yet insightful question is, "What is the difference between predictive analytics and forecasting?" In summary, time-series forecasting, "forecasting" for short, is a sub-discipline of prediction. The key difference is, in forecasting, we consider the temporal dimension. A typical prediction is given by , which is to say that the forecast of future values is dependent on past observations of the same variable. Such a process is known as an autoregressive process. Therefore, time-series forecasting models are designed to exploit the time-series autoregressive characteristics and arrive at a unique formulation of theory and solutions beyond non-time-series predictive analytics. By exploiting the time-dependent characteristics of the problem, often more accurate solutions are achieved.

Medium- to Large-Scale Applications of Time-Series Forecasting

Below is a non-exhaustive list of medium- to large-scale time-series forecasting problems. These use cases go beyond the often typical list of small-scale problems discussed with forecasting. These examples require automation, without a human in the loop, and are not satisfied by spreadsheet models. The use cases are taken from a review of the literature (referenced below). They are real use cases that organizations are facing and the types of use cases driving forecast innovations:

  • Forecasting the sales demand for retail or grocery stores by sales channel (online, store, distributor), product, city, state, and country
  • Supply chain forecasting of product availability based on suppliers, assembly, and potentially widespread and complex geographic logistic processes
  • Forecasting crop yield based on multivariate time-series variables, such as rainfall, temperature, etc.
  • Forecasting utilization demand on data center servers
  • Multivariate forecasting of vegetation growth and wildfires affecting electrical utilities
  • Repair center monthly parts demand for automobile or airplane parts manufacturing, including seasonal and economic factors
  • Forecasting simultaneous individual and aggregate demand for thousands or millions of electrical customers
  • Forecasting traffic congestion on the traffic lanes for city roads and highways
  • Global daily sales demand for millions of products such as for Amazon e-commerce sales
  • Daily and hourly demand for call support centers based on trends, seasons, holidays, and exogenous variables

Model Innovations for Scalable Forecasting

In recent years and up to recent months, significant innovations to modeling algorithms are becoming available. These include the following significant progress elements:

  • Enhancements to ARIMA models
  • A significant new model, Prophet, for ease of use by non-forecasting experts and up to problems of medium-scale complexity
  • Predictive analytics models, such as Support Vector Machine (SVM), Random Forest (RF), and Extreme Gradient Boost (XGB), for handling medium- to large-scale problems
  • Two new deep-learning models, DeepAR and NeuralProphet, handle large- to very-large-scale scenarios

Below is a summary of each of these modeling paradigms' salient strengths and weaknesses:

ARIMA Models

Forecasting experts often employ classical models such as ARIMA. Traditionally, the practical application of the ARIMA models requires advanced statistical expertise to analyze and configure. For example, the configuration includes setting parameters such as lag order, degree of differencing, moving average window, analyzing autocorrelations, and partial autocorrelations. Innovations to the ARMA models improve ease of use and accuracy with automatic ML-based discovery of the ARIMA model parameters. Popular open-source models are readily available in the R "auto.arima" and Python "pmdarima" packages. However, these classic models are appropriate for time-series processes with limited complexity regarding seasonality and the number of multi-variate time-series. For small-scale problems, they can be significantly more efficient and effective than other models.

Predictive Analytics Models

Data scientists schooled in machine learning-backed predictive analytics methods often apply regression models to the forecast problem. These models treat the forecast variable as the dependent variable and the past observations plus covariates as the independent variables (ML feature variables). This approach brings many potential models to bear on the problem, such as SVM (support vector machine), ensemble tree-based models (e.g., Random Forest) and boosting tree models (XG Boost). These models can potentially handle complex relationships for medium- to large-scale data volume, with a medium to a large number of covariates. However, these methods do not explicitly exploit time-series characteristics corresponding to trend and seasonality. Understanding, deploying, and automating these models takes significant data-science and software expertise.

Prophet

Facebook open-sourced Prophet in 2017. The model is a radical departure from classical methods; it works out of the box with minimal configuration and includes parameters with straightforward human interpretation. The goal is to make forecasting accessible to business domain experts, who are not necessarily forecasting experts. Such ease of use allows domain experts to easily configure and improve models based on domain-level heuristics and best practices. Under the hood, the model works based on an additive regression model, where it decomposes the time-series into trend, seasonality, and holiday components. After independent regression for each component, the sum of the model components forms the forecast. The model learns complex seasonality behavior; the trend methods include a capacity-limited logistic growth model or linear trends with change points and automatic change point selection. Application scenarios include human-in-the-loop analysis and also operation as an automated data process. The model is suitable for small- to medium-sized forecast problems with potentially tens of covariates and complex seasonality.

DeepAR and DeepAR+

Perhaps the most significant forecasting innovations are deep-learning models that make forecasting available to large-scale, complex, big data use-cases. These innovations started with LSTM (Long-Short Term Memory) RNN (Recursive Neural Network) approaches. However, until recently, the industry had not converged on a best-practice LSTM-based forecasting architecture. With the introduction of DeepAR by AWS in April 2017, the industry now has a general LSTM RNN architecture for time-series forecasting. DeepAR offers unique advantages, such as multivariate forecasts with multivariate inputs and scalability to thousands of covariates. The DeepAR model was benchmarked on realistic big-data scenarios and achieved approximately 15% improved accuracy relative to prior state-of-the-art methods. For example, benchmark use cases include automobile parts demand for 1046 aligned time-series, hourly energy demand for 370 customers, traffic lane congestion for San Francisco bay area highways, and Amazon sales demand. DeepAR is available as open source in the PyTorch and TensorFlow AI frameworks and as a service in the AWS Sagemaker AI service. The DeepAR+ model is a univariate forecast– univariate forecast with multivariate inputs– offered by the AWS forecasting service. A significant advantage of the deep-learning methods is that even if one series in a group has little data, the model can apply the learning from similar series to improve the forecast.

NeuralProphet

Next, a collaboration between Facebook and Stanford University introduced the "NeuralProphet" model in November 2020. "NeuralProphet" is an open-source model built on top of the PyTorch AI framework. The architecture is inspired by the popular Prophet model and the AR-Net, a non-LSTM feed-forward auto-regression deep-learning neural network. NeuralProphet is currently in Beta release and, at this time, offers univariate forecast with multivariate inputs. The architecture is not tested with large-scale multivariate scenarios and, due to the univariate forecast limitation, is significantly less scalable than DeepAR. Because NeuralProphet is not recursive, it is likely to exhibit significantly faster training performance than recursive-based models with potentially similar forecast accuracy. As compared to the predictive analytics methods, however, it does exploit the autoregressive and time statistics nature of the forecasting domain and can potentially provide better accuracy.

Where to Apply the Models

Model complexity
Figure 1. Where to apply forecasting models.

With all these options, businesses are challenged to choose the most effective model and technology for their forecasting applications. For example, countless technical publications argue the merits of one model over another, reporting better performance, such as ARIMA methods over Prophet, DeepAR, and ML algorithms, or vice versa. There is no one best model for all scenarios. For human-in-the-loop analysis, analysts often pick a model based on ease of use and the model's ability to solve the problem but one shouldn’t worry too much about optimizing model efficiency. However, as automation and problem complexity scale upwards, picking an effective and efficient model becomes a key concern. Ultimately, considerations based on the time-series data complexity, computational load, and forecasting accuracy will determine the model selection.

Based on the review of model characteristics, Figure 1 provides simplified guidance for applying the models. For example, for low complexity time-series, a few covariates, simple trends, and seasonality, ARIMA-based models are likely to be the most efficient and provide good accuracy. The Prophet model handles low- to medium-complexity cases, including tens of covariates, with complex trend and seasonality. For medium to large-scale cases, including tens to hundreds of covariates, complex trend, and seasonality, ML models such as Support Vector Machine (SVM), Random Forest (RF), and Extreme Gradient Boost (XGB) models can solve the problem. However, ML predictive models do not explicitly exploit the temporal nature. Therefore, to ensure achieving the ultimate predictive performance, they should be compared to deep-learning models or Prophet.

Two deep-learning models have recently become popular– DeepAR and NeuralProphet. DeepAR offers significant and unique advantages including, multivariate forecasts with multivariate inputs, learns across covariates to improve forecasts, and scales to thousands and potentially millions of covariates. NeuralProphet offers a time-series autoregressive forecasting method for medium- to large-scale scenarios. Training time will be a significant factor for recursive-based models (i.e., DeepAR); thus, NeuralProphet is likely to train faster. The deep-learning models are significantly more complicated to set up and run than the previous models. Consequently, they are usually not the best choice for medium-scale problems. However, for large-scale problems, deep-learning models should be considered versus ML models. It is worth emphasizing that no one best model and empirical performance studies, along with expertise in model configuration, will determine the best model.

Summary and Conclusion

Forecasting is an indispensable tool in the business planning process; and a unique problem where optimized solutions exploit the temporal nature of the data. With the acceleration of AI transformation, forecasting models and AI frameworks present new use cases beyond the small-scale spreadsheet oriented analysis of the past. A set of use cases as a result of research and customer experience are presented. These realistic use cases are driving new model innovation - they require automation, model usability by non-forecast experts, automation and integration into business processes, and models that scale up to thousands or millions of simultaneous time-series.

This recent wave of innovations has improved models and offer a suite of solutions to handle small-, medium-, large-, and very-large-scale forecasting scenarios. Classical models such as ARIMA are capable of handling low time-series complexity time-series. For low- to medium-scale problems, Prophet is a powerful model and offers ease of use by non-forecast experts. Data scientists have frequently employed ML engineering and ML methods, such as Support Vector Machine (SVM), Random Forest (RF), and Extreme Gradient Boosting (XGB), for large-scale problems with potentially hundreds of time-series.

The new deep-learning models offer several advantages. NeuralProphet offers a time-series autoregressive forecasting method for medium to large-scale scenarios, while DeepAR offers improvements that exploit time-series characteristics for very-large-scale forecasting scenarios. Ultimately, the best model for the specific application will result from an empirical evaluation that considers forecasting accuracy and the solution's complexity.

This post provides an overview of the significant advancements in forecasting methods, real use cases, and guidance for where to best apply each type of model. With this summary, the key stakeholders- business planner, data scientist, and forecast expert- are better aligned to integrate intelligence from advanced forecasting methods into business operations.

HCL's Next.ai is your partner for accelerating the AI journey and enhancing business value from scalable business forecasting. As organizations proceed on their AI transformation, they are faced with numerous challenges, such as modernizing data infrastructure, improving data quality, hard-to-find AI modeling expertise, and scalable ML engineering. Next.ai empowers the AI journey of customers with the experience to build solutions such as support center workforce forecasting, sales demand forecasting, data center capacity forecasting, and supply chain forecasting, including that of inventory. As data sets grow and scale, Next.ai offers a full stack of solutions based on AI modeling and ML engineering, including optimization of models at edge, analytics platform engineering, and AI analytics process automation.

For more information about forecasting and a full stack of AI and analytics services, consult the Next.ai web page at https://www.hcltech.com/next-ai and contact us at Next.ai@hcl.com.

HCL's Next.ai is your partner for accelerating the AI journey and enhancing business value from scalable business forecasting.

References and Further Reading

  1. R. J. Hyndman and G. Athanasopoulos. Forecasting: principles and practice. OTEXTS, 2nd edition. https://otexts.com/fpp2/.
  2. B. Letham S.J. Taylor. Prophet, forecasting at Scale. 2017. https://doi.org/10.7287/peerj.preprints.3190v2l.
  3. David Salinas, Valentin Flunkert, and Jan Gasthaus. Deepar: Probabilistic forecasting with autoregressive recurrent networks. February 2019. https://arxiv.org/pdf/1704.04110.pdf, arXiv 1704.04110.7
  4. Github. Neural Prophet. November 2020. https://github.com/ourownstory/neural_prophet