Neural Prophet vs. Prophet, Why Neural Prophet is So Accurate
Introduction
Let us look back at its predecessor — Prophet firstly.
I love Facebook Prophet a lot, as it works well out of the box and it is a white box to interpret. Rather, I would put it as the performance summit of classic “models”.
On the other hand, “model” is not the most proper word to position the Prophet as its biggest innovation is not the model itself, but making all staff a handy tool.
Prophet is based on the idea of signal decomposition, which essentially decomposes a time series into trend component, seasonal component, time events component, and external regressors.
- The trend component is piecewise linear type. Compared with the linear type of traditional signal decomposition, it is an innovation but a pit. Because the trend can easily conflict with the seasonal component if it has too many changing points (overfitting). I feel that this is the part that is most prone to overfitting.
- As for the seasonal component, the number of Fourier series can be adjusted for a specified frequency (period), which is also an innovative point, thus improving the accuracy.
- Discrete-time events and external regressors allow users to add external covariates, although in a linear way, but better than nothing.
In short, these few weak improvements for the model add up to a handy tool.
The Core of Neural Prophet
This post is mainly about Neural Prophet and explains what makes it better than Prophet.
Neural Prophet is an upgraded version of Prophet, but we can treat it as a brand new version. Although the official PPT highlighted a lot of changes, in short, it is based on two “dramatic changes”.
The first dramatic change is that the AR (autoregressive) is added as one component in Neural Prophet’s concept, which is super heavyweight.
- Because the AR model itself has its master place in the classic time series analysis domain, and it is used as an auxiliary in Neural Prophet, it is a god-level improvement accuracy.
- Compared with the traditional AR model, in Neural Prophet the AR model uses a neural network to fit faster and more accurately.
- Regularization is a very useful trick and obviously, FB is willing to use it everywhere. A Regularization concept is also adapted to the AR model (parameter is ar_sparsity).
- With Regularizations' support, your AR model can see farther, because you have the ability to use more historical data without worrying about fitting time.
The second big change is using Pytorch as a backend.
- Neural Prophet transfers all components into PyTorch, which is the real open-source code as Pytorch is so popular now.
- Trend component and cycle component do not change that much, their algorithms are still consistent with Prophet.
- AR component, external regression component changes very greatly. Using the PyTorch framework as backend means that you are no longer limited to linear regression.
- From the source code, Neural Prophet uses Relu to activate layers.
- Moreover, the neural network is the most suitable for network reconstruction. If the result is not pleasing to the eye, if you change the number of layers, it may be pleasing.
These two great changes are actually very sincere at present, but I am still waiting for its bigger feature: The global model.
The global model means that a model is adapted to multiple time series, such as predicting the sales of iPhone 11, 12, 13, and even the unreleased iPhone 18 at the same time (cold start problem).
We will cover an explanation of the global model in a later post.
Get started
Here I choose a classic dataset to demonstrate Neural Prophet step by step , especially showing the effect of core AR.
This dataset is the temperature information of the weather with a time resolution of 5min.
First let us import the module and dataset, which is the same as that of Prophet.
The requirements for the data format are also the same, that is the name must contain two columns- ds and y, representing the time info and target variables, respectively.
from neuralprophet import NeuralProphet
data_location = "https://raw.githubusercontent.com/ourownstory/neuralprophet-data/main/datasets/"
df = pd.read_csv(data_location + "yosemite_temps.csv")
df['ds'] = pd.to_datetime(df['ds'])
plt.figure(figsize=(20,16))
sns.lineplot(data = df.set_index('ds'))
plt.savefig('raw_y.png',format= 'png')
plt.show()
Use Prophet Only
We don’t need to install Prophet anymore as we can disable the AR component in Neural Prophet to downgrade to Prophet.
Here we split the dataset into 80% and 20% which prevents overfitting.
As you can see in the code below, the coding style is inherited from Prophet.
Since Neural Prophet has a neural network training process, plot_live_loss feature is integrated conveniently.
m_baseline = NeuralProphet()
df_train, df_test = m_baseline.split_df(df, freq='5min', valid_p = 0.20)
plt.figure(figsize=(20,16))
metrics = m_baseline.fit(df_train, freq='5min',validation_df=df_test, plot_live_loss=True)
Validate Prophet Forecast
The MAE of the training set is about 3.75, but the MAE of MAE_val arises to 28+, which is typical overfitting. We can compare the real values with the predictions in the plot to furtherly investigate.
forecast = m_baseline.predict(df)
fig = go.Figure()
fig.add_trace(go.Scatter(mode='markers',x = forecast['ds'],y=forecast['y'],name='y',opacity=0.5,marker=dict(size=2)))
fig.add_trace(go.Scatter(mode='lines',x = forecast['ds'],y=forecast['yhat1'],name='y_haat'))
fig.show()
The black theme style of scatter plot in Prophet family looks uncomfortable, here I use Plotly to plot the result. The issue is obviously revealed in the plot: the tail of the training dataset is reaching the sky (increase all the way).
Because the trend component is fitted by the piecewise linear way. For the training dataset, we can fit the data more accurately as (Neural) Prophet can adjust the change points. But when you get to the test dataset, the model will be confused. Without the information of the “change point”, the trend component can only go to “trend” all the way.
We can check the plot_parameters as well, the first component is the trend component. Please be aware of the magnitude (30+), and then look at the tail trend. Now you get it, the trend is the major component and has a misleading tail making the forecast worse in the test dataset.
Enable Neural Prophet AR model
The AR is designed to eliminate this bug and make models more robust,in my opinion. Because AR is to use historical data to predict future data. Intuitively speaking, this is the baseline. With a baseline, are you still worried about how bad forecast will be?
The resolution of this dataset is 5min, and the temperature is in daily period. So we use the one-day data as the autoregressive parameter.
Let us give a try again with same code but enabling the AR model.
m_add_lag = NeuralProphet(n_lags=12*24)
df_train, df_test = m_add_lag.split_df(df, freq='5min', valid_p = 0.20)
m_add_lag.fit(df_train, freq='5min',validation_df=df_test, plot_live_loss=True)
m_add_lag.plot_parameters()
plt.savefig('ar_pramas.png',format= 'png')
plt.show()
forecast = m_add_lag.predict(df)
fig = go.Figure()
fig.add_trace(go.Scatter(mode='markers',x = forecast['ds'],y=forecast['y'],name='y',opacity=0.5,marker=dict(size=2)))
fig.add_trace(go.Scatter(mode='lines',x = forecast['ds'],y=forecast['yhat1'],name='y_haat'))
fig.show()
Comparison
We see that the forecast results are perfect compared to the original Prophet.
Training MAE=0.394 and MAE_val=0.304.
The AR component completely corrects the misleading of the trend component. In my opinion, with the AR component, the trend component can actually be significantly regularized.
Recap
This post introduces the core of Neural Prophet, and also explains why it can greatly improve the forecast effect via examples.
If you are interested in other parameters of Neural Prophet, you can refer to the official documentation.