Only Using Pandas To Handle Time Series Anomaly Detection, Do You Get It?
Foreword
There are tons of time series anomaly detection methods, but till now, the simplest and most efficient method is the most widely used in industry.
This post introduces these simplest and most efficient methods, which are suitable for use in production environments.
Notice:
- No modeling, no machine learning, just using Pandas.
- The principle is simple, just read it and you will get it.
Case 1: Raw measurement with fixed threshold
We need to detect Temperature Trend’s anomalies.
This type of anomaly is the simplest and can be determined by setting a threshold.
Here we can set fixed thresholds: maximum and minimum.
Here’s a little trick: use the clip function to determine whether the data is within the range.
min_t = 12
max_t = 30
df[col+'threshold_alarm'] = (df[col].clip(lower = min_t,upper=max_t) != df[col])
plot_anomaly(df[col],anomaly_pred = df[df[col+'threshold_alarm']==True][col+'threshold_alarm'],anomaly_true=None,file_name = 'file')
Case 2: Dynamic threshold setting
The fixed threshold setting requires experience. Sometimes we can automatically calculate the threshold based on historical data. The common method is using percentile.
min_t = df[col].quantile(0.03)
max_t = df[col].quantile(0.97)
df[col+'threshold_alarm'] = (df[col].clip(lower = min_t,upper=max_t) != df[col])
Alternatively, the best practice in the industry is to use IQR to set the threshold.
IQR is to calculate the difference between the third and first quartile, and then use this difference value (IQR) as a benchmark to set a threshold.
Q1 = df[col].quantile(0.25)
Q3 = df[col].quantile(0.75)
IQR = Q3- Q1
c = 2
min_t = Q1 - c*IQR
max_t = Q3 + c*IQR
df[col+'threshold_alarm'] = (df[col].clip(lower = min_t,upper=max_t) != df[col])
Case 3: Detecting Changes Abnormalities
Thresholds can be used not only to detect the raw measurement, but also to detect the value of changes from the raw measurement.
Here, the IQR is applied to the change value as an example, and the change value is set as the change of previous value.
Of course, you are free to compare changes of the other period in history, such as the same time yesterday.
window = 1
df[col+'_diff'] = df[col].diff(periods= window).fillna(0)Q1 = df[col+'_diff'].quantile(0.25)
Q3 = df[col+'_diff'].quantile(0.75)
IQR = Q3- Q1
c = 2
min_t = Q1 - c*IQR
max_t = Q3 + c*IQR
df[col+'diff_alarm'] = (df[col+'_diff'].clip(lower = min_t,upper=max_t)!= df[col+'_diff'])
Because the amount of change is based on the a window, here we mark the abnormal time window in the plot.
Case 4: Processing Raw Data with Noise
The dataset above is relatively simple, we will show more complicated dataset below.
This is the resource usage of the CPU, and you can see that there are a lot of inconsistencies on it.
We firstly employ the IQR method we recommend above as a baseline.
Q1 = df[col].quantile(0.25)
Q3 = df[col].quantile(0.75)
IQR = Q3- Q1
c = 2
min_t = Q1 - c*IQR
max_t = Q3 + c*IQR
df[col+'threshold_alarm'] = (df[col].clip(lower = min_t,upper=max_t) != df[col])
You can see that many anomalies are detected, such as a sudden spike in CPU, or a high level for a long period of time.
Case 5: How to avoid false alarm
If customer A says that it is normal for the above CPU consumption to be at a high level (80%), only those that spikes are abnormal.
What can be done to avoid such false alarm?
The solution is simple: add a window. Check whether the time series within the window is stable.
window = 5
df[col+'ma'] = df[col].rolling(window=window,closed='left').mean()
kpi_col = col+'ma'+'diff'
df[kpi_col] = (df[col]-df[col+'ma']).fillna(0)
Case 6: Again,How to Avoid False Alarm
If customer B says, I do not agree with A. CPU resource consumption at a high level (80%) is abnormal, and spikes are not worth alerting. How can we do now?
The solution is simple as well: add windows and add windows
. Use the median to determine whether two windows are stationary.
window = 10
df[col+'ma'] = df[col].rolling(window=window,closed='left').median()
df[col+'ma_shift'] = df[col+'ma'].shift(periods=window)
kpi_col = col+'ma'+'shift'+'diff'
df[kpi_col] = (df[col+'ma']-df[col+'ma_shift']).fillna(0)
Case 7: Don’t Limit Yourself to MEAN aggregation
For some signals, such as vibration, heartbeat, the mean value may be stable, and the unstable value is the amplitude. In this case, we need to change the statistical method to use variance or standard deviation
.
window = 5
df[col+'ma'] = df[col].rolling(window=window,closed='left').std()
Define the plotting function for anomalies
In order to better highlight the anomalies in the trend, I defined the function of plotting anomalies. Among them, one is to identify the abnormal point, and the other is to identify the abnormal interval/window.
def plot_anomaly(ts,anomaly_pred = None,anomaly_true=None,file_name = 'file'):
fig = go.Figure()
yhat = go.Scatter(
x = ts.index,
y = ts,
mode = 'lines', name = ts.name)
fig.add_trace(yhat)
if anomaly_pred is not None:
status = go.Scatter(
x = anomaly_pred.index,
y = ts.loc[anomaly_pred.index],
mode = 'markers', name = anomaly_pred.name,marker= {'color':'red','size':10,'symbol':'star','line_width':0})
fig.add_trace(status)
if anomaly_true is not None:
status = go.Scatter(
x = anomaly_true.index,
y = ts.loc[anomaly_true.index],
mode = 'markers', name = anomaly_true.name,marker= {'color':'yellow','size':10,'symbol':'star-open','line_width':2})
fig.add_trace(status)
fig.show()
def plot_anomaly_window(ts,anomaly_pred = None,file_name = 'file',window='1h'):
fig = go.Figure()
yhat = go.Scatter(
x = ts.index,
y = ts,
mode = 'lines', name = ts.name)
fig.add_trace(yhat)
if anomaly_pred is not None:
for i in anomaly_pred.index:
fig.add_vrect(x0=i - pd.Timedelta(window),x1= i,line_width=0, fillcolor="red", opacity=0.2)
fig.show()
To recap
Only using Pandas can handle time series anomaly detection, do you get it?
CREDIT: THE IDEA IS INSPIRED BY ADTK, https://github.com/arundo/adtk