In the dynamic world of advertising technology, detecting anomalies quickly and accurately is crucial for maintaining platform stability and customer satisfaction.
By combining Machine Learning (ML) and time series analysis methods with intuitive data presentation, we created a powerful tool that enables proactive issue resolution and helps maintain the health of our ad exchange ecosystem.
Anomaly detection isn't just about identifying issues; it's about proactively addressing potential problems before they escalate. Implementing a robust system for this purpose enables us to:
As a data science team in an Ad Tech company that connects app developers with advertisers, we see firsthand how sudden anomalies can disrupt this delicate balance. Sometimes, this is a direct consequence of our experiments.
Machine learning techniques have revolutionized our ability to analyze and interpret complex systems across various domains, including Anomaly detection, with classics such as Isolation Forest (2008), One-Class Classification (1996), DBSCAN (1996), and Local Outlier Factor (2000).
In our case, we use ML to model customers' performance by predicting the expected values and comparing them with the actual performance metrics for our biggest publishers and advertisers.
Using ML to predict our customers' performance allows us to identify patterns related to changing trends and seasonality in the Ad Tech landscape.
For that, we assume that customers' performance is a Time Series problem.
At the heart of our anomaly detection system lies time series analysis. This statistical technique allows us to:
By understanding the underlying patterns in our data, we can more accurately identify when something truly unusual occurs.
Perfection is an elusive goal when it comes to predicting some random KPI behavior. The dynamic nature of RTB auction arena interactions with technology means that even the most sophisticated models can't achieve 100% accuracy. However, this doesn't mean we can't effectively identify and address anomalies.
In our approach, we adopted this inherent uncertainty and used it to classify suspect discrepancies as anomalies. Discrepancies between our predicted values and actual observations are analyzed and categorized. This binary classification - an anomaly or not, allows us to quickly identify potential issues while acknowledging the natural variations in a publisher or advertiser's performance.
For our specific use case, we implemented Meta's open-source Prophet model.
Prophet is particularly well-suited for forecasting time series data with strong seasonal effects and multiple seasons of historical data. Some key advantages of Prophet include:
We use Prophet to generate expected values for various performance metrics. These forecasts serve as our baseline for anomaly detection.
Not every discrepancy is something we need to worry about; we classify anomalies based on the magnitude and the direction of the mismatch between expected and actual values. This classification can help us prioritize our response:
By categorizing anomalies, we can allocate resources effectively and promptly address the most pressing issues.
Detecting anomalies is only half the way; presenting this information in an actionable format is equally important, and here is why:
We implemented a robust visualization dashboard to bring our anomaly detection results to life, and this empowers our team with the following capabilities:
The combination of advanced anomaly detection algorithms and user-friendly visualization tools ensures that our team can stay ahead of potential issues, providing a proactive approach to healthy platform management and customer satisfaction.
Here is an example of using Prophet to model and predict individual customer behavior, comparing with actual results and anomaly classification
To start, we need the following libraries:
from prophet import Prophet
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
We start with selecting one entity and analyzing its selected KPI performance.
In this example, we will use this Pandas DataFrame, which contains only two columns: Time and Performance. Visualizations make the information accessible to a broader audience, which fosters a data-driven culture across the company. Performance represents the selected metric.
To see what it looks like over time:
plt.figure(figsize = (14,7))
plt.grid(alpha = 0.2)
plt.plot(
x = train_period.timestamp,
y = train_period.performance,
linewidth = 2
);
We can easily identify a strong hourly seasonality.
Now let's take a look at the test period (18 hours):
plt.figure(figsize = (14,7))
plt.grid(alpha = 0.2)
plt.plot(
x = test_period.timestamp,
y = test_period.performance,
linewidth = 2);
We aim to identify any anomalies during the test period. In this case, there are issues around 22:00 on 07-17.
Now, let's train a Prophet model with a confidence level of 80%
predictor = Prophet(
interval_width=0.8
)
train_period.columns = ['ds', 'y'] ## This is a Prophet build in constrain.
predictor.fit(train_period)
## Collect the prediction, and the uncertainty intervals boundries.
forecast = predictor.predict(test_period)[['ds','yhat','yhat_lower','yhat_upper']]
## Merge to actuals:
combined_df = pd.merge(
test_period, forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']], on="ds", how="left"
)
## Plot everything
plt.figure(figsize=(14, 8))
# Scatter plot for actual values
plt.scatter(combined_df['ds'], combined_df['y'], label='Actual Values', color='black', s=15)
# Line plot for predicted values (yhat)
plt.plot(combined_df['ds'], combined_df['yhat'], label='Predicted Values (yhat)', color='blue', linewidth=2)
# Line plots for upper and lower confidence bounds
plt.plot(combined_df['ds'], combined_df['yhat_upper'], color='red', linestyle='--', linewidth=0.01)
plt.plot(combined_df['ds'], combined_df['yhat_lower'], color='red', linestyle='--', linewidth=0.01)
# Fill between upper and lower bounds
plt.fill_between(combined_df['ds'], combined_df['yhat_lower'], combined_df['yhat_upper'], color='lightskyblue', alpha=0.2)
# Add labels, title, and legend
plt.title('Actual vs Predicted Values with Confidence Intervals', fontsize=16)
plt.xlabel('Datetime', fontsize=14)
plt.ylabel('Values', fontsize=14)
plt.legend(fontsize=12)
plt.grid(alpha=0.3)
plt.xticks(rotation=45, size=12)
plt.yticks(size=12)
plt.tight_layout()
# Show the plot
plt.show()
The prediction interval is visualized in light blue, with a notable anomaly observed significantly below this range. To comprehensively analyze these outliers, our objective is to construct a unified data frame encompassing both the training and testing periods, as well as the corresponding predictions.
results_to_plot = pd.concat([train_period, forecast], axis=0)
With the current predicted behavior and actual values available, we can conduct a comparative analysis to determine whether any given point constitutes an anomaly. As a theoretical approach, we define an anomaly threshold using a distance of one standard deviation from the prediction boundaries. This criterion classifies suspect points as potential anomalies.
series_std = results_to_plot.y.std()
print(series_std)
>>> 0.277
results_to_plot['is_anomaly'] =
results_to_plot.apply(lambda row: 1 if ((row.y < (row.yhat_lower - series_std))
| (row.y > (row.yhat_upper + series_std))) else 0, axis = 1)
results_to_plot['is_anomaly'].value_counts()
>>>is_anomaly
0 166
1 2
This is what our data looks like:
By leveraging machine learning and time series analysis, we created a robust system that detects anomalies and classifies them based on severity, allowing for efficient resource allocation and swift response times.
The implementation of Prophet and our custom visualization tools has empowered our team to proactively address potential issues before they escalate. We believe this approach will significantly enhance our platform's stability, improve customer satisfaction, and optimize revenue streams.