Intended usage
This dashboard is intended for internal use only.
If you have questions about metrics that can be used for external communication, please contact Data Science for the latest messaging.
Prediction assessment process
Basic
- For each recommendation metric
- Iterate over all customer tenants
- For each tenant, randomly select 25% of completed campaigns
- Withhold all offers within these campaigns and any identical offers in other campaigns from the training data. Rebuild all prediction models used by the current tenant
- Use the rebuilt models to predict the relative results for the withheld campaigns
- Measure pairwise accuracy for all withheld campaign: any pair of offers from the same campaign that is directionally correct (ex. offers A and B were both tested in the same campaign. Offer A was predicted to perform better than offer B and offer A performed better than offer B) gets counted as one correct prediction. Any pair of offers that is directionally incorrect (ex. offer A was predicted to perform better than offer B, but offer B performed better than offer A) gets counted as one incorrect prediction.
Cold start
- For each recommendation metric
- Iterate over all customer tenants
- Turn on data sharing settings allowing all tenants to share data (not persisted and only impacts this analysis), subject to category restrictions
- Withhold all completed campaigns from the current tenant. Rebuild all prediction models used by the current tenant
- Use the rebuilt models to predict the relative results for all withheld campaigns
- Measure the pairwise accuracy for all withheld campaigns
The distinction between cases where error is known and where error is approximated
For some recommendation metrics, the error of each measured value is known. For pairs of promotions, this means that it is possible to calculate expected accuracy. In essence, based on the values measured for both offers, what is the probability that offer A would beat offer B (or the other way around) if this experiment were repeated. This expected accuracy value is important because it is the upper bound for the prediction accuracy. Increasing the measured accuracy requires gathering more sample or making test platform improvements.
For other recommendation metrics, the error of each measured value is unknown (ex. a retailer platform may report only rolled up numbers without any accompanying error). For these cases, error is approximated for modeling purposes, but it is not possible to report an accurate measurement accuracy.
Inputs
Platforms: restricts output by platform
Tenants: restricts output by tenant
Start Date: restricts output such that it is only based on campaigns whose end date is greater than or equal to the entered date
End Date: restrict output such that is is only based on campaigns whose end data is less than or equal to the entered date
Cold Start: if selected, limits output to cold start results. If not selected, limits results to basic results. See 'Prediction assessment process' section above.
Recommendation Metric: determines which recommendation metric to report on
Group: controls the level at which results are aggregated
Latest Results section
Aggregates data from the last time each campaign was assessed.
Overall Accuracy Over time section
Plots the values for from each time that prediction assessment was run. Note that these values may fluctuate as different sets of campaigns are chosen each time that prediction assessment is run.
Detailed Results section
Contains exportable data from each time that a particular campaign is assessed.
Feedback and questions
Please contact Engineering or Data Science if you have any questions or feedback about this dashboard or the overall prediction process.