A Request-level Guaranteed Delivery Advertising Planning: Forecasting and Allocation

Zhang, Hong, et al. "A Request-level Guaranteed Delivery Advertising Planning: Forecasting and Allocation." Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2020. #kdd2020 #kdd

https://dl.acm.org/doi/10.1145/3394486.3403348

Abstract

PMPのProgramatic Gualanteed、在庫保証取引の配信を行なうための技術スタックについて

既存技術は現在のトレンドのニーズを満たしていない。

1) advertisers pursue more precise targeting, which requires not only user-level attributes but also request-level attributes;

2) users pre- fer more friendly ad serving, which imposes more diverse serving constraints;

3) the bottleneck of the publisher’s revenue growth lies in not only the ad serving, but also the forecast accuracy and sales strategy.

これらの問題を解決するため

we present a holistic design of a request-level guaranteed delivery advertising planning system with careful optimization for all three critical components including impression forecasting, selling and serving.

さらに、提案手法はテンセントのオンライン広告システムにデプロイした。1年経つ。

Introduction

A large portion of online display advertising is sold in a guaranteed delivery way.

中国ではそうなのか?? (日本ではPGは全然無いのだが……)

保証型配信を行なうためのパブリッシャーのタスク

1）指定された期間の在庫（対象となるImp機会数）を予測する

2）注文に対して保証できるImp数の最大量を数ヶ月または数週間前に決定し、契約を締結する

3）実際のユーザーからのリクエストが届いたときに、数千の対象となる契約のうち、各機会に対してどの契約を表示すべきかを一瞬の待ち時間で決定

確かにこのタスクをこなせるパブリッシャーは限られてくると思う。DSP並のシステムが必要。

続いて既存研究で足りていない点

First, most work formulate and solve the forecasting and allocation problems at the crowd level 2, 4, 7, 8, 10, 12. Some

In reality, advertisers always pursue more precise targeting. For example, an advertiser may tar- get Shanghai female iPhone users watching a specified popular TV show through a Wi-Fi connection.

もっと細かい属性まで指定して広告主は買いつけたいが、既存研究は crowd level (細かくないってこと?)

むしろユーザーターゲンティングが今後難しく (iOS14以降) なるから既存手法の方が良くなったりしないのかな?

https://gyazo.com/d0356f04887d78884ad1ccdb0272a326

同じユーザーでもその時の端末やメディアによって対応するオーダーが分かれる図

テンセントの既存オーダーはほぼリクエストレベルのターゲティングが必要だった。

Second, previous work in the guaranteed delivery adver- tising mainly focus on how to efficiently achieve optimal allocation for online serving, assuming that impression forecasts and contracts are already available. Though impression forecast errors and sales strategy have non-ignorable effects on the performance of ad serving 2, 4, 5, 12, few work has delved into the large-scale impression opportunity forecasting and selling problems at the user level10, let alone at the request level. Third,

既存研究はIMP予測と契約情報はすでに使える前提とする設定にフォーカスしている

リクエストレベル・ユーザーレベルのインプレッション予測は今までになかった

Third, more customized and user-friendly serving constraints are desired by both advertis- ers and users. Most allocation algorithms are tailored for specific constraints,

よりカスタマイズされたユーザーフレンドリーな配信制約が求めれているが、ほとんどのアルゴリズムは特定の制約にあわせて作られている。

産業界のトレンドはより細やかなターゲティングとフレンドリーな配信制約を求めているため、Request-level guaranteed delivery Advertising Planning system (RAP) を設計した。このシステムはインプレッション予測・販売・配信の3つの重要な要素を最適化するのが狙い。

https://gyazo.com/474427c6642179ede0c90f9ee6b95b66

メインコンポーネントは次の2つの挑戦にフォーカスしている

Large-scale request-level impression forecasting.

属性の組合せパターンが膨大になる、既存研究の組合せよりも自分達の組合せパターンは桁違い

In Tencent advertising system, there are billion level impressions per day.

The number of valid user attribute combinations is 63,901, and the number of valid request attribute combinations is 4,254,119, which results in an extremely large space for possible targeting.

ターゲティング可能な属性の組合せパターンが膨大になる、それを計算する資源も膨大に必要

Large-scale request-level impression allocation for selling and serving under customized linear constraints.

数百億のImpと数千の契約(deal)の直積になるため、短いレイテンシーで最適な割り当てをするのが非常に困難

販売段階では1分未満

アロケーション時には数百msec以内

任意にカスタマイズされた配信制約は、大規模な最適化問題の解決をさらに困難にする

この論文の貢献は次の3つ

We present a holistic design of a large-scale request-level guaranteed delivery advertising planning system

We propose a method combining clustering and tensor factorization to achieve accurate request-level impression forecasting and obviously reduce the time complexity.

RAP has been deployed in the Tencent online guaranteed delivery advertising system for nearly one year.

1年近くtencentのImp保証配信システムに載せて評価した

2 Related Work

時系列予測

伝統的な時系列予測の手法は疎な高次元データに適していない

行列分解を利用する手法が提案されている

Allocation

一般的には確率的最適化として考えられる

Imp保証広告への適用も提案されている

標準的な手法は大規模商業広告システムには遅すぎる

既存研究は群集レベルまたはユーザレベルで割り当てをモデル化している

3 Problem and system overview

広告プランニングは二部グラフ

インプレッション機会のprofile

ユーザー属性 (e.g., gender, age, income level and area)

attributes of the request itself (e.g., time, network condition, content type and con- tent name).

同じユーザーであってもWifi接続で映画をみていたり、4G接続でニュースをみたりする

インプレッションのタイプが異なるので、対応するDemandのオーダーは別になる

ユーザー i, Request Type j, オーダー k,

Notations

ユーザー: $ i

リクエストタイプ: $ j

オーダー: $ k

オーダーの要求Imp数: $ dk

オーダーの配信制約: $ qk (e.g. reach and frequency)

ユーザー i のリクエストタイプ j がオーダーkに割り当てられる割合: $ x_{ijk}

RAPのCore Components

Impression forecasting

契約前に予測する

契約期間における長期的なインベントリ $ s_{ij} の予測

配信前に予測する

配信最適化のための短期的な $ s_{ij} の予測

Impression allocation for selling

新規オーダーが到着したときに長期予測sと制約qもとでアロケーションプランxを求める

negative impactを最小化する

Impression allocation for serving

to achieve optimal ad serving, the publisher needs to generate a compact allocation plan xi jk based on the short-term forecast ofsi j . When a request arrives, given a set ofeligible contracts, the allocation plan and the real-time feedbacks, the publisher must decide the optimal contracts to serve the impressions in the request

多分、予測をミスると契約を受けすぎてImp保証の契約不履行となってしまう

4 Impression Forecasting

一般的に配信保証型広告は1日単位で販売するので予測の最小単位は1日とする。

$ s^t_{ij} を t日目のユーザーiの属性jのImp

t日目の総Impは $ D^t

将来M日の予測の誤差を最小化する

tensor factorizatioon

ただしこのロス関数はimpracticalだし既存のtensor factorization modelが適用できない

D の次元が大きくて疎だから

この問題に対処するため

テンソル分解にクラスタリングを組みこんだ

既存オーダーのターゲティング設定を考慮してロス関数を改良した

Tensor Factorization with clustering

https://gyazo.com/9507553d4a0716073bc8d93c778d0295

行列A, テンソルB, 行列C に分解する

次元K1はユーザークラスタ

Aはユーザーがどのクラスタに属するかの条件付き確率

次元K2はリクエスト属性のクラスタ

Cはリクエスト属性がどのクラスタに属するかの条件付き確率

Bはユーザークラスタとリクエスト属性クラスタの同時確率

元の予測問題を2つのクラスタリング問題と1つの低次元の予測問題に分解する

Reweighting forecast errors for target attribute combinations

ロスのweight調整

すべてのリクエスト属性の組合せについて同様に予測しなければならないわけではない

オーダーに含まれるリクエスト属性だけ当てればよい

$ \gamma を target request attributes

属性combinationがオーダーに含まれる確率で損失を補正する

この損失は配信不足の可能性を示すので、最小化したい

User Clustering

ユーザーグループ (〜10^4)

e.g. 上海の女性iPhoneユーザー

より密なテンソルになる

この損失はオーバーフィットを防ぐ

Request Clustering

https://gyazo.com/76695dd332fe6489758d622694c8e2d6

K-Meansのロス関数とおなじ

クラスタ数は15が得られた

クラスタは頻繁なupdateは不要

クラスタリング結果の一貫性を評価するために、2ヶ月間（2018年の10月から11月まで）の実際のデータを使って検証した

毎週のクラスタリング結果は、任意の2つの週の間で少なくとも12/15のクラスタ中心が重なっていた

最終的な損失関数

組合せ

過去のすべてのユーザーに基づいてオリジナルのテンソルを構築する必要はなく、直近のアクティブユーザーにフォーカスすればよい。

深層学習モデルのトレーニングでは、トレーニングデータのサイズを小さくするために、時間ウィンドウ内の1000人に1人のユーザーをトレーニングサンプルとして

5 Impression Allocation

5.1 Basic Allocation Problem Formulation

5.2 Optimization Algorithm

ラグランジュ緩和問題を経由して双対問題の導出

勾配降下や座標降下法で解けるけど時間がかかる

もっと良い方法がある

GPUで並列化

契約時のアロケーション

既存契約の割り当て済みImpについて配置換えは行なうが不足はしないように

既存契約を優先

6 System Evaluation

6.1 Datasets

6.2 Impression Forecasting

6.3 Impression Allocation

貪欲法 (既存契約のImpを移動しない) と比較すると圧倒的

オンラインナップサック相当?

貪欲法に非常に不利な問題?

7 Conclustion

#adtechnology