上海品茶

您的当前位置:上海品茶 > 报告分类 > PDF报告下载

2017年视频推荐中用户兴趣建模、识别的挑战和解法.pdf

编号:95384 PDF 32页 1.52MB 下载积分:VIP专享
下载报告请您先登录!

2017年视频推荐中用户兴趣建模、识别的挑战和解法.pdf

1、视频推荐搜索中的用户兴趣优酷 搜索、推荐、内容智能负责人 数据智能部总监 李玉Agenda优酷视频个性化搜索推荐简介 视频个性化搜索推荐中的用户兴趣表达的挑战 当前工业界常见方法的问题探讨 我们的尝试的方法优酷个性化服务简介个性化服务在优酷Data一多半的视频播放通过个性化搜索推荐技术分发 对于CTR、人均播放量、人均时长、留存率等均有显著提升 帮助用户发现好内容,帮助高质量内容触达精准受众6亿+视频5亿+用户Algo视频推荐中用户兴趣表达的挑战视频推荐的用户兴趣表达的挑战技术挑战:剧、综、影、漫:用户选择成本高,用户追的剧、综艺少,推荐成功率低 用户目的性强,发现、浏览、逛的心智低 长节目可

2、选择空间有限 头部节目用户行为稀疏,大量用户每月只观看3个以下节目,对比:短视频信息流场景:通过数百个观看行为推荐30个 优酷头部节目:通过3、4个观看行为推荐30个 数据噪声多、分布驱热、highly biased,常用推荐算法模型描述能力不足视频推荐的用户兴趣表达的挑战 cont.技术挑战:视频内容兴趣复杂,感性、微妙、亚文化细分多样,对于符合兴趣大方向的惊喜度(serendipity)与多样性要求更高,对比:电商:兴趣明确:想买4K电视、牛仔裤、连衣裙;高度结构化,类目体系清晰 视频:兴趣感性、微妙:喜欢香港武侠片但是讨厌成龙;喜欢日本动漫,今敏等、但讨厌宫崎骏;兴趣会进化、发展、细分,

3、如:相声:郭德纲 小岳岳-方清平;或者-王玥波评书;或者-侯宝林 刘宝瑞 马三立 传统 科幻迷:从浅度:看星战、地心引力-中度:星际穿越-深度:银翼杀手、降临、三体;微妙的亚文化:二次元、游戏、直播;文艺青年;腐、柜;追剧族、韩剧迷、恐怖片迷 兴趣体现的是用户的个人认同 兴趣多维度正交,如:只看”大制作”、美剧质感 不喜欢重复,期待惊喜(serendipity)识别、表达用户兴趣的重要性Retargeting(看了又看):推荐用户有过交互的内容(看了又看)成功率高,长期价值低 局部提升非全局提升(抢其他渠道流量)成功率高因此ctr高 容易陷入局部最优 热点推荐 推荐近期热点 容易陷入局部最优

4、个性化兴趣推荐 推荐符合每个用户兴趣的内容 成功率低因此ctr偏低 更具长期价值 短期收益可能小,但容易长期收敛 推荐命中成功率:retargeting 热点 个性化发现 推荐命中(不命中)价值:个性化发现 推荐热点 retargeting个性化内容推荐较少模型兴趣预测不准确兴趣命中少正样本不足当前工业界常见方法的问题探讨个性化推荐工业界常用方法流程:召回、排序 特征:统计特征 用户画像:DEMO、用户对于标签的frequency、recency 高维组合特征 Item based similarity(i2i)Common Algo Framework(对应的优酷的方法)DataMatchF

5、eatureRankRRankFTRL,DNN,XGBoost,FFMEnsembleRerankFFeatureItem/User/User2Item StatisticsUser Profile:(Demo,Interest profile,search profile,view history)Item tags,categories,topicsitem/tag/topic relevance scoresMMatchItem Based CF,DNN CFSlim CFTag to Item,User2user2ItemStar2ItemPopularity,TrendingDDat

6、aETLoffline/streaming常用方法对于表达用户视频兴趣的问题Demo(年龄、性别、地域),设备类型、城市.问题:用户的内容兴趣与以上信息相关性不大 问题:三线城市50岁男性可能和一线城市30岁女性的观看习惯一致 基于内容标签的用户画像 人工内容标签:恐怖片、动作片、搞笑、香港片、韩国片 Topic Modeling标签:LDA提取视频标题、描述的主题(内容数据噪声大)基于统计的方法(frequency、recency)建立用户标签 问题:人工标签主观性大、噪声大 问题:人工标签粒度容易过于宽泛 问题:topic modeling标签噪声大、数据稀疏 问题:往往基于统计的方法,很

7、难精准描述用户的兴趣 问题:容易受到驱热的影响常用方法对于表达用户兴趣的问题 cont.高维组合特征 通过组合以上各种特征,产生更丰富的信息 问题:容易受到噪声影响 问题:计算量过大 Item based similarity(i2i)CF similarity SVD+/MF Slim DNN 简单高效Problem of I2IItem based CF是学术和工业界都最有效的方法之一 Item based方法比User based方法更有效。主要因为user 维度行为更稀疏,噪声更大。Item的维度积累历史行为更多,variance更小。问题1:由于基于item维度的全局统计,每个用户观

8、看item的不同原因信息被平均掉。对于一个视频,有的用户因为热度观看,有的用户因为主题的类型观看,有的用户因为主演、导演观看。问题2:不同用户群体的不同喜好在全局Item similarity的计算过程中被平滑掉。问题3:对于长尾item行为数据过于稀疏 问题4:粒度太细,数据稀疏,扩展能力弱 问题5:驱热、哈利波特现象介绍我们的一些尝试基础用户画像做法用户 兴趣画像用户观看行为内容标签内容的标签、类目体系演员、导演等Metadata内容针对每个标签、类目的兴趣强度分兴趣画像用户对于各类标签观看的FrequencyRecency用户观看行为问题:基于统计,无法区分驱热、类型、明星等信息 粒度过

9、于粗User Interest Latent VectorEnd2End 黑盒模型由于噪声与概率分布假设的问题并非全局收敛,需缩小搜索空间 拆解为多个更容易的子问题 机器学习解一个End2End大问题 拆解为若干个更容易的小问题 传统End2End方法易受数据稀疏与噪声影响:End2End模型:观看历史节目推荐,易受噪声影响 拆解为子问题预测模型:观看历史宽泛兴趣分类Latent Vector节目推荐,对于噪声更鲁邦 宽泛兴趣Latent vector人工构建类目体系+审核,降噪?LatentVector?用户兴趣的建模的work-CTRCollaborative Topic Modeling

10、 for Recommending Scientific Articles 用户兴趣的建模的work-CTPFContent-based recommendations with Poisson factorization A Practical Algorithm for Solving the Incoherence Problem of Topic Models In Industrial Applications 用户兴趣的建模的work-CTPF with popularity,stars tags and queries实现性能优化,scalable to internet sca

11、le 基于parameter server架构的分布式实现 EM不是全局收敛。针对每个topic进行人工审核,再作为初始值进行迭代。扩展到文本+标签+meta+流行度 基于兴趣向量的个性化I2I similarity长期兴趣与短期兴趣的平衡Phased GRU RecNetBased on:SESSION-BASED RECOMMENDATIONS WITH RECURRENT NEURAL NETWORKS-ICLR2016Listwise Loss:BPR/TOP1 Loss捕捉用户兴趣中的时域规律:长期短期平衡 有一些短期兴趣满足后,多样性需求会变强过一段时间需求又会周期性的出现长期兴趣

12、与短期兴趣的平衡Phased GRU RecNet cont.GRU:默认的假设是等距采样:Published as a conference paper at ICLR 20162.2DEEPLEARNING INRECOMMENDERSOne of the first related methods in the neural networks literature where the use of Restricted Boltz-mann Machines(RBM)for Collaborative Filtering(Salakhutdinov et al.,2007).In thi

13、s work anRBM is used to model user-item interaction and perform recommendations.This model has beenshown to be one of the best performing Collaborative Filtering models.Deep Models have been usedtoextractfeaturesfromunstructuredcontentsuchasmusicorimagesthatarethenusedtogetherwithmore conventional c

14、ollaborative filtering models.In Van den Oord et al.(2013)a convolutional deepnetwork is used to extract feature from music files that are then used in a factor model.More recentlyWang et al.(2015)introduced a more generic approach whereby a deep network is used to extractgeneric content-features fr

15、om any types of items,these features are then incorporated in a standardcollaborative filtering model to enhance the recommendation performance.This approach seems tobe particularly useful in settings where there is not sufficient user-item interaction information.3RECOMMENDATIONS WITHRNNSRecurrent

16、Neural Networks have been devised to model variable-length sequence data.The maindifference between RNNs and conventional feedforward deep models is the existence of an internalhidden state in the units that compose the network.Standard RNNs update their hidden state h usingthe following update func

17、tion:ht=g(Wxt+Uht?1)(1)Where g is a smooth and bounded function such as a logistic sigmoid function xtis the input ofthe unit at time t.An RNN outputs a probability distribution over the next element of the sequence,given its current state ht.A Gated Recurrent Unit(GRU)(Cho et al.,2014)is a more ela

18、borate model of an RNN unit thataims at dealing with the vanishing gradient problem.GRU gates essentially learn when and by howmuch to update the hidden state of the unit.The activation of the GRU is a linear interpolationbetween the previous activation and the candidate activationht:ht=(1?zt)ht?1+z

19、tht(2)where the update gate is given by:zt=?(Wzxt+Uzht?1)(3)while the candidate activation functionhtis computed in a similar manner:ht=tanh(Wxt+U(rt?ht?1)(4)and finaly the reset gate rtis given by:rt=?(Wrxt+Urht?1)(5)3.1CUSTOMIZING THEGRUMODELWe used the GRU-based RNN in our models for session-base

20、d recommendations.The input of thenetwork is the actual state of the session while the output is the item of the next event in the session.The state of the session can either be the item of the actual event or the events in the session sofar.In the former case 1-of-N encoding is used,i.e.the input v

21、ectors length equals to the numberof items and only the coordinate corresponding to the active item is one,the others are zeros.Thelatter setting uses a weighted sum of these representations,in which events are discounted if theyhave occurred earlier.For the stake of stability,the input vector is th

22、en normalized.We expect thisto help because it reinforces the memory effect:the reinforcement of very local ordering constraintswhich are not well captured by the longer memory of RNN.We also experimented with adding anadditional embedding layer,but the 1-of-N encoding always performed better.The co

23、re of the network is the GRU layer(s)and additional feedforward layers can be added betweenthe last layer and the output.The output is the predicted preference of the items,i.e.the likelihoodof being the next in the session for each item.When multiple GRU layers are used,the hiddenstate of the previ

24、ous layer is the input of the next one.The input can also be optionally connected3Published as a conference paper at ICLR 20162.2DEEPLEARNING INRECOMMENDERSOne of the first related methods in the neural networks literature where the use of Restricted Boltz-mann Machines(RBM)for Collaborative Filteri

25、ng(Salakhutdinov et al.,2007).In this work anRBM is used to model user-item interaction and perform recommendations.This model has beenshown to be one of the best performing Collaborative Filtering models.Deep Models have been usedtoextractfeaturesfromunstructuredcontentsuchasmusicorimagesthatarethe

26、nusedtogetherwithmore conventional collaborative filtering models.In Van den Oord et al.(2013)a convolutional deepnetwork is used to extract feature from music files that are then used in a factor model.More recentlyWang et al.(2015)introduced a more generic approach whereby a deep network is used t

27、o extractgeneric content-features from any types of items,these features are then incorporated in a standardcollaborative filtering model to enhance the recommendation performance.This approach seems tobe particularly useful in settings where there is not sufficient user-item interaction information

28、.3RECOMMENDATIONS WITHRNNSRecurrent Neural Networks have been devised to model variable-length sequence data.The maindifference between RNNs and conventional feedforward deep models is the existence of an internalhidden state in the units that compose the network.Standard RNNs update their hidden st

29、ate h usingthe following update function:ht=g(Wxt+Uht?1)(1)Where g is a smooth and bounded function such as a logistic sigmoid function xtis the input ofthe unit at time t.An RNN outputs a probability distribution over the next element of the sequence,given its current state ht.A Gated Recurrent Uni

30、t(GRU)(Cho et al.,2014)is a more elaborate model of an RNN unit thataims at dealing with the vanishing gradient problem.GRU gates essentially learn when and by howmuch to update the hidden state of the unit.The activation of the GRU is a linear interpolationbetween the previous activation and the ca

31、ndidate activationht:ht=(1?zt)ht?1+ztht(2)where the update gate is given by:zt=?(Wzxt+Uzht?1)(3)while the candidate activation functionhtis computed in a similar manner:ht=tanh(Wxt+U(rt?ht?1)(4)and finaly the reset gate rtis given by:rt=?(Wrxt+Urht?1)(5)3.1CUSTOMIZING THEGRUMODELWe used the GRU-base

32、d RNN in our models for session-based recommendations.The input of thenetwork is the actual state of the session while the output is the item of the next event in the session.The state of the session can either be the item of the actual event or the events in the session sofar.In the former case 1-o

33、f-N encoding is used,i.e.the input vectors length equals to the numberof items and only the coordinate corresponding to the active item is one,the others are zeros.Thelatter setting uses a weighted sum of these representations,in which events are discounted if theyhave occurred earlier.For the stake

34、 of stability,the input vector is then normalized.We expect thisto help because it reinforces the memory effect:the reinforcement of very local ordering constraintswhich are not well captured by the longer memory of RNN.We also experimented with adding anadditional embedding layer,but the 1-of-N enc

35、oding always performed better.The core of the network is the GRU layer(s)and additional feedforward layers can be added betweenthe last layer and the output.The output is the predicted preference of the items,i.e.the likelihoodof being the next in the session for each item.When multiple GRU layers a

36、re used,the hiddenstate of the previous layer is the input of the next one.The input can also be optionally connected3Published as a conference paper at ICLR 20162.2DEEPLEARNING INRECOMMENDERSOne of the first related methods in the neural networks literature where the use of Restricted Boltz-mann Ma

37、chines(RBM)for Collaborative Filtering(Salakhutdinov et al.,2007).In this work anRBM is used to model user-item interaction and perform recommendations.This model has beenshown to be one of the best performing Collaborative Filtering models.Deep Models have been usedtoextractfeaturesfromunstructured

38、contentsuchasmusicorimagesthatarethenusedtogetherwithmore conventional collaborative filtering models.In Van den Oord et al.(2013)a convolutional deepnetwork is used to extract feature from music files that are then used in a factor model.More recentlyWang et al.(2015)introduced a more generic appro

39、ach whereby a deep network is used to extractgeneric content-features from any types of items,these features are then incorporated in a standardcollaborative filtering model to enhance the recommendation performance.This approach seems tobe particularly useful in settings where there is not sufficie

40、nt user-item interaction information.3RECOMMENDATIONS WITHRNNSRecurrent Neural Networks have been devised to model variable-length sequence data.The maindifference between RNNs and conventional feedforward deep models is the existence of an internalhidden state in the units that compose the network.

41、Standard RNNs update their hidden state h usingthe following update function:ht=g(Wxt+Uht?1)(1)Where g is a smooth and bounded function such as a logistic sigmoid function xtis the input ofthe unit at time t.An RNN outputs a probability distribution over the next element of the sequence,given its cu

42、rrent state ht.A Gated Recurrent Unit(GRU)(Cho et al.,2014)is a more elaborate model of an RNN unit thataims at dealing with the vanishing gradient problem.GRU gates essentially learn when and by howmuch to update the hidden state of the unit.The activation of the GRU is a linear interpolationbetwee

43、n the previous activation and the candidate activationht:ht=(1?zt)ht?1+ztht(2)where the update gate is given by:zt=?(Wzxt+Uzht?1)(3)while the candidate activation functionhtis computed in a similar manner:ht=tanh(Wxt+U(rt?ht?1)(4)and finaly the reset gate rtis given by:rt=?(Wrxt+Urht?1)(5)3.1CUSTOMI

44、ZING THEGRUMODELWe used the GRU-based RNN in our models for session-based recommendations.The input of thenetwork is the actual state of the session while the output is the item of the next event in the session.The state of the session can either be the item of the actual event or the events in the

45、session sofar.In the former case 1-of-N encoding is used,i.e.the input vectors length equals to the numberof items and only the coordinate corresponding to the active item is one,the others are zeros.Thelatter setting uses a weighted sum of these representations,in which events are discounted if the

46、yhave occurred earlier.For the stake of stability,the input vector is then normalized.We expect thisto help because it reinforces the memory effect:the reinforcement of very local ordering constraintswhich are not well captured by the longer memory of RNN.We also experimented with adding anadditiona

47、l embedding layer,but the 1-of-N encoding always performed better.The core of the network is the GRU layer(s)and additional feedforward layers can be added betweenthe last layer and the output.The output is the predicted preference of the items,i.e.the likelihoodof being the next in the session for

48、each item.When multiple GRU layers are used,the hiddenstate of the previous layer is the input of the next one.The input can also be optionally connected3Published as a conference paper at ICLR 20162.2DEEPLEARNING INRECOMMENDERSOne of the first related methods in the neural networks literature where

49、 the use of Restricted Boltz-mann Machines(RBM)for Collaborative Filtering(Salakhutdinov et al.,2007).In this work anRBM is used to model user-item interaction and perform recommendations.This model has beenshown to be one of the best performing Collaborative Filtering models.Deep Models have been u

50、sedtoextractfeaturesfromunstructuredcontentsuchasmusicorimagesthatarethenusedtogetherwithmore conventional collaborative filtering models.In Van den Oord et al.(2013)a convolutional deepnetwork is used to extract feature from music files that are then used in a factor model.More recentlyWang et al.(

51、2015)introduced a more generic approach whereby a deep network is used to extractgeneric content-features from any types of items,these features are then incorporated in a standardcollaborative filtering model to enhance the recommendation performance.This approach seems tobe particularly useful in

52、settings where there is not sufficient user-item interaction information.3RECOMMENDATIONS WITHRNNSRecurrent Neural Networks have been devised to model variable-length sequence data.The maindifference between RNNs and conventional feedforward deep models is the existence of an internalhidden state in

53、 the units that compose the network.Standard RNNs update their hidden state h usingthe following update function:ht=g(Wxt+Uht?1)(1)Where g is a smooth and bounded function such as a logistic sigmoid function xtis the input ofthe unit at time t.An RNN outputs a probability distribution over the next

54、element of the sequence,given its current state ht.A Gated Recurrent Unit(GRU)(Cho et al.,2014)is a more elaborate model of an RNN unit thataims at dealing with the vanishing gradient problem.GRU gates essentially learn when and by howmuch to update the hidden state of the unit.The activation of the

55、 GRU is a linear interpolationbetween the previous activation and the candidate activationht:ht=(1?zt)ht?1+ztht(2)where the update gate is given by:zt=?(Wzxt+Uzht?1)(3)while the candidate activation functionhtis computed in a similar manner:ht=tanh(Wxt+U(rt?ht?1)(4)and finaly the reset gate rtis giv

56、en by:rt=?(Wrxt+Urht?1)(5)3.1CUSTOMIZING THEGRUMODELWe used the GRU-based RNN in our models for session-based recommendations.The input of thenetwork is the actual state of the session while the output is the item of the next event in the session.The state of the session can either be the item of th

57、e actual event or the events in the session sofar.In the former case 1-of-N encoding is used,i.e.the input vectors length equals to the numberof items and only the coordinate corresponding to the active item is one,the others are zeros.Thelatter setting uses a weighted sum of these representations,i

58、n which events are discounted if theyhave occurred earlier.For the stake of stability,the input vector is then normalized.We expect thisto help because it reinforces the memory effect:the reinforcement of very local ordering constraintswhich are not well captured by the longer memory of RNN.We also

59、experimented with adding anadditional embedding layer,but the 1-of-N encoding always performed better.The core of the network is the GRU layer(s)and additional feedforward layers can be added betweenthe last layer and the output.The output is the predicted preference of the items,i.e.the likelihoodo

60、f being the next in the session for each item.When multiple GRU layers are used,the hiddenstate of the previous layer is the input of the next one.The input can also be optionally connected3Published as a conference paper at ICLR 20162.2DEEPLEARNING INRECOMMENDERSOne of the first related methods in

61、the neural networks literature where the use of Restricted Boltz-mann Machines(RBM)for Collaborative Filtering(Salakhutdinov et al.,2007).In this work anRBM is used to model user-item interaction and perform recommendations.This model has beenshown to be one of the best performing Collaborative Filt

62、ering models.Deep Models have been usedtoextractfeaturesfromunstructuredcontentsuchasmusicorimagesthatarethenusedtogetherwithmore conventional collaborative filtering models.In Van den Oord et al.(2013)a convolutional deepnetwork is used to extract feature from music files that are then used in a fa

63、ctor model.More recentlyWang et al.(2015)introduced a more generic approach whereby a deep network is used to extractgeneric content-features from any types of items,these features are then incorporated in a standardcollaborative filtering model to enhance the recommendation performance.This approac

64、h seems tobe particularly useful in settings where there is not sufficient user-item interaction information.3RECOMMENDATIONS WITHRNNSRecurrent Neural Networks have been devised to model variable-length sequence data.The maindifference between RNNs and conventional feedforward deep models is the exi

65、stence of an internalhidden state in the units that compose the network.Standard RNNs update their hidden state h usingthe following update function:ht=g(Wxt+Uht?1)(1)Where g is a smooth and bounded function such as a logistic sigmoid function xtis the input ofthe unit at time t.An RNN outputs a pro

66、bability distribution over the next element of the sequence,given its current state ht.A Gated Recurrent Unit(GRU)(Cho et al.,2014)is a more elaborate model of an RNN unit thataims at dealing with the vanishing gradient problem.GRU gates essentially learn when and by howmuch to update the hidden sta

67、te of the unit.The activation of the GRU is a linear interpolationbetween the previous activation and the candidate activationht:ht=(1?zt)ht?1+ztht(2)where the update gate is given by:zt=?(Wzxt+Uzht?1)(3)while the candidate activation functionhtis computed in a similar manner:ht=tanh(Wxt+U(rt?ht?1)(

68、4)and finaly the reset gate rtis given by:rt=?(Wrxt+Urht?1)(5)3.1CUSTOMIZING THEGRUMODELWe used the GRU-based RNN in our models for session-based recommendations.The input of thenetwork is the actual state of the session while the output is the item of the next event in the session.The state of the

69、session can either be the item of the actual event or the events in the session sofar.In the former case 1-of-N encoding is used,i.e.the input vectors length equals to the numberof items and only the coordinate corresponding to the active item is one,the others are zeros.Thelatter setting uses a wei

70、ghted sum of these representations,in which events are discounted if theyhave occurred earlier.For the stake of stability,the input vector is then normalized.We expect thisto help because it reinforces the memory effect:the reinforcement of very local ordering constraintswhich are not well captured

71、by the longer memory of RNN.We also experimented with adding anadditional embedding layer,but the 1-of-N encoding always performed better.The core of the network is the GRU layer(s)and additional feedforward layers can be added betweenthe last layer and the output.The output is the predicted prefere

72、nce of the items,i.e.the likelihoodof being the next in the session for each item.When multiple GRU layers are used,the hiddenstate of the previous layer is the input of the next one.The input can also be optionally connected3reset gateupdate gate长期兴趣与短期兴趣的平衡Phased GRU RecNet cont.用户session实际情况是有的se

73、ssion一天100个行为,有的session一个月只有一个行为 Phased GRU,引入time gate k,根据采样间隔控制变量的更新(同时增加一定程度的采样间隔):Based on:Phased LSTM:Accelerating Recurrent Network Training for Long or Event-based Sequences 基于传染病模型的有限行为用户兴趣预测大量用户行为非常稀疏,每月观看量不超过3次 用户群体的兴趣演变遵循类似传染病传播的机制 预测:?基于Nystrom CUR的explorationNxN的I2I矩阵有很多元素很稀疏,explore收集

74、数据需要很多流量,代价很高 Nystrom CUR:可以用c个landmark item来代表整个I2I相似度矩阵 通过statistical leverage score选择c个item 重点explore对于c个item有过观看的用户nnc基于HIN图、聚类等方法的兴趣识别利用用户与节目的播放记录构建二部图,每个节点的标签按相似度传播给相邻节点,在节点传播的每一步,每个节点按照相邻节点的标签来更新自己的标签。与该节点相似度越大,其相邻节点对其标注的影响权值也越大。当绝大多数节点的标签不再更新时,整个网络按照标签就形成了各自所属的社区。算法思想权重设定Item节点的权重为该节目观看人数的倒数

75、 User节点的权重为该用户观看节目数量的倒数 U-I连边的权重为该用户对该节目的观看完成率 U-I连边的权重加入随机因子效果评估Item在类簇中的挂载成功率为100%仅有单个Item挂载的类簇占99.48%,最多一个类簇内包含32个节目类簇内包含的用户个数的分布直方图如右所示,其中最大的类簇包含用户45313个将全部用户划分为35830个类簇典型CASE序号节ID节名称1323580汽车城之建筑队2323577汽车城之车特洛伊3318953和迷你卡车学习4323581汽车城之汤姆的油漆店5323573汽车城之超级变形卡车6323571汽车城之拖车汤姆Hierarchical View Fee

76、dback Aggregation算法模型能力有限,End2End模型精准capture个性化特征能力有限 最优解在非常高纬空间中,由于噪声与模型收敛能力问题,需人工辅助降低搜索空间维度 使用交叉特征的统计值,效果好于使用离散交叉裸id特征 结合业务理解,辅助模型更好capture个性化特征 结合统计量的variance进行噪声过滤 交叉统计:更好capture不同用户群体对于不同视频类型的兴趣,如:爱看韩剧的人群对于台湾偶像剧的人均vv;爱看日本恐怖片的人群对于美国恐怖片的人均vv;20岁一线城市女性看游戏人均vvUser Age Gender Geo Video Tag Popularit

77、y Category Source Exclusive Purchased User Interest Category Topic Tag Match Type Relevance Popularity Trending Context Time of day Day of week Location User id 个性化排序在优酷视频搜索稀疏全连接域内信息的二次编码concat全局全连接个性化排序在优酷视频搜索-特征域划分及编码query user video id域 统计域 用户观看序列 标签兴趣 文本 超高维的稀疏编码来表征独立个体 利用神经网络来拟合个体共性 视频表达是基础 按特征的重要度和关联性分域 亿级参数 挑战:特征维度高 模型存储空间大,离线训练计算时间成本高,在线实现资源占用高,前向网络计算不能满足RT要求 特征分域 随机编码 挂靠编码 抽样技术We Are Hiring ly136216alibaba-T hanks

友情提示

1、下载报告失败解决办法
2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。

本文(2017年视频推荐中用户兴趣建模、识别的挑战和解法.pdf)为本站 (云闲) 主动上传,三个皮匠报告文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三个皮匠报告文库(点击联系客服),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。
会员购买
客服

专属顾问

商务合作

机构入驻、侵权投诉、商务合作

服务号

三个皮匠报告官方公众号

回到顶部