ALS評分
交替最小二乘ALS(Alternating Least Squares)算法的原理是對稀疏矩陣進行模型分解,評估缺失項的值,從而得到基本的訓(xùn)練模型。在協(xié)同過濾分類方面,ALS算法屬于User-Item CF(Collaborative Filtering),兼顧User和Item項,也稱為混合CF。本文將介紹如何使用ALS矩陣分解的結(jié)果對User和Item進行評分。
使用限制
支持的計算引擎為MaxCompute和Flink。
可視化配置組件參數(shù)
輸入樁
輸入樁(從左到右)
數(shù)據(jù)類型
建議上游組件
是否必選
user因子表
無
是
item因子表
無
是
待評分的輸入數(shù)據(jù)
無
是
組件參數(shù)
頁簽
參數(shù)
描述
字段設(shè)置
user列名
輸入數(shù)據(jù)源中,用戶ID列的名稱。該列數(shù)據(jù)必須是BIGINT類型。
item列名
輸入數(shù)據(jù)源中,item項的列名。該列數(shù)據(jù)必須是BIGINT類型。
參數(shù)設(shè)置
預(yù)測結(jié)果列名
輸出數(shù)據(jù)表中,用來制定評分結(jié)果存儲的列名。
輸出表生命周期
輸出表生命周期。
執(zhí)行調(diào)優(yōu)
節(jié)點個數(shù)
取值范圍為1~9999。
單個節(jié)點的內(nèi)存大小
取值范圍為1024 MB~64*1024 MB。
輸出樁
輸出樁(從左到右)
數(shù)據(jù)類型
下游組件
評分結(jié)果表
無
無
使用示例
用來評分的user因子表和item因子表:
輸出的user因子表
user_id
factors
8528750
[0.026986524,0.03350178,0.03532385,0.019542359,0.020429865,0.02046867,0.022253247,0.027391396,0.018985065,0.04889483]
282500
[0.116156064,0.07193632,0.090851225,0.017075706,0.025412979,0.047022138,0.12534861,0.05869226,0.11170533,0.1640192]
4895250
[0.038429666,0.061858658,0.04236993,0.055866677,0.031814687,0.0417443,0.012085311,0.0379342,0.10767074,0.028392972]
... ...
... ...
輸出的item因子表
item_id
factors
24601
[0.0063337763,0.026349949,0.0064828005,0.01734504,0.022049638,0.0059205987,0.008568814,0.0015981696,0.0,0.013601779]
26699
[0.0027524426,0.0043066847,0.0031336215,0.00269448,0.0022347474,0.0020477585,0.0027995422,0.0025390312,0.0033011117,0.003957773]
20751
[0.03902271,0.050952066,0.032981463,0.03862796,0.048720762,0.027976315,0.02721664,0.018149626,0.0149896275,0.026251089]
... ...
... ...
評分結(jié)果表:
user_id | item_id | pred |
19500 | 143 | 1.882628425846633E-4 |
19500 | 2610 | 1.1106864974408381E-4 |
19500 | 2655 | 8.975836536251336E-6 |
19500 | 3190 | 1.6171501181361236E-4 |
19500 | 3720 | 2.3276544959571766E-4 |
19500 | 5254 | 2.420645481606698E-4 |