How much do we know about the User-Item Matrix?: Deep Feature Extraction for Recommendation

Lim, Taejun

Access status:

Open Access

Field	Value	Language
dc.contributor.author	Lim, Taejun
dc.date.accessioned	2022-11-29T04:39:47Z
dc.date.available	2022-11-29T04:39:47Z
dc.date.issued	2022	en
dc.identifier.uri	https://hdl.handle.net/2123/29763
dc.description.abstract	Collaborative filtering-based recommender systems typically operate on a high-dimensional sparse user-item matrix. Matrix completion is one of the most common formulations where rows and columns represent users and items, and predicting user’s ratings in items corresponds to filling in the missing entries of the matrix. In practice, it is a very challenging task to predict one's interest based on millions of other users having each seen a small subset of thousands of items. We considered how to extract the key features of users and items in the rating matrix to capture their features in a low-dimensional vector and how to create embeddings that well represent the characteristics of users and items by exploring what kind of user/item information to use in the matrix. However, recent studies have focused on utilising side information, such as user's age or movie's genre, but it is not always available and is hard to extract. More importantly, there has been no recent research on how to efficiently extract the important latent features from a sparse data matrix with no side information (1st problem). The next (2nd) problem is that most matrix completion techniques have mainly focused on semantic similarity between users and items with data structure transformation from a rating matrix to a user/item similarity matrix or a graph, neglecting the position of each element (user, item and rating) in the matrix. However, we think that a position is one of the fundamental points in matrix completion, since a specific point to be filled is presented based on the positions of its row and column in the matrix. In order to address the first (1st) problem, we aim to generalise and represent a high-dimensional sparse user-item matrix entry into a low-dimensional space with a small number of important features, and propose a Global-Local Kernel-based matrix completion framework, named GLocal-K, which is divided into two major stages. First, we pre-train an autoencoder with the local kernelised weight matrix, which transforms the data from one space into the feature space by using a 2d-RBF kernel. Then, the pre-trained autoencoder is fine-tuned with the rating matrix, produced by a convolution-based global kernel, which captures the characteristics of each item. GLocal-K outperforms the state-of-the-art baselines on three collaborative filtering benchmarks. However, it cannot show its superior feature extraction ability when the data is very large or too extremely sparse. For the aforementioned second (2nd) problem and the GLocal-K's limitation, we propose a novel position-enhanced user/item representation training model for recommendation, SUPER-Rec. We first capture the rating position in a matrix using relative positional rating encoding and store the position-enhanced rating information and its user-item relationship to a fixed dimension of embedding that is not affected by the matrix size. Then, we apply the trained position-enhanced user and item representations to the simplest traditional machine learning models to highlight the pure novelty of the SUPER-Rec representation. We contribute to the first formal introduction and quantitative analysis of the position-enhanced user/item representation in the recommendation domain and produce a principled discussion about SUPER-Rec with the incredibly excellent RMSE/MAE/NDCG/AUC results (i.e., both rating and ranking prediction accuracy) by an enormous margin compared with various state-of-the-art matrix completion models on both explicit and implicit feedback datasets. For example, SUPER-Rec showed the 28.2% RMSE error decrease in ML-1M compared to the best baseline, while the error decrease by 0.3% to 4.1% was prevalent among all the baselines.	en
dc.language.iso	en	en
dc.rights	The author retains copyright of this thesis
dc.subject	Recommender systems	en
dc.subject	Collaborative filtering	en
dc.subject	Matrix completion	en
dc.subject	Feature extraction	en
dc.subject	Feature representation	en
dc.subject	Deep learning	en
dc.title	How much do we know about the User-Item Matrix?: Deep Feature Extraction for Recommendation	en
dc.type	Thesis
dc.type.thesis	Masters by Research	en
dc.rights.other	The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.	en
usyd.faculty	SeS faculties schools::Faculty of Engineering::School of Computer Science	en
usyd.degree	Master of Philosophy M.Phil	en
usyd.awardinginst	The University of Sydney	en
usyd.advisor	Poon, Josiah