A common concept-based representation space for recommendation: Matching user preferences and items.

Advisors: Dra A. García Serrano, Dr. J. Cigarrán

Content-based recommendation offers items which content is, in some degree, similar to the content of the items already consumed by the users or they relates. In this context, it arises the problem of creating an accurate content representation and the most adequate linking of the information related to the users into this representation. To cope with the latter problem, we propose a common latent space to represent users and items based on Formal Concept Analysis (FCA). The recommendation process will be then straightforward by looking for the items closer to the user representation in the common space. In order to create an accurate content representation, the FCA-based representation is based on the semantic information (coming from LOD resources, mainly DBpedia, and WordNet) related to the data. Our hypothesis is that the abstraction and the linking between the data provided by the semantic information will better represent the data than the unstructured text or the isolated user profile. In the state of the art similar ideas can be found, based on ontology representations. However, ontology creation is an expensive and time-consuming process; therefore, these approaches are often limited by the quality of the generated ontology (i.e., the amount of objects, classes and relationships included in the ontology). To address this problem, probabilistic methodologies, such as LDA, have been proposed to create a latent conceptual space based on the implicit relationships between the data. These techniques also present some problems: its complexity, the need of setting the desired number of concepts to be detected in the latent space, or the impossibility to link the latent concepts to concepts in the real world (they are mathematical formalizations of the latent data relationships). Our proposal intends to go a step further in the area of recommendation by creating a data-driven common latent space for the user and item models in order to reach the adaptability of the probabilistic methodologies, avoiding its drawbacks and offering a similar formalization level and structure than an ontology-based methodology

Angel Castellanos