Theoretical and empirical analysis of similarity measures

Advisor: Dr.E. Amigó Cabrera

In multiple information access tasks such as document clustering, filtering, text evaluation, etc., measuring the similarity between texts is a nuclear issue. We will describe our work in three aspects: how to combine similarity measures, what are the basic axioms of similarity and their empirical effects, and how to exploit similarity training data. Regarding the first issue, I will describe briefly my collaboration in the formal and empirical analysis of unsupervised combining functions. This work is closely related with ranking fusion, voting and averaging techniques Regarding the second issue, it will be described a proposed theory that explain the relations between probabilistic models, set-theoretic models and informationtheoretic models. The resulting axioms will help us to analyze the measures of the state of the art. It will be shown some experiments and it will be pointed out the way to follow. In the ambit of semi-supervised clustering, it will be described a proposal that take into account the content of the texts (direct measure) and the proximity to a set of texts previously grouped. It will be shown the experiments performed.

Fernando Giner Martínez


Videos de la serie ( Ver listado de videos )
Biometric Authentication Based On Retinal Vascular Network
Advisor: Dr.Enrique J. Carmona Suárez
1 jun. 2015
Towards Affective States Detection in Educational Contexts
Advisors: Dr. Jesús González Boticario y Dra. Olga C. Santos Martín
1 jun. 2015
Alert Detection and Event Detection for Reputation Management in Twitter
Advisors: Dr. J. Gonzalo, Dr.E. Amigó
2 jun. 2015
Co-Occurrence Graphs for Multilingual Word Sense
Advisors: Dra. L. Araujo, Dr J. Martínez Romo
2 jun. 2015
Theoretical and empirical analysis of similarity measures
Advisor: Dr.E. Amigó Cabrera
2 jun. 2015