A1 Journal article (refereed)
Social Distance metric : from coordinates to neighborhoods (2017)
Terziyan, V. (2017). Social Distance metric : from coordinates to neighborhoods. International Journal of Geographical Information Science, 31(12), 2401-2426. https://doi.org/10.1080/13658816.2017.1367796
JYU authors or editors
Publication details
All authors or editors: Terziyan, Vagan
Journal or series: International Journal of Geographical Information Science
ISSN: 1365-8816
eISSN: 1365-8824
Publication year: 2017
Volume: 31
Issue number: 12
Pages range: 2401-2426
Publisher: Taylor & Francis
Publication country: United Kingdom
Publication language: English
DOI: https://doi.org/10.1080/13658816.2017.1367796
Publication open access: Not open
Publication channel open access:
Abstract
Choice of a distance metric is a key for the success in many machine learning and data processing tasks. The distance between two data samples traditionally depends on the values of their attributes (coordinates) in a data space. Some metrics also take into account the distribution of samples within the space (e.g. local densities) aiming to improve potential classification or clustering performance. In this paper, we suggest the Social Distance metric that can be used on top of any traditional metric. For a pair of samples x and y, it averages the two numbers: the place (rank), which sample y holds in the list of ordered nearest neighbors of x; and vice versa, the rank of x in the list of the nearest neighbors of y. Average is a contraharmonic Lehmer mean, which penalizes the difference between the numbers by giving values greater than the Arithmetic mean for the unequal arguments. We consider normalized average as a distance function and we prove it to be a metric. We present several modifications of such metric and show that their properties are useful for a variety of classification and clustering tasks in data spaces or graphs in a Geographic Information Systems context and beyond.
Keywords: data mining; cluster analysis; geographic information systems; density; classification; graphs (network theory)
Free keywords: metric; Lehmer mean; distance function; social neighborhood; clustering
Contributing organizations
Ministry reporting: Yes
VIRTA submission year: 2017
JUFO rating: 2