Reference: Terziyan V., Social Distance Metric: From Coordinates to
Neighborhoods, International Journal of Geographical
Information Science, 31 (12), 2401-2426, Taylor
& Francis,
2017. (doi: 10.1080/13658816.2017.1367796)
For any two points and in some data space with metric , we define the “mutual social ranking” function
as follows: , if point is the -th nearest neighbor of the point in metric ; and , if . Similarly: , if point is the -th nearest neighbor of the point in metric . It is evident that and are not necessarily equal (see the example of
the mutual social ranking asymmetry in the figure below where the point for the point is
the eighth closest one while the point for the point is
the fifth closest one).
Special case (“rule of tie”): If points (including point ) pretend to be the -th nearest neighbor of point , then: .
Having and due
to the use of some metric , and choosing , we can compute the Social Distance between
and as
follows:
I.
First
we average and in
a special way. We compute
the Lehmer mean of and , which is:
.
Notice that this averaging function includes
most of famous means depending on . E.g., it is equals to Arithmetic
mean for , and to Contraharmonic mean for See more cases and details in the referred article.
II.
Using
the Lehmer mean above, we compute the Social Distance as follows:
.
… and we have
proven it (and several modifications of it) to be a metric suitable for many intelligent
data processing tasks (classification, clustering, etc.) See details in the
article.
See examples of the Social Distance computing for different
neighborhoods in 2D space for the same pair of data points and : cases (a)–(c) have different
configuration resulting to the same distance; (d) case with some data points at
the border of the neighborhood (ties); (e) and (f) cases with symmetric
neighborhoods.