Reference: Terziyan V., Social Distance Metric: From Coordinates to
Neighborhoods, International Journal of Geographical
Information Science, 31 (12), 2401-2426, Taylor
& Francis,
2017. (doi: 10.1080/13658816.2017.1367796)
For any two points and
in some data space with metric
, we define the “mutual social ranking” function
as follows:
, if point
is the
-th nearest neighbor of the point
in metric
; and
, if
. Similarly:
, if point
is the
-th nearest neighbor of the point
in metric
. It is evident that
and
are not necessarily equal (see the example of
the mutual social ranking asymmetry in the figure below where the point
for the point
is
the eighth closest one while the point
for the point
is
the fifth closest one).
Special case (“rule of tie”): If points (including point
) pretend to be the
-th nearest neighbor of point
, then:
.
Having and
due
to the use of some metric
, and choosing
, we can compute the Social Distance
between
and
as
follows:
I.
First
we average and
in
a special way. We compute
the Lehmer mean of
and
, which is:
.
Notice that this averaging function includes
most of famous means depending on . E.g., it is equals to Arithmetic
mean for
, and to Contraharmonic mean for
See more cases and details in the referred article.
II.
Using
the Lehmer mean above, we compute the Social Distance as follows:
.
… and we have
proven it (and several modifications of it) to be a metric suitable for many intelligent
data processing tasks (classification, clustering, etc.) See details in the
article.
See examples of the Social Distance computing for different
neighborhoods in 2D space for the same pair of data points and
: cases (a)–(c) have different
configuration resulting to the same distance; (d) case with some data points at
the border of the neighborhood (ties); (e) and (f) cases with symmetric
neighborhoods.