Social Distance Metric

Reference:   Terziyan V., Social Distance Metric: From Coordinates to Neighborhoods, International Journal of Geographical Information Science, 31 (12), 2401-2426, Taylor & Francis, 2017. (doi: 10.1080/13658816.2017.1367796)

 

 

For any two points  and  in some data space with metric , we define the “mutual social ranking” function  as follows: , if point  is the -th nearest neighbor of the point  in metric ; and , if  . Similarly: , if point  is the -th nearest neighbor of the point  in metric . It is evident that  and  are not necessarily equal (see the example of the mutual social ranking asymmetry in the figure below where the point  for the point  is the eighth closest one while the point  for the point  is the fifth closest one).

 

Special case (“rule of tie”): If  points (including point ) pretend to be the -th nearest neighbor of point , then: .

 

 

Having  and  due to the use of some metric , and choosing , we can compute the Social Distance   between  and  as follows:

I.                First we average  and  in a special way. We compute the Lehmer mean of  and , which is:

.

Notice that this averaging function includes most of famous means depending on . E.g., it is equals to Arithmetic mean for , and to Contraharmonic mean for   See more cases and details in the referred article.

II.              Using the Lehmer mean above, we compute the Social Distance as follows:

.

and we have proven it (and several modifications of it) to be a metric suitable for many intelligent data processing tasks (classification, clustering, etc.) See details in the article.

See examples of the Social Distance computing for different neighborhoods in 2D space for the same pair of data points  and : cases (a)–(c) have different configuration resulting to the same distance; (d) case with some data points at the border of the neighborhood (ties); (e) and (f) cases with symmetric neighborhoods.

 

Figure