Spatial weight matrix impact on real estate hierarchical clustering in the process of mass valuation

Keywords: agglomerative clustering, entropy, property mass appraisal, market analysis


Research background: The value of the property can be determined on an individual or mass basis. There are a number of situations in which uniform and relatively fast results obtained by means of mass valuation undoubtedly outweigh the advantages of the individual approach. In literature and practice there are a number of different types of models of mass valuation of real estate. For some of them it is postulated or required to group the valued properties into homogeneous subset due to various criteria. One such model is Szczecin Algorithm of Real Estate Mass Appraisal (SAREMA). When using this algorithm, the area to be valued should be divided into the so-called location attractiveness areas (LAZ). Such division can be made in many ways. Regardless of the method of clustering, its result should be assessed, depending on the degree of implementation of the adopted criterion of division quality. A better division of real estate will translate into more accurate valuation results.

Purpose of the article: The aim of the article is to present an application of hierarchical clustering with a spatial constraints algorithm for the creation of LAZ. This method requires the specification of spatial weight matrix to carry out the clustering process. Due to the fact that such a matrix can be specified in a number of ways, the impact of the proposed types of matrices on the clustering process will be described. A modified measure of information entropy will be used to assess the clustering results.

Methods: The article utilises the algorithm of agglomerative clustering, which takes into account spatial constraints, which is particularly important in the context of real estate valuation. Homogeneity of clusters will be determined with the means of information entropy.

Findings & Value added: The main achievements of the study will be to assess whether the type of the distance matrix has a significant impact on the clustering of properties under valuation.


Download data is not yet available.


Arguelles, M., Benavides, C., & Fernandez, I. (2014). A new approach to the identification of regional clusters: hierarchical clustering on principal components. Applied Economics, 46(21). doi: 10.1080/00036846.2014.904491.

Bai, Y. P. & Wang, B. H. (2012). Study on regional land use structure change characteristics in Baolan-Lanqing-Qingzang urban belt based on information entropy and regional entropy. Advanced Materials Research, 518-523. doi: 10.4028/

Bapat, R. B, (2006). Determinant of the distance matrix of a tree with matrix weights, Linear Algebra and its Applications, 416.

Boongoen, T., & Iam-On, N. (2018). Cluster ensembles: a survey of approaches with recent extensions and applications. Computer Science Review, 28. doi: 10.1016/j.cosrev.2018.01.003.

Bourassa, S. C., Hamelink, F., Hoesli, M., & Macgregor, B. D. (1999). Defining housing submarkets. Journal of Housing Economics, 8(2). doi: 10.1006 /jhec.1999.0246.

Cellmer, R. (2013). Use of spatial autocorrelation to build regression models of transaction prices. Real Estate Management and Valuation, 21(4). doi: 10.2478/ remav-2013-0038.

Dąbrowski, R., & Latos, D. (2015), Possibilities of practical application of the remote sensing data in the real property appraisal. Real Estate Management and Valuation, 23(2). doi: 10.1515/remav-2015-0016.

Davidson, I., & Ravi, S.S. (2005). Agglomerative hierarchical clustering with constraints: theoretical and empirical results. In: A. M. Jorge, L. Torgo, P. Brazdil, R. Camacho, & J. Gama (Eds.). Knowledge discovery in databases: PKDD 2005. PKDD 2005. Lecture notes in computer science. vol 3721. Berlin, Heidelberg: Springer. doi:10.1007/11564126_1.

Dedkova, O., & Polyakova, I. (2018). Development of mass valuation in Republic of Belarus. Geomatics And Environmental Engineering, 12(3). doi: 10.7494/ geom.2018.12.3.29.

Fang, Y. X., & Wang, Y. H. (2012). Selection of the number of clusters via the bootstrap method. Computational Statistics & Data Analysis, 56. doi: 10.1016/ j.csda.2011.09.003.

Getis, A., & Aldstadt, J. (2004). Constructing the spatial weights matrix using a local statistic. Geographical Analysis, 36(2). doi: 10.1111/j.1538-4632.2004. tb01127.x.

Grover, R. (2016). Mass valuations. Journal of Property Investment & Finance, 34(2). doi: 10.1108/JPIF-01-2016-0001.

Guo, G. (2008). Regionalization with dynamically constrained agglomerative clustering and partitioning (REDCAP). International Journal of Geographical Information Science, 22(7).. doi: 10.1080/13658810701674970.

Hastie, T., Tibshirani, R., & Friedman, J. H. (2009). The elements of statistical learning: data mining, inference, and prediction. New York: Springer.

Hozer, J., Kokot, S., & Kuźmiński, W. (2002). Methods of statistical analysis of the market in real estate appraisal. Warsaw: PFSRM.

Jahanshiri, E., Buyong, T., & Shariff, A. R. M. (2011). A review of property mass valuation models. Pertanika Journal of Science & Technology, 19.

Kantardzic, M. (2003). Data mining. Concepts, models, methods, and algorithms. Wiley-IEEE Press.

Kauko, T., & d’Amato, M. (Eds.) (2008). Mass appraisal methods. An international perspective for property valuers. Wiley-Blackwell.

Keskin, B., & Watkins, C. (2016). Defining spatial housing submarkets: exploring the case for expert delineated boundaries. Urban Studies, 54(6). doi: 10.1177/0042098015620351.

Kolesnikov, A., Trichina, E., & Kauranne, T. (2015). Estimating the number of clusters in a numerical data set via quantization error modeling. Pattern Recognition, 48(3). doi: 10.1016/j.patcog.2014.09.017.

LeSage, J. P., & Pace, R. K. (2014). The biggest myth in spatial econometrics. Econometrics, 2. doi: 10.3390/econometrics2040217.

Ludovisi, A. (2014). Effectiveness of entropy-based functions in the analysis of ecosystem state and development. Ecological Indicators, 36. doi: 10.1016/j. ecolind.2013.09.020.

Mimmack, G. M., Mason, S. J., & Galpin, J. S. (2000), Choice of distance matrices in cluster analysis: defining regions. Journal of Climate, 14. doi: 10.1175/1520-0442(2001)014<2790:CODMIC>2.0.CO;2.

Müller, A. C., & Guido, S. (2016). Introduction to machine learning with python. Sebastopol: O’Reilly.

Pagourtzi, E., Assimakopoulos, V., Hatzichristos, T., & French, N. (2003), Real estate appraisal: a review of valuation methods. Journal of Property Investment & Finance, 21(4). doi: 10.1108/14635780310483656.

Palm, R. (1978). Spatial segmentation of the urban housing market. Economic Geography, 54(3).

Raschka, S., & Mirjalili, V. (2017), Python machine learning. Birmingham-Mumbai: Packt Publishing.

Truffet, L. (2018). Shannon entropy reinterpreted. Reports on Mathematical Physics, 81(3). doi:10.1016/S0034-4877(18)30050-8.

Unpingco, J. (2016). Python for probability, statistics, and machine learning. Springer International Publishing.

Wellman, J. F., & Regenauer-Lieb, K. (2012). Uncertainties have a meaning: Information entropy as a quality measure for 3-D geological models, Tectonophysics, 526–529. doi:10.1016/j.tecto.2011.05.001.

Wu, X., Ma, T., Cao, J., Tian, Y., & Alabdulkarim, A., (2018). A comparative study of clustering ensemble algorithms. Computers and Electrical Engineering, 68. doi:10.1016/j.compeleceng.2018.05.005.

Zhang, X., & Yu, Y. (2018). Spatial weights matrix selection and model averaging for spatial autoregressive models. Journal of Econometrics, 203. doi: 10.1016/j.jeconom.2017.05.021.

Zurada, J., Levitan, A., & Guan, J. (2011). A comparison of regression and artificial intelligence methods in a mass appraisal context. Journal of Real Estate Research, 33(3).

How to Cite
Gnat, S. (2019). Spatial weight matrix impact on real estate hierarchical clustering in the process of mass valuation. Oeconomia Copernicana, 10(1), 131-151.