A prototype machine learning residential land-use classifier using housing market dynamics


  • Shivani Raghav University of Toronto Transportation Research Institute
  • Stepan Oskin Prodigy
  • Eric J. Miller University of Toronto




Housing market, Land use, Machine learning


There is ample evidence of the role of land use and transportation interactions in determining urban spatial structure. The increased digitization of human activity produces a wealth of new data that can support longitudinal studies of changes in land-value distributions and integrated urban microsimulation models. To produce a comprehensive dataset, information from various sources needs to be merged at the land-parcel level to enhance datasets with additional attributes, while maintaining the ease of data storage and retrieval and respecting spatial and temporal relationships. This paper proposes a prototype of a workflow to augment a historical dataset of real estate transactions with data from multiple urban sources and to use machine learning to classify land use of each record based on housing market dynamics. The study finds that engineered parcel-level attributes, capturing housing market dynamics, have stronger predictive power than aggregated socio-economic variables, for classifying property land use.

Author Biography

Eric J. Miller, University of Toronto

Eric J. Miller, Ph.D.
Professor, Department of Civil & Mineral Engineering
Director, University of Toronto Transportation Research Institute (UTTRI)
Research Director, Data Management Group
Research Director, Travel Modelling Group
University of Toronto


Alonso, W. (1964). Location and land use, towards a general theory of land rent. Cambridge: Harvard University Press. https://www.hup.harvard.edu/catalog.php?isbn=9780674730854

Arribas-Bel, D. (2014). Accidental, open and everywhere: Emerging data sources for the understanding of cities. Applied Geography, 49, 45–53. https://dx.doi.org/10.1016/j.apgeog.2013.09.012

Ashby, B. (2018). TTS 2016 City of Toronto summary by Ward. Toronto: Malatest. http://dmg.utoronto.ca/pdf/tts/2016/2016TTS_Summaries_Toronto_Wards.pdf

Case, K. E., & Mayer, C. J. (1996). Housing price dynamics within a metropolitan area. Regional Science and Urban Economics, 26(3–4), 387–407. https://doi.org/10.1016/0166-0462(95)02121-3

Chen, C., Ma, J., Susilo, Y., Liu, Y., & Wang, M. (2016). The promises of big data and small data for travel behavior (Aka human mobility) analysis. Transportation Research Part C: Emerging Technologies, 68, 285–299.

Chen, J. H., Ong, C. F., Zheng, L., & Hsu, S. C. (2017). Forecasting spatial dynamics of the housing market using support vector machine. International Journal of Strategic Property Management, 21(3), 273–283. https://doi.org/10.3846/1648715X.2016.1259190

Clapp, J. M., Kim, H. J., & Gelfand, A. E. (2002). Predicting spatial patterns of house prices using LPR and Bayesian smoothing. Real Estate Economics, 30(4), 505–532. https://doi.org/10.1111/1540-6229.00048

Data Management Group. (2014). Data Management Group at the University of Toronto Transportation Research Institute. Retrieved from http://dmg.utoronto.ca. http://dmg.utoronto.ca

Data Management Group. (2019). Survey boundary files. Retrieved from http://dmg.utoronto.ca. http://dmg.utoronto.ca/survey-boundary-files

DMTI Spatial Inc. (2014). CanMap ® RouteLogistics user manual V2014.2. (2014.2). Retrieved from www.dmtispatial.com

Dubin, R. A. (1998). Predicting house prices using multiple listings data. Journal of Real Estate Finance and Economics, 17(1), 35–59. https://doi.org/10.1023/A:1007751112669

Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., & Lin, C.-J. (2008). LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 9, 1971–1874. http://www.csie.ntu.edu.tw/

Füss, R., & Koller, J. A. (2016). The role of spatial and temporal structure for residential rent predictions. International Journal of Forecasting, 32, 1352–1368. https://doi.org/10.1016/j.ijforecast.2016.06.001

Iacono, M., Levinson, D., & El-Geneidy, A. (2008). Models of transportation and land-use change: A guide to the territory. Journal of Planning Literature, 22(4), 323–340.

Ismail, S. (2006). Spatial autocorrelation and real estate studies: A literature review. Malaysian Journal of Real Estate, 1(1), 1–13.

Katsiampa, P., & Begiazi, K. (2019). An empirical analysis of the Scottish housing market by property type. Scottish Journal of Political Economy, 66(4), 559–583. https://doi.org/10.1111/sjpe.12210

Kelly, E. D. (1994). The transportation land-use link. Journal of Planning Literature, 9(2), 128–145. https://journals-sagepub-com.myaccess.library.utoronto.ca/doi/10.1177/088541229400900202

Kim, D., & Jin, J. (2019). The effect of land use on housing price and rent: Empirical evidence of job accessibility and mixed land use. Sustainability, 11(3), 938. https://doi.org/10.3390/su11030938

Knight, R. L., & Trygg, L. L. (1977). Land-use impacts of rapid transit (DOT-TPI-10-77-29). Washington, DC: U.S. Department of Energy, Office of Scientific and Technical Information. https://www.osti.gov/servlets/purl/5952387https://www.osti.gov/servlets/purl/5952387

Lee, D. B. (1973). Requiem for large scale models. Journal of the American Institute of Planners, 39, 163–178.

Luo, T., Tan, R., Kong, X., & Zhou, J. (2019). Analysis of the driving forces of urban expansion based on a modified logistic regression model: A case study of Wuhan City, Central China. Sustainability, 11(8), 2207. https://doi.org/10.3390/su11082207

Manheim, M. L. (1978). Fundamentals of transportation systems analysis volume 1: Basic concepts. Cambridge, MA: MIT Press. https://mitpress.mit.edu/books/fundamentals-transportation-systems-analysis-volume-1

Map and Data Library. (2019). Canadian census geography (unit) definitions. Toronto: University of Toronto. https://mdl.library.utoronto.ca/canadian-census-geography-unit-definitions

Martinez, F. J. (2018). Microeconomic modeling in urban science. Cambridge, MA: Academic Press, Elsevier.

Miller, E. J. (2018). The case for microsimulation frameworks for integrated urban models. Journal of Transport and Land Use, 11(1), 1025–1037.

Miller E. J. (2019). Travel demand models, the next generation: Boldly going where no one has gone before. In K. G. Goulais & A. W. Davis (Eds.), Mapping the travel behavior genome. Cambridge, MA: Elsevier.

Miller, E. J., Kriger, D. S., & Hunt, J. D. (1998). Integrated urban models for simulation of transit and land-use policies guidelines for implementation and use (TCRP Report 48). Washington, DC: Transportation Research Board.

Potepan, M. J. (1996). Explaining intermetropolitan variation in housing prices, rents and land prices. Real Estate Economics, 24(2), 219–245. https://doi.org/10.1111/1540-6229.00688

Raschka, S., & Mirjalili, V. (2017). Python machine learning (2nd Ed). Birmingham, UK: Packt Publishing.

Shu, B., Zhang, H., Li, Y., Qu, Y., & Chen, L. (2014). Spatiotemporal variation analysis of driving forces of urban land spatial expansion using logistic regression: A case study of port towns in Taicang City, China. Habitat International, 43, 181–190. https://doi.org/10.1016/j.habitatint.2014.02.004

Spengler, E. H. (1930). Land values in New York in relation to transit facilities. New York, NY: Columbia University Press.

Spinney, J., Kanaroglou, P., & Scott, D. (2011). Exploring spatial dynamics with land price indexes. Urban Studies, 48(4), 719–735. https://doi.org/10.1177/0042098009360689

Statistics Canada. (2015). Dissemination area (DA). Census program reference materials, 2011 census dictionary. Retrieved from https://www12.statcan.gc.ca/census-recensement/2011/ref/dict/geo021-eng.cfm

Statistics Canada. (2018). Hierarchy of standard geographic units. Illustrated glossary 92-195-X. Retrieved from https://www150.statcan.gc.ca/n1/pub/92-195-x/2011001/other-autre/hierarch/h-eng.htm

Teranet Enterprises Inc. (2019). About POLARIS, Teranet. Retrieved from www.teranet.ca. https://www.teranet.ca/registry-solutions/about-polaris/

Verburg, P. H., Schot, P. P., Dijst, M. J., & Veldkamp, A. (2004). Land-use change modelling: Current practice and research priorities. GeoJournal, 61(4), 309–324. https://doi.org/10.1007/s10708-004-4946-y

Wang, W. C., Chang, Y. J., & Wang, H. C. (2019). An application of the spatial autocorrelation method on the change of real estate prices in Taitung city. ISPRS International Journal of Geo-Information, 8(6), 249. https://doi.org/10.3390/ijgi8060249

Wegener, M. (1994). Operational urban models state of the art. Journal of the American Planning Association, 60(1), 17–29. https://www.tandfonline.com/doi/abs/10.1080/01944369408975547

Wolpert, D. H. (1996). The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7),1391–1420.

Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82. https://ti.arc.nasa.gov/m/profile/dhw/papers/78.pdf

Wu, B., Huang, B., & Fung, T. (2009). Projection of land-use change patterns using kernel logistic regression. Photogrammetric Engineering and Remote Sensing, 75(8), 971–979. https://doi.org/10.14358/PERS.75.8.971




How to Cite

Raghav, S., Oskin, S., & Miller, E. (2022). A prototype machine learning residential land-use classifier using housing market dynamics. Journal of Transport and Land Use, 15(1), 355–374. https://doi.org/10.5198/jtlu.2022.1905