A prototype machine learning residential land-use classifier using housing market dynamics
Keywords:Housing market, Land use, Machine learning
There is ample evidence of the role of land use and transportation interactions in determining urban spatial structure. The increased digitization of human activity produces a wealth of new data that can support longitudinal studies of changes in land-value distributions and integrated urban microsimulation models. To produce a comprehensive dataset, information from various sources needs to be merged at the land-parcel level to enhance datasets with additional attributes, while maintaining the ease of data storage and retrieval and respecting spatial and temporal relationships. This paper proposes a prototype of a workflow to augment a historical dataset of real estate transactions with data from multiple urban sources and to use machine learning to classify land use of each record based on housing market dynamics. The study finds that engineered parcel-level attributes, capturing housing market dynamics, have stronger predictive power than aggregated socio-economic variables, for classifying property land use.
Alonso, W. (1964). Location and land use, towards a general theory of land rent. Cambridge: Harvard University Press. https://www.hup.harvard.edu/catalog.php?isbn=9780674730854
Arribas-Bel, D. (2014). Accidental, open and everywhere: Emerging data sources for the understanding of cities. Applied Geography, 49, 45–53. https://dx.doi.org/10.1016/j.apgeog.2013.09.012
Ashby, B. (2018). TTS 2016 City of Toronto summary by Ward. Toronto: Malatest. http://dmg.utoronto.ca/pdf/tts/2016/2016TTS_Summaries_Toronto_Wards.pdf
Case, K. E., & Mayer, C. J. (1996). Housing price dynamics within a metropolitan area. Regional Science and Urban Economics, 26(3–4), 387–407. https://doi.org/10.1016/0166-0462(95)02121-3
Chen, C., Ma, J., Susilo, Y., Liu, Y., & Wang, M. (2016). The promises of big data and small data for travel behavior (Aka human mobility) analysis. Transportation Research Part C: Emerging Technologies, 68, 285–299.
Chen, J. H., Ong, C. F., Zheng, L., & Hsu, S. C. (2017). Forecasting spatial dynamics of the housing market using support vector machine. International Journal of Strategic Property Management, 21(3), 273–283. https://doi.org/10.3846/1648715X.2016.1259190
Clapp, J. M., Kim, H. J., & Gelfand, A. E. (2002). Predicting spatial patterns of house prices using LPR and Bayesian smoothing. Real Estate Economics, 30(4), 505–532. https://doi.org/10.1111/1540-6229.00048
Data Management Group. (2014). Data Management Group at the University of Toronto Transportation Research Institute. Retrieved from http://dmg.utoronto.ca. http://dmg.utoronto.ca
Data Management Group. (2019). Survey boundary files. Retrieved from http://dmg.utoronto.ca. http://dmg.utoronto.ca/survey-boundary-files
DMTI Spatial Inc. (2014). CanMap ® RouteLogistics user manual V2014.2. (2014.2). Retrieved from www.dmtispatial.com
Dubin, R. A. (1998). Predicting house prices using multiple listings data. Journal of Real Estate Finance and Economics, 17(1), 35–59. https://doi.org/10.1023/A:1007751112669
Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., & Lin, C.-J. (2008). LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 9, 1971–1874. http://www.csie.ntu.edu.tw/
Füss, R., & Koller, J. A. (2016). The role of spatial and temporal structure for residential rent predictions. International Journal of Forecasting, 32, 1352–1368. https://doi.org/10.1016/j.ijforecast.2016.06.001
Iacono, M., Levinson, D., & El-Geneidy, A. (2008). Models of transportation and land-use change: A guide to the territory. Journal of Planning Literature, 22(4), 323–340.
Ismail, S. (2006). Spatial autocorrelation and real estate studies: A literature review. Malaysian Journal of Real Estate, 1(1), 1–13.
Katsiampa, P., & Begiazi, K. (2019). An empirical analysis of the Scottish housing market by property type. Scottish Journal of Political Economy, 66(4), 559–583. https://doi.org/10.1111/sjpe.12210
Kelly, E. D. (1994). The transportation land-use link. Journal of Planning Literature, 9(2), 128–145. https://journals-sagepub-com.myaccess.library.utoronto.ca/doi/10.1177/088541229400900202
Kim, D., & Jin, J. (2019). The effect of land use on housing price and rent: Empirical evidence of job accessibility and mixed land use. Sustainability, 11(3), 938. https://doi.org/10.3390/su11030938
Knight, R. L., & Trygg, L. L. (1977). Land-use impacts of rapid transit (DOT-TPI-10-77-29). Washington, DC: U.S. Department of Energy, Office of Scientific and Technical Information. https://www.osti.gov/servlets/purl/5952387https://www.osti.gov/servlets/purl/5952387
Lee, D. B. (1973). Requiem for large scale models. Journal of the American Institute of Planners, 39, 163–178.
Luo, T., Tan, R., Kong, X., & Zhou, J. (2019). Analysis of the driving forces of urban expansion based on a modified logistic regression model: A case study of Wuhan City, Central China. Sustainability, 11(8), 2207. https://doi.org/10.3390/su11082207
Manheim, M. L. (1978). Fundamentals of transportation systems analysis volume 1: Basic concepts. Cambridge, MA: MIT Press. https://mitpress.mit.edu/books/fundamentals-transportation-systems-analysis-volume-1
Map and Data Library. (2019). Canadian census geography (unit) definitions. Toronto: University of Toronto. https://mdl.library.utoronto.ca/canadian-census-geography-unit-definitions
Martinez, F. J. (2018). Microeconomic modeling in urban science. Cambridge, MA: Academic Press, Elsevier.
Miller, E. J. (2018). The case for microsimulation frameworks for integrated urban models. Journal of Transport and Land Use, 11(1), 1025–1037.
Miller E. J. (2019). Travel demand models, the next generation: Boldly going where no one has gone before. In K. G. Goulais & A. W. Davis (Eds.), Mapping the travel behavior genome. Cambridge, MA: Elsevier.
Miller, E. J., Kriger, D. S., & Hunt, J. D. (1998). Integrated urban models for simulation of transit and land-use policies guidelines for implementation and use (TCRP Report 48). Washington, DC: Transportation Research Board.
Potepan, M. J. (1996). Explaining intermetropolitan variation in housing prices, rents and land prices. Real Estate Economics, 24(2), 219–245. https://doi.org/10.1111/1540-6229.00688
Raschka, S., & Mirjalili, V. (2017). Python machine learning (2nd Ed). Birmingham, UK: Packt Publishing.
Shu, B., Zhang, H., Li, Y., Qu, Y., & Chen, L. (2014). Spatiotemporal variation analysis of driving forces of urban land spatial expansion using logistic regression: A case study of port towns in Taicang City, China. Habitat International, 43, 181–190. https://doi.org/10.1016/j.habitatint.2014.02.004
Spengler, E. H. (1930). Land values in New York in relation to transit facilities. New York, NY: Columbia University Press.
Spinney, J., Kanaroglou, P., & Scott, D. (2011). Exploring spatial dynamics with land price indexes. Urban Studies, 48(4), 719–735. https://doi.org/10.1177/0042098009360689
Statistics Canada. (2015). Dissemination area (DA). Census program reference materials, 2011 census dictionary. Retrieved from https://www12.statcan.gc.ca/census-recensement/2011/ref/dict/geo021-eng.cfm
Statistics Canada. (2018). Hierarchy of standard geographic units. Illustrated glossary 92-195-X. Retrieved from https://www150.statcan.gc.ca/n1/pub/92-195-x/2011001/other-autre/hierarch/h-eng.htm
Teranet Enterprises Inc. (2019). About POLARIS, Teranet. Retrieved from www.teranet.ca. https://www.teranet.ca/registry-solutions/about-polaris/
Verburg, P. H., Schot, P. P., Dijst, M. J., & Veldkamp, A. (2004). Land-use change modelling: Current practice and research priorities. GeoJournal, 61(4), 309–324. https://doi.org/10.1007/s10708-004-4946-y
Wang, W. C., Chang, Y. J., & Wang, H. C. (2019). An application of the spatial autocorrelation method on the change of real estate prices in Taitung city. ISPRS International Journal of Geo-Information, 8(6), 249. https://doi.org/10.3390/ijgi8060249
Wegener, M. (1994). Operational urban models state of the art. Journal of the American Planning Association, 60(1), 17–29. https://www.tandfonline.com/doi/abs/10.1080/01944369408975547
Wolpert, D. H. (1996). The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7),1391–1420.
Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82. https://ti.arc.nasa.gov/m/profile/dhw/papers/78.pdf
Wu, B., Huang, B., & Fung, T. (2009). Projection of land-use change patterns using kernel logistic regression. Photogrammetric Engineering and Remote Sensing, 75(8), 971–979. https://doi.org/10.14358/PERS.75.8.971
How to Cite
Copyright (c) 2022 Shivani Raghav, Stepan Oskin, Eric J. Miller
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors who publish with JTLU agree to the following terms: 1) Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution-Noncommercial License 4.0 that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal. 2) Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal. 3) Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.