Thirty years of Australian housing: a PCA-ARIMAX econometric study

Research
ARIMAX
PCA
R
Author

Yiran.Y

Published

April 7, 2026

Property Market · Forecast Modelling · Research

Australian house prices have risen strongly since the mid-1990s, but that growth has been anything but uniform. Melbourne has consistently amplified national movements. Sydney tracks the national market closely. Regional areas tend to dampen it. Perth swings wildly depending on where the mining cycle sits.

The question this paper sets out to answer is: how much of the difference in regional house price growth reflects persistent structural trends, and how much is just cyclical noise.

The approach

The paper builds a three-factor model applied to regional repeat-sales log price indexes spanning 1995 to 2024, covering roughly 3 million house transactions across Australia.

The three factors are:

  • Market: The national price trend common to all regions, proxied by the national house price index. Each region’s sensitivity to this factor is captured by a loading coefficient βr. A region with βr > 1 amplifies national movements; βr < 1 dampens them.

  • Mining: A mean-reverting spread constructed from the Perth-Sydney price differential, capturing the resource-sector cycle. Perth loads positively; Sydney negatively. This explains a lot of why these two cities behave so differently over time.

  • Lifestyle: A spread capturing amenity-driven demand in coastal and regional areas, reflecting internal migration patterns toward sea-change and tree-change destinations, a pattern that intensified sharply during and after COVID-19.

These three factors are grounded in PCA evidence from Sijp et al. (2025), with the leading principal components mapping almost perfectly to the Market, Mining, and Lifestyle interpretations. The model is estimated via ARIMAX, which accounts for the autocorrelated structure of housing price residuals.

The methodology

  • Approximating the factors: Rather than using PCA components directly, portable factor proxies were constructed from standard city-level price indexes: the national index for Market, the Perth–Sydney spread for Mining, and a weighted spread of high versus low Lifestyle regions. These proxies were validated against their PCA counterparts with correlations of 1.00, 0.98, and 0.98 respectively.

  • Building the ARIMAX: Each regional price index was modelled as a linear combination of the three factors plus a city-specific residual. Because the residuals are autocorrelated, ARIMAX was used rather than OLS, allowing the idiosyncratic component to follow an ARMA process. The Market factor absorbs the unit root in the system, while Mining and Lifestyle enter as stationary spreads. Model orders were selected by AICc across each region, with a seasonal MA(1) component included at 12 months.

  • Stability testing and uncertainty quantification: Factor loadings were tested for temporal stability using expanding estimation windows from 1995 to 2024, stepping forward in three-month increments. Forecast uncertainty was decomposed across three sources: Mining exposure, Lifestyle exposure, and the idiosyncratic city-specific remainder, allowing the source of regional risk to be attributed and compared across cities and time horizons.

Key findings

The Market loadings are stable across major economic shocks including the mining boom, the Global Financial Crisis, and COVID-19. This stability is the central finding: it means the cross-sectional ranking of cities by sensitivity to national growth is persistent and informative for scenario planning.

Among the major cities, Melbourne amplifies national growth the most, Brisbane follows, Sydney tracks the national market closely, and regional areas dampen it. Perth is the most volatile, driven by its large positive exposure to the Mining factor.

My contribution

This paper is the work of Dr Willem Sijp at Neoval Pty Ltd and the University of Technology Sydney. I contributed as a research assistant in the following ways:

  • Data and preparation: Processed and prepared the underlying house price dataset used throughout the analysis.

  • Factor construction: Participated in constructing the three factor proxies, working through the linear combination approach and validating the proxies against their PCA counterparts.

  • ARIMAX model building: Assisted in identifying model orders and running diagnostic tests in parallel with Willem, including Ljung-Box tests and AICc-based selection across regions.

  • Manuscript and discussions: Edited and refined the manuscript, and contributed to ongoing discussions on methodology and interpretation throughout the project.

  • Stationarity analysis: Wrote a full technical analysis of the Mining factor’s stationarity. Using structural break detection, the apparent unit root was shown to be a consequence of regime shifts around the mining boom rather than genuine non-stationarity, supporting the ARIMAX specification.