Real estate accounts for 61% of France's national net wealth. Housing is the largest item of expenditure of French households. Indices that track real estate prices evolution are thus crucial instruments for decision makers of all kinds: households, investors, the scientific community, local governments, etc. Yet, the available public statistics fail to cope with the heterogeneity of the housing prices dynamics across the country. 

In France, Notaire-Insee indices are considered as the reference, especially because their methodology and indices are open source. Quarterly, the institute produces indices for apartments and houses in big agglomerates. With 9 indices for house prices in France, the division proposed by this methodology hides a lot of disparities. For instance, the “Province” house index includes more than 25 000 cities as diverse as Toulouse (450k inhabitants) and Malroy (350 inhabitants), which represents 36% of the French housing stock. This indicator does not make it possible to highlight the differences in dynamics between cities geographically distinct and drived by different fundamentals due to different economic conditions. 

This work aims at producing a library of open data real estate price indices that track price evolution at fine geographical scale. To do so we develop a methodology for real estate price index computation, and then apply it on geographical clusters close to local markets. 

We want to be part of an open-source approach. Indeed, the methodology will be published, and all the indices will be made available for free to all.

The proposed method is applied on the fiscal database of real estate transactions DV3F, containing all the transactions in France (except Alsace, Moselle and Mayotte) between 2010 and 2020.

Our approach is based on classic hedonic price index methods. Each aspect and hypothesis of the hedonic method have been justified to produce precise indices.

Producing indices close to local markets requires working in a low data environment, and increases the probability of encountering outliers. Hedonic methods being very sensitive to outliers, we tackle this issue by testing the impact of different dynamics filters methods. 

To reduce the heteroscedasticity and improve the precision of the model, different forms and combinations of the regression have been tested.

This method is applied to 2 divisions of France: one for apartments, another for houses. In order to produce indices close to local markets, a clusterization of cities of France is computed as finely as possible and based on socio economic and local housing stock criteria. To preserve the quality of indices, all clusters respect constraints of minimum transaction volumes. This division is based on a clusterization of urban areas thanks to Ascending Hierarchical Classification and Kohonen algorithms. 

This clusterization resulted in the computation of 350 apartments and 400 houses indices. The application of our approach on these geographical clusters reveals a great diversity of house price dynamics. For instance, the “Province” index produced by Notaire-Insee is divided into 220 clusters, with variations between 2015 and 2020 of 2% and 29% respectively for the first and ninth decile of these indices. 

By highlighting the plurality of real estate price dynamics in France and urban centers, our approach emphasizes the need for indices to be computed on a local scale to be useful.