With the rapid growth of news, information and opinionated data available in digital form, accompanied by a swift progress of textual analysis techniques, the field of sentiment analysis became a hotspot in the area of natural language processing. Additionally, scientists can nowadays draw on increased computational power to study textual documents. 
These developments allowed real estate researchers to advance beyond traditional sentiment measures such as closed-end fund discounts and survey-based measures (see e.g., Lin et al. (2009) as well as Jin et al. (2014)) and facilitate the development of new sentiment proxies. As an example, Google search volume data was successfully used to forecast commercial real estate market developments (Dietzel et al. (2014)) and to predict market volatility (Braun (2016)) as well as housing market turning points (Dietzel (2016)). Using sentiment-dictionaries and content-analysis software, Walker (2014) examined the relationship of media coverage and the boom of the UK housing market. In similar fashion, Soo (2015) showed that local housing media sentiment is able to predict future house prices in US cities.

However – in contrast to related research in finance – sentiment analysis in real estate still lacks behind. Real estate literature so far misses the application of more advanced machine learning techniques like supervised learning algorithms when trying to extract sentiment from news items. By facilitating a dataset of about 54,000 headlines from the S&P Global Market Intelligence database collected over a 12-year timespan (01/2005 – 12/2016), this paper examines the relationship between movements of both direct as well as indirect commercial real estate markets in the United States and media sentiment. It thereby aims to explore the performance and potential of a support vector machine as classification algorithm (see Cortes and Vapnik (1995). When mapping headlines into a high dimensional feature space, we can identify the polarity of individual news items and aggregate the results into three different sentiment measures. Controlling for other influence factors and sentiment indices, we show that these 'tone' measures indeed bear the potential to explain real estate market movements over time.

To our knowledge, this paper is the first one explicitly exploring a support vector machine’s potential in extracting media sentiment not only for the United States but also for real estate markets in general.