Overview
Central bank communication is one of the richest — and most underexploited — sources of forward guidance in macroeconomics. When the Bank of Korea (BOK) publishes a Monetary Policy Committee statement, the specific words chosen signal hawkish or dovish intent that markets price in before any rate decision takes effect. This talk introduces two automated methods for extracting that signal from Korean-language text, and presents the eKoNLPy library built to support this analysis.
The NLP Challenge
Korean financial text poses specific difficulties for standard NLP pipelines. General-purpose Korean morphological analysers do not recognise domain-specific monetary policy vocabulary, producing incorrect segmentations that corrupt downstream sentiment classification. The first contribution of this work is a custom NLP library, eKoNLPy, which extends existing Korean NLP tools with an economics-specific dictionary, enabling accurate tokenisation of central bank communications.
Two Automated Dictionary Construction Methods
The core methodological contribution is a pair of unsupervised approaches to building hawkish/dovish lexicons from a large corpus of BOK documents:
- Machine Learning Classification: Train a classifier on seed terms with known polarity, then propagate labels to semantically similar n-grams.
- Word Embeddings: Use distributional semantics to identify terms that cluster with known hawkish or dovish anchors in the embedding space.
Both methods require no manual annotation of the full vocabulary — a significant advantage when domain experts are scarce.
Results
Lexicon-based sentiment indicators derived from these methods outperform English-translated BOK texts, media-based economic policy uncertainty measures, and standard macroeconomic uncertainty indices when predicting current and future monetary policy decisions within an augmented Taylor rule framework. The text-mining measure of monetary policy surprise — computed as the change in sentiment around policy announcements — better explains movements in long-term interest rates, while the actual base rate change remains more closely tied to short-term rate fluctuations. This confirms that the textual measure captures forward guidance and market expectations that rate changes alone do not convey.
The 103-slide presentation walks through the full pipeline from corpus construction to asset price impact analysis, providing a replicable blueprint for applying similar methods to other central banks’ communications.