Machine Language Acquisition. Reinforcement Learning of Minimalist Grammars (MinGLear)

Maschineller Spracherwerb. Verstärkungslernen minimalistischer Grammatiken

Abstract

Motivation and Background

We are developing a reinforcement learning algorithm for numeral words of natural languages. Numerals have an obvious connection between syntax and semantics, e.g. "two hundred and fourty-five" means 2 * 100 + 40 + 5. Our algorithm is intended to not only learn words but also to predict their integer meaning.

State-of-the-art algorithms try this by slot-filling of digit, i.e. they learn how to decide that 'two' is the hundred, 'four' is the ten… This requires a high degree of individual supervision for single languages – especially if their number systems are not even decimal. Instead, we use more general constraints of numerals word structure that are derived from Hurford's theory of the Packing Strategy [Hur06].
Our algorithm forms new words via exchange of subnumerals and their meaning is predicted by linear regression, see figure.

The exchange of subnumerals works by generalizing expressions like "two hundred and fourty-five" and converting them to an output set of words (X,Y) |--> "X hundred and Y".
In order to decide which subnumerals become generalized and which do not, we use general size and divisibility arguments. We represent the numeral systems by a minimalist grammar (MG).

Moreover, we want to investigate the possibility to utilize search engines as supervisors: Running the wrong word “fiveteen” in a search engine will display significantly less search results opposed to a search of the correct word “sixteen”. Based on that, we attempt to find a model to decide whether or not a word is correct.

Scientific Hypotheses
  • Natural numeral grammars can be efficiently represented as Minimalist Grammars
  • Those Minimalist Grammars can be efficiently generated based on a learning algorithm that
    • reinforces itself by linear regression and
    • is supervised by big search engines
References
[Huf06]J. R. Hurford. "A performed practice explains a linguistic universal: Counting gives the Packing Strategy". ELSEVIER. May 2, 2006
[Zabb05]Y. Zabbal. "The Syntax of Numeral Expressions". Second Generals Paper. University of Massachusetts - Department of Linguistics. Amherst, MA 01003. May 19, 2005
[IM06]T. Ionin and O. Matushansky. "The Composition of Complex Cardinals". Journal of Semantics 23: 315–360. November 16, 2006
[FW00]G. Flach and M. Wolff. "Automatische Generierung von Zahlwortgrammatiken". Technische Universität Dresden, Fakultät Elektrotechnik, Institut für Akustik und Sprachkommunikation. October 5, 2000
[Ham04]H. Hammarström. "Deduction of Numeral Grammars". Proceedings of the Ninth ESSLLI Student Session. 2004
[bG+19RL]P. beim Graben et. al. "Reinforcement Learning of Minimalist Numeral Grammars". Brandenburgische Technische Universität Cottbus – Senftenberg, Institute of Electronics and Information Technology, Department of Communications Engineering. June 11, 2019
[Flac00]G. Flach, M. Holzapfel, C. Just, A. Wachtler and M. Wolff, "Automatic learning of numeral grammars for multi-lingual speech synthesizers," 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100), 2000, pp. 1291-1294 vol.3, doi: 10.1109/ICASSP.2000.861814.
[bG+19UMT]P. beim Graben et. al. "Bidirektionale Utterance–Meaning–Transducer für Zahlworte durch kompositionale minimalistische Grammatiken". Tagungsband der 30. Konferenz Elektronische Sprachsignalverarbeitung (ESSV). Volume: 91. January 2019
[Men18]J. A. Mendia. "Epistemic numbers". Proceedings of SALT 28: 493–511, March 2018
[And15]C. Anderson. "Numerical Approximation Using Some". Proceedings of Sinn und Bedeutung 19. July 2015
Project Data
Period11/2020–10/2022
FundingDFG Sachbeihilfe
- Grant#WO 819/3-1
- Total182,7 TEUR

Mission:

To devise a symbolic machine learning algorithm for minimalist grammars

Researchers: