Approaches to improving machine learning with reinforcement based on intrinsic motivation
Authors: Balitskaya A.V. | |
Published in issue: #6(47)/2020 | |
DOI: 10.18698/2541-8009-2020-6-620 | |
Category: Informatics, Computer Engineering and Control | Chapter: System Analysis, Control, and Information Processing, Statistics |
|
Keywords: reinforced machine learning, multi-agent learning, intrinsic motivation algorithms, deep learning, neural networks, agents, behavioral psychology, Starcraft, SMAC |
|
Published: 11.07.2020 |
Today, reinforced learning is one of the most promising areas of machine learning. However, a number of problems arise (among which we can mention abstraction from actions or studying the environment with rare rewards), which can be solved with the help of intrinsic motivation. Intrinsic motivation encourages an agent to engage in research, games, and other activities caused by curiosity in the absence of external rewards. The ability to effectively self-learn is one of the hallmarks of intelligence and allows the agent to function successfully for a long period in dynamic, complex environments about which there is little prior knowledge. The article provides an overview of the role of intrinsic motivation and describes approaches to improving the training of an agent based on it.
References
[1] Alfimtsev A.N. Fuzzy process-oriented approach to nondeterministic design of intelligent multimodal interfaces. Nauka i obrazovanie: nauchnoe izdanie [Science and Education: Scientific Publication], 2012, no. 11 (in Russ.). URL: https://elibrary.ru/download/elibrary_18381185_41681497.pdf (accessed: 05.03.2020).
[2] Alfimtsev A.N. Deklarativno-protsessnaya tekhnologiya razrabotki intellektual’nykh mul’timodal’nykh interfeysov. Avtoref. dis. dok. tekh. nauk [Declarative-process technology of developing intelligent multimode interfaces. Abs. doc. tech. sci. diss.]. Moscow, IPU RAS Publ., 2016 (in Russ.).
[3] Barto A.G., Sutton R.S. Landmark learning: an illustration of associative search. Biol. Cybern., 1981, vol. 42, no. 1, pp. 1–8. DOI: https://doi.org/10.1007/BF00335152
[4] Harlow H.F. Learning and satiation of response in intrinsically motivated complex puzzle performance by monkeys. J. Comp. Physiol. Psychol., 1950, vol. 43, no. 4, pp. 289–294. DOI: https://doi.apa.org/doi/10.1037/h0058114
[5] Deci E. Intrinsic motivation. Plenum, 1975.
[6] Burda Y., Edwards H., Pathak D., et al. Large-scale study of curiosity-driven learning. arxiv.org: website. URL: https://arxiv.org/abs/1808.04355 (accessed: 18.02.2020).
[7] Montúfar G., Ghazi-Zahedi K., Ay N. Information theoretically aided reinforcement learning for embodied agents. arxiv.org: website. URL: https://arxiv.org/abs/1605.09735 (accessed: 18.02.2020).
[8] Achiam J., Sastry Sh. Surprise-based intrinsic motivation for deep reinforcement learning. arxiv.org: website. URL: https://arxiv.org/abs/1703.01732 (accessed: 18.02.2020).
[9] Mohamed S., Rezende D.J. Variational information maximisation for intrinsically motivated reinforcement learning. arxiv.org: website. URL: https://arxiv.org/abs/1509.08731 (accessed: 18.02.2020).
[10] Vinyals O., Ewalds T., Bartunov S., et al. StarCraft II: a new challenge for reinforcement learning. arxiv.org: website. URL: https://arxiv.org/abs/1708.04782 (accessed: 18.02.2020).