Difference between revisions of "Rémi Munos"
GerdIsenberg (talk | contribs) (Created page with "'''Home * People * Rémi Munos''' FILE:remimunus.jpg|border|right|thumb|link=http://www.cmap.polytechnique.fr/~munos/| Rémi Munos <ref>[http://www.cmap....") |
GerdIsenberg (talk | contribs) |
||
Line 25: | Line 25: | ||
* [[Audrūnas Gruslys]], [[Rémi Munos]], [[Ivo Danihelka]], [[Marc Lanctot]], [[Alex Graves]] ('''2016'''). ''Memory-Efficient Backpropagation Through Time''. [https://arxiv.org/abs/1606.03401v1 arXiv:1606.03401] | * [[Audrūnas Gruslys]], [[Rémi Munos]], [[Ivo Danihelka]], [[Marc Lanctot]], [[Alex Graves]] ('''2016'''). ''Memory-Efficient Backpropagation Through Time''. [https://arxiv.org/abs/1606.03401v1 arXiv:1606.03401] | ||
* [[Jane X Wang]], [[Zeb Kurth-Nelson]], [[Dhruva Tirumala]], [[Hubert Soyer]], [[Joel Z Leibo]], [[Rémi Munos]], [[Charles Blundell]], [[Dharshan Kumaran]], [[Matt Botvinick]] ('''2016'''). ''Learning to reinforcement learn''. [https://arxiv.org/abs/1611.05763 arXiv:1611.05763] | * [[Jane X Wang]], [[Zeb Kurth-Nelson]], [[Dhruva Tirumala]], [[Hubert Soyer]], [[Joel Z Leibo]], [[Rémi Munos]], [[Charles Blundell]], [[Dharshan Kumaran]], [[Matt Botvinick]] ('''2016'''). ''Learning to reinforcement learn''. [https://arxiv.org/abs/1611.05763 arXiv:1611.05763] | ||
− | * [[Arthur Guez]], [[Théophane Weber]], [[Ioannis Antonoglou]], [[Karen Simonyan]], [[Oriol Vinyals]], [[Daan Wierstra]], [[Rémi Munos]], [[David Silver]] (''' | + | * [[Arthur Guez]], [[Théophane Weber]], [[Ioannis Antonoglou]], [[Karen Simonyan]], [[Oriol Vinyals]], [[Daan Wierstra]], [[Rémi Munos]], [[David Silver]] ('''2018'''). ''Learning to Search with MCTSnets''. [https://arxiv.org/abs/1802.04697 arXiv:1802.04697] |
=External Links= | =External Links= |
Revision as of 22:27, 3 June 2018
Rémi Munos,
a French mathematician and computer scientist at Google DeepMind, from 2000 to 2006 Associate Professor at the Centre de Mathématiques Appliquées, Ecole Polytechnique and later affiliated with INRIA Lille [2]. His research interests covers reinforcement learning, multi-armed bandits, and dynamic programming. Rémi Muno was contributor of the Go playing program Mogo, using Monte-Carlo Tree Search which uses patterns in the simulations and improvements in UCT.
Contents
Selected Publications
1996
- Rémi Munos (1996). A convergent reinforcement learning algorithm in the continuous case : the finite-element reinforcement learning. In International Conference on Machine Learning. Morgan Kaufmann
2005 ...
- Sylvain Gelly, Yizao Wang, Rémi Munos, Olivier Teytaud (2006). Modification of UCT with Patterns in Monte-Carlo Go. INRIA
- Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári (2007). Tuning Bandit Algorithms in Stochastic Environments. pdf
- Yizao Wang, Jean-Yves Audibert, Rémi Munos (2008). Algorithms for Infinitely Many-Armed Bandits, , Advances in Neural Information Processing Systems, pdf, Supplemental material - pdf
- Rémi Munos, Csaba Szepesvári (2008). Finite time bounds for sampling based fitted value iteration. Journal of Machine Learning Research, 9:815-857, 2008. pdf, pdf
- Raphaël Maîtrepierre, Jérémie Mary, Rémi Munos (2008). Adaptive play in Texas Hold'em Poker. ECAI 2008
- Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári (2009). Exploration-exploitation trade-off using variance estimates in multi-armed bandits. Theoretical Computer Science, 410:1876-1902, 2009, pdf
- Vincent Berthier, Amine Bourki, Matthieu Coulm, Guillaume Chaslot, Christophe Fiter, Sylvain Gelly, Jean-Baptiste Hoock, Rémi Munos, Julien Pérez, Arpad Rimmel, Philippe Rolet, Olivier Teytaud, Paul Vayssière, Yizao Wang, Ziqin Yu (et al.) (2009). Computer-Go is not only for Go. Korea, August 2009 slides as pdf
2010 ...
- Rémi Munos (2010). Approximate dynamic programming. In Olivier Sigaud and Olivier Buffet, editors, Markov Decision Processes in Artificial Intelligence, chapter 3, pages 67-98. ISTE Ltd and John Wiley & Sons Inc., pdf
- Ronald Ortner, Daniil Ryabko, Peter Auer, Rémi Munos (2014). Regret bounds for restless Markov bandits. Theoretical Computer Science 558, pdf
- Rémi Munos (2014). From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning. Foundations and Trends in Machine Learning, Vol. 7, No 1, hal-00747575v5, slides as pdf
2015 ...
- Audrūnas Gruslys, Rémi Munos, Ivo Danihelka, Marc Lanctot, Alex Graves (2016). Memory-Efficient Backpropagation Through Time. arXiv:1606.03401
- Jane X Wang, Zeb Kurth-Nelson, Dhruva Tirumala, Hubert Soyer, Joel Z Leibo, Rémi Munos, Charles Blundell, Dharshan Kumaran, Matt Botvinick (2016). Learning to reinforcement learn. arXiv:1611.05763
- Arthur Guez, Théophane Weber, Ioannis Antonoglou, Karen Simonyan, Oriol Vinyals, Daan Wierstra, Rémi Munos, David Silver (2018). Learning to Search with MCTSnets. arXiv:1802.04697