Difference between revisions of "Rémi Munos"
GerdIsenberg (talk | contribs) (Created page with "'''Home * People * Rémi Munos''' FILE:remimunus.jpg|border|right|thumb|link=http://www.cmap.polytechnique.fr/~munos/| Rémi Munos <ref>[http://www.cmap....") |
GerdIsenberg (talk | contribs) |
||
(2 intermediate revisions by the same user not shown) | |||
Line 22: | Line 22: | ||
* [[Mathematician#ROrtner|Ronald Ortner]], [[Mathematician#DRyabko|Daniil Ryabko]], [[Peter Auer]], [[Rémi Munos]] ('''2014'''). ''Regret bounds for restless Markov bandits''. [https://en.wikipedia.org/wiki/Theoretical_Computer_Science_%28journal%29 Theoretical Computer Science] 558, [http://daniil.ryabko.net/mabajr.pdf pdf] | * [[Mathematician#ROrtner|Ronald Ortner]], [[Mathematician#DRyabko|Daniil Ryabko]], [[Peter Auer]], [[Rémi Munos]] ('''2014'''). ''Regret bounds for restless Markov bandits''. [https://en.wikipedia.org/wiki/Theoretical_Computer_Science_%28journal%29 Theoretical Computer Science] 558, [http://daniil.ryabko.net/mabajr.pdf pdf] | ||
* [[Rémi Munos]] ('''2014'''). ''From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning''. [http://dblp.uni-trier.de/db/journals/ftml/ftml7.html#Munos14 Foundations and Trends in Machine Learning, Vol. 7, No 1], [https://hal.archives-ouvertes.fr/hal-00747575 hal-00747575v5], [http://chercheurs.lille.inria.fr/~munos/papers/files/AAAI2013_slides.pdf slides as pdf] | * [[Rémi Munos]] ('''2014'''). ''From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning''. [http://dblp.uni-trier.de/db/journals/ftml/ftml7.html#Munos14 Foundations and Trends in Machine Learning, Vol. 7, No 1], [https://hal.archives-ouvertes.fr/hal-00747575 hal-00747575v5], [http://chercheurs.lille.inria.fr/~munos/papers/files/AAAI2013_slides.pdf slides as pdf] | ||
+ | * [[Tor Lattimore]], [[Rémi Munos]] ('''2014'''). ''Bounded Regret for Finite-Armed Structured Bandits''. [https://arxiv.org/abs/1411.2919 arXiv:1411.2919] | ||
==2015 ...== | ==2015 ...== | ||
* [[Audrūnas Gruslys]], [[Rémi Munos]], [[Ivo Danihelka]], [[Marc Lanctot]], [[Alex Graves]] ('''2016'''). ''Memory-Efficient Backpropagation Through Time''. [https://arxiv.org/abs/1606.03401v1 arXiv:1606.03401] | * [[Audrūnas Gruslys]], [[Rémi Munos]], [[Ivo Danihelka]], [[Marc Lanctot]], [[Alex Graves]] ('''2016'''). ''Memory-Efficient Backpropagation Through Time''. [https://arxiv.org/abs/1606.03401v1 arXiv:1606.03401] | ||
− | * [[Jane X Wang]], [[Zeb Kurth-Nelson]], [[Dhruva Tirumala]], [[Hubert Soyer]], [[Joel Z Leibo]], [[Rémi Munos]], [[Charles Blundell]], [[Dharshan Kumaran]], [[ | + | * [[Jane X Wang]], [[Zeb Kurth-Nelson]], [[Dhruva Tirumala]], [[Hubert Soyer]], [[Joel Z Leibo]], [[Rémi Munos]], [[Charles Blundell]], [[Dharshan Kumaran]], [[Matthew Botvinick]] ('''2016'''). ''Learning to reinforcement learn''. [https://arxiv.org/abs/1611.05763 arXiv:1611.05763] |
− | * [[Arthur Guez]], [[Théophane Weber]], [[Ioannis Antonoglou]], [[Karen Simonyan]], [[Oriol Vinyals]], [[Daan Wierstra]], [[Rémi Munos]], [[David Silver]] (''' | + | * [[Arthur Guez]], [[Théophane Weber]], [[Ioannis Antonoglou]], [[Karen Simonyan]], [[Oriol Vinyals]], [[Daan Wierstra]], [[Rémi Munos]], [[David Silver]] ('''2018'''). ''Learning to Search with MCTSnets''. [https://arxiv.org/abs/1802.04697 arXiv:1802.04697] |
=External Links= | =External Links= |
Latest revision as of 17:53, 17 January 2019
Rémi Munos,
a French mathematician and computer scientist at Google DeepMind, from 2000 to 2006 Associate Professor at the Centre de Mathématiques Appliquées, Ecole Polytechnique and later affiliated with INRIA Lille [2]. His research interests covers reinforcement learning, multi-armed bandits, and dynamic programming. Rémi Muno was contributor of the Go playing program Mogo, using Monte-Carlo Tree Search which uses patterns in the simulations and improvements in UCT.
Contents
Selected Publications
1996
- Rémi Munos (1996). A convergent reinforcement learning algorithm in the continuous case : the finite-element reinforcement learning. In International Conference on Machine Learning. Morgan Kaufmann
2005 ...
- Sylvain Gelly, Yizao Wang, Rémi Munos, Olivier Teytaud (2006). Modification of UCT with Patterns in Monte-Carlo Go. INRIA
- Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári (2007). Tuning Bandit Algorithms in Stochastic Environments. pdf
- Yizao Wang, Jean-Yves Audibert, Rémi Munos (2008). Algorithms for Infinitely Many-Armed Bandits, , Advances in Neural Information Processing Systems, pdf, Supplemental material - pdf
- Rémi Munos, Csaba Szepesvári (2008). Finite time bounds for sampling based fitted value iteration. Journal of Machine Learning Research, 9:815-857, 2008. pdf, pdf
- Raphaël Maîtrepierre, Jérémie Mary, Rémi Munos (2008). Adaptive play in Texas Hold'em Poker. ECAI 2008
- Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári (2009). Exploration-exploitation trade-off using variance estimates in multi-armed bandits. Theoretical Computer Science, 410:1876-1902, 2009, pdf
- Vincent Berthier, Amine Bourki, Matthieu Coulm, Guillaume Chaslot, Christophe Fiter, Sylvain Gelly, Jean-Baptiste Hoock, Rémi Munos, Julien Pérez, Arpad Rimmel, Philippe Rolet, Olivier Teytaud, Paul Vayssière, Yizao Wang, Ziqin Yu (et al.) (2009). Computer-Go is not only for Go. Korea, August 2009 slides as pdf
2010 ...
- Rémi Munos (2010). Approximate dynamic programming. In Olivier Sigaud and Olivier Buffet, editors, Markov Decision Processes in Artificial Intelligence, chapter 3, pages 67-98. ISTE Ltd and John Wiley & Sons Inc., pdf
- Ronald Ortner, Daniil Ryabko, Peter Auer, Rémi Munos (2014). Regret bounds for restless Markov bandits. Theoretical Computer Science 558, pdf
- Rémi Munos (2014). From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning. Foundations and Trends in Machine Learning, Vol. 7, No 1, hal-00747575v5, slides as pdf
- Tor Lattimore, Rémi Munos (2014). Bounded Regret for Finite-Armed Structured Bandits. arXiv:1411.2919
2015 ...
- Audrūnas Gruslys, Rémi Munos, Ivo Danihelka, Marc Lanctot, Alex Graves (2016). Memory-Efficient Backpropagation Through Time. arXiv:1606.03401
- Jane X Wang, Zeb Kurth-Nelson, Dhruva Tirumala, Hubert Soyer, Joel Z Leibo, Rémi Munos, Charles Blundell, Dharshan Kumaran, Matthew Botvinick (2016). Learning to reinforcement learn. arXiv:1611.05763
- Arthur Guez, Théophane Weber, Ioannis Antonoglou, Karen Simonyan, Oriol Vinyals, Daan Wierstra, Rémi Munos, David Silver (2018). Learning to Search with MCTSnets. arXiv:1802.04697