法国国家信息与自动化研究所深度强化学习方向博士后职位
法国国家信息与自动化研究所深度强化学习方向博士后职位
POST-DOCTORAL RESEARCH VISIT F/M HIGH PERFORMANCE DEEP REINFORCEMENT LEARNING
Inria
Job Description
Contract type: Fixed-term contract
Level of qualifications required: PhD or equivalent
Function: Post-Doctoral Research Visit
About the research centre or Inria department
Grenoble Rhône-Alpes Research Center groups together a few less than 650 people in 37 research teams and 8 research support departments.
Staff is localized on 5 campuses in Grenoble and Lyon, in close collaboration with labs, research and higher education institutions in Grenoble and Lyon, but also with the economic players in these areas.
Present in the fields of software, high-performance computing, Internet of things, image and data, but also simulation in oceanography and biology, it participates at the best level of international scientific achievements and collaborations in both Europe and the rest of the world.
Assignment
Reinforcement learning goal is to self-learn a task trying to maximize a reward (a game score for instance) interacting with simulations. Recently, researchers have successfully introduced deep neural networks enabling to address more complex problems. This is often refered as Deep Reinforcement Learning (DRL). DRL managed for instance to play many ATARI games. The most visible success of DLR is probably AlphaGo Zero that outperformed the best human players (and itself) after being trained without using data from human games but solely through reinforcement learning. The process requires an advanced infrastructure for the training phase. For instance AlphaGo Zero trained during more than 70 hours using 64 GPU workers and19 CPU parameter servers for playing 4.9 million games of generated self-play, using 1,600 simulations for each Monte Carlo Tree Search. The general workflow is the following. To speed up the learning process and enable a wide but thorough exploration of the parameter space, the learning neural network interacts in parallel with several instances of actors, each one consisting of a simulation of the task being learned and a neural network interacting with this simulation through the best wining strategy it knows. Periodically the actor neural networks are being updated by the learned neural network. This workflow has evolved through various research works combining parallelization, asynchronism and novel learning strategies (GORILA, A3C, IMPALA,...).
The goal of this postdoc is to push forward the scalability of these approaches, and to proposing novel learning strategies to learn more rapidly and more complex tasks (multiple heterogeneous tasks at once, non deterministic games, simulations of complex industrial or living systems). This work will be performed in close collaboration in between the Sequel INRIA team specialized in DRL (https: // team.inria.fr/sequel/) and the DataMove team specialized in HPC (https: // team.inria.fr/datamove) . Datamove has developed the Melissa (https: // melissa-sa.github.io/) solution to manage large ensembles of parallel simulations and aggregate their data on- line in a parallel server. Melissa enabled to run thousands of simulation on up to 30 000 cores. So far Melissa was used to compute advanced statistics. But we expect this framework to be a sound base for a DRL workflow. The SequeL team has strong activities in reinforcement learning, either deep or not, ranging from theroretical aspects to applications. Among other projects, SequeL has collaborated with Mila (Montréal) to design and develop the Guesswhat?! experiment (https: // guesswhat.ai/). As early as 2006, SequeL worked on go and designed the first go program (Crazy Stone) able to challenge a human expert player.
References
AlphaGoZero: https: // deepmind.com/blog/alphago-zero-learning-scratch/
TensorFlow: https: // www. tensorflow.org/
Gorila https: // arxiv.org/pdf/1507.04296
A3C https: // arxiv.org/abs/1602.01783
Rainbow https: // arxiv.org/abs/1710.02298
Impala https: // arxiv.org/abs/1802.01561
Elf: https: // arxiv.org/abs/1707.01067
RAY/Rllib: https: // ray.readthedocs.io/en/latest/rllib.html
Melissa: "https: // hal.inria.fr/hal-01607479v1
Main activities
Requirement: PhD in computer Science
Location: Grenoble or Lille
Hosting Teams:
Sequel (INRIA Lille): https: // team.inria.fr/sequel/
DataMove (INRIA Grenoble): https: // team.inria.fr/datamove
Contact: Bruno.Raffin@inria.fr and Philippe.Preux@inria.fr
Period: to start somewhere in 2019
Duration: 24 months
We are looking for a candidate with a PhD either in deep learning, reinforcement learning or high performance computing (a combination of these expertise would be ideal) for a 24 month contract at INRIA. The candidate will have the possibility to join either the Sequel team at Lille or the Grenoble Team at Grenoble.
The postdoc will have access to large supercomputers equipped with multiple GPUs for experiments. We expect this work to lead to international publications sustained by advanced software prototypes.
Benefits package
Subsidized meals
Partial reimbursement of public transport costs
Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
Possibility of teleworking (after 6 months of employment) and flexible organization of working hours
Professional equipment available (videoconferencing, loan of computer equipment, etc.)
Social, cultural and sports events and activities
Access to vocational training
Social security coverage
Remuneration
Salary: 2 653 € gross/month.
Monthly salary after taxes : around 2 136,39 € (medical insurance included, income tax excluded).
General Information
Theme/Domain : Distributed and High Performance Computing Statistics (Big data) (BAP E)
Town/city : St Martin d'Heres
Inria Center : CRI Grenoble - Rhône-Alpes
Starting date : 2021-01-01
Duration of contract : 2 years
Deadline to apply : 2021-02-10
Contacts
Inria Team : DATAMOVE
Recruiter : Raffin Bruno / bruno.raffin@inria.fr