当前位置：首页>>博士后招聘>>国外博士后招聘>>正文内容

法国国家信息与自动化研究所深度强化学习方向博士后职位

2020年11月18日

来源：知识人网整理

摘要：

法国国家信息与自动化研究所深度强化学习方向博士后职位

POST-DOCTORAL RESEARCH VISIT F/M HIGH PERFORMANCE DEEP REINFORCEMENT LEARNING

Inria

Job Description

Contract type: Fixed-term contract

Level of qualifications required: PhD or equivalent

Function: Post-Doctoral Research Visit

About the research centre or Inria department

Grenoble Rhône-Alpes Research Center groups together a few less than 650 people in 37 research teams and 8 research support departments.

Staff is localized on 5 campuses in Grenoble and Lyon, in close collaboration with labs, research and higher education institutions in Grenoble and Lyon, but also with the economic players in these areas.

Present in the fields of software, high-performance computing, Internet of things, image and data, but also simulation in oceanography and biology, it participates at the best level of international scientific achievements and collaborations in both Europe and the rest of the world.

Assignment

Reinforcement learning goal is to self-learn a task trying to maximize a reward (a game score for instance) interacting with simulations. Recently, researchers have successfully introduced deep neural networks enabling to address more complex problems. This is often refered as Deep Reinforcement Learning (DRL). DRL managed for instance to play many ATARI games. The most visible success of DLR is probably AlphaGo Zero that outperformed the best human players (and itself) after being trained without using data from human games but solely through reinforcement learning. The process requires an advanced infrastructure for the training phase. For instance AlphaGo Zero trained during more than 70 hours using 64 GPU workers and19 CPU parameter servers for playing 4.9 million games of generated self-play, using 1,600 simulations for each Monte Carlo Tree Search. The general workflow is the following. To speed up the learning process and enable a wide but thorough exploration of the parameter space, the learning neural network interacts in parallel with several instances of actors, each one consisting of a simulation of the task being learned and a neural network interacting with this simulation through the best wining strategy it knows. Periodically the actor neural networks are being updated by the learned neural network. This workflow has evolved through various research works combining parallelization, asynchronism and novel learning strategies (GORILA, A3C, IMPALA,...).

The goal of this postdoc is to push forward the scalability of these approaches, and to proposing novel learning strategies to learn more rapidly and more complex tasks (multiple heterogeneous tasks at once, non deterministic games, simulations of complex industrial or living systems). This work will be performed in close collaboration in between the Sequel INRIA team specialized in DRL (https: // team.inria.fr/sequel/) and the DataMove team specialized in HPC (https: // team.inria.fr/datamove) . Datamove has developed the Melissa (https: // melissa-sa.github.io/) solution to manage large ensembles of parallel simulations and aggregate their data on- line in a parallel server. Melissa enabled to run thousands of simulation on up to 30 000 cores. So far Melissa was used to compute advanced statistics. But we expect this framework to be a sound base for a DRL workflow. The SequeL team has strong activities in reinforcement learning, either deep or not, ranging from theroretical aspects to applications. Among other projects, SequeL has collaborated with Mila (Montréal) to design and develop the Guesswhat?! experiment (https: // guesswhat.ai/). As early as 2006, SequeL worked on go and designed the first go program (Crazy Stone) able to challenge a human expert player.

References

 AlphaGoZero: https: // deepmind.com/blog/alphago-zero-learning-scratch/

 TensorFlow: https: // www. tensorflow.org/

 Gorila https: // arxiv.org/pdf/1507.04296

 A3C https: // arxiv.org/abs/1602.01783

 Rainbow https: // arxiv.org/abs/1710.02298

 Impala https: // arxiv.org/abs/1802.01561

 Elf: https: // arxiv.org/abs/1707.01067

 RAY/Rllib: https: // ray.readthedocs.io/en/latest/rllib.html

 Melissa: "https: // hal.inria.fr/hal-01607479v1

Main activities

 Requirement: PhD in computer Science

 Location: Grenoble or Lille

 Hosting Teams:

 Sequel (INRIA Lille): https: // team.inria.fr/sequel/

 DataMove (INRIA Grenoble): https: // team.inria.fr/datamove

 Contact: Bruno.Raffin@inria.fr and Philippe.Preux@inria.fr

 Period: to start somewhere in 2019

 Duration: 24 months

We are looking for a candidate with a PhD either in deep learning, reinforcement learning or high performance computing (a combination of these expertise would be ideal) for a 24 month contract at INRIA. The candidate will have the possibility to join either the Sequel team at Lille or the Grenoble Team at Grenoble.

The postdoc will have access to large supercomputers equipped with multiple GPUs for experiments. We expect this work to lead to international publications sustained by advanced software prototypes.

Benefits package

 Subsidized meals

 Partial reimbursement of public transport costs

 Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)

 Possibility of teleworking (after 6 months of employment) and flexible organization of working hours

 Professional equipment available (videoconferencing, loan of computer equipment, etc.)

 Social, cultural and sports events and activities

 Access to vocational training

 Social security coverage

Remuneration

Salary: 2 653 € gross/month.

Monthly salary after taxes : around 2 136,39 € (medical insurance included, income tax excluded).

General Information

 Theme/Domain : Distributed and High Performance Computing Statistics (Big data) (BAP E)

 Town/city : St Martin d'Heres

 Inria Center : CRI Grenoble - Rhône-Alpes