It is widely known that policies trained using reinforcement learning (RL) to solve simulated robotics problems (MuJoCo) are

Question we address: How to develop physics-informed reinforcement learning algorithms that guarantee safety and interpretability ?

Current experimental challenges include: demonstrating interpretability of the trained policies using

perform a

Work towards theorems providing convergence guarantees of the

Establish link between

\[ Q^{\pi, CM}(s_t, a_t) = \mathbb{E}_{r_{i\ge t},s_{i>t}\sim E,\ z_{i>t}\sim \pi,\ a_i = CM(s_i,z_i)}{\left[R_t|s_t,a_t\right]}; \]

\[ 0 = \min\left\{l(x)-V(x,t), \frac{\partial V}{\partial t}+\max_{u\in\mathcal{U}}{\nabla_x{V^Tf(x,u)}}\right\}. \]

Project in collaboration with Prof. Henryk Michalewski (Google & University of Warsaw).

- position is suitable for students that want to get involved in ML/RL research,
- full-time monthly salary at least 6000 PLN gross +, negotiable and highly dependant on the prospect candidate qualifications,
- possibility of combining the position with the PhD school run at University of Warsaw, Poland,
- work towards results publishable on major CS conferences, with aim at CAV, ICML, NIPS, ICLR, ICRA, AAAI,
- worldwide collaboration with renowned academic institutions (including UC San Diego, Stony Brook, Rutgers, TU Wien), and industry (Google, esportslab.gg),
- access to personal computer and computational resources.

- MSc title or work towards MSc in progress,
- passion for research,
- proficient in Python programming,
- know fundamentals of machine learning, neural networks and reinforcement learning algorithms (e.g. the Deep Learning book),
- know fundamentals of mathematics (calculus, linear algebra, basic real analysis),
- capability of working individually and self-study,

Please see the detailed project description below,
and reach me in case of any questions,

apply by sending CV and a motivation letter explaining why are you interested in this post to my jcyranka at gmail account.

Best, Jacek Cyranka