<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="ru">
	<id>https://www.wikicshse.ru/index.php?action=history&amp;feed=atom&amp;title=RL_2023</id>
	<title>RL 2023 - История изменений</title>
	<link rel="self" type="application/atom+xml" href="https://www.wikicshse.ru/index.php?action=history&amp;feed=atom&amp;title=RL_2023"/>
	<link rel="alternate" type="text/html" href="https://www.wikicshse.ru/index.php?title=RL_2023&amp;action=history"/>
	<updated>2026-06-06T11:52:01Z</updated>
	<subtitle>История изменений этой страницы в вики</subtitle>
	<generator>MediaWiki 1.45.3</generator>
	<entry>
		<id>https://www.wikicshse.ru/index.php?title=RL_2023&amp;diff=656&amp;oldid=prev</id>
		<title>imported&gt;Ivlevin: Migrated current public revision from wiki.cs.hse.ru</title>
		<link rel="alternate" type="text/html" href="https://www.wikicshse.ru/index.php?title=RL_2023&amp;diff=656&amp;oldid=prev"/>
		<updated>2023-11-25T16:19:35Z</updated>

		<summary type="html">&lt;p&gt;Migrated current public revision from wiki.cs.hse.ru&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Новая страница&lt;/b&gt;&lt;/p&gt;&lt;div&gt;== Lecturers and Seminarists ==&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot; style=&amp;quot;text-align:center&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
|| Lecturer || [https://www.hse.ru/staff/anaumov Alexey Naumov] || [anaumov@hse.ru] || T924&lt;br /&gt;
|- &lt;br /&gt;
|| Seminarist || Ilya Levin || [tg: @levensons] || T926&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== About the course ==&lt;br /&gt;
This page contains materials for Mathematical Foundations of Reinforcement learning course in 2023 year, optional one for 2nd year Master students of the Math of Machine Learning program (HSE and Skoltech).&lt;br /&gt;
&lt;br /&gt;
== Grading == &lt;br /&gt;
The final grade consists of 2 components (each is non-negative real number from 0 to 10, without any intermediate rounding) :&lt;br /&gt;
* O&amp;lt;sub&amp;gt;HW&amp;lt;/sub&amp;gt; for the hometasks&lt;br /&gt;
* O&amp;lt;sub&amp;gt;Exam&amp;lt;/sub&amp;gt; for the exam&lt;br /&gt;
The formula for the final grade is &lt;br /&gt;
* O&amp;lt;sub&amp;gt;Final&amp;lt;/sub&amp;gt; = 0.6*O&amp;lt;sub&amp;gt;HW&amp;lt;/sub&amp;gt; + 0.4*O&amp;lt;sub&amp;gt;Exam&amp;lt;/sub&amp;gt;&lt;br /&gt;
with the usual (arithmetical) rounding rule.&lt;br /&gt;
&lt;br /&gt;
== Course materials ==&lt;br /&gt;
*[https://www.overleaf.com/read/kbzmvxdzbrxq &amp;#039;&amp;#039;&amp;#039;Lectures and seminars notes&amp;#039;&amp;#039;&amp;#039;]&lt;br /&gt;
*[https://colab.research.google.com/drive/10qBq7Ot_1ZpnTeD11P5AnE8jFVj0OLXl?usp=sharing &amp;#039;&amp;#039;&amp;#039;Notebook for the first seminar&amp;#039;&amp;#039;&amp;#039;]&lt;br /&gt;
&lt;br /&gt;
== Homeworks ==&lt;br /&gt;
*[https://disk.yandex.ru/i/C8hwvvS5us09sA &amp;#039;&amp;#039;&amp;#039;HW 1&amp;#039;&amp;#039;&amp;#039;]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Recommended literature ==&lt;br /&gt;
&lt;br /&gt;
* Sebastien Bubek, Nicolo Cesa-Bianchi. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems. Chapter 2. http://sbubeck.com/SurveyBCB12.pdf&lt;br /&gt;
* Richard S. Sutton, Andrew G. Barto. Reinforcement Learning: An Introduction. Chapter 2. http://incompleteideas.net/book/the-book-2nd.html;&lt;br /&gt;
* Botao Hao et al. Bootstrapping Upper Confidence Bound. https://arxiv.org/abs/1906.05247&lt;br /&gt;
* Aleksandrs Slivkins. Introduction to Multi-Armed Bandits. https://arxiv.org/abs/1904.07272 [Chapter 1]&lt;br /&gt;
&lt;br /&gt;
==Homeworks ==&lt;br /&gt;
&lt;br /&gt;
== Projects ==&lt;/div&gt;</summary>
		<author><name>imported&gt;Ivlevin</name></author>
	</entry>
</feed>