









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
the document has basics of machine learning
Typology: Slides
1 / 15
This page cannot be seen from the preview
Don't miss anything!
rise to the credit assignment problem and is thus more difficult]
might provide training examples; the learner might suggest interesting examples and ask the teacher for their outcome; or the learner can be completely on its own with no access to correct outcomes]
experience representative of the task the system will actually have to solve? It is best if it is, but such a situation cannot systematically be achieved]
(^) Given a set of legal moves, we want to learn how to choose the best move [since the best move is not necessarily known, this is an optimization problem] (^) ChooseMove : B --> M is called a Target Function [ ChooseMove , however, is difficult to learn. An easier and related target function to learn is V : B --> R, which assigns a numerical score to each board. The better the board, the higher the score.] (^) Operational versus Non-Operational Description of a Target Function [An operational description must be given] (^) Function Approximation [The actual function can often not be learned and must be approximated]
[e.g. <x1=3, x2=0, x3=1, x4=0, x5=0, x6=0, +100 (=blacks won)] Useful and Easy Approach: Vtrain(b) <- V(Successor(b))
Defining a criterion for success [What is the error that needs to be minimized?] Choose an algorithm capable of finding weights of a linear function that minimize that error [e.g. the Least Mean Square (LMS) training rule].
(^) The Performance Module : Takes as input a new board and outputs a trace of the game it played against itself. (^) The Critic : Takes as input the trace of a game and outputs a set of training examples of the target function (^) The Generalizer : Takes as input training examples and outputs a hypothesis which estimates the target function. Good generalization to new cases is crucial. (^) The Experiment Generator: Takes as input the current hypothesis (currently learned function) and outputs a new problem (an initial board state) for the performance system to explore In this course, we are mostly concerned with the generalizer