Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Interacting with the Environment: Connectionism & Feedback in Learning Systems, Slides of Artificial Intelligence

West Bengal University of Animal and Fishery Sciences Artificial Intelligence

The concepts of connectionism, feedback, and credit assignment in the context of collective learning systems (cls). It discusses the role of automata, allegedonic algorithms, and the environment in cls, as well as the importance of selection and compensation methods. The document also includes pseudo code for implementing cls and explanations of reward and punishment compensation.

Typology: Slides

2012/2013

Uploaded on 04/29/2013

shantii 🇮🇳

4.4

(14)

98 documents

1 / 16

This page cannot be seen from the preview

Don't miss anything!

Interacting with the environment

Connectionism

Stímulus

Response

Feedback

Docsity.com

Partial preview of the text

Download Interacting with the Environment: Connectionism & Feedback in Learning Systems and more Slides Artificial Intelligence in PDF only on Docsity!

Interacting with the environment

Connectionism

Stímulus

Response

Feedback

Feedback at work:

playing tic tac toe

Possibilities

Boards

Plays

Won!

Credit Assignment and Connectionism

For all the actions the environment delivers one

single composite feedback

The automata must distribute the prize or

punishement among the network, generating a credit

assignment model

In this way the automata generates an adequate

internal pattern of behavior

This is we call Learning
There are many methodologies to model such

behavior

Here we shall use CLS (Collective Learning Systems)

So Far ….

Artificial Intelligence deals with knowledge and learning

Artificial learning is obtained by

Traversing knowledge bases (rule based and logical programming)
Artificial selection (genetic and evolutionary algorithms)
Adaptive methods (connectionism and feedback)

Adaptive behavior studies have their roots in Pavlov’s

studies of animal conditioning

CLS Formalization

 CLS = [ AUTOMATA, MA ]

 Where AUTOMATA = { I, O, STM, A }

 I : Is a vector of possible entries or stimuli

 O : Is a vector of possible responses or actions

 STM : Is the transition matrix where the Probability Pij of choosing Response Oj is stored for each Stimulus Ii

 A : Is an Alegedonic algorithm (punishment / reward) the modifies the distinct Pij according to the compensation policy of the automaton, and it is precisely this algorithm that represents learning

 MA: Is the Environment that emits a series of stimuli I and evaluates the

responses O of the AUTOMATA, that serves to determine the values applied to Pij across the algorithm A, and the matrix STM.

CLS mapping

Possible moves

Other player ’ s turn

Second turn

Selection method

Compensation method

Initial method

After the game is over and the winner is determined the compensation method modifies probabilities and the STM becomes more prescriptive (knowledge) rather than just descriptive (information)

Looking at the options the CLS selects

Its moves and gives the board to

the other player

After the other selects a move, the

CLS takes a second move and so on until

the game is over description

possible prescription moves

1/2 1/4 0 0 1/ 0 0 3/4 1/8 0 1/

Best moves

Algedonic compensation

In case of a Reward (with 0 < ß < 1)

For selection i -> k in the STM ( the selected play )

STM(t+1)i,k = STM(t)i,k + ß*(1– STM(t)i,k

For the others transitions i à j for j ≠ k

STM(t+1)i,j = STM(t)i,j - ß*(1– STM(t)i,k)/(n-1)

In case of a Punishment (with 0 < ß < 1)

For selection i -> k in the STM ( the selected play )

STM(t+1)i,k = STM(t)i,k - ß*STM(t)i,k

For the others transitions i à j for j ≠ k

STM(t+1)i,j = STM(t)i,j + ß*STM(t)i,k /(n-1)

CLS Non linear Compensation

Think about the case of a R/P in my

everyday life. How much do I listen to

a R/P? It depends on:

Who is giving it to me
What is my expectation
The recent evaluations I have had

We can take into account such

concerns, for example:

The domain of ß is 0 < ß < 1

A value of 0 will cause no learning,

while a value of 1 will saturate the

STM driving to one selection only

A reward/inaction is achieved by

using ß = 0

When punishing a ß / 2 reduces

the changes of wrongly updating

probabilities

Rewards more at the beginning

STM[i][k] = STM[i][k] + ß(1-STM[i][k])(1-STM[i][k] )

Rewards more at the end

STM[i][k] = STM[i][k] + ß* (1-STM[i][k] )*(STM[i][k])

Punishes more at the beginning

STM[i][k] = STM[i][k] - ß/2STM[i][k](1-STM[i][k] )

Punishes more at the end

STM[i][k] = STM[i][k] - ß/2* STM[i][k]*STM[i][k]

Boltzman entropy

As measurement of order, entropy can be define as

Using Entropy, we can check how well organized is the STM, thus the lower the value of the entropy the more uneven the probabilities are in the STM.

S = - STM[i][j] Log 2 (STM[i][j] ) for all i,j

STM in time

The initial STM represents a

descriptive matrix of

possible actions in the

game. Mostly the rules of

the game (information)

Inputs i^ Output options j

As time goes by the STM becomes more prescriptive of the game, showing the right moves (knowledge)

Δ time

In reality is not only Δ time, but also Δ information that

reduces the entropy and guides the STM towards the right

answers, this the reason why information is also called

negentropy

Knowledge Representation (basic)

Additional to the transitions in a game,

it is also necessary to represents

other aspects of the game such as:

The board or state of the game

The value of each board state

This becomes an important part of

knowledge representation

Could be represented as

‘ ---- 0 ---- ’

‘ --X- 0 ---- ’

X O

X

O O

X

O O

X

Yet it pays to represent equal boards with the same notation ‘ --X- 0 ---- ’.

Why?

If we use only arrays, you will see that the

number of boards will make the amount

Information to represent quite big.

As a matter of fact this is were data

structures first began to be used in stead

of arrays and the concept of link list was

developed in languages such as IPL, Lisp,

Snobol.

Learning versus Preprogrammed Knowledge

Δ time

Δ performance

100 %

Slow learning

Below the

preprogrammed

Back to stability

It will remain limited until it is reprogrammed

However It will relearn

Interacting with the Environment: Connectionism & Feedback in Learning Systems, Slides of Artificial Intelligence

Related documents

Partial preview of the text

Download Interacting with the Environment: Connectionism & Feedback in Learning Systems and more Slides Artificial Intelligence in PDF only on Docsity!

Interacting with the environment

Credit Assignment and Connectionism

STM[i][k] = STM[i][k] + ß(1-STM[i][k])(1-STM[i][k] )

STM[i][k] = STM[i][k] + ß* (1-STM[i][k] )*(STM[i][k])

STM[i][k] = STM[i][k] - ß/2STM[i][k](1-STM[i][k] )

STM[i][k] = STM[i][k] - ß/2* STM[i][k]*STM[i][k]

The board or state of the game

The value of each board state

X

O O

X

X

O O

X

Below the

preprogrammed