









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
The concepts of connectionism, feedback, and credit assignment in the context of collective learning systems (cls). It discusses the role of automata, allegedonic algorithms, and the environment in cls, as well as the importance of selection and compensation methods. The document also includes pseudo code for implementing cls and explanations of reward and punishment compensation.
Typology: Slides
1 / 16
This page cannot be seen from the preview
Don't miss anything!
Connectionism
Stímulus
Response
Feedback
Feedback at work:
playing tic tac toe
Possibilities
Boards
Plays
Won!
single composite feedback
punishement among the network, generating a credit
assignment model
internal pattern of behavior
behavior
So Far ….
Artificial Intelligence deals with knowledge and learning
Artificial learning is obtained by
Adaptive behavior studies have their roots in Pavlov’s
studies of animal conditioning
CLS Formalization
CLS = [ AUTOMATA, MA ]
Where AUTOMATA = { I, O, STM, A }
I : Is a vector of possible entries or stimuli
O : Is a vector of possible responses or actions
STM : Is the transition matrix where the Probability Pij of choosing Response Oj is stored for each Stimulus Ii
A : Is an Alegedonic algorithm (punishment / reward) the modifies the distinct Pij according to the compensation policy of the automaton, and it is precisely this algorithm that represents learning
MA: Is the Environment that emits a series of stimuli I and evaluates the
responses O of the AUTOMATA, that serves to determine the values applied to Pij across the algorithm A, and the matrix STM.
CLS mapping
Possible moves
Other player ’ s turn
Second turn
Selection method
Compensation method
Initial method
After the game is over and the winner is determined the compensation method modifies probabilities and the STM becomes more prescriptive (knowledge) rather than just descriptive (information)
Looking at the options the CLS selects
Its moves and gives the board to
the other player
After the other selects a move, the
CLS takes a second move and so on until
the game is over description
possible prescription moves
1/2 1/4 0 0 1/ 0 0 3/4 1/8 0 1/
Best moves
Algedonic compensation
In case of a Reward (with 0 < ß < 1)
For selection i -> k in the STM ( the selected play )
STM(t+1)i,k = STM(t)i,k + ß*(1– STM(t)i,k
For the others transitions i à j for j ≠ k
STM(t+1)i,j = STM(t)i,j - ß*(1– STM(t)i,k)/(n-1)
In case of a Punishment (with 0 < ß < 1)
For selection i -> k in the STM ( the selected play )
STM(t+1)i,k = STM(t)i,k - ß*STM(t)i,k
For the others transitions i à j for j ≠ k
STM(t+1)i,j = STM(t)i,j + ß*STM(t)i,k /(n-1)
CLS Non linear Compensation
Think about the case of a R/P in my
everyday life. How much do I listen to
a R/P? It depends on:
We can take into account such
concerns, for example:
A value of 0 will cause no learning,
while a value of 1 will saturate the
STM driving to one selection only
using ß = 0
the changes of wrongly updating
probabilities
Rewards more at the beginning
Rewards more at the end
Punishes more at the beginning
Punishes more at the end
Boltzman entropy
As measurement of order, entropy can be define as
Using Entropy, we can check how well organized is the STM, thus the lower the value of the entropy the more uneven the probabilities are in the STM.
S = - STM[i][j] Log 2 (STM[i][j] ) for all i,j
STM in time
The initial STM represents a
descriptive matrix of
possible actions in the
game. Mostly the rules of
the game (information)
Inputs i^ Output options j
As time goes by the STM becomes more prescriptive of the game, showing the right moves (knowledge)
Δ time
In reality is not only Δ time, but also Δ information that
reduces the entropy and guides the STM towards the right
answers, this the reason why information is also called
negentropy
Knowledge Representation (basic)
Additional to the transitions in a game,
it is also necessary to represents
other aspects of the game such as:
This becomes an important part of
knowledge representation
Could be represented as
‘ ---- 0 ---- ’
‘ --X- 0 ---- ’
O
X O
Yet it pays to represent equal boards with the same notation ‘ --X- 0 ---- ’.
Why?
If we use only arrays, you will see that the
number of boards will make the amount
Information to represent quite big.
As a matter of fact this is were data
structures first began to be used in stead
of arrays and the concept of link list was
developed in languages such as IPL, Lisp,
Snobol.
Learning versus Preprogrammed Knowledge
Δ time
Δ performance
100 %
Slow learning
Back to stability
It will remain limited until it is reprogrammed
However It will relearn