The Principle of Presence: a heuristic for Growing Knowledge Structured Neural Networks



Yüklə 445 b.
tarix04.11.2017
ölçüsü445 b.
#30133


The Principle of Presence:


Neural Networks

  • Efficient at learning single problems

    • Fully connected
    • Convergence in W3
  • Lifelong learning:

    • Specific cases can be important
    • More knowledge, more weights
    • Catastrophic forgetting
  • -> Full connectivity not suitable

  • -> Need localilty



How can people learn so fast?

  • Focus, attention

  • Raw table storing?

    • Frog and
    • Car and
    • Running woman
  • With generalization



What do people memorize? (1)

  • 1 memory: a set of « things »

  • Things are made of other, simpler things

  • Thing=concept

  • Basic concept=perceptual event



What do people memorize? (2)

  • Remember only what is present in mind at the time of memorization:



What do people memorize? (3)

  • Not what is not in mind!

    • Too many concepts are known
    • What is present:
      • Few things
      • Probably important
    • What is absent:
      • Many things
      • Probably unrelevant
  • Good but not always true -> heuristic



Presence in everyday life

  • Easy to see what is present,

  • Infants lose attention to balls that have just disappeared

  • The zero number invented long after other digits

  • Etc.



The principle of presence

  • Memorization = create a new concept upon only active concepts

  • Independant of the number of known concepts

  • Few active concepts



Implications

  • A concept can be active or inactive.

  • Activity must reflect importance, be rare

  • ~ event (programming)

  • New concept = conjunction of actives ones

  • Concepts must be re-usable(lifelong):

    • Re-use = create a link from this concept
    • 2 independant concepts = 2 units
  • -> More symbolic than MLP: a neuron can represent too many things



Implementation: NN

  • Nonlinearity

  • Graphs properties: local or global connectivity

  • Weights:

  • But more symbolic:

    • Inactivity: piecewise continuous activation function
    • Knowledge not too much distributed
    • Concepts not too much overlapping


First implementation

  • Inputs: basic events

  • Output: target concept

  • No macro-concept:

  • -> 3-layer

  • Neuron = conjunction,

    • unless explicit (supervised learning),
    • -> DNF
  • Output weights simulate priority



Locality in learning

  • Only one neuron modified at a time:

    • Nearest = most activated
  • If target concept not activated when it should:

    • Generalize the nearest connected neuron
    • Add a neuron for that specific case
  • If target active, but not enough or too much:

    • Generalize the most activating neuron


Learning: example (0)

  • Must learn AB.

  • Examples: ABC, ABD, ABE, but not AB.



Learning: example (1)

  • ABC:



Learning : example (2)

  • ABD:



Learning : example (3)

  • ABE: N1 slightly active for AB



Learning : example (4)

  • Final: N1 has generalized, active for AB



NETtalk task

  • TDNN: 120 neurons, 25.200 cnx, 90%

  • Presence: 753 neurons, 6.024 cnx, 74%

  • Then learns by heart

  • If inputs activity reversed

    • -> catastrophic!
  • Many cognitive tasks heavily biased toward the principle of presence?



Advantages w/r NNs

  • As many inputs as wanted, only active ones are used

  • Lifelong learning:

    • Large scale networks
    • Learns specific cases and generalizes, both quickly
  • Can lower weights without wrong prediction -> imitation



But…

  • Few data, limiting the number of neurons:

  • not as good as backprop

  • Creates many neurons (but can be deleted)

  • No negative weights



Work in progress

  • Negative case, must stay rare

    • Inhibitory links
  • Re-use of concepts

    • Macro-concepts: each concept can become an input


Yüklə 445 b.

Dostları ilə paylaş:




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©muhaz.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin