Cognitive Psychology and Cognitive Neuroscience/Knowledge Representation and Hemispheric Specialisation

Introduction

Most human cognitive abilities rely on or interact with what we call knowledge. How do people navigate through the world? How do they solve problems, how do they comprehend their surroundings and on which basis do people make decisions and draw inferences? For all these questions, knowledge, the mental representation of the world is part of the answer.

What is knowledge? According to Merriam-Websters online dictionary, knowledge is “the range of one’s information and understanding” and “the circumstance or condition of apprehending truth or fact through reasoning”. Thus, knowledge is a structured collection of information, that can be acquired through learning, perception or reasoning.

This chapter deals with the structures both in human brains and in computational models that represent knowledge about the world. First, the idea of concepts and categories as a model for storing and sorting information is introduced, then the concept of semantic networks and, closely related to these ideas, an attempt to explain the way humans store and handle information is made. Apart from the biological aspect, we are also going to talk about knowledge representation in artificial systems which can be helpful tools to store and access knowledge and to draw quick inferences.

After looking at how knowledge is stored and made available in the human brain and in artificial systems, we will take a closer look at the human brain with regard to hemispheric specialisation. This topic is not only connected to knowledge representation, since the two hemispheres differ in which type of knowledge is stored in each of them, but also to many other chapters of this book. Where, for example, is memory located, and which parts of the brain are relevant for emotions and motivation? In this chapter we focus on the general differences between the right and the left hemisphere. We consider the question whether they differ in what and how they process information and give an overview about experiments that contributed to the scientific progress in this field.

Knowledge Representation in the Brain

Concepts and Categories

For many cognitive functions, concepts are essential. Concepts are mental representations, including memory, reasoning and using/understanding language. One function of concepts is the categorisation of knowledge which has been studied intensely. In the course of this chapter, we will focus on this function of concepts.

Imagine you wake up every single morning and start wondering about all the things you have never seen before. Think about how you would feel if an unknown car parked in front of your house. You have seen thousands of cars but since you have never seen this specific car in this particular position, you would not be able to provide yourself with any explanation. Since we are able to find an explanation, the questions we need to ask ourselves are: How are we able to abstract from prior knowledge and why do we not start all over again if we are confronted with a slightly new situation? The answer is easy: We categorise knowledge. Categorisation is the process by which things are placed into groups called categories.

Categories are so called “pointers of knowledge”. You can imagine a category as a box, in which similar objects are grouped and which is labeled with common properties and other general information about the category. Our brain does not only memorise specific examples of members of a category, but also stores general information that all members have in common and which therefore defines the category. Coming back to the car-example, this means that our brain does not only store how your car, your neighbors’ and your friends’ car look like, but it also provides us with the general information that most cars have four wheels, need to be fueled and so on. Because categorisation immediately allows us to get a general picture of a scene by allowing us to recognise new objects as members of a category, it saves us much time and energy that we otherwise would have to spend in investigating new objects. It helps us to focus on the important details in our environment, and enables us to draw the correct inferences. To make this obvious, imagine yourself standing at the side of a road, wanting to traverse it. A car approaches from the left. Now, the only thing you need to know about this car is the general information provided by the category, that it will run you over if you don't wait until it has passed. You don't need to care about the car's color, number of doors and so on. If you were not able to immediately assign the car to the category "car", and infer the necessity to step back, you would get hit because you would still be busy with examining the details of that specific and unknown car. Therefore categorisation has proved itself as being very helpful for surviving during evolution and allows us to quickly and efficiently navigate through our environment.

Categories provide a lot of information about their members

Definitional Approach

Take a look at the following picture! You will see four different kinds of cars. They differ in shape, color and other features, nonetheless you are probably sure that they are all cars.

Four different objects but all are cars

What makes us so convinced about the identity of these objects? Maybe we can try to find a definition which describes all these cars. Have all of them four wheels? No, There are some which have only three. Do all cars drive with petrol? No, That's not true for all cars either. Apparently we will fail to come up with a definition. The reason for this failure is that we have to generalise to make a definition. That would work perhaps for geometrical objects, but obviously not for natural things. They do not share completely identical features in one category for that it is problematic to find an appropriate definition. There are however similarities between members of one category, so what about this familiarity? The famous philosopher and linguist Ludwig Wittgenstein asked himself this question and claimed to have found a solution. He developed the idea of family resemblance. That means that members of a category resemble each other in several ways. For example cars differ in shape, color and many other properties but every car resembles somehow other cars. The following two approaches determines categories by similarity.

Prototype Approach

The prototype approach was proposed by Rosch in 1973. A prototype is an average case of all members in a particular category, but it is not an actual, really existent member of the category. Even extreme various features of members within one category can be explained by this approach. Different degrees of prototypicality represent differences among category- members. Members which resemble the prototype very strongly are high-prototypical. Members which differ in a lot of ways from the prototype are therefore low-prototypical. There seem to be connections to the idea of family resemblance and indeed some experiments showed that high prototypicality and high family resemblance are strongly connected. The typicality effect describes the fact that high-prototypical members are faster recognised as a member of a category. For example participants had to decide whether statements like “A penguin is a bird.” or “A sparrow is bird.” are true. Their decisions were much faster concerning the “sparrow” as a high-prototypical member of the category “bird” than for an atypical member as “penguin”. Participants also tend to prefer prototypical members of a category when asked to list objects of a category. Concerning the birds-example, they rather list “sparrow” than “penguin”, which is a quite intuitive result. In addition high-prototypical objects are strongly affected by priming.

Exemplar Approach

The typicality effect can also be explained by a third approach which is concerned with exemplars. Similar to a prototype, an exemplar is a very typical member of the category. The difference between exemplars and prototypes is that exemplars are actually existent members of a category that a person has encountered in the past. Nevertheless, it involves also the similarity of an object to a standard object. Only that the standard here involves many examples and not the average, each one called an exemplar.

Again we can show the typicality effect: Objects that are similar to many examples we have encountered are classified faster to objects which are similar to few examples. You have seen a sparrow more often in your life than a penguin, so you should recognise the sparrow faster.

For both prototype and exemplar approach there are experiments whose results support either one approach. Some people claim that the exemplar approach has less problems with variable categories and with atypical cases within categories. E.g. the category “games” is quite difficult to realise with the prototype approach. How do you want to find an average case for all games, like football, golf, chess. The reason for that could be that “real” category- members are used and all information of the individual exemplars, which can be useful when encountering other members later, are stored. Another point where the approaches can be compared is how well they work for differently sized categories. The exemplar approach seems to work better for smaller categories and prototypes do better for larger categories.

Some researchers concluded that people may use both approaches: When we initially learn something about a category we average seen exemplars into a prototype. It would be very bad in early learning, if we already take into account what exceptions a category has. In getting to know some of these exemplars more in detail the information becomes strengthened.

“We know generally what cats are (the prototype), but we know specifically our own cat the best (an exemplar).” (Minda & Smith, 2001)

Hierarchical Organization of Categories

Now that we know about the different approaches of how we go about forming categories, let us look at the structure of a category and the relationship between categories. The basic idea is that larger categories can be split up into more specific and smaller ones.

Rosch stated that by this process three levels of categorization are created:

It is interesting that the decrease of information from basic to superordinate is really high but that the increase of information from basic down to subordinate is rather low. Scientists wanted to find out if among these levels one is preferred over the others. They asked participants to name presented objects as quickly as possible. The result was that the subjects tended to use the basic-level name, which includes the optimal amount of stored information. Therefore a picture of a retriever would be named “dog” rather than “animal” or “retriever”. It is important to note that the levels are different for each person depending on factors such as expertise and culture.

One factor which influences our categorization is knowledge itself. Experts pay more attention to specific features of objects in their area than non-experts would do. For example after presenting some pictures of birds experts of birds tend to say the subordinate name (blackbird, sparrow) while non-experts just say "bird". The basic level in the area of interest of an expert is lower than the basic level of a layperson. Therefore knowledge and experience of people affect categorization.

Another factor is culture. Imagine a people living for instance in close contact with their natural environment, and have therefore a greater knowledge about plants etc. than, for example, students in Germany. If you ask the latter what they see in nature, they use the basic level ‘tree’ and if you do the same task for the people closer to nature they will tend to answer in terms of lower level concepts such as ‘oak tree’.

Representation of Categories in the Brain

There is evidence that some areas in the brain are selective for different categories, but it is not very probable that there is a corresponding brain area for each category. Results of neurophysiological research point to a kind of double dissociation for living and non-living things. Evidence has been found in fMRI studies that they are indeed represented in different brain areas. It is important to denote that nevertheless there is much overlap between the activation of different brain areas by categories. Moreover when going one step closer into the physical area there is a connection to mental categories, too. There seem to exist neurons which respond better to objects of a particular category, namely so called “category-specific neurons”. These neurons fire not only as a response to one object but to many objects within one category. This leads to the idea that probably many neurons fire if a person recognises a particular object and that maybe these combined patterns of the firing neurons represent the object.

Semantic Networks

The "Semantic Network approach" proposes that concepts of the mind are arranged in networks, in other words, in a functional storage-system for the `meanings' of words. Of course, the concept of a semantic net is very flexible. In a graphical illustration of such a semantic net, concepts of our mental dictionary are represented by nodes, which in this way represent a piece of knowledge about our world.

The properties of a concept could be placed, or "stored", next to a node representing that concept. Links between the nodes indicate the relationship between the objects. The links can not only show that there is a relationship, they can also indicate the kind of relation by their length, for example.

Every concept in the net is in a dynamical correlation with other concepts, which may have protoypically similar characteristics or functions.

Collins and Quillian's Model

Semantic Network according to Collins and Quillian with nodes, links, concept names and properties.

One of the first scientists who thought about structural models of human memory that could be run on a computer was Ross Quillian (1967). Together with Allan Collins, he developed the Semantic Network with related categories and with a hierarchical organisation.

In the picture on the right hand side, Collins and Quillians network with added properties at each node is shown. As already mentioned, the skeleton-nodes are interconnected by links. At the nodes, concept names are added. Like in paragraph "Hierarchical Organisation of Categories", general concepts are on the top and more particular ones at the bottom. By looking at the concept "car", one gets the information that a car has 4 wheels, has an engine, has windows, and furthermore moves around, needs fuel, is manmade.

These pieces of information must be stored somewhere. It would take too much space, if every detail must be stored at every level. So the information of a car is stored at the basis level and further information about specific cars, e.g. BMW, is stored at the lower level, where you do not need the fact that the BMW also has four wheels, if you already know that it is a car. This way of storing shared properties at a higher-level node is called Cognitive Economy.

In order not to produce redundancies, Collins and Quillian thought of this as an information inheritance principle. Information, that is shared by several concepts, is stored in the highest parent node, containing the information. So all son-nodes, that are below the information bearer , also can access the information about the properties. However, there are exceptions. Sometimes a special car has not four wheels, but three. This specific property is stored in the son-node.

The logic structure of the network is convincing, since it can show that the time of retrieving a concept and the distances in the network correlate. The correlation is proven by the sentence-verification technique. In experiments probands had to answer statements about concepts with "yes" or "no". It took actually longer to say "yes", if the concept bearing nodes were further apart.

The phenomenon that adjacent concepts are activated is called Spreading activation. These concepts are far more easily accessed by memory, they are "primed". This was studied and backed by David Meyer and Roger Schaneveldt (1971) with a lexical-decision task. Probands had to decide if word pairs were words or non-words. They were faster at finding real word pairs if the concepts of the two words were close by in the intended network.

While having the ability to explain many questions, the model has some flaws.

The Typicality Effect is one of them. It is known that "reaction times for more typical members of a category are faster than for less typical members". (MITECS) This contradicts the assumptions of Collins' and Quillian's Model, that the distance in the net is responsible for reaction time. It was experimentally determined that some properties are stored at specific nodes, therefore the cognitive economy stands in question. Furthermore, there are examples of faster concept retrieval although the distances in the network are longer.

These points led to another version of the Semantic Network approach: Collins and Loftus Model.

Collins and Loftus Model

Collins and Loftus (1975) tried to abandon these problems by using shorter or longer links depending on the relatedness and interconnections between formerly not directly linked concepts. Also the former hierarchic structure was substituted by a more individual structure of a person. Only to name a few of the extensions. As shown in the picture on the right, the new model represents interpersonal differences, such as acquired during a humans lifespan. They manifest themselves in the layout and the various lengths of the links of the same concepts.

An example: The concept "vehicle" is connected to car, truck or bus by short links, and to fire engine or ambulance with longer links.

After these enhancements, the model is so omnipotent that some researchers scarced it for being too flexible. In their opinion, the model is no longer a scientific theory, because it is not disprovable. Furthermore, we do not know how long these links are in us. How should they be measurable and could they actually?

Connectionist Approach

Every concept in a semantic net is in a dynamical correlation with other concepts which can have prototypically similar characteristics or functions. The neural networks in the brain are organised similarly. Furthermore, it is useful to include the features of ”spreading activation” and ”parallel distributed activity” in a concept of such a semantic net to explain the complexity of the very sophisticated environment.

Basic Principles of Connectionism

The connectionists did this by modeling their networks after neural networks in the nervous system. Every node of the diagram represents a neuron-like processing unit. These units can be divided into three subgroups: Input units, which become activated by a stimulation of the environment, hidden units, which receive signals from an input-unit and pass them to an output unit and output units, which show a pattern of activation that represents the initial stimulus. Excitatory and inhibitory connections between units just like synapses in the brain allow ’input’ to be analyzed and evaluated. For computing the outcome of such systems, it is useful to attach a certain ’weight’ to the input of the connectionists system, that mimics the strength of a stimulus of the human nervous system.

It needs to be emphasized that connectionist networks are not models of how the nervous system works. The approach of connectionist networks is a hypothetical approach to represent categories in network patterns. Another name for the connectionist approach is Parallel Distributed Processing approach, for short PDP, since processing takes place in parallel lines and the output is distributed across many units.

Operation of Connectionist Networks

First a stimulus is presented to the input units. Then the links pass on the signal to the hidden units, that distribute the signal to the output units via further links. In the first trial, the output units shows a wrong pattern. After many repetitions, the pattern finally is correct. This is achieved by back propagation. The error signals are send back to the hidden units and the signals are reprocessed. During these repetitive trials, the ”weights” of the signal are gradually calibrated on behalf of the error signals in order to get a right output pattern at last. After having achieved a correct pattern for one stimulus, the system is ready to learn a new concept.

Evaluating Connectionism

The PDP approach is important for knowledge representation studies. It is far from perfect, but on the move to get there. The process of learning enables the system to make generalizations, because similar concepts create similar patterns. After knowing one car, the system can recognize similar patterns as other cars, or may even predict how other cars look like. Furthermore, the system is protected against total wreckage. A damage to single units will not cause the system’s total breakdown, but will delete only some patterns, which use those units. This is called graceful degradation and is often found in patients with brain lesions. These two arguments lead to the third. The PDP is organized similarly to the human brain. And some effective computer programs have been developed on this basis, that were able to predict the consequences of human brain damage.

On the other hand, the connectionist approach is not without problems. Formerly learned concepts can be superposed by new concepts. In addition, PDP can not explain more complex processes than learning concepts. Neither can it explain the phenomenon of rapid learning, which does not require extensive learning. It is assumed that rapid learning takes place in the hippocampus, and that conceptual and gradual learning is located in the cortex.

In conclusion, the PDP approach can explain some features of knowledge representation very well but fails for some complex processes.

Mental Representation

There are different theories on how living beings, especially humans encode information to knowledge. We may think of diverse mental representations of the same object. When reading the written word "car", we call this a discrete symbol. It matches with all imaginable cars and is therefore not bound to a special vehicle. It is an abstract, or amodal, representation. This is different if instead we see a picture of a car. It might be a red sports car. Now we speak of a non-discrete symbol, an imaginable picture that appears in front of our inner eye and that fits only to certain cars of sufficiently similar appearance.

Propositional Approach

The Propositional Approach is one possible way to model mental representations in the human brain. It works with discrete symbols which are strongly connected among each other. The usage of discrete symbols necessitates clear definitions of each symbol, as well as information about the syntactic rules and the context dependencies in which the symbols may be used. The symbol "car" is only comprehensible for people who do understand English and have seen a car before and therefore know what a car is about. The Propositional Approach is an explicit way to explain mental representation.

Definitions of propositions differ in the different fields of research and are still under discussion. One possibility is the following: ”Traditionally in philosophy a distinction is made between sentences and the ideas underlying those sentences, called propositions. A single proposition may be expressed by an almost unlimited number of sentences. Propositions are not atomic, however; they may be broken down into atomic concepts called ”Concepts”.

In addition, mental propositions deal with the storage, retrieval and interconnection of information as knowledge in the human brain. There is a big discussion, if the brain really works with propositions or if the brain processes its information to and from knowledge in another way or perhaps in more than one way.

Imagery Approach

One possible alternative to the Propositional Approach, is the Imagery Approach. Since here the representation of knowledge is understood as the storage of images as we see them, it is also called analogical or perceptual approach. In contrast to the Propositional Approach it works with non-discrete symbols and is modality specific. It is an implicit approach to mental representation. The picture of the sports car includes implicitly seats of any kind. If additionally mentioned that they are off-white, the image changes to a more specific one. How two non-discrete symbols are combined is not as predetermined as it is for discrete symbols. The picture of the off-white seats may exist without the red car around, as well as the red car did before without the off-white seats. The Imagery and the Propositional Approaches are also discussed in chapter 8.

Computational Knowledge Representation

Computational knowledge representation is concerned with how knowledge can be represented symbolically and how it can be manipulated in automated ways. Almost all of the theories mentioned above evolved in symbiosis with computer science. On the one hand, computer science uses the human brain as an inspiration for computational systems, on the other hand, artificial models are used to further our understanding of the biological basis of knowledge representation.

Knowledge representation is connected to many other fields related to information processing, e.g. logic, linguistics, reasoning, and the philosophical aspects of these fields. In particular, it is one of the crucial topics of Artificial Intelligence, as it deals with information encoding, storing and usage for computational models of cognition.

There are three main points that need to be addressed with regard to computational knowledge representation: The process, the formalisms and the applications of knowledge engineering.

Knowledge Engineering

The process of developing computational knowledge-based systems is called knowledge engineering. This process involves assessing the problem, developing a structure for the knowledge base and implementing actual knowledge into the knowledge base. The main task for knowledge engineers is to identify an appropriate conceptual vocabulary.

There are different kinds of knowledge, for instance rules of games, attributes of objects and temporal relations, and each type is expressed best by its own specific vocabulary. Related conceptual vocabularies that are able to describe objects and their relationships are called ontologies. These conceptual vocabularies are highly formal and each is able to express meaning in specific fields of knowledge. They are used for queries and assertions to knowledge bases and make sharing knowledge possible. In order to represent different kinds of knowledge in one framework, Jerry Hobbs (1985) proposed the principle of ontological promiscuity. Thereby several ontologies are mixed together to cover a range of different knowledge types.

A query to a system that represents knowledge about a world made of everyday items and that can perform actions in this world may look like this: “Take the cube from the table!”. This query could be processed as follows: First, since we live in a temporal world, the action needs to be a processed in a way that can be broken down into successive steps. Secondly, we make general statements about the rules for our system, for example that gravitational forces have a certain effect. Finally, we try out the chain of tasks that have to be done to take the cube from the table. 1) Reach out for the cube with the hand, 2) grab it, 3) raise the hand with the cube, etc. Logical Reasoning is the perfect tool for this task, because a logical system can also recognise if the task is possible at all.

There is a problem with the procedure described above. It is called the frame problem. The system in the example deals with changing states. The actions that take place change the environment. That is, the cube changes its place. Yet, the system does not make any propositions about the table so far. We need to make sure, that after picking up the cube from the table, the table does not change its state. It should not disappear or break down. This could happen, since the table is no longer needed. The systems tells that the cube is in the hand and omits any information about the table. In order to tackle the Frame Problem there have to be stated some special axioms or similar things. The Frame Problem has not been solved completely. There are different approaches to a resolution. Some add object spatial and temporal boundaries to the system/world (Hayes 1985). Others try more direct modeling. They do transformations on state descriptions. For example: Before the transformation the cube is on the table, after transformation , the table still exists, but independent from the cube.

Knowledge Representation Formalisms

The type of knowledge representation formalism determines how information is stored. Most knowledge representation applications are developed for a specific purpose, for example a digital map for robot navigation or a graph like account of events for visualizing stories.

Each knowledge representation formalisms needs a strict syntax, semantics and inference procedure in order to be clear and computable. Most formalisms have the following attributes to be able to express information more clearly: The Semantic Network Approach, hierarchies of concepts (e.g. vehicle -> car -> truck) and property inheritance (e.g. red cars have four wheels since cars have four wheels). There are attributes that provide the possibility to add new information to the system without creating any inconsistencies, and the possibility to create a "closed-world" assumption. For example if the information that we have gravitation on earth is omitted, the closed-world assumption must be false for our earth/world.

A problem for knowledge representation formalisms is that expressive power and deductive reasoning are mutually exclusive. If a formalism has a big expressive power, it is able to describe a wide range of (different) information, but is not able to do brilliant inferring from (given) data. Propositional logic is restricted to Horn clauses. A Horn clause is a disjunction of literals with at most one positive literal. It has a very good decision procedure(inferring), but can not express generalisations. An example is given in the logical programming language Prolog. If a formalism has a big deductive complexity, it is able to do brilliant inferring, i.e. make conclusions, but has a poor range of what it can describe. An example is second-order logic. So, the formalism has to be tailored to the application of the KR system. This is reached by compromises between expressiveness and deductive complexity. In order to get a greater deductive power, expressiveness is sacrificed and vice versa.

With the growth of the field of knowledge bases, many different standards have been developed. They all have different syntactic restrictions. To allow intertranslation, different "interchange" formalisms have been created. One example is the Knowledge Interchange Format which is basically first-order set theory plus LISP (Genesereth et al. 1992).

Applications of Knowledge Representation

Computational knowledge representation is mostly not used as a model of cognition but to make pools of information accessible, i.e. as an extension of database technology. In these cases general rules and models are not needed. With growing storage media, one is capable of creating simple knowledge bases stating all specific facts. The information is stored in the form of sentential knowledge, that is knowledge saved in form of sentences comparable to propositions and program code. Knowledge is seen as a reservoir of useful information rather than as supporting a model of cognitive activity. More recently, increased available memory size has made it feasible to use "compute-intensive" representations that simply list all the particular facts rather than stating general rules. These allow the use of statistical techniques such as Markov simulation, but seem to abandon any claim to psychological plausibility.

Artificial Intelligence

Artificial intelligence or intelligence added to a system that can be arranged in a scientific context or Artificial Intelligence (English: Artificial Intelligence or simply abbreviated AI) is defined as the intelligence of a scientific entity. This system is generally considered a computer. Intelligence is created and incorporated into a machine (computer) in order to be able to do work as human beings can. Several types of fields that use artificial intelligence include expert systems, computer games (games), fuzzy logic, artificial neural networks and robotics. Many things seem difficult for human intelligence, but for Informatics it is relatively unproblematic. For example: transforming equations, solving integral equations, making chess games or Backgammon. On the other hand, things that for humans seem to demand a little intelligence, until now are still difficult to realize in Informatics. For example: Object / Face Introduction, playing soccer.

Although AI has a strong connotation of science fiction, AI forms a very important branch of computer science, dealing with behavior, learning and intelligent adaptation in a machine. Research in AI involves making machines to automate tasks that require intelligent behavior. Examples include control, planning and scheduling, the ability to answer customer diagnoses and questions, as well as handwriting recognition, voice and face. Such things have become separate disciplines, which focus on providing solutions to real life problems. The AI system is now often used in the fields of economics, medicine, engineering and the military, as has been built in several home computer and video game software applications. This 'artificial intelligence' not only wants to understand what an intelligence system is, but also constructs it. There is no satisfactory definition for 'intelligence': 1. intelligence: the ability to acquire knowledge and use it 2. or intelligence is what is measured by a 'Intelligence Test'

Broadly speaking, AI is divided into two notions namely Conventional AI and Computational Intelligence (CI, Computational Intelligence). Conventional AI mostly involves methods now classified as machine learning, which are characterized by formalism and statistical analysis. Also known as symbolic AI, logical AI, pure AI and GOFAI, Good Old Fashioned Artificial Intelligence. The methods include: 1. Expert system: apply the capability of consideration to reach conclusions. An expert system can process a large amount of information that is known and provides conclusions based on these information. 2. Case based considerations 3. Bayesian Network 4. Behavior-based AI: a modular method for manually establishing AI systems Computational intelligence involves iterative development or learning (e.g. tuning parameters as in connectionist systems. This learning is based on empirical data and is associated with non-symbolic AI, irregular AI and soft calculations. The main methods include: 1. Neural Network: a system with very strong pattern recognition capabilities 2. Fuzzy systems: techniques for consideration under uncertainty, have been used extensively in modern industry and consumer product control systems. 3. Evolutionary computing: applying biologically inspired concepts such as population, mutation and "survival of the fittest" to produce better problem solving. These methods are mainly divided into evolutionary algorithms (e.g. genetic algorithms) and group intelligence (e.g. ant algorithms) With a hybrid intelligent system, experiments were made to combine these two groups. Expert inference rules can be generated through neural networks or production rules from statistical learning as in ACT-R. A promising new approach states that strengthening intelligence tries to achieve artificial intelligence in the process of evolutionary development as a side effect of strengthening human intelligence through technology.

History of artificial intelligence In the early 17th century, René Descartes argued that an animal's body was nothing but complicated machines. Blaise Pascal invented the first mechanical digital calculating machine in 1642. At 19, Charles Babbage and Ada Lovelace worked on programmable mechanical calculators. Bertrand Russell and Alfred North Whitehead published Principia Mathematica, which overhauled formal logic. Warren McCulloch and Walter Pitts published "Logical Calculus of Ideas that Remain in Activities" in 1943 which laid the foundation for neural networks. The 1950s were a period of active effort in AI. The first AI program to work was written in 1951 to run the Ferranti Mark I engine at the University of Manchester (UK): a script play program written by Christopher Strachey and a chess game program written by Dietrich Prinz. John McCarthy made the term "artificial intelligence" at the first conference provided for this issue, in 1956. He also discovered the Lisp programming language. Alan Turingmemper introduced "Turing test" as a way to operationalize intelligent behavior tests. Joseph Weizenbaum built ELIZA, a chatterbot that applies Rogerian psychotherapy. During the 1960s and 1970s, Joel Moses demonstrated the power of symbolic considerations to integrate problems in the Macsyma program, a knowledge-based program that was first successful in the field of mathematics. Marvin Minsky and Seymour Papert published Perceptrons, which demonstrated simple neural network boundaries and Alain Colmerauer developed the computer language Prologue. Ted Shortliffe demonstrates the power of a rule-based system for representation of knowledge and inference in diagnosis and medical therapy which is sometimes referred to as the first expert system. Hans Moravec developed the first computer controlled vehicle to deal with the tangled, starred road independently. In the 1980s, neural networks were used extensively with the reverse propagation algorithm, first explained by Paul John Werbos in 1974. In 1982, physicists such as Hopfield used statistical techniques to analyze storage properties and network optimization nerve. Psychologists, David Rumelhart and Geoff Hinton, continue their research on neural network models in memory. In 1985 at least four research groups rediscovered the Back-Propagation learning algorithm. This algorithm is successfully implemented in computer science and psychology. The 1990s marked large gains in various fields of AI and demonstrations of various applications. More specifically Deep Blue, a chess computer game, defeated Garry Kasparov in a well-known match 6 game in 1997. DARPA stated that the costs saved through applying the AI method for scheduling units in the first Gulf War had replaced all investment in AI research since 1950 to the US government. The great challenge of DARPA, which began in 2004 and continues to this day, is a race for a $ 2 million prize where vehicles are driven by themselves without communication with humans, using GPS, computers and sophisticated sensors, across several hundred miles of challenging desert areas.

Hemispheric Distribution

After having dealt with how knowledge is stored in the brain, we now turn to the question of whether the brain is specialised and, if it is specialised, which functions are located where and which knowledge is present in which hemisphere. These questions can be subsumed under the topic “hemispheric specialisation” or “lateralisation of processing” which looks at the differences in processing between the two hemispheres of the human brain.

Differences between the hemispheres can be traced back to as long as 3.5 million years ago. Evidence for this are fossils of australopithecines (which is an ancient ancestor of homo sapiens). Because differences have been present for so long and survived the selective pressure they must be useful in some way for our cognitive processes.

Differences in Anatomy and Chemistry

Although at first glance the two hemispheres look identical, they differ in in various ways.

Concerning the anatomy, some areas are larger and the tissue contains more dendritic spines in one hemisphere than in the other. An example of this is what used to be called “Broca’s area” in the left hemisphere. This area which is –among other things- important for speech production shows greater branching in the left hemisphere than in the respective right hemisphere area. Because of the left hemisphere’s importance for language, with which we will deal later, one can conclude that anatomical differences have consequences for lateralisation in function.

Neurochemistry is another domain the hemispheres differ in: The left hemisphere is dominated by the neurotransmitter dopamine, whereas the right hemisphere shows higher concentrations of norepinephrine. Theories suggest that modules specialised on cognitive processes are distributed over the brain according to the neurotransmitter needed. Thus, a cognitive function relying on dopamine would be located in the left hemisphere.

The Corpus Callosum

The two hemispheres are interconnected via the corpus callosum, the major cortical connection. With its 250 million nerve fibres it is like an Autobahn for neural data connecting the two hemispheres. There are in fact smaller connections between the hemispheres but these are little paths in comparison. All detailed higher order information must pass through the corpus callosum when being transferred from one hemisphere to the other. The transfer time, which can be measured with ERP, lies between 5 and 20 ms.

Historic Approaches

Hemispheric specialisation has been of interest since the days of Paul Broca and Karl Wernicke, who discovered the importance of the left hemisphere for speech in the 1860s. Broca examined a number of patients who could not produce speech but whose understanding of language was not severed, whereas Wernicke examined patients who suffered the opposite symptoms (i.e. who could produce speech but did not understand anything). Both Broca and Wernicke found that their patients’ brains had damage to distinct areas of the left hemisphere.

Because in these days language was seen as the cognitive process superior to all other processes, the left hemisphere was believed to be superior to the right which was expressed in the “cerebral dominance theory” developed by J.H. Jackson. The right hemisphere was seen as a “spare tire [...] having few functions of its own” (Banich, S.94). This view was not challenged until the 1930s. In this decade and the following, research dramatically changed this picture. Of special importance for showing the role of the right hemisphere was Sperry, who conducted several experiments in 1974 for which he won the Nobel Prize in Medicine and Physiology in 1981.

Experiments with Split-Brain Patients

Sperry’s experiments took place with people who suffered a condition called “split brain syndrome” because they underwent a commissurotomy. In a commissurotomy the corpus callosum is sectioned so that communication between the hemispheres becomes severed in these patients. With his pioneering experiments, Sperry wanted to find out whether the left hemisphere really plays such an important role in speech processing as suggested by Broca and Wernicke.

Sperry used different experimental designs in his studies, but the basic assumption behind all experiments of this type was that perceptual information received at one side of the body is processed in the contra-lateral hemisphere of the brain. In one of the experiments the subjects had to recognise objects by touching it with merely one hand, while being blindfolded. He then asked the patients to name the object they felt and found that people could not name it when touching it with the left hand (which is linked to the right hemisphere). The question that arose was whether this inability was due to a possible function of the right hemisphere as “spare tire” or due to something else. Sperry now changed the design of his experiment so that patients now had to show that they recognised the objects by using it the right way. For example, if they recognised a pencil they would use it to write. With this changed design, no difference in performance between both hands were found.

In a different experiment conducted by Sperry et al. the patients were shown the word sky to one visual field and scraper to the other. They now had to draw the whole word they had seen with one hand. The patients were not able to synthesise this to skyscraper, instead they draw a scraper overlapped by some cloud. Thus it was concluded that each hemisphere took control of the hand to draw what it had seen.

Experiments with Patients with other Brain-Lesions

There have been other experiments conducted to gain more knowledge about hemispheric specialisation. They were conducted with epileptic individuals who were about to receive surgery where parts of one of their hemispheres was going to be removed. Before the surgery started it was important to find out which hemisphere is responsible for speech in this individual. This was done using the Wada-technique, where barbiturate is injected into one of the arteries supplying the brain with blood. Shortly after the injection, the contra-lateral side of the body is paralysed. If the person is now still able to speak, the doped hemisphere of the brain is not responsible for speech production in this individual. With the results of this technique it could be estimated that 95\% of all adult right-handers use their left hemisphere for speech.

Research with people who suffer brain lesions or even have a commissurotomy has some major draw backs: The reason why they had to undergo such surgery is usually epileptic seizures. Because of this, it is possible that their brains are not typical or have received damage to other areas during the surgery. Also, these studies have been performed with very limited numbers of subjects, so the statistical reliability might not be high.

Experiments with Neurologically Intact Individuals

In addition to experiments with brain-severed patients, studies with neurologically intact individuals have been conducted to measure perceptual asymmetries. These are usually performed with one of three methods: Namely the “divided visual field technique”, “dichaptic presentation” and “dichotic presentation”. Each of them again has as basic assumption the fact that perceptual information received at one side of the body is processed in the contra-lateral hemisphere.

Highly simplified picture of the visual pathway.

The divided visual field technique is based on the fact that the visual field can be divided into the right (RVF) and left visual field (LVF). Each visual field is processed independently from the other in the contra-lateral hemisphere. The divided visual field technique includes two different experimental designs: The experimenter can present one picture in just one of the visual fields and then let the subject respond to this stimulus. The other possibility involves showing two different pictures in each visual field.

A problem that can occur using the visual field technique is that the stimulus must be presented for less than 200 ms because this is how long the eyes can look at one point without shifting of the visual field.

In the dichaptic presentation technique the subject is presented two objects at the same time in each hand. (c.f. Sperry’s experiments)

The dichotic presentation technique enables researchers to study the processing of auditory information. Here, different information is presented simultaneously to each ear. Experiments with these techniques found that a sensory stimulus is processed 20 to 100 ms faster when it is initially directed to the specialised hemisphere for that task and the response is 10% more accurate.

Explanations for this include three hypotheses, namely the direct access theory, the callosal relay model and the activating-orienting model. The direct access theory assumes that information is processed in that hemisphere to which it is initially directed. This may result in less accurate responses, if the initial hemisphere is the unspecialised hemisphere. The Callosal relay model states that information if initially directed to the wrong hemisphere is transferred to the specialised hemisphere over the corpus callosum. This transfer is time-consuming and is the reason for loss of information during transfer. The activating-orienting model assumes that a given input activates the specialised hemisphere. This activation then places additional attention on the contra-lateral side of the activated hemisphere, “making perceptual information on that side even more salient”. (Banich)

Common Results

All the experiments mentioned above have some basic findings in common: The left hemisphere is superior at verbal tasks such as the processing of speech, speech production and recognition of letters whereas the right hemisphere excels at non-verbal tasks such as face recognition or tasks that involve spatial skills such as line orientation, or distinguishing different pitches of sound. This is evidence against the cerebral dominance theory which appointed the right hemisphere to be a spare tire! In fact both hemispheres are distinct and outclass at different tasks, and neither one can be omitted without this having high impact on cognitive performance.

Although the hemispheres are so distinct and are experts at their assigned functions, they also have limited abilities in performing the tasks for which the other hemisphere is specialised. In the picture above is an overview which hemisphere gives raise to what ability.

Differences in Processing

Experiment on local and global processing with patients with left- or right-hemisphere damage

There are two sets of approaches to the question of hemispheric specialisation. One set of theories is about the topic by asking the question “What tasks is each hemisphere specialised for?”. Theories that belong to this set, assign the different levels of ability to process sensory information to the different levels of abilities for higher cognitive skills. One theory that belongs to this set is the “spatial frequency hypothesis”. This hypothesis states that the left hemisphere is important for fine detail analysis and high spatial frequency in visual images whereas the right hemisphere is important for low spatial frequency. We have pursued this approach above.

The other approach does not focus on what type of information is processed by each hemisphere but rather on how each hemisphere processes information. This set of theories assumes that the left hemisphere processes information in an analytic, detail- and function-focused way and that it places more importance on temporal relations between information, whereas the right hemisphere is believed to go about the processing of information in a holistic way, focusing on spatial relations and on appearance rather than on function.

The picture above shows an exemplary response to different target stimuli in an experiment on global and local processing with patients who suffer right- or left-hemisphere damage. Patients with damage to the right hemisphere often suffer a lack of attention to the global form, but recognise details with no problem. For patients with left-hemisphere-damage this is true the other way around. This experiment supports the assumption that the hemispheres differ in the way they process information.

Interaction of the Hemispheres

Why is the transfer between the hemispheres needed at all if the hemispheres are so distinct concerning functioning, anatomy, chemistry and the transfer results in degrading of quality of information and takes time? The reason is that the hemispheres, although so different, do interact. This interaction has important advantages because as studies by Banich and Belger have shown it may “enhance the overall processing capacity under high demand conditions” (Banich). (Under low demand conditions the transfer does not make as much sense because the cost of transferring the information to the other hemisphere are higher than the advantages of parallel processing.)

The two hemispheres can interact over the corpus callosum in different ways. This is measured by first computing performance of each hemisphere individually and then measuring the overall performance of the whole brain. In some tasks one hemisphere may dominate the other in the overall performance, so the overall performance is as good or bad as the performance of one of the single hemispheres. What's surprising is that the dominating hemisphere may very well be the one that is less specialised, so here is another example of a situation where parallel processing is less effective than processing in just one half of the brain.

Another way of how the hemispheres interact is that overall processing is an average of performance of the two individual hemispheres.

The third, most surprising way the hemispheres can interact is that when performing a task together the hemispheres behave totally different than when performing the same task individually. This can be compared to social behavior of people: Individuals behave different in groups than they would when being by themselves.

Individual Factors Influencing Lateralisation

After having looked at hemispheric specialisation from a general point of view, we now want to focus on differences between individuals concerning hemispheric specialisation. Aspects that may have an impact on lateralisation might be age, gender or handed-ness.

Age could be one factor which decides in how far each hemisphere is used at specific tasks. Researchers have suggested that lateralisation develops with age until puberty. Thus infants should not have functionally-lateralised brains. Here are four pieces of evidence that speak against this hypothesis:

Infants already show the same brain anatomy as adults. This means the brain of a new born is already lateralised. Following the hypothesis that anatomy is linked to function this means that lateralisation is not developed at a later period in life.

Differences in perceptual asymmetries that means superior performance at processing verbal vs. non- verbal material in the different hemispheres cannot be observed in children aged 5 to 13, i.e. children aged 5 process the material the same way 13 year olds do.

Experiments with 1-week-old infants showed that they responded with increased interest to verbal material when this was presented to the right ear than when presented to the left ear and increased interest to non-verbal material when presented to the left ear. The infants’ interest was hereby measured by the frequency of soother sucking.

Although children who underwent hemispherectomy (the surgical removal of one hemisphere) do develop the cognitive skills of the missing hemisphere (in contrast to adults or adolescents who can only partly compensate for missing brain parts), they do not develop these skills to the same extent as a child with hemispherectomy of the other hemisphere. For example: A child whose right hemisphere has been removed will develop spatial skills but not to the extent that a child whose left hemisphere has been removed, and thus still possesses the right hemisphere.

Handedness is another factor that might influence brain lateralisation. There is statistical evidence that left-handers have a different brain organisation than right-handers. 10% of the population is left-handed. Whereas 95% of the right-handed people process verbal material in a superior manner in the left-hemisphere, there is no such a high figure for verbal superiority of one hemisphere in left-handers: 70% of the left-handers process verbal material in the left-hemisphere, 15% process verbal material in the right hemisphere (the functions of the hemispheres are simply switched around), and the remaining 15% are not lateralised, meaning that they process language in both hemispheres. Thus as a group, left-handers seem to be less lateralised. However a single left-handed-individual can be just as lateralised as the average right-hander.

Gender is also an aspect that is believed to have impact on the hemispheric specialisation. In animal studies, it was found that hormones create brain differences between the genders that are related to reproductional functions. In humans it is hard to determine to which extent it is really hormones that cause differences and to which extent it is culture and schooling that are responsible.

One brain area for which a difference between the genders was observed is the corpus callosum. Although one study found that the c.c. is larger in women than in men these results could not be replicated. Instead it was found that the posterior part of the c.c. is more bulbous in women than in men. This might however be related to the fact that the average woman has a smaller brain than the average man and thus the bulbousness of the posterior section of the c.c. might be related to brain size and not to gender.

In experiments that measure performance in various tasks between the genders the cultural aspect is of great importance because men and women might use different problem solving strategies due to schooling.

Summary

Although the two hemispheres look like each other’s mirror images at first glance, this impression is misleading. Taking a closer look, the hemispheres not only differ in their conformation and chemistry, but most importantly in their function. Although both hemispheres can perform all basic cognitive tasks, there exists a specialisation for specific cognitive demands. In most people, the left hemisphere is an expert at verbal tasks, whereas the right hemisphere has superior abilities in non-verbal tasks. Despite the functional distinctness the hemispheres communicate with each other via the corpus callosum.

This fact has been utilised by Sperry’s experiments with split-brain-patients. These are outstanding among other experiments measuring perceptual asymmetries because they were the first experiments to refute the hemispheric dominance theory and received recognition through the Nobel Prize for Medicine and Physiology.

Individual factors such as age, gender or handed-ness have no or very little impact on hemispheric functioning.

References

Editors: Robert A. Wilson and Frank C. Keil.(Eds.) (online version July 2006). The MIT Encyclopedia of the Cognitive Sciences (MITECS), Bradford Books

Knowledge Representation

Goldstein, E. Bruce.(2005). Cognitive Psychology - Connecting, Mind Research, and Everyday Experience. Thomson, Wadsworth. Ch 8 Knowledge, 265-308.

Sowa, John F.(2000). Knowledge Representation - Logical, Philosophical, and Computational Foundations. Brooks/Cole.

Slides concerning Knowledge from: http://www.cogpsy.uos.de/ , Knowledge: Propositions and images. Knowledge: Concepts and categories.

Minda, J. P. & Smith, J. D. (2001). Prototypes in category learning: The effects of category size, category structure, and stimulus complexity. Journal of Experimental Psychology: Learning, Memory, & Cognition, 27, 775–799.

Hemispheric Distribution

Banich, Marie T.(1997).Neuropsycology - The Neural Bases of Mental Function. Hougthon Mifflin Company. Ch 3 Hemispheric Specialisation, 90-123.

Hutsler, J. J., Gillespie, M. E., and Gazzaniga (2002). The evolution of hemispheric specialisation. In Bizzi, E., Caliassano, P. and Volterra V. (Eds.) Frontiers of Life, Volume III: The Intelligent Systems Academic Press: New York.

Birbaumer, Schmidt(1996). Biologische Psychologie. Springer Verlag Berlin-Heidelberg. 3. Auflage. Ch 24 Plastizität, Lernen, Gedächtnis. Ch 27 Kognitive Prozesse (Denken).

Kandel, Eric R.; Schwartz, James H.; Jessel, Thomas M.(2000). Principles of Neural Science. Mc Graw Hill. 4.th edition. Part IX, Ch 62 Learning and Memory.

Ivanov, Vjaceslav V.(1983). Gerade und Ungerade - Die Assymmetrie des Gehirns und der Zeichensysteme. S.Hirzel Verlag Stuttgart.

David W.Green ; et al.(1996). Cognitive Science - An Introduction. Blackwell Publishers Ltd. Ch 10 Learning and Memory(David Shanks).