Saturday, September 4, 2010

Precursors to Language

[added on Feb 2011]

Hominids experience reality and classify it. They classify scenes visually into visual objects that participate in visual scenes. Some of the visual scenes are action-scenes. Action-scenes involve other hominids visibly and audibly behaving, with their body parts like faces and eye-gaze and hand movements, and with extended vocalizations segmented into parts with different articulation. [The precursor to linguistic classification capacities is the segmental, and thus discrete, classification of display action-scenes. The breakthrough to linguistic classifications can work without vocalization, as demonstrated by the emergence of Deaf sign languages repeatedly in many communities in history. However, the typical case, which likely interacted with the biological evolution and cultural development of behavioral modernity, is that display action-scenes incorporate vocalizations classified phonemically. The breakthrough builds on prior cognitive capacities related to social behavior, conventionalized routines and intentionality. Behavior in the linguistic domain is unambiguously attained with the systematic intentionality of conventionalized phoneme sequences (or in the less common case, conventionalized sign lexical items). The intentionality of linguistic actions is marked by a sharp demarcation between the display situation of the source signal, and the described situation that becomes the cognitively accessible "content" of a linguistic display or utterance. The link is mediated by a shared lexical scheme of individuation, so that recovery of the source signal during the linguistic recognition performance of a listener depends critically on the conventionalized-arbitrary lexical association between phonemically-classified display and "cognitively established content". It is the shared lexical scheme that establishes what pragmatically available significance of vocalization displays is established. The established lexical senses extract a semantic level of meaning that is independent of the particular utterance situation, and indeed the broader linguistic context of usage. To be an established sense it is sufficient to associate with the phonemic form of a lexical item a discrete "semantic contribution" in term of other (more or less basic) lexical items in the scheme.]

[The level of "content" in the described situation is thus abstracted from particularities of the utterance display situations, with the important exception of indexicals. Content is in general practice transparently referential. (But see Perry's reflexive-referential theory of content, which analyzes the importance of additional levels of content to explicate indexicality and other reflexive phenomena, even proper names. ]

Hominids classify a visual scene into an action-scene if it involves a hominid (the Agent a) behaving in a known pattern of behavior [as mentioned by a verb], with specific success-conditions. For example, an a-grabs-f situation is one where an agent a moves parts of their body to make a significant change in the scene. Before the action, object f was located in the scene but not located in the hands of of a. If the scene satisfies the conditions to be an a-grabs-f situation, the scene changes so that at a later time f is in the hands of a.


Event:
Prior sit:
Resulting sit:
t1 part-of t, t2 part-of t, t1 before t2

We can say that an action-scene or event is the part of the world that can be classified according to such conditions. Let us say there is a hominid s, that classifies the visual scene before it as an a-grabs-f scene. Then s can visually track the various individuals involved in the event, and tracks the n-ary condition [which underlies the sense of the verb] picked out by that classification, which are relations that individuals stand in (or not) in the scene. If s registers a scene as falling under the classification a-grabs-f, they can remember it as such. If a few days later, the see a grabbing another object f2 of the same type F as f (a physical object, let us say food), they can classify the new scene as the same type of situation as earlier, call the situation-type a-grabs-F. The a-grabs-F situation-type is more general than the earlier a-grabs-f action-scene (a singular scene, involving fully identified individuals), because the participant F is parametrized to a type of object F. Similarly, s can classify the actions of another hominid b in a similar way, according to a still more general situation-type A-grabs-F. [A is the type of some particular hominid who takes the role of an Agent in the action-situation so classified.]

The world in which hominids live is full of regularities, and their ability to recognize and remember those regularities allows a troop of hominids to be successful in survival, reproduction and maintaining group cohesion. Some regularities are related to others. For example, the action of grabbing results in a situation where A-holds-F. For purposes of illustration, we do not treat hold as a significant change in a scene, but as a significant continuity or stative relation. An event that involves some A holding an F involves a stative relation rather that a proper change-based action (admittedly the boundary between actions and statives can be fuzzy or arbitrary, at some earlier time, the agent must have acted to come to hold the object).

Event:
Prior sit:
Resulting sit:
t1 part-of t, t2 part-of t, t1 before t2

Still another related situation-type to the grab action is a catch action, where the object of type F was moving in the prior situation, and when it is caught and held it no longer moves. We can distinguish two variations (at least) of the catch action. If the object is of an animate type G (it is a hominid, or an animal) it is able to move on its own and avoid the catching action of the agent. The agent will often have to chase the object it wants to catch, and may have to use some instrument like a rope or net to restrain its movement. We will call this variant a catch-capture situation. If the object is of an inanimate type H, it is typically moving because it is falling through the air due to gravity. We call this variant with inanimate objects a catch-seize situation. We can characterize the conditions for a scene to be classified as a catch situation as follows:

• Event:
o Sense1-event:
 Prior sit:



 Resulting sit:
OR

• <>
• <>

 t1 part-of t, t2 part-of t, t1 before t2
o Sense2-event:
 Prior sit:


• <>
 Resulting sit:

• <>
• <>
 t1 part-of t, t2 part-of t, t1 before t2

Our hominid s can observe the scenes around her, and if they involve individuals of the appropriate type, s can classify situations that involve grabbing, holding and catching. We say that s has a long-term cognitive memory, with a scheme of individuation (for situations involving individuals and relations) that is attuned to precisely those types of situations, and many others. Our hominid can track scenes, classifying them as being in a certain type of situation, or as not being in that situation-type, using their working memory to judge if a situation sit1 is of a certain type (for example: a-grabs-f) or not. We say that hominid s believes that sit1 is of Situation-Type-A if her working memory tracks the relevant individuals as standing in the relations specified in the conditions for Situation-Type-A.

Hominid s lives in a troop with other hominids, and classifies their actions in ways that are relevant to the continuing social life of the troop. A hominid can call the attention of others in the troop to a certain situation of a type, perhaps by a vocalization display or by shifting their gaze in a way that can be observed by others, or by pointing. We call these attention-directing displays as referring actions. If the belief state of s is that the scene they ["they" = s and troop members who can observe her display] can see is significant because it is of a certain type, they can call attention to the scene or to the individuals involved by referring displays. Members of the troop are attuned to the situations that significant others find important, which is important to the cohesion of the group. Young hominids are socialized to be aware of the same types of situations as adult members of the troop, and develop the same sort of cognitive scheme of individuation as the others in the troop.

[In support of the idea of referring actions using monitored gaze-shift, we have the way that hominids apparently have some selectional advantage by having white scleras in their eyes. This make more prominent a referring display by shifting the gaze from one location to another.]

This framework allows us to propose a scenario for the evolution of language among hominids.

No comments: