Thiessen (in press): In this paper, a strong argument is voiced in support of relatively domain general statistical learning. Nevertheless, several cases are cited (e.g., babies are only poorly conditioned to fear a cloth; rats are poorly conditioned to use auditory cues as a sign of nauseating food) which indicate that all learning situations are not equal. My question is: what are the underlying neural/computational mechanisms which are responsible for better learning of some stimuli contingencies than others, when the statistical redundancies across different stimuli contingencies are held constant (i.e., only the stimuli themselves are changed across experiments)?
One issue raised in the paper is whether different kinds of statistical information are learnt from different learning mechanisms, and how to go about probing this question. To date, there has been little success in dissociating different kinds of statistical knowledge, because most species that have been tested appear to have both. However, imagine a scenario in which tests on a particular animal species were able to show that the animal had one mechanism and not another. How can you be sure? This is a particularly thorny question because - as discussed in my first question - some stimulus contingencies could be better suited to some types of learning, which could spuriously lead researchers to deny the presence of a particular mechanism. Other than convergent results from many tasks, are there any other ways to address this issue?
A quick thought about the "less is more" concept. If children are sensitive to transition probabilities, the "less is more" would significantly reduce computational load - because the size of a transition probability matrix increases exponentially with the number of past states you condition on, it might be beneficial to only be able to consider the prior state.
Theissen (in press): As the paper points out, there is an analogy between conditional and distributional statistics and between the concepts of posterior and prior probabilities in Bayesian inference. One principle of Bayesian learning is that the prior has greater influence during conditions of stimulus uncertainty (i.e. people fall back on their expectations when there is high information uncertainty about what was actually perceived). Is there evidence that distributional statistics play a greater role in statistical learning when there is noise in the stimuli or the words are going by too fast to parse easily? How would this affect children's language learning? Another issue is the role of motherese in language learning (simplified speech given to infants and small children). What is the evidence that this is necessary for language learning? Could it be that motherese helps accelerate the learning of the conditional statistics of the language?
McClelland: The article says that the universal grammar model predicts that children will acquire certain language abilities "rapidly?" It seems however, that this is a relative term (are we talking about a time span of a few hours, a few weeks, or a few months). Is there a way to establish a concrete definition for how fast a process needs to be to lend support for the universal grammar hypothesis.
Indeed, defining "quickly" is a difficult thing to do. I think that it wuold be reasonable to say that a period of a couple of months would be the maximum one could reasonably propose; however, there is much evidence showing that tasks can be learned in far less time than that.
Perhaps an alternative means of assessing the universal grammar vs. pdp explanations is necessary. For example, it is known that as a PDP network trains, it can frequently backpedal slightly (i.e., revert to a more erronous state). However, universal grammars do not explicitly predict such a phenomenon. Thus, I would suggest that if performance increases monotonically, this would provide at least some support for the universal grammar hypothesis.
This is a question related to Pat Kuhl's talk. She mentioned that children with autism do not prefer "motherese" speak, and actually prefer a computer-generated voice. She also mentioned that to teach new phonetic distinctions, infants learn best from social interactions as opposed to TV or audio. What sort of interventions could be used to get children with autism to speak since they can't pick up on social information? Does the preference to listen to a computer-generated voice as opposed to motherese determine whether an autistic child will eventually be verbal or nonverbal?
When children are brought up bilingual and one of the languages has a different rule for sentence structure (like adjectives after nouns rather than before), do they show more grammatical errors in their native language?
Do you think it's better to teach children a second language in preschool or elementary school as opposed to high school? Is there any evidence to support either position?
Blair-- Actually, I think the UG explanation would have no trouble with child data that show kids sometimes moving from using correct irregular forms first to overregularization and then back eventually to correct production. All you'd need to suggest is that the qualitative change that took place between the irregular / overregular stages was the setting of a parameter in the LAD and therefore beginning the use of the rule. Then once the rule is in use, irregulars would be learned as usual.
I am a bit confused on how statistical learning is applied in language acquisition. Statistical learning implies building a Bayesian probability space that includes all expectations, based on their constituent probabilities. A template, if you will, on how things should behave on a normal day for the learning mechanism in question. Based then on the input there is a convergence on one small area in the Bayesian space that corresponds to the output which best correlates with the input. In a Bayesian model if the space is broad enough, or if it so happens that there is more than one area of convergence, a hybrid is usually represented as an output. An example would be the face recognition mechanism. If I merge inputs from two distinct faces together, the output will be neither face 1 nor face 2 but something that subjects usually address as the other faces brother. Similar results are also observed in the motion perception system. I somehow do not see that happening in a language system. If for example we give the following ambiguous headline inputs to the language system:
MINERS REFUSE TO WORK AFTER DEATH or PROSTITUTES APPEAL TO POPE
The system has to settle on only one meaning. Even if the context of the sentence is unfamiliar the system has to choose one of the meanings or at best alternate between the different meanings. Do we ever see hybrid meanings or any sort of hybrid outputs as we see with other systems (ex motion perception, face perception)?
How will Chomsky explain the situation that people read long, complex sentence, such as the paragraph-long sentence in GRE reading? I don't think someone really need to get the structure of the whole sentence, and I don't think it is possible to do it due to the working memory limit.
I was wondering whether there is any difference in learning language due to the differences in the linguistic features. Although statistical learning mechanism seems to be robust, some features like accentedness of speech (rhythmic) might help to distinguish the word boundaries. For example, Korean doesn't have accent in speech whereas English does and Chinese even has tonal feature. Would there be any difference in the rate of learning for such differences in the features?
Also for infants in bilingual (or multiple languages) environment, the experience that receive would be differet from monolingual environment - the transitional probability might be weakened from high variations of words. In this case, would it cause infants/children to develop language slower than those grow up in monolingual environment?
domain general theorists explain children's superior language learning to adults by positing that adults perception and memory of language is just too good and hinders performance -- evidence resides in studies that show adults learn language better while distracted -- if adult language learning is facilitated by distraction then wouldn't domain general theorists advocate for instruction that takes place while students are distracted? This would be an entirely different way to approach college language courses than what previously exists. Also, is there any evidence that language learning is facilitated if we were to listen to a tape of someone speaking a different language while sleeping
also, in reference to saffran et al., it seems important to me to test whether the statistical learning is not specific to language in order to claim that it is not an experience independent mechanism. Was there a control condition that looked to domains outside language and established that statistical learning was comparable in that domain to the domain of language?
In the Ch 6, as an earliest example of recognition of familiar sound, the personal labels mommy and daddy were mentioned. The 6-month-old infants did not generalize those labels for other women and men, however, those individuals eventually acquire the knowledge that those personal labels such as mommy and daddy refer to anybodys mother or father. I have a question for the mechanism of acquisition for personal labels including proper names and personal pronouns. Personal pronouns, such as I and you, alternate to whom it refers according to the context. Richard et al. (1999) found a positive correlation between visual perspective taking skills (such as seen in Piagets Three Mountain Problem) and acquisition of pronoun. In addition to the visual perspective skills, the concept of self-awareness must be necessary to understand the meaning of personal proper names and personal pronouns. Are there any other types of cognitive abilities necessary to acquire the personal labels?
In the Saffrans paper, the author left the question of whether the observed statistical learning was the language specific mechanism or the general learning mechanism. Is there any way to determine this? In the connectionist accounts, can we claim the same argument?
I am not certain of the Pinkers claim. What evidence did Pinker employ to claim the existance of grammatical categories? In sum, if there is innate language device, which aspect of language acquisition is it reasonable to believe innate? Or instead of dividing into components of language, should we find another way of approach to think about innate vs. non-innate argument? In the connectionist account, is the structure of the model considered innately determined?
I could see how that could be used to argue for a relatively slow monotonic increase in performance, but not for the highly variable, non-monotonic data described in the McClelland paper.
If my suggestion can't seperate rules from PDP explanations, then what can?
In response to Valentinos' question about statistical learning resulting in hybrids of multiple meanings of ambiguous words: I think statistical learning best applies to early language learning of how to parse strings of sounds into words and sentences and to assign meaning. Later, more complex language learning which can account for multiple meanings of ambiguous words are implemented through different learning systems. Therefore, I think it may be possible for infants to produce a hybrid of sounds/meanings in an ambigous situation, but this is not the case for more complicated ambiguity in adult speech.
To Jamie: A study described in chapter 6 examined ultimate grammatical competence in a second language as a function of age of arrival in that country. The graph shows that at arrival at age 7, ultimate grammer competence starts declining. This would suggest that children have a much better chance of success at second language learning if they start before age 7, which would suggest languages should start being taught in kindergarten or first grade!
I imagine that if you are learning a second language with very similar grammar structures, you would be able to acquire good competency even if you start learning the language later in life.
This is just a repost of my question from last week, but I thought Jaime might find it interesting, relating to her first question:
On the link between action and perception: Jana Iverson at Pitt conducts research on the relationship between movement and language learning. She and her colleagues have found that movement is a necessary component to language acquisition. During the babbling stage there is much flailing of the arms and bouncing. Later in development, when a child is near to producing their first two word sentence they can be observed first to use gesture in the place of one of the words. For example, the progression might be: Pre-Transition: "Bop" Transition: *point* "Bop" Post-Transition: "Want Bop"
Clearly, movement is incorporated in language development, but how important is it? Interestingly, in a separate study Iverson found that children who were likely to be autistic (ie- had autistic siblings but were too early to themselves be diagnosed) did not display as much of movement during the babbling stage as normal children, and also experienced language development delays. This is a little bit of a chicken-egg problem, as we do not know if the attenuated movement is due to the developmental disorder autism or to the lack of movement, or some interaction between them. It is also possible that the mothers were somehow changed in their parenting behavior as a result of having an older autistic child, and this somehow influences the progression of language development in the younger child.
In any case, I have since wondered about the relationship between gesture and language development and just how important it is. For example, if a parent encouraged movement during babbling, would this help language acquisition? Conversely, if a parent discouraged movement (probably unknowingly) would this cause delays or make it more difficult?
How far back in development is the gesture-vocalization link important to later language development? For example, when babies cry and flail around, many parents swaddle their child, which means to wrap them tightly in a blanket. This minimizes movement and soothes the baby with the result that they stop crying. But, is it possible that this crying and flailing is actually and important developmental activity and inhibiting it may be somewhat harmful?
Erika have you heard of baby-signs? haha, i know researchers from uc davis developed it! it was on meet the fockers i think -- anyways, its this program where you teach pre-verbal infants sign language -- i vaguely remember, but i think the researchers did a randomized experiment and found that baby sign infants had higher IQ's at 8 yrs than non-baby sign infants. kinda interesting and perhaps relevant to your question... i hope. -bryan
Here's the youtube video of the part in meet the fockers... hope this worked... if not here's the link: http://www.youtube.com/watch?v=JgJ9dXGtXns&feature=related
Bryan - I have heard of that! I don't know too much about it, but actually I have been wondering if baby signing has an effect on later language development. If a baby already has a somewhat sufficient way to communicate, perhaps it feels less motivated to follow the normal developmental trajectory. Or, it could be that because it can actively communicate, some parts of the language system might have a head start in development. Did they report any abnormalities in latency of language development in baby signers? Erika
In the learning constraints response to the riddle of induction (Siegler & Alibali, Chapter 6, p 201), they explain that children learn definitions of words from context by only considering a few logical hypotheses about what an adult might mean when he/she says a word. One constraint is the whole object constraint, where a child assumes that a label applies to the whole object being referred to. For example, pointing to a dog and saying "dog", the child will assume the label "dog" applies to the whole object. Likewise, if the child has never heard the label "dog" before, and an adult pointed to a dog and said "fur", then a child would assume the label "fur" applies to the whole dog.
Another constraint is the mutual exclusivity constraint, where if a child is shown two objects and a label word is spoken, then the child will assume that the new label word applies to the object that is NOT the one it already knows a label for. The final constraint is the taxonomic constraint, where the child assumes the label is for a basic feature (like a whole animal called a dog rather than a part of the animal like a leg or fur).
My question is: how can these constraints account for the situation where a child is seeing a single object for which it already knows a label, yet is also presented with a different label? For example, show a child a picture of a dog. If the child already knows that the word "dog" applies to it, then under the learning constraints it assumes that "dog" means the whole animal dog. Assuming the child knows this much, show him/her a dog and say "fur". Under the learning constraints hypothesis, what does the child do with this label, "fur"?
Will the child assume it is a specific feature label, because it already has the main basic feature slot filled?