Experiments Soliciting Donations

Soliciting Donations

Examples of Experiments on Important Social Topics 
An Illustrative Social Psychology Experiment 
Remaining Questions
Key Considerations in Experiments
Laboratory Versus Field Experiments
Research Participants 
Control of Conditions
Experimental Procedures 
Experimental Artifacts
Results Obtained
Internal Validity and External Validity 
Long-Term Effects
Ethical Issues
The Need for an Experimenting Society 
Suggested Readings

It  is  the  major  objective  of  an  experiment-lab  or  field-to  have  the  greatest possible  impact  on  a  subject  within  the  limits  of  ethical  considerations  and requirements of control.
-Aronson, Brewer, & Carlsmith (1985, p. 482)

Suppose that you are going door-to-door, soliciting donations for the Heart Association or the Cancer Society. In order to collect the maximum amount for your organization, you want to make the best possible opening statement when people come to the door. What should you say? Should you emphasize that their neighbors have given? Or request a set amount of money as a goal? Or ask for any contribution, no matter how small? Or what?
Most people handle such situations by following their hunches, or using a rule-of-thumb handed down by some more experienced collector. However, to get a really valid answer to such questions, the best method would be to conduct an experiment, trying out several different solicitation approaches under controlled conditions. In this chapter we will describe in some detail one such experiment on soliciting donations, using it as a springboard for analyzing the characteristic strengths and weaknesses of social psychology experiments.
There are many recent experiments having important social applications. In this respect the situation has changed markedly since Meltzer (1972) lamented the lack of high-quality applied social research. Here are some examples of interesting experiments on a wide variety of important topics:
Effects of erotic and violent films on viewers' aggressiveness (Sinclair, Lee, & Johnson, 1995)
ᄋ Effects of a product labeling system on environmentally conscious buying by supermarket shoppers (Linn, Vining, & Feeley, 1994)
ᄋ Increasing drivers' use of seat belts (Jonah, Dawson, & Smith, 1982)
ᄋ Reducing HIV risk behaviors among injection drug users over a 2-year period (Latkin et al., 1996)
ᄋ Improving the reading and mathematics performance of poor elementary school children over several years (Bushell, 1978)
ᄋ Effects of beer commercials on drinking expectations among 3rd, 5th, and 8th grade children (Lipsitz et al., 1993)
ᄋ Reducing the amount of welfare payments by giving recipients job training and assistance in job searches (Friedlander & Burtless, 1995)
ᄋ Increasing blood donations among high school students (Sarason et al., 1992)
ᄋ Increasing long-term energy conservation (Pollak, Cook, & Sullivan, 1980)
ᄋ Increasing the amount of money contributed to charities (Aune & Basil, 1994; Cialdini & Schroeder, 1976)

We will focus our analysis on the last of these topics.
Historically, experiments have been the most frequent research method used by social psychologists-and also the one with the highest prestige. To summarize their key characteristics, experiments are studies in which there is a high degree of planned manipulation of conditions by the investigator, and research participants (often termed subjects) are randomly assigned to groups. Random assignment is essential to an experiment because it creates separate groups of research participants that should be nearly identical on any variable: height, weight, self-esteem, intelligence, aggressiveness, and so on.
In an experiment the researcher manipulates (regulates the levels of) one or more variables, controls or holds constant other variables that might otherwise have an effect, and measures or observes the levels of one or more variables that are expected to be affected by the manipulation. The manipulated variables are called independent variables, and the variables expected to be influenced by the manipulation are called dependent variables. If random assignment produces identical groups to start with, and other procedures are conducted carefully, we can conclude that any difference between the groups on the dependent variable must be caused by the manipulation. Experiments typically exceed all other research methods in their degree of control of conditions and in the precision of measurement possible. As a result, experiments are better than other methods for determining the direction of causal relationships. Because the researcher establishes and controls all the experimental conditions, the obtained results are more likely to be due to them than to some unknown or uncontrolled factor. On the other hand, experiments rarely attempt to simulate real-world conditions in any detail; instead they aim at isolating one or a few factors that may be important causes of some effect from the multitude of other factors and conditions with which they are inextricably mixed in real life. As a result, two major disadvantages of experiments for the study of social behavior are artificiality (in the sense that the conditions do not occur outside the laboratory) and doubt about the generalizability of results. A detailed look at a carefully done social psychological experiment will reveal other common characteristics.

A study by Cialdini and Schroeder (1976) was aimed at testing some techniques that might increase the contributions that charities could collect in door-to-door campaigns. From their own personal experience with solicitors for charities, the authors realized that requests which allowed and encouraged even small donations were particularly hard to turn down (Cialdini, 1980). 
This phenomenon is related to the foot-in-the-door effect (see Chapter 2), in which small initial requests often produce greater compliance to later larger requests (Dillard, 1991; Freedman & Fraser, 1966). However, a request for small donations might succeed in increasing the number of donors but fail to increase the total amount of money collected because most of the contributions would be small ones. Most charitable organizations would not view such an outcome as a success.

The authors theorized that a solicitation strategy which legitimized minimal contributions but did not directly request them might increase the number of donors without reducing the size of the average donation. If so, that would clearly produce a greater total collection.
Accordingly, the experimenters designed a solicitation approach featuring the key words, "Even a penny will help," at the end of the standard request format. 
This was a small change, but one with great theoretical importance, since it should make it harder for people to refuse to donate. An initial experiment compared this approach with the standard appeal format in a door-to-door solicitation for donations to the American Cancer Society. Results showed that the "even a penny" approach was highly successful.
However, other experimental conditions were needed in order to investigate other possible explanations of the hypothesized effect. In order to test whether the legitimization of minimal contributions was crucial, a third condition substituted the statement, "Even a dollar will help." A dollar was not a minimal amount, since it had been the median amount given by those who did contribute in the initial experiment, and therefore this condition was expected to make it easier for people to refuse to donate. A fourth condition tested whether the key factor might be the apparent need of the charitable organization. That is, the statement "Even a penny will help" might not just legitimate small donations, but it might also suggest that the American Cancer Society was currently very hard up for funds. To avoid that connotation while retaining the legitimization of small donations, the fourth condition gave information that people could view as a model for their own behavior, using the wording, "We've already received contributions ranging from a penny on up." These four conditions comprised the key theoretical manipulations in the experiment.

Through the cooperation of the local branch of the American Cancer Society, the solicitors were provided with official identification badges, information pamphlets, and donation envelopes. The solicitors were four pairs of college-age research assistants, a male and a female in each pair, who were unaware of the hypotheses of the research. They went to a middle-income suburban housing area that had not been recently canvassed by the Cancer Society, at times when both men and women were likely to be home (late afternoons, evenings, and weekends). When the first adult came to the door, the solicitor of the same sex as the householder gave the appeal for funds, using one of the four prescribed formats, recited verbatim. The four pairs of solicitors contacted 123 subjects, 30 or 31 in each condition. In each case they recorded whether the subject made a contribution and, if so, the size of the contribution.

Table 4-1 displays the results of the experiment for frequency and amount of donations. The gender of the target subjects was also considered as a factor, but no significant differences were found, so results for men and women were combined.
The pattern of findings in Table 4-1 was exactly as expected: the even-a-penny and social legitimization conditions were markedly higher, both in percentage of donors and in total donations. To verify the statistical significance of this pattern, three orthogonal comparisons were conducted on both dependent variables (the frequency and total amount of donations). For both variables there was no significant difference between the first two conditions, control and even-a-dollar, nor between the last two conditions, social legitimization and even-a-penny. But, as predicted, for both variables the last two conditions combined were significantly higher than the first two conditions combined (p < .02 and p < .05, respectively). Also as predicted, there was no significant difference in the size of the average donation actually given in the four conditions.
The practical implications of these results are clearcut. The two conditions that legitimized paltry contributions yielded 55% more donors and 50% greater total donations than the other two conditions. Any charity would certainly be overjoyed to be able to increase their revenues by that amount. However, even that difference may be somewhat of an underestimate, for the control condition had one freak contribution of $10, which was $8 more than the second-highest contribution and accounted for nearly half the total revenues in that condition. If this atypical donation were removed, and the highest donation in each of the other conditions were similarly removed as being possibly unrepresentative, the average donations in the four conditions would be almost identical, but the number of contributions would still be very different. By such a procedure the last two conditions would be estimated to produce 66% larger total revenues than the first two conditions.

One other feature of the results might be mentioned. An additional 46 subjects were contacted during the same time period and in the same area, but instead of being requested to contribute, they were asked to answer a survey question about a charitable organization's apparent need for money. Each subject was read one of the four standard appeals used by the solicitors in this experiment and asked to rate the charity's apparent "need for money" on a 7-point scale. Though the even-a-penny condition displayed the highest mean score, as expected, the four conditions did not differ significantly, and this finding cast doubt on the hypothesis that the charity's need was an important basis for people's contributions.

Remaining Questions
Though this experiment was carefully done, not everything in the report is crystal clear. One ambiguity concerns the four pairs of solicitors and the four conditions. Did each pair use only one of the four appeals, or did each pair use all four appeals in a preselected order? The former procedure would be somewhat simpler in making arrangements and in training the solicitors, but it would be undesirable because it would confound the four conditions with the characteristics of the solicitors. Thus, if one pair of solicitors was more attractive or better dressed or more persuasive or added other ideas to their appeal for funds, the results would be mistaken for an effect due to the superiority of one of the standard appeals. By contrast, if each pair of solicitors used all four appeals, preferably in a preselected random order, any differences between the solicitors would be spread across all four appeals and would not contribute to any differences between the appeals. Presumably, this was what was actually done in the experiment, but the report does not mention it, so we can't be sure.

Another set of questions has to do with the generality of the findings. There is certainly no question that they apply to real-life situations, since they were obtained in an actual door-to-door solicitation procedure. 
There might be some doubt about geographical and social-class generalizability. Probably these results, obtained in an Arizona middle-income suburb, would be applicable to other U.S. communities like a Chicago suburb or an inner-city neighborhood in Los Angeles-even though the absolute level of giving might be different. In fact, several subsequent studies have replicated the basic finding in U.S. suburban neighborhoods (Weyant & Smith, 1987). There would, however, be more concern about the generality of these results in other countries-a topic that would take much research to verify.
A related question is the extent to which the results found by Cialdini and Schroeder would apply to other situations-for example, a campaign seeking large contributions from local businesses or rich citizens. If the results did generalize, what would be the best minimal contribution to mention? Probably not a penny, and perhaps not a dollar; some larger amount, such as $10, might be minimal for well-to-do donors. Similarly, would the technique apply to solicitations by telephone or by mail? The only way to know would be to try it out under these various circumstances. In fact, Weyant and Smith (1987) did so in a mail solicitation for the Cancer Society to a high income neighborhood, using $5, $10, and $25 as the "small" options suggested to one group of residents, and $50, $100, and $250 as the "large" options suggested to another group. Consistent with the basic principle of the even-a-penny experiment, these authors found that the "small" suggested donations yielded well over twice as many donors without decreasing the average size of donations (which was around $12)-a very useful generalization of the earlier study's findings.
We can also raise more abstract questions about the nature of the process that produced the results of the even-a-penny study. Does the legitimization of tiny contributions work by eliminating people's excuses for not complying with the solicitor's request, or by pressuring them to avoid establishing a public image as someone who is especially unhelpful, or both? Which of these processes is more important, or are there some situations where one is dominant and some where the other one is primary? Although recent studies have begun to examine some of these questions, more experimental research is needed in order to fully explicate the conditions under which various solicitation procedures will effectively produce compliance, and the types of people who are likely to comply.
Would the same effect reported above be found if the study were conducted in Germany, France, China, or Mexico? Although a cross-national replication of the study described above has not been conducted, crossnational research on the related foot-in-the-door technique suggests that the findings may have limited generality. Kilbourne (1989) conducted experimental studies on compliance in Paris, Frankfurt, and Amsterdam, and found a different pattern of results in each country.

All experiments require a great deal of preparation and planning, most of which is usually not apparent in the final research report. Just as with surveys, pretesting of every phase of the procedure is essential in order to avoid problems that could invalidate the obtained results. False starts and revisions of procedures are frequent, and practical problems of scheduling and logistics always occur. For instance, in the Cialdini and Schroeder experiment, weather, transportation, and scheduling problems must have occurred, though they are not mentioned in the report. A thorough and helpful guide for novice investigators on how to plan and conduct an experiment has been provided by Aronson, Brewer, and Carlsmith (1985).

Laboratory Versus Field Experiments
Laboratory experiments are apt to be considerably more convenient for the investigator because they are conducted on home ground, where the necessary equipment and subjects are often readily available. 
They frequently allow more control over conditions, more isolation of variables, and greater precision of measurement than field experiments. However, they generally have the offsetting disadvantage of being more artificial situations, and that can produce a number of undesirable artifacts which are discussed later in this chapter. A related problem is that both practical and ethical limitations often prevent the use of strong manipulations (for example, of fear, happiness, or anger) in laboratory studies, and consequently the effects that are obtained are apt to be rather weak as well. When nonsignificant findings result, they leave doubt about whether the experimental hypothesis was wrong or the manipulations were merely too weak to have a significant effect.
The laboratory experiment's frequent artificiality has been termed lack of mundane realism by Aronson et al. (1985)-that is, a lack of similarity to any situation in the world outside the laboratory, as in giving 
subjects electric shocks. Despite being artificial in this sense, experiments can nevertheless have strong experimental realism-that is, convincingness and impact on the subject. Though this is a controversial issue, Aronson et al. (1985) argued that experimental realism is a more important consideration than mundane realism. As they pointed out,

The fact that an event is similar to events that occur in the real world does not endow it with importance. Many events that occur in the real world are boring and unimportant in the lives of the actors or observers. Thus, it is possible to put a subject to sleep if an experimental event is high on mundane realism but remains low on experimental realism. (p. 482)

Most authorities agree, however, that because of laboratory experiments' typical artificiality and frequently weak effects, their results should be taken back to the field and checked against the results of other types of research in real-world situations. Another reason for this procedure is the likelihood of interaction effects between variables studied in the laboratory and other variables that were not included. (Interaction effects are discussed in more detail later in this chapter.)
In any case, many experiments in applied social psychology, like the even-a-penny experiment, are field experiments-that is, ones conducted in a setting that is natural for the participants. Such experiments are likely to have mundane realism, and if well designed, they can have high-experimental realism as well. In many field experiments the participants are not even aware that they are subjects in an experiment, since an event such as a solicitor coming to their door seems completely natural to them. However, field experiments pose their own unique problems for the investigators, such as finding an appropriate setting and getting permission to conduct research from the relevant authorities such as the police or an employer; in this case the researchers had to obtain the cooperation of the American Cancer Society.

Instructions are often used to manipulate one or more of the independent variables in an experiment. In the charity solicitation experiment, the four standard appeals were the main experimental manipulations. They were simple, but very effective. Even here the instructions had to be pretested, while in more complex studies they must be carefully checked for clarity and believability. In this experiment there was no deception of the participants, but in some studies, to conceal the real point of the research, the instructions' have to include a false but convincing "cover story" about why the study is being done and what its topic area and purposes are.

+ + + +++ + ++ + + + ++ + + + + + + ++ + + ++ + + + ++ + ++ + + ++ + + + ++ + + ++ + ++ +++ +
Famous as a social researcher, and also as a writer teacher, Elliot Aronson was born in 1932 in Massachusetts. . He was attracted to psychology by Abraham Maslow at Brandeis, where he received his B.A. Subsequently, he earned an M.A. under David McClelland at Wesleyan University and a Ph.D. in social psychology under Leon Festinger at Stanford in 1959. After brief periods on the faculty at Harvard and at  the University of Minnesota, in 1965 he was appointed Professor and Director of the Social Psychology Program at the University of Texas at Austin. In 1974 he moved to the University of California at Santa Cruz, where he remains.        
     Aronson is widely known for his laboratory research on dissonance theory, attitude change, interpersonal attractiveness, group interaction, and social influence. His applied research on intergroup relations began when the public schools-in Austin, Texas, were suddenly desegregated amidst rioting and turmoil.
 The resulting program of action research, focused on fostering cooperative relationships and decreasing - prejudice among elementary school students, is described in  Chapter 8. More recently Aronson has also conducted ap  plied social psychological research aimed at persuading people to conserve energy and to protect themselves from AIDS  and other sexually transmitted diseases.
Among his many honors, Aronson-has received the American Association for the Advancement of Science's - Prize for Creative Research, the APA's National Media  Award for writing, and its Distinguished Teaching Award. He  served as co-editor of the massive Handbook of Social Psychology in 1968-1969 and in 1985, and he has authored texts  on social psychology and on experimental research. Among  his other well-known volumes are The Jigsaw Classroom and Burnout.

+ + + +++ + ++ + + + ++ + + + + + + ++ + + ++ + + + ++ + ++ + + ++ + + + ++ + + ++ + ++ +++ +

Aronson et al. (1985) advocate that, wherever possible, instead of using instructions, experimental conditions should be manipulated by an "event," because it is likely to have much more impact than instructions do. 
For example, in the famous study by Asch (1958), subjects who underwent the personal experience of having several other subjects disagree with them about the length of some lines would be much more affected by this event than if the experimenter had merely told them that several other people disagreed with their judgments.

Research Participants
The choice and assignment of participants are important steps in an experiment. An unfortunate fact is that most social psychological research has been conducted with college students, in academic laboratories, and frequently using materials that require a high level of educational background (Sears, 1986). Because college students differ from the overall human population on many dimensions, there are undoubtedly many instances where the generality of these results is limited. 
For instance, compared to older adults, college students tend to be more willing to comply with authority, to have less structured attitudes, and to have less formulated notions of self. It is possible that these qualities have contributed to the common social psychological view of people as inconsistent and easily influenced (Sears, 1986). In addition, a large majority of college students are White, and it is often unclear how generalizable results would be from White college students to non-White college students, even within the same age cohort. Usually, generalizability of findings is increased by having a diverse sample of research participants.
As we pointed out earlier in this chapter, randomization in the assignment of participants to experimental conditions is the greatest advantage of an experiment over other research methods, because it equates the conditions by balancing all the personal characteristics (the subjects' sex, education, race, religion, personality, and so forth) that might affect the dependent variable. Even characteristics not yet suggested as important, and therefore uncontrollable in any other way, will be evenly spread across the experimental conditions by randomization.
In addition to random assignment of subjects, important personal characteristics are sometimes controlled by systematic choice of participants. For instance, for some purposes, only women may be chosen as subjects, or only college graduates. This procedure reduces the error variance within experimental conditions which results from the diversity of participants, but as noted above, it decreases the generality of findings. (These concepts are discussed at greater length below in the section on internal validity and external validity.) An excellent, though complex, way to combine the advantages of diversity of participants with control of participant characteristics is a randomized blocks design. In such a design, each block of people with a given level of one characteristic (for example, college graduates, high school graduates, and other educational levels) is randomly assigned across several experimental conditions, so that there are equal numbers of the given level in each condition. These various ways of equating experimental groups all help to minimize the error variance in the measures of the dependent variables.
In the charity solicitation study, the characteristics of participants were controlled in two ways. First, a homogeneous middle-income housing area was chosen as the scene of the study. This was a reasonable procedure since such areas are often targeted for door-to-door solicitation by the Cancer Society, and it certainly made the results more relevant than if only college students had been the subjects. 
Second, though the report did not specify it, presumably each successive household was randomly assigned to one of the four experimental appeals.

Control of Conditions
In an experiment, the investigator first determines and arranges all the aspects of the independent variable manipulation: using instructions, events, equipment, the help of research confederates, or other appropriate techniques. Also, the experimenter decides what kinds of behavior will constitute the dependent variable and exactly how it will be measured and recorded. In this regard, Aronson et al. (1985) argue strongly for more use of behavioral measures of the dependent variable, rather than self-report measures such as attitude scales or questionnaire answers, which quite often have dubious reliability and validity. Third, the experimenter decides what other aspects of the situation need to be held constant or explicitly controlled in some other way so that the measurements will reflect the true variables of interest and not some accidental effects of the situat:;.on. This often includes arranging circumstances to block the occurrence of certain kinds of behavior and encourage other kinds.
Cialdini and Schroeder's experiment controlled conditions in several ways. The time of day was specitied so that both men and women would be at home. 
Solicitors worked in male-female pairs, which allowed the important control of always having the appeal made by the experimenter of the same sex as the person who came to the door. Since the sex of the solicitor can influence the respondent's degree of identification and compliance with the request, this was a very wise procedure. Using four pairs of solicitors made the findings more broadly generalizable than if only one pair of especially persuasive (or especially inept) experimenters had been used. Having each pair of solicitors use all four appeals, in random order, would be necessary in order not to confound the success of one appeal with the characteristics of one pair of solicitors.
One important type of control is to have all the experimenters dress and behave in the same ways, particularly across the various experimental conditions. 
This precept includes the direction of their gaze, their degree of enthusiasm or friendliness, their inflection when speaking instructions, their equipment and clothing, and all other aspects of their behavior. For instance, other charity solicitation experiments have found that donations were greater when the solicitor gazed directly at the potential donor's eyes rather than at the collection box, and when a simple identification poster was carried rather than a more complex and distracting poster with a photograph of a handicapped child (Bull & Gibson-Robinson, 1981; Isen & Noonberg, 1979).

Experimental Procedures
In the charity solicitation experiment, the dependent variables were objective behavioral measures (donation of money and amount donated), so the authors did not have to be concerned with the problems of selfreport measures, or the reliability and validity issues of rating scales. Similarly, because the independent variable was a simple phrase in the appeal, there was no need for a manipulation check-an essential procedure with complex manipulations, in order to see whether the experimental conditions actually affected the subjects in the expected ways. Likewise, though many attitude-change experiments use both a pretest and a posttest measurement, there was no need for a premeasure in this posttest-only design. Since no complex experimental script was used to vary the conditions, there was also no need for the frequently used postexperimental questionnaire to check on the subjects' perception of the situation. As in many field experiments, there was no deception, so there was no debriefing session at the end of the experiment to explain the purpose of the study and the need for deception.
One especially important precaution was observed: the solicitors were "blind to" (unaware of) the hypotheses of the study. This was essential because otherwise they might have subtly changed their verbal inflection, enthusiasm, or general approach in a way favoring the hypothesized best experimental condition, thus spuriously supporting the hypothesis. In experiments where the investigators cannot be blind to the hypotheses, Aronson et al. (1985) strongly recommend that they be unaware of which experimental condition each subject is in. If necessary, this can be accomplished by using teams of researchers who each handle a different phase of the experiment. In the solicitation study, of course, the solicitors could not be kept ignorant of the conditions since they delivered the appeal orally, so their ignorance of the hypotheses of the study was essential.
However, there is some question whether the investigators' ignorance of the hypotheses was sufficient. In a replication of Cialdini and Schroeder's study, Weyant (1984) found evidence that even though the pairs of student researchers did not know the research hypotheses, their expectations influenced the results. 
Weyant's researchers conducted door-to-door solicitations for the American Cancer Society in a middleclass neighborhood. They randomly assigned each of 360 households to one of five experimental conditions, and as expected, the results showed that more people contributed, and more money was collected, in the even-a-penny condition than in any of the other four-a very useful replication. However, 68 of the householders interrupted the researchers to respond before the key sentence constituting the experimental manipulation was delivered. Of these, the percentage of participants who donated money in each of the conditions was 0%, 7%, 10%, and 23%, compared to 28% in the even-a-penny condition. Because there were no differences in the manipulation delivered to these households, the only other possibility is that the student researchers had developed their own hypotheses about the condition manipulations, and were subtly acting in such a way as to confirm their expectations. It is unclear how the expectations of the solicitors affected the results obtained by Cialdini and Schroeder, but that issue is certainly important in determining the reason for their findings.
A problem that does not apply to the solicitation study, but is common in research where new procedures are being tried out on established groups of people, is the Hawthorne effect. Named for the Hawthorne plant of the Western Electric Company, where it was first noticed by Mayo (1933) and Roethlisberger and Dickson (1939), this effect is an augmented response of participants due to their enthusiasm for being in a new or experimental program and the resulting attention they receive. Such an increase in production levels, or in attitude scores, can easily be mistaken for the effect of the experimental manipulation (for example, changes in work lighting in the Hawthorne studies) rather than being recognized as an experimental artifact. A typical safeguard against this effect is a reversal design, in which the independent variable is first added and later removed, or first increased and later decreased. If the dependent variable measure does not revert to an earlier level when the independent variable is reversed, this may indicate a Hawthorne effect, though it could also indicate a long-lasting effect of the independent variable if such a carry-over effect seemed possible.
Sometimes experiments cannot use a reversal design (for instance, when people have been given some 
benefit, such as pay raises or better health care, that cannot be taken away, for either practical or ethical reasons). In such cases a multiple baseline design is very helpful. In this design, several different independent variables (IVs) are introduced at different points in time. Each IV is expected to change its own dependent variable (DV), but if other DVs change when one IV is introduced, something like a Hawthorne effect may be at work. An excellent example of a multiple baseline design is seen in a study by Pierce and Risley (1974) on reducing disruptions in a youth recreation center.
When rules against several different kinds of disruptions were made and consistently enforced at several different points in time, the researchers found, as predicted, that each enforced rule largely eliminated its own kind of disruption but did not appreciably reduce other types of unruly behavior.

Experimental Artifacts
Unreliability of measures may be a problem in some experiments, depending on the types of measuring instruments that are used. It is less of a problem where the measures are objective-that is, ones on which all observers would agree-such as the monetary donations in the Cialdini and Schroeder experiment. However, it is always an issue when measures are subjective, such as judgmental rating scales or self-report measures. In such situations, investigators should try to determine and report reliability coefficients for their measures (either internal consistency, or stability measures, or both). If dependent variable measures are low in reliability, the results obtained using those measures may be experimental artifacts, which would not be obtained again if the study were repeated. Of course, this potential problem applies just as much, if not more, to nonexperimental studies.
Demand characteristics are present in any situation, not just in experiments. They are the perceptual 
cues, both explicit and implicit, that communicate what kind of behavior is expected there (Orne, 1969). 
For example, the demand characteristics in a library encourage one to sit and read, to speak and walk softly, and so on. In an experiment, demand characteristics that are too obvious can produce experimental artifacts by signaling to subjects the kinds of results the experimenter wants or expects. In such cases subjects who are so inclined can either be overcooperative and display the expected behavior without a genuine basis, or be negativistic and do just the opposite. That is why experimenters often go to great lengths to conceal the true purpose of the experiment, or even use deception, so that subjects cannot guess the purpose and subsequently follow or oppose the research hypothesis. Demand characteristics in an experiment can never be wholly eliminated, but they can be minimized, and useful suggestions on how to do so have been offered by Aronson et al. (1985) and by Orne (1969). Fortunately, most research has shown that relatively few subjects are inclined to be either overcooperative or negativistic (Weber & Cook, 1972). In the charity solicitation experiment, the demand characteristics seem to be about equal in all four conditions, and thus not a problem in interpreting the findings.
Experimenter effects are distortions in the results of an experiment produced by the behavior or characteristics of an experimenter (Rosenthal, 1976). They are similar to interviewer effects in surveys, and many of the same safeguards can be applied to minimize their operation. The type of experimenter effect that has been most thoroughly studied is the impact of the experimenter's expectations on the behavior of subjects. 
There is considerable research evidence that this artifact can sometimes distort experimental results (Rosenthal, 1969; Seaver, 1973), and many recommendations have been made about how to combat or minimize its effects (Aronson et al., 1985). Some of these suggestions are summarized in Table 4-2. In the charity solicitation study, the authors attempted to minimize experimenter expectancy effects by using completely objective measures and by keeping the solicitors, as well as the subjects, ignorant of the experimental hypothesis (termed a double-blind procedure).However, as Weyant (1984) has demonstrated, experimenter effects may still have been a problem.

TABLE 4-2 Suggestions for Controlling Experimenter Expectancy Effects
1. Increase number of experimenters.
2. Observe behavior of experimenters.
3. Analyze experiment for order effects (changes in results between early and late data).
4. Analyze experiment for computational errors.
5. Develop experimenter-selection procedures.
6. Develop experimenter-training procedures.
7. Maintain blind conditions (in which experimenter does not know which group subjects are in).
8. Minimize experimenter-subject contact.
9. Use expectancy control groups (in which expectancies are created or not created to determine their effect separate from the effects of other variables).

Source: Oskamp, S. (1977). Methods of studying human behavior. In L. S. Wrighman, Social Psychology (2nd ed.) (Adapted from Rosenthal, 1966). Copyright 0 1972, 1977 by Wadsworth Publishing Company, Inc. Reprinted by permission of the publisher, Brooks/ Cole Publishing Company, Monterey, California.

Subject effects are distortions in research results caused by various response sets or temporary behavior patterns adopted by subjects contrary to the intentions of the researcher. Though some experiments have shown that subjects can be induced to be negativistic or cooperative (Christensen, 1977; Earn & Kroger, 1976), a thorough review of the relevant research literature has indicated that by far the most common subject effect is caused by evaluation apprehension (Weber & Cook, 1972). As the name implies, this produces attempts by subjects to act in socially desirable ways and "look good" out of concern about how other people (the experimenter or other research subjects) will evaluate them (Rosenberg, 1969). This can occur in any kind of research method, not just in experiments. It can be combatted by taking steps to reassure subjects and reduce their anxiety, and by disguising the experimental hypothesis and the expected behavior. Experiments in natural settings where subjects are unaware of being studied are a particularly good way of avoiding undue evaluation apprehension. In the charity solicitation experiment, whatever evaluation apprehension was present should have been equal in all conditions, and thus would not affect the study's hypotheses.

Results Obtained
The results of the charity solicitation experiment were objective, simple, and straightforward. They leave little room for challenge or need for interpretation, but often this is not the case. In contrast to the large effect shown here, many studies display only small effects, which may not be practically important. The size of an empirical finding is usually evaluated in two ways: by a significance test to show whether the result could have occurred by chance, and by stating the effect size, which indicates the practical importance of the findings. Often in this book, we will mention both the statistical significance and the effect size when discussing research findings, and we will explain each procedure briefly here.
If the descriptive findings of a study show a difference between conditions, we want to know whether 
that is a "true" difference-i.e., one that would hold for the underlying population and not just for the sample of individuals that we studied. There is always some possibility that our findings could have occurred by chance, just because of the characteristics of the individuals that happened to be sampled, rather than because of our manipulation of conditions. A significance test indicates the likelihood that an obtained finding did not occur just by chance-i.e., that it is true of the population, and thus is dependable and real. 
Typically in psychology, we set a 95% confidence level for our findings; that is, we accept a 5% probability that our results are due to chance.
If our results are significant, a second issue is "how big is the effect?" An effect size measure shows the strength of the relationship between an independent and dependent variable, but is not influenced by the size of the sample. In contrast, significance tests are sensitive to the size of the sample, so that for large sample sizes (e.g., N = several hundred), small and even trivial effects can reach statistical significance. Of the several measures of effect 'size, the two most commonly used are r and d (Chow, 1988; Keppel, 1991). 
The first measure, r, is simply the correlation coefficient, which represents the strength of the relationship between two variables. Its value can range from +1 (a perfect positive relationship) to -1 (a perfect negative relationship), with 0 indicating no relationship at all. 
The second measure, d, is the standardized size of the difference between two conditions-the number of standard deviations between the means of the experimental and control conditions (for instance 1.5 would indicate 1 Y2 standard deviations, and 0 would indicate no difference). Cohen (1992) has provided useful terms for evaluating both these measures-for example, a medium r is .30, and a medium d is .50 (half a standard deviation).
In considering research findings, it is often important to examine not only the main effect of one variable on another, but also interaction effects, in which two or more variables in combination have different effects than when alone. Interaction effects often occur with complex DVs (such as industrial production rates) which have multiple and complex determinants, so that changes in even a powerful single IV (such as wages) may not affect them sharply. As examples of interaction effects, wage increases might affect production levels for new workers but not for old ones, or they might increase productivity in the morning but not in the afternoon when workers are tired, or they might raise employee morale but not productivity. In charity solicitation research, Bull and Gibson-Robinson (1981) found an interaction effect showing that directness of the solicitor's gaze affected the amount of obtained donations more for casually dressed collectors than for smartly dressed ones.
Such interaction effects involve moderator variables, which are defined as variables that affect the strength or the direction of a relationship between an independent and a dependent variable (Baron & Kenny, 1986). In contrast to moderators, a mediator variable is an intervening variable that totally accounts for the relationship between an independent and a dependent variable, in the sense of showing the process by which one variable affects the other. For instance, hypothetically, if women solicitors received larger donations than men, but that effect was totally accounted for by their having greater eye contact with householders (that is, the gender difference disappeared when amount of eye contact was held constant), their eye contact would be a mediator variable, showing how and why the gender effect occurred.

Internal Validity and External Validity
 If all confounding variables, experimental artifacts, lack of controls, or careless procedures are successfully avoided or minimized, an experiment may have internal validity, its most important requisite. Internal validity means that the study's conclusions can be accepted as correct-that is, the results were really caused by the operation of the independent variables. 
Cook, Campbell, and Perrachio (1990) have described internal validity as the most essential requirement of an experiment, without which no amount of social importance or real-world applicability has any value.
Cialdini and Schroeder's solicitation study seems to meet the test of internal validity very well. It has no apparent artifacts, careless procedures, or lack of controls that would cause us to challenge its results, except possibly the experimenter expectancy effect as an alternative explanation of the findings. However, in many studies that is not the case. A famous bad example was provided by some of the evaluation studies of the Head Start preschool program, which used control groups that were not really equivalent to the experimental group. As a result they were destined from the very beginning to find small or null effects of the Head Start program, but their findings were invalid because of the inadequate control procedures (Campbell & Erlebacher, 1970).
External validity refers to the generalizability of research results to other populations, settings, treatment variables, and measurement methods (Campbell & Stanley, 1966). Poor external validity can occur even when a study has excellent internal validity. In fact, this is a common occurrence in laboratory experiments because of their frequent artificiality, though artificiality alone is not a sure indication of poor generalizability (Aronson et al., 1985). If there is doubt about the generalizability of findings, the only safe approach is to try them out in other situations in which you are interested. 
That is the reason for the recommendation that laboratory findings be checked in field studies, and that the ideas developed in field studies be taken back into the laboratory for further analysis and more precise testing. However, even field research can be poor in generalizability, if it confines itself to atypical individuals, settings, or behaviors (e.g., only to male company presidents).
We have already discussed the external validity of the charity solicitation experiment at some length. 
Though its results clearly apply to real-life situations such as the one where it was carried out, there are undoubtedly some solicitation tasks and settings where its findings would not apply or would have to be appropriately modified. For example, Weyant and Smith (1987) showed that $5, rather than a penny, was an effective minimal contribution to mention in mail solicitations to residents of well-to-do neighborhoods.

Long-Term Effects
In many studies we want to know not just about immediate changes, but also and often more importantly,about long-term effects of the research variables. That question did not apply to the solicitation experiment, however, because the only change desired was the immediate one of contributing money at the moment when the request was made.
Unfortunately, though long-term change is often of great interest, it is rarely studied in research, largely because it is difficult and time-consuming to follow up research subjects and obtain additional data later on. 
As a consequence, much of our knowledge about attitude and behavior change is actually confined to immediate change. We should be aware that findings about how to create immediate changes may not apply at all to the issue of creating long-term changes. Or the findings may apply in part, but it would take additional research to show which parts apply and which do not (Cialdini, Petty, & Cacioppo, 1981).
Let us list a few examples of short-term versus longterm follow-up methods in the other interesting experiments mentioned at the beginning of This chapter. Studies of television violence, beer commercials, and blood donations have focused almost exclusively on immediate changes in attitudes and behaviors. A study that involved labeling of supermarket products according to their effect on the environment found that the labels led shoppers to change the products they purchased. 
However, an educational intervention aimed at increasing the effectiveness of the product labels did not produce any further changes in behavior (Linn et el., 1994).
Some studies have used longer follow-up periods. A Canadian study involving well-publicized police enforcement of laws requiring the use of auto seat belts found over half the increase in seat belt use still persisted six months after the enforcement program ended (Jonah et al., 1982). At the high end of the scale, Friedlander and Burtless (1995) examined the effects of job training and assistance in job searches on the work income and amount of welfare funds received by families on welfare over a 5-year period. The results showed significant increases in annual income, and significant reductions in the amount of welfare money received, compared to the control conditions. Similarly, Bushell (1978) traced the elementary school performance of children in an experimental Follow Through program (subsequent to a summer school Head Start experience) for periods as long as six years, and he found continuing effects of the Follow Through training even after the children had left the special program.

Ethical Issues
The ethical guidelines discussed in Chapter 1 are particularly applicable to experimental research. In experiments, more than in any other research method, the investigator has real power over the subject, and this power must be used fairly, sympathetically, and unharmfully. Fortunately, it is now rare for psychological experiments to pose any real risk of harmful consequences to subjects, and most psychologists are sensitive to their responsibility to protect participants from any such risks. However, in the past, some experiments have used noxious situations that were physically or psychologically stressful to subjects (cf. Farr & Seaver, 1975; Greenberg & Ruback, 1992). An extreme example involved some military studies of stress, which exposed unsuspecting subjects to highly realistic simulations of radioactive contamination, aircraft engine failure in flight, or false information that their own errors with explosives had injured other soldiers (Berkun et al., 1962). Such stressful experiments would probably not be allowed under the federal regulations that have been in effect since the early 1970s. Since then, because of the ethical and legal issues involved in research with human participants, all proposed research procedures must be approved by an institutional review board (IRB) before being carried out.
Most experimenters in recent years have recognized their obligation to obtain informed consent from research participants, and the federal government has promulgated instructions about the way this should be done-for example, telling subjects the essential facts about the study, and informing them that they can refuse to take part or withdraw from participation at any time (U.S. Department of Health, Education, and Welfare, 1971). However, in field experiments, such as the solicitation experiments, obtaining informed consent is generally considered unnecessary if the experimental conditions are relatively natural situations to which the subjects might be exposed in everyday life. 
Though this procedure is somewhat controversial, it is similar to the situation in survey research, where informed consent is considered to have been given by the respondent's act of answering an interviewer's questions.
The two most common ethical problems in experiments are deception and debriefing. Because of the need to avoid experimental artifacts and to keep the subjects ignorant of the research hypotheses, deception has often been used in social psychological experiments. Recent surveys of the prevalence of deception in social psychological research showed that it is still widely used, though less than at its peak (Nicks, Korn, & Mainieri, in press; Sieber, Iannuzzo, & Rodriguez, 1995). For instance, in the Journal of Personality and Social Psychology, deception was used in 66% of the research reports published in 1969, 47% in 1978, 32% in 1986, and 47% in 1992. Undoubtedly deception has been overused and misused at times, and the practice has been severely criticized (Kelman, 1967). In recent decades, the scrutiny of proposed research procedures by IRBs has made investigators more careful to avoid unnecessary deception. Only where deception is necessary for the completion of the research and justified by the potential value of the research findings is it considered ethical, and there is still vigorous debate about this degree of permissiveness (Baumrind, 1985; Christensen, 1988; Soliday & Stanton, 1995).
Debriefing, on the other hand, is considered an ethical requirement for laboratory experiments. Even if a study involves no deception, it is still important to answer the participants' questions afterward and to be sure that they do not leave with any unallayed anxieties or misconceptions (Gurman, 1994). Helpful suggestions for a thorough debriefing routine have been offered by Mills (1976) and Rosnow and Rosenthal (1996). However, it is important to realize that the false impressions instilled in deception experiments can survive the traditional type of debriefing and therefore require much more thorough discussion with the research participants about the processes which cause false impressions to persist (Ross, Lepper, & Hubbard, 1975). In field experiments, debriefing is normally carvied out only if the participants are aware that they have been in a study. In studies involving unobtrusive observation of public behavior, where the subjects are unaware of being studied and might become more upset to learn about it than to remain ignorant of it, both informed consent and debriefing are usually omitted. 
That was the case, for instance, in the charity solicitation experiment by Cialdini and Schroeder.
Finally, like atomic scientists after World War II, social researchers must face ethical issues about the use of their findings. Our research results are fortunately not so deadly, but they can certainly have major effects on some people's lives. Knowing that, what is our responsibility to the people who will use them, to the people on whom they will be used, and to the advancement of scientific knowledge? To take a mild example, if you were Cialdini and Schroeder, would you be just as content to have your research findings used by the Hare Krishnas (which has actually happened) as by the American Cancer Society? Or if, like Milton Rokeach (1979), you developed a strong method of changing people's values, would you be willing to have it used by the Ku Klux Klan or the American Nazi 
Party? If not, should you give up experimental research and turn to some less controversial occupation? Obviously there can be no final answer to these questions; all investigators must decide for themselves. However, the ethical code of the American Psychological Association (1992, p. 1600) reminds us:
Psychologists are aware of their professional and scientific responsibilities to the community and the society in which they work and live. They apply and make public their knowledge of psychology in order to contribute to human welfare . . . [and] try to avoid misuse of their work.

The use, or misuse, of findings is a particularly crucial question for applied social psychologists. Because they are applied scientists, they are even more eager than other researchers to see their work used for the benefit of humanity, not just for the advancement of scientific knowledge.
One of the researcher's strongest tools is the experiment, precisely because it so effectively demonstrates the nature and direction of causal relationships between variables. Despite the limitations of experiments, they can be effectively applied in answering vital social questions and evaluating important social programs. Donald Campbell (1969, 1988) advocated just such an approach in influential articles calling for social scientists to help build an "experimenting society." In such a society, experimental methods would be used rather than guesswork, superstition, popular stereotypes, or "common sense" in deciding important issues of public policy. For instance, research could be applied to questions such as:
ᄋ Do stiffer punishments reduce the amount of crime?
ᄋ Are single-parent families less successful than two-parent families in raising children to be responsible adults?
ᄋ Would a guaranteed annual income reduce people's incentive to work?
Often the results of such public policy research have provided surprising new insights about our society, and much more research evidence could be brought to bear on such questions if social investigators were encouraged to study them intensively. In future chapters, particularly Chapter 17, we will examine some findings on these kinds of questions.

Experiments are studies in which there is a high degree of planned manipulation of conditions, control of other conditions, and precision of measurement. As a result they are the best research tool for isolating the effects of specific independent variables and determining the direction of causal relations between variables. The goal of all the many careful procedures involved in experiments is to achieve high internal validity of the results. However, experiments are often low in mundane realism, and because of this artificiality, there frequently are questions about how far their findings can be generalized to real-life situations.
The experimental realism (impact) of procedures in laboratory experiments is often limited by practical and ethical considerations, and consequently their results may be rather weak. Field experiments, on the other hand, though more natural in their settings and procedures, must also follow ethical procedures and may not allow as much control and precision as laboratory experiments. Social research advances best when a variety of research methods is employed, checking the results of any one method by the use of other methods and settings.
Cialdini and Schroeder's study on the wording of appeals for charitable donations illustrates many of the characteristics of experiments: theorizing, careful procedural planning, and controls to achieve the desired impact on subjects and avoid experimental artifacts, statistical tests of the results, and checks on other hypotheses. As a field experiment that involved no deception, its clear-cut results had an unusually high degree of mundane realism and external validity (generalizability).
Experiments that attempt to change socially important behaviors should be concerned with and measure long-range effects in addition to immediate changes. 
Fortunately, increasing numbers of social psychologists are successfully applying careful, long-range procedures in the study of vital social issues. In doing so, they are helping to foster an "experimenting society." 

Aronson, E., Brewer, M., & Carlsmith, J. M. (1985). Experimentation in social psychology. In G. Lindzey & E. Aronson (Eds.), The handbook of social psychology (3rd ed., Vol. 1, pp. 441-486). New York: Random House.-Presents a thorough rationale for experimental procedures combined with a helpful how-to guide that can aid experimenters in avoiding many costly mistakes.
Campbell, D. T. (1969). Reforms as experiments. American Psychologist, 24, 409-429.-An influential paper, which advocates applying experimental methods to solving social problems and deciding issues of public policy.
Rosnow, R., & Rosenthal, R. (1996). Beginning behavioral research: A conceptual primer (2nd ed.). New York: Macmillan.-A practical guide to experimental research, including sections on manipulation of independent variables, quantifying and measuring dependent variables, and ethical concerns.

+ + + + + + + + + ++ ++ + + + + + + + + + + + ++ ++ + + + +++ + + + + + + + + + + + 
Second Edition
Stuart Oskamp
Claremont Graduate University
P. Wesley Schultz
California State University, San Marcos


Postingan populer dari blog ini



Triandis’ Theory of Interpersonal Behaviour