In line with these authors, the following three categories were distinguished: blatant explicit statement about the inferiority of one group, e. We were further interested in the tasks to assess performance. Finally, we closely inspected available studies for their way to assess immigrant group status.

One method consists of participants self-identifying themselves as members of a certain immigrant group. However, another frequent option is to identify immigrants via demographic characteristics like the place of birth of themselves, their parents, or their grandparents. By employing the latter method, rather than self-categorization, individuals could be ascribed an immigrant background status, although they would self-define to belong to the mainstream culture. These individuals might be unaffected by a stereotype threat manipulation that targets their ethnic group of origin, due to the weak self-group link.

As a consequence, studies that used methods other than self-identification might yield weaker effects than studies that relied on self-identification. Beyond immigrant status categorization, we were particularly interested in identity aspects, as previous research suggests competing predictions on the role of ethnic identity strength.

To identify relevant studies a literature search was conducted in December , which was repeated in December and in March Our database search for literature published until resulted in references. Finally, we asked for additional published or unpublished studies through e-mailing lists. Full texts for potentially relevant studies were retrieved. Thus, studies that manipulated the salience of a stereotype about immigrants, but focused exclusively on non-immigrants did not meet this criterion e. Likewise, studies that did not report separate results for immigrants and other groups, most notably studies that reported combined average scores for Latino and African American participants were not included e.

Second, the activated immigrant stereotype needed to be negative.

This excluded research on the consequences of stereotypes regarding Asian immigrants e. Third, the study had to be experimental and had to follow the stereotype threat or social identity threat family of experimental treatments. Immigrant participants were supposed to be randomly assigned to two or more experimental groups. Fourth, a measure of cognitive performance served as a dependent variable.

Fifth, we inspected all studies for the quality of the applied methods and measures. This included an analysis of the operationalization of the independent variable and the dependent measures. Specifically, theory-based preconditions for stereotype threat to occur were inspected, such as sufficient domain identification and substantial task difficulty Schmader et al. We identified 18 texts published and unpublished articles and dissertations containing 21 experiments that met our criteria.

In addition to 19 English-language texts, one report was written in German, and one in French both languages were intelligible to us. In two cases the identical study was used for two separate journal articles. These results were included in our analysis only once.

Thus, our final sample consisted of 19 experiments, 10 of which were unpublished Table 1. All three authors read and coded all available studies. A coding sheet was developed to gather the relevant information. Discrepancies between the judgments were very rare and resolved through discussions. When available, the effect size calculations were based on descriptive data M s, SD s, n s , when unavailable, formulas to calculate the standardized mean difference based on t -test statistics or F -statistics with 1 degrees of freedom were employed Lipsey and Wilson, ; Wilson, We ensured that our standardized mean difference score reflected the mean difference between immigrants under conditions of low versus high stereotype threat.

Our standardized mean difference scores never represented an interaction effect e. We did not consider such interaction effects because they can be driven — in part or completely — by stereotype lift effects Walton and Cohen, among non-immigrant groups. We also did not consider multi-group comparisons; when the stereotype threat treatment involved more than one group e. In some studies several performance scores were reported e. In order to preserve independence of effect sizes, we averaged the scores before they were included in the meta-analysis Headrick, Whereas nine studies were published in outlets of educational or social psychology, a majority of 10 studies was unpublished.

Five of the European studies investigated adolescents. Based on the distinction of stereotype threat-activating cues in the meta-analysis by Nguyen and Ryan , out of the 19 experiments included in our meta-analysis, seven studies were found to have used indirect and subtle cues, five used moderately explicit cues, and seven used blatant stereotype threat-activating cues. There are different ways to distinguish immigrant group members from non-immigrants.

Demographic surveys and statistical reports e. Another common method is to ask individuals with which group they identify most. Our dataset diverges remarkably from the data obtained in previous meta-analyses on stereotype threat. In addition to differences in the main aims and methodological approaches outlined above focus on immigrants, quantification of stereotype threat main effects, and moderating variables , there are specific differences that appear noteworthy. We decided to meta-analyze the main effects or the simple main effects of the stereotype threat treatment on immigrants.

As a consequence, the effect sizes integrated in our meta-analysis differ in part from the effect sizes reported in the meta-analysis by Nguyen and Ryan One previous meta-analysis Nadler and Clark, identified six studies on stereotype threat effects among Latino samples in the US. We did not include two of these six studies in our analysis, because they did not meet our inclusion criteria.

In one study Stone, a negative stereotype with respect to White Americans was examined, Latinos served as a control group. In the second study Good et al. Thus, only four out of the 19 identified studies were already included in the meta-analysis by Nadler and Clark The meta-analytic procedure followed the recommendations by Lipsey and Wilson ; Wilson, All standardized effect sizes were adjusted for small sample bias Hedges, The inverse variance served as a weight that was allotted to each effect size see Lipsey and Wilson, , for the respective formulas.

Negative effect sizes indicate a worse performance in the stereotype threat high than in the stereotype threat low condition Table 2. All studies except for one Wicherts et al. With respect to outliers, the individual effect size of one study Berjot et al. Results with and without this particular study were inspected.

This indicates an average effect in support of stereotype threat theory among immigrant samples.

According to the interpretation by Cohen , the overall effect is medium in size. We further examined the homogeneity of our sample of effect sizes. Our goal in the meta-analysis was to include a maximum of studies, published or unpublished, written in English, Spanish, German, or French. More than half of our datasets originated from unpublished research. Despite our efforts, however, it is unlikely that we were able to uncover every study conducted so far that would have met our criteria. In order to estimate a potential sampling bias in our set of studies we first plotted the meta-analytic data for a visual inspection Figure 1.

The funnel plot illustrates that the great majority of studies yielded a negative effect size, indicating that the direction of the effect was regularly in support of stereotype threat effects. It further shows that studies with larger samples were more likely to yield null effects. The plot points at a lack of small-sample studies with effects that do not support a stereotype threat hypothesis. One reason for this finding could be a selective reporting of small-scale studies.

In order to gage the file drawer problem Rosenthal, , we first calculated the number of studies confirming the null hypothesis that would be needed to conclude that the effect is small. Funnel plot based on effect size d and sample size. In studies with negative effect sizes, low stereotype threat groups outperformed high stereotype threat groups.

Taken together, our sampling analysis pointed out a remarkable lack of null effects in small sample studies. If such studies were conducted, they were unavailable to us. A file-drawer analysis showed that the number of studies in support of the null hypothesis that were needed to change the average effect size to small or even to insubstantial is rather large.

Thus, we conclude that the average effect size in support of a stereotype threat effect among people with an immigrant background is not severely challenged by potentially existing but unaccounted for studies. As a complement to our discussion on publication bias, the following moderator analyses included publication status published versus unpublished as one possibly influential factor.

Due to the fact that the effect sizes were significantly heterogeneous, we inspected the influence of factors with theoretical relevance. Before turning to these conceptually relevant aspects, differences between published and unpublished studies are addressed. More than half of our studies were unpublished, so it seemed warranted to contrast published with non-published effects. This result holds with or without the study by Berjot et al.

One main goal of the meta-analysis was to examine potential differences in effect size between studies conducted with Latino samples in the US and immigrants in Europe in order to gage whether stereotype threat is a sufficiently replicated phenomenon with immigrant samples outside the US.