E-Book, Englisch, Band Vol. 3, 336 Seiten
Schweizer / DiStefano Principles and Methods of Test Construction
2016
ISBN: 978-1-61334-449-1
Verlag: Hogrefe Publishing
Format: EPUB
Kopierschutz: 6 - ePub Watermark
Standards and Recent Advances
E-Book, Englisch, Band Vol. 3, 336 Seiten
Reihe: Psychological Assessment – Science and Practice
ISBN: 978-1-61334-449-1
Verlag: Hogrefe Publishing
Format: EPUB
Kopierschutz: 6 - ePub Watermark
This latest volume in the series Psychological Assessment – Science
and Practice describes the current state-of-the-art in test development
and construction. The past 10–20 years have seen substantial
advances in the methods used to develop and administer tests. In this
volume many of the world’s leading authorities collate these advances
and provide information about current practices, thus equipping researchers
and students to successfully construct instruments using
the latest standards and techniques. The volume is organized into five
related sections. The first explains the benefits of considering the underlying
theory when designing tests, with a focus on factor analysis
and item response theory in construction. The second section looks at
item format and test presentation. The third discusses model testing
and selection, while the fourth goes into statistical methods to identify
group-specific biases. The final section discusses current topics of
special relevance, such as multi-trait multi-state analyses and development
of screening instruments.
Zielgruppe
For psychologists concernced with assessment, psychometricians, and students.
Autoren/Hrsg.
Fachgebiete
Weitere Infos & Material
[26]Chapter 3 Using Factor Analysis in Test Construction Deborah L. Bandalos1 and Jerusha J. Gerstner2 1Department of Graduate Psychology, James Madison University, Harrisonburg, VA, USA 2Marriott Corporation, Washington DC, USA Exploratory and confirmatory factor analytic methods are two of the most widely used procedures in the scale development process. For example, in a recent review of 75 articles reporting on the development of new or revised scales in the educational and psychological research literature, 77% used one or both of these methods (Gerstner & Bandalos, 2013). In this chapter we first review the basic concepts of exploratory and confirmatory factor analysis. We then discuss current recommendations for the use of both exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) taken from commonly used textbooks and from articles on these topics. In the remainder of this chapter we discuss the conceptual and statistical bases for EFA and CFA, practical considerations in using the methods, and ways in which both EFA and CFA can be used to improve the scale development process. Finally, we illustrate these topics with an example. Review of Basic Concepts Factor analysis, both exploratory and confirmatory, allows researchers to explicate a model that underlies the observed correlations or covariances among scores on a set of variables. In essence, these methods answer the question, “Why are these scores correlated in the way we observe?” In factor analysis, the answer to this question is that the scores are correlated due to a common cause: the factor(s). The diagram in Figure 3.1 represents such a model in which there are six item (or variable scores) (in this chapter, we use the terms item and variable somewhat interchangeably) labeled X1 – X6; two factors, F1 and F2; and six error terms, one for each item score. The two factors are correlated, as indicated by the curved arrow between them. Note that single-headed arrows run from the factors to the item scores (Xs). This means that, according to the factor analytic model, variation in the factors results in variation in the item scores. For example, if the item scores were measures of respondents’ underlying levels of depression and the factors represented two aspects of depression, such as somatic and physical complaints, an increase (or decrease) in level of depression is posited to result in a corresponding increase (or decrease) in the level of the item response. This is, in fact, why researchers take an interest in answers to items on a scale; they are thought to be a window into respondents’ levels of some attribute of interest. The same is the case with achievement or aptitude tests, or any measurements that are used in an attempt to capture an unobservable entity, or construct, such as creativity or motivation. [27] Figure 3.1. Hypothetical 2-factor model. There is an additional influence on each of the items: an error component (a term cannot be an influence). This component is a residual in the sense that it captures the variation in an item score that is not explained through the item’s relationship with the factor. In EFA, the residual components are typically called uniquenesses, because they represent the unique, or idiosyncratic, variation in the score. Although well-developed items of the same construct should be highly correlated, they should not be perfectly correlated. Thus even items developed to measure the same construct will have some unique variation, because items are typically written to tap into slightly different aspects of the construct of interest. This results in some lack of correlation among the items, which is reflected in the error, or uniqueness component. Having said this, the uniqueness also captures random error, or lack of reliability, in the items. The residual component is there fore a combination of specific, or unique, and random error. A factor model such as that depicted in Figure 3.1 gives rise to the mathematical model for factor analysis: where Xiv is the score of person i on variable v, wvf is the weight, or loading, of variable v on factor f (i.e., factor loading), Ffi. is the score on factor f of person i, wvu is the weight of variable v on the error component, or uniqueness, and Uiv is the score of person i on the unique factor for variable v. The same equation holds for CFA, although the terms w, F, and U are usually replaced with ?, ? ;(or ?), and d (or e). Equation 1 represonts the score on an iteam (X) of one person. However, as we noted previously, factor analysis is a methodology for studying the relationships among variable scores; thus, individual’s scores on the Xs are collected across all items and respondents and the relationships between variables are summarized for the set of respondents in the form of a correlation or covariance matrix. This correlation or covariance matrix of the variable scores is typically [28]analyzed instead of the raw data. The model of Equation 1 results in the following matrix equation: Here, Sx is the matrix of correlations or covariances among the items, ? is a v by f matrix of factor loadings, F is an f by f matrix of correlations or covariances among the factors, and ?d is a v by v matrix of the variables’ unique variances. As can be seen from Figure 3.1, the uniquenesses are assumed to be uncorrelated, making ?d a diagonal matrix. From Equation 2 it can be seen that the correlations or covariances among item scores can be decomposed into components due to the factor, represented by ?, to the correlations or covariances among the factors, or F, and to error (?d). Factor analysis can thus be seen as a method of modeling the covariation among a set of observed variable scores as a function of one or more latent variables or factors, the correlations among these factors, and error. Here, we use the term latent variable to refer to an unobservable but theoretically defensible entity, such as intelligence, self-efficacy, or creativity. These variables are considered to be latent in the sense that they are not directly observable (see Bollen, 2002, for a more detailed discussion of latent variables). The purpose of factor analysis is to assist researchers in identifying and/or understanding the nature of the latent variables underlying the items of interest. Technically, these descriptions exclude component analysis, which is a method for reducing the dimensionality of a set of observed variables through the creation of an optimum number of weighted composites. In this chapter, we confine discussion to factor, and not component, models. Differentiating EFA and CFA As their names imply, one difference between exploratory and confirmatory factor analyses lies in the manner in which they are used. However, although the exploratory/confirmatory distinction is often treated as a dichotomy, in practice it is more of a continuum. Any statistical analysis can be used in a manner that is more exploratory or more confirmatory, and despite their nomenclature, the same is true of EFA and CFA. For example, researchers employing EFA often have strong theory to support a particular factor model, with hypotheses about the number of factors, the variables that should load on these, and even the level of correlation among the factors. On the other hand, a researcher may have limited or conflicting theory or evidence to support a particular factor model. The first situation represents a more confirmatory use of EFA and the second a more exploratory use. Having said this, CFA does represent a more confirmatory approach because it allows researchers to specify, or restrict, more aspects of the model than in EFA. In addition, CFA more easily allows for tests of the hypothesized model, both as a whole and of specific parameters, such as factor loadings or correlations/ covariances. However, it is precisely because of the restrictive nature of the typical CFA model that it may be best to use EFA for situations in which little previous research and/or theory exists to guide specification of the factor model. This is because specification of a CFA model requires the researcher to indicate not only which variables load on each factor, but also which do not. Typically, a simplified model structure in which variables load only on one factor is specified. If such a model does not hold, it can result in a serious misfit of the model to the data. In such situations the researcher would likely attempt to determine a more appropriate structure. Within the CFA approach there are so many options for respecification of parameters that the [29]researcher could easily be led down an unprofitable path. Particularly during the beginning stages of scale construction, variables often display an annoying tendency to load onto more than one factor, or to fail to load on the factor for which they are written. Such tendencies can be difficult to detect from the output of a typical CFA, but are readily apparent from any EFA output. A final reason to use EFA in the beginning stages of scale construction is that the exploration of different models, as is recommended under the EFA paradigm, may reveal interesting and conceptually plausible structures that can help the researcher to more thoroughly explore the nature of the construct of interest. As a general guideline, therefore, we recommend that EFA be used for situations...