digplanet beta 1: Athena
Share digplanet:

Agriculture

Applied sciences

Arts

Belief

Business

Chronology

Culture

Education

Environment

Geography

Health

History

Humanities

Language

Law

Life

Mathematics

Nature

People

Politics

Science

Society

Technology

Not to be confused with psychrometrics, the measurement of the heat and water vapor properties of air.
For other uses of this term and similar terms, see Psychometry (disambiguation).

Psychometrics is the field of study concerned with the theory and technique of psychological measurement. One part of the field is concerned with the objective measurement of skills and knowledge, abilities, attitudes, personality traits, and educational achievement. For example, psychometric research has concerned itself with the construction and validation of assessment instruments such as questionnaires, tests, raters' judgments, and personality tests. Another part of the field is concerned with statistical research bearing on measurement theory (e.g., item response theory; intraclass correlation).

Thus psychometrics involves two major research tasks: (i) the construction of instruments and procedures for measurement; and (ii) the development and refinement of theoretical approaches to measurement. Those who practice psychometrics are known as psychometricians. All psychometricians possess a specific psychometric qualification, and while most are psychologists with advanced graduate training in psychometric testing. Many work in human resources departments. Others specialize as learning and development professionals.

19th century foundation[edit]

Psychological testing has come from two streams of thought: one, from Darwin, Galton, and Cattell on the measurement of individual differences, and the second, from Herbart, Weber, Fechner, and Wundt and their psychophysical measurements of a similar construct. The second set of individuals and their research is what has led to the development of experimental psychology, and standardized testing.[1]

Victorian stream[edit]

Charles Darwin was the inspiration behind Sir Francis Galton who led to the creation of psychometrics. In 1859, Charles Darwin published his book "The Origin of Species", which pertained to individual differences in animals. This book discussed how individual members in a species differ and how they possess characteristics that are more adaptive and successful or less adaptive and less successful. Those who are adaptive and successful are the ones that survive and give way to the next generation, who would be just as or more adaptive and successful. This idea, studied previously in animals, led to Galton's interest and study of human beings and how they differ one from another, and more importantly, how to measure those differences.

Galton wrote a book entitled "Hereditary Genius" about different characteristics that people possess and how those characteristics make them more "fit" than others. Today these differences, such as sensory and motor functioning (reaction time, visual acuity, and physical strength) are important domains of scientific psychology. Much of the early theoretical and applied work in psychometrics was undertaken in an attempt to measure intelligence. Francis Galton, often referred to as "the father of psychometrics," devised and included mental tests among his anthropometric measures. James McKeen Cattell, who is considered a pioneer of psychometrics went on to extend Galton's work. Cattell also coined the term mental test, and is responsible for the research and knowledge which ultimately led to the development of modern tests. (Kaplan & Saccuzzo, 2010)

German stream[edit]

The origin of psychometrics also has connections to the related field of psychophysics. Around the same time that Darwin, Galton, and Cattell were making their discoveries, J.E. Herbart was also interested in "unlocking the mysteries of human consciousness" through the scientific method. (Kaplan & Saccuzzo, 2010) Herbart was responsible for creating mathematical models of the mind, which were influential in educational practices in years to come.

Following Herbart, E.H. Weber built upon Herbart's work and tried to prove the existence of a psychological threshold saying that a minimum stimulus was necessary to activate a sensory system. After Weber, G.T. Fechner expanded upon the knowledge he gleaned from Herbart and Weber, to devise the law that the strength of a sensation grows as the logarithm of the stimulus intensity. A follower of Weber and Fechner, Wilhelm Wundt is credited with founding the science of psychology. It is Wundt's influence that paved the way for others to develop psychological testing.[1]

20th century[edit]

The psychometrician L. L. Thurstone, founder and first president of the Psychometric Society in 1936, developed and applied a theoretical approach to measurement referred to as the law of comparative judgment, an approach that has close connections to the psychophysical theory of Ernst Heinrich Weber and Gustav Fechner. In addition, Spearman and Thurstone both made important contributions to the theory and application of factor analysis, a statistical method developed and used extensively in psychometrics.[citation needed] In the late 1950s, Leopold Szondi made an historical and epistemological assessment of the impact of statistical thinking onto psychology during previous few decades: "in the last decades, the specifically psychological thinking has been almost completely suppressed and removed, and replaced by a statistical thinking. Precisely here we see the cancer of testology and testomania of today."[2]

More recently, psychometric theory has been applied in the measurement of personality, attitudes, and beliefs, and academic achievement. Measurement of these unobservable phenomena is difficult, and much of the research and accumulated science in this discipline has been developed in an attempt to properly define and quantify such phenomena. Critics, including practitioners in the physical sciences and social activists, have argued that such definition and quantification is impossibly difficult, and that such measurements are often misused, such as with psychometric personality tests used in employment procedures:

"For example, an employer wanting someone for a role requiring consistent attention to repetitive detail will probably not want to give that job to someone who is very creative and gets bored easily."[3]

Figures who made significant contributions to psychometrics include Karl Pearson, Henry F. Kaiser, Carl Brigham, L. L. Thurstone, Georg Rasch, Eugene Galanter, Johnson O'Connor, Frederic M. Lord, Ledyard R Tucker, Arthur Jensen, and David Andrich.

Definition of measurement in the social sciences[edit]

The definition of measurement in the social sciences has a long history. A currently widespread definition, proposed by Stanley Smith Stevens (1946), is that measurement is "the assignment of numerals to objects or events according to some rule." This definition was introduced in the paper in which Stevens proposed four levels of measurement. Although widely adopted, this definition differs in important respects from the more classical definition of measurement adopted in the physical sciences, namely that scientific measurement entails "the estimation or discovery of the ratio of some magnitude of a quantitative attribute to a unit of the same attribute" (p. 358)[4]

Indeed, Stevens's definition of measurement was put forward in response to the British Ferguson Committee, whose chair, A. Ferguson, was a physicist. The committee was appointed in 1932 by the British Association for the Advancement of Science to investigate the possibility of quantitatively estimating sensory events. Although its chair and other members were physicists, the committee also included several psychologists. The committee's report highlighted the importance of the definition of measurement. While Stevens's response was to propose a new definition, which has had considerable influence in the field, this was by no means the only response to the report. Another, notably different, response was to accept the classical definition, as reflected in the following statement:

Measurement in psychology and physics are in no sense different. Physicists can measure when they can find the operations by which they may meet the necessary criteria; psychologists have but to do the same. They need not worry about the mysterious differences between the meaning of measurement in the two sciences. (Reese, 1943, p. 49)

These divergent responses are reflected in alternative approaches to measurement. For example, methods based on covariance matrices are typically employed on the premise that numbers, such as raw scores derived from assessments, are measurements. Such approaches implicitly entail Stevens's definition of measurement, which requires only that numbers are assigned according to some rule. The main research task, then, is generally considered to be the discovery of associations between scores, and of factors posited to underlie such associations.

On the other hand, when measurement models such as the Rasch model are employed, numbers are not assigned based on a rule. Instead, in keeping with Reese's statement above, specific criteria for measurement are stated, and the goal is to construct procedures or operations that provide data that meet the relevant criteria. Measurements are estimated based on the models, and tests are conducted to ascertain whether the relevant criteria have been met.

Instruments and procedures[edit]

The first psychometric instruments were designed to measure the concept of intelligence. The best known historical approach involved the Stanford-Binet IQ test, developed originally by the French psychologist Alfred Binet. Intelligence tests are useful tools for various purposes. An alternative conception of intelligence is that cognitive capacities within individuals are a manifestation of a general component, or general intelligence factor, as well as cognitive capacity specific to a given domain.

Psychometrics is applied widely in educational assessment to measure abilities in domains such as reading, writing, and mathematics. The main approaches in applying tests in these domains have been Classical Test Theory and the more recent Item Response Theory and Rasch measurement models. These latter approaches permit joint scaling of persons and assessment items, which provides a basis for mapping of developmental continua by allowing descriptions of the skills displayed at various points along a continuum. Such approaches provide powerful information regarding the nature of developmental growth within various domains.

Another major focus in psychometrics has been on personality testing. There have been a range of theoretical approaches to conceptualizing and measuring personality. Some of the better known instruments include the Minnesota Multiphasic Personality Inventory, the Five-Factor Model (or "Big 5") and tools such as Personality and Preference Inventory and the Myers-Briggs Type Indicator. Attitudes have also been studied extensively using psychometric approaches. A common method in the measurement of attitudes is the use of the Likert scale. An alternative method involves the application of unfolding measurement models, the most general being the Hyperbolic Cosine Model (Andrich & Luo, 1993).

Theoretical approaches[edit]

Psychometricians have developed a number of different measurement theories. These include classical test theory (CTT) and item response theory (IRT).[5][6] An approach which seems mathematically to be similar to IRT but also quite distinctive, in terms of its origins and features, is represented by the Rasch model for measurement. The development of the Rasch model, and the broader class of models to which it belongs, was explicitly founded on requirements of measurement in the physical sciences.[7]

Psychometricians have also developed methods for working with large matrices of correlations and covariances. Techniques in this general tradition include: factor analysis,[8] a method of determining the underlying dimensions of data; multidimensional scaling,[9] a method for finding a simple representation for data with a large number of latent dimensions; and data clustering, an approach to finding objects that are like each other. All these multivariate descriptive methods try to distill large amounts of data into simpler structures. More recently, structural equation modeling[10] and path analysis represent more sophisticated approaches to working with large covariance matrices. These methods allow statistically sophisticated models to be fitted to data and tested to determine if they are adequate fits.

One of the main deficiencies in various factor analyses is a lack of consensus in cutting points for determining the number of latent factors. A usual procedure is to stop factoring when eigenvalues drop below one because the original sphere shrinks. The lack of the cutting points concerns other multivariate methods, also.[citation needed]

Key concepts[edit]

Key concepts in classical test theory are reliability and validity. A reliable measure is one that measures a construct consistently across time, individuals, and situations. A valid measure is one that measures what it is intended to measure. Reliability is necessary, but not sufficient, for validity.

Both reliability and validity can be assessed statistically. Consistency over repeated measures of the same test can be assessed with the Pearson correlation coefficient, and is often called test-retest reliability.[11] Similarly, the equivalence of different versions of the same measure can be indexed by a Pearson correlation, and is called equivalent forms reliability or a similar term.[11]

Internal consistency, which addresses the homogeneity of a single test form, may be assessed by correlating performance on two halves of a test, which is termed split-half reliability; the value of this Pearson product-moment correlation coefficient for two half-tests is adjusted with the Spearman–Brown prediction formula to correspond to the correlation between two full-length tests.[11] Perhaps the most commonly used index of reliability is Cronbach's α, which is equivalent to the mean of all possible split-half coefficients. Other approaches include the intra-class correlation, which is the ratio of variance of measurements of a given target to the variance of all targets.

There are a number of different forms of validity. Criterion-related validity can be assessed by correlating a measure with a criterion measure theoretically expected to be related. When the criterion measure is collected at the same time as the measure being validated the goal is to establish concurrent validity; when the criterion is collected later the goal is to establish predictive validity. A measure has construct validity if it is related to measures of other constructs as required by theory. Content validity is a demonstration that the items of a test do an adequate job of covering the domain being measured. In a personnel selection example, test content is based on a defined statement or set of statements of knowledge, skill, ability, or other characteristics obtained from a job analysis.

Item response theory models the relationship between latent traits and responses to test items. Among other advantages, IRT provides a basis for obtaining an estimate of the location of a test-taker on a given latent trait as well as the standard error of measurement of that location. For example, a university student's knowledge of history can be deduced from his or her score on a university test and then be compared reliably with a high school student's knowledge deduced from a less difficult test. Scores derived by classical test theory do not have this characteristic, and assessment of actual ability (rather than ability relative to other test-takers) must be assessed by comparing scores to those of a "norm group" randomly selected from the population. In fact, all measures derived from classical test theory are dependent on the sample tested, while, in principle, those derived from item response theory are not.

Standards of quality[edit]

The considerations of validity and reliability typically are viewed as essential elements for determining the quality of any test. However, professional and practitioner associations frequently have placed these concerns within broader contexts when developing standards and making overall judgments about the quality of any test as a whole within a given context. A consideration of concern in many applied research settings is whether or not the metric of a given psychological inventory is meaningful or arbitrary.[12]

Testing standards[edit]

In this field, the Standards for Educational and Psychological Testing[13] place standards about validity and reliability, along with errors of measurement and related considerations under the general topic of test construction, evaluation and documentation. The second major topic covers standards related to fairness in testing, including fairness in testing and test use, the rights and responsibilities of test takers, testing individuals of diverse linguistic backgrounds, and testing individuals with disabilities. The third and final major topic covers standards related to testing applications, including the responsibilities of test users, psychological testing and assessment, educational testing and assessment, testing in employment and credentialing, plus testing in program evaluation and public policy.

Evaluation standards[edit]

In the field of evaluation, and in particular educational evaluation, the Joint Committee on Standards for Educational Evaluation[14] has published three sets of standards for evaluations. The Personnel Evaluation Standards[15] was published in 1988, The Program Evaluation Standards (2nd edition)[16] was published in 1994, and The Student Evaluation Standards[17] was published in 2003.

Each publication presents and elaborates a set of standards for use in a variety of educational settings. The standards provide guidelines for designing, implementing, assessing and improving the identified form of evaluation.[18] Each of the standards has been placed in one of four fundamental categories to promote educational evaluations that are proper, useful, feasible, and accurate. In these sets of standards, validity and reliability considerations are covered under the accuracy topic. For example, the student accuracy standards help ensure that student evaluations will provide sound, accurate, and credible information about student learning and performance.

Non-human: animals and machines[edit]

Psychometrics addresses human abilities, attitudes, traits and educational evolution. Notably, the study of behavior, mental processes and abilities of non-human animals is usually addressed by comparative psychology, or with a continuum between non-human animals and the rest of animals by evolutionary psychology. Nonetheless there are some advocators for a more gradual transition between the approach taken for humans and the approach taken for (non-human) animals.[19] [20] [21] [22]

The evaluation of abilities, traits and learning evolution of machines has been mostly unrelated to the case of humans and non-human animals, with specific approaches in the area of artificial intelligence. A more integrated approach, under the name of universal psychometrics, has also been proposed.[23]

See also[edit]

References[edit]

Bibliography[edit]

  • Andrich, D. & Luo, G. (1993). "A hyperbolic cosine model for unfolding dichotomous single-stimulus responses". Applied Psychological Measurement 17 (3): 253–276. doi:10.1177/014662169301700307. 
  • Michell, J. B (1997). "Quantitative science and the definition of measurement in psychology". British Journal of Psychology 88 (3): 355–383. doi:10.1111/j.2044-8295.1997.tb02641.x. 
  • Michell, J. (1999). Measurement in Psychology. Cambridge: Cambridge University Press.
  • Rasch, G. (1960/1980). Probabilistic models for some intelligence and attainment tests. Copenhagen, Danish Institute for Educational Research), expanded edition (1980) with foreword and afterword by B.D. Wright. Chicago: The University of Chicago Press.
  • Reese, T.W. (1943). "The application of the theory of physical measurement to the measurement of psychological magnitudes, with three experimental examples". Psychological Monographs 55: 1–89. 
  • Stevens, S. S. (1946). "On the theory of scales of measurement". Science 103 (2684): 677–80. doi:10.1126/science.103.2684.677. PMID 17750512. 
  • Thurstone, L.L. (1927). "A law of comparative judgement". Psychological Review 34 (4): 278–286. doi:10.1037/h0070288. 
  • Thurstone, L.L. (1929). The Measurement of Psychological Value. In T.V. Smith and W.K. Wright (Eds.), Essays in Philosophy by Seventeen Doctors of Philosophy of the University of Chicago. Chicago: Open Court.
  • Thurstone, L.L. (1959). The Measurement of Values. Chicago: The University of Chicago Press.
  • http://www.services.unimelb.edu.au/careers/student/interviews/test.html .Psychometric Assessments University of Melbourne.
  • S.F. Blinkhorn (1997). "Past imperfect, future conditional: fifty years of test theory". Br. J. Math. Statist. Psychol 50 (2): 175–185. doi:10.1111/j.2044-8317.1997.tb01139.x. 

Notes[edit]

  1. ^ a b Kaplan, R.M., & Saccuzzo, D.P. (2010). Psychological Testing: Principles, Applications, and Issues. (8th ed.). Belmont, CA: Wadsworth, Cengage Learning.
  2. ^ Leopold Szondi (1960) Das zweite Buch: Lehrbuch der Experimentellen Triebdiagnostik. Huber, Bern und Stuttgart, 2nd edition. Ch.27, From the Spanish translation, B)II Las condiciones estadisticas, p.396. Quotation:

    el pensamiento psicologico especifico, en las ultima decadas, fue suprimido y eliminado casi totalmente, siendo sustituido por un pensamiento estadistico. Precisamente aqui vemos el cáncer de la testología y testomania de hoy.

  3. ^ Psychometric Assessments. Psychometric Assessments . University of Melbourne.
  4. ^ Michell, J. (1997). Quantitative science and the definition of measurement in psychology. British Journal Of Psychology, 88(3), 355-383. doi:10.1111/j.2044-8295.1997.tb02641.x
  5. ^ Embretson, S.E., & Reise, S.P. (2000). Item Response Theory for Psychologists. Mahwah, NJ: Erlbaum.
  6. ^ Hambleton, R.K., & Swaminathan, H. (1985). Item Response Theory: Principles and Applications. Boston: Kluwer-Nijhoff.
  7. ^ Rasch, G. (1960/1980). Probabilistic models for some intelligence and attainment tests. Copenhagen, Danish Institute for Educational Research, expanded edition (1980) with foreword and afterword by B.D. Wright. Chicago: The University of Chicago Press.
  8. ^ Thompson, B.R. (2004). Exploratory and Confirmatory Factor Analysis: Understanding Concepts and Applications. American Psychological Association.
  9. ^ Davison, M.L. (1992). Multidimensional Scaling. Krieger.
  10. ^ Kaplan, D. (2008). Structural Equation Modeling: Foundations and Extensions, 2nd ed. Sage.
  11. ^ a b c Reliability definitions at the University of Connecticut
  12. ^ Blanton, H., & Jaccard, J. (2006). Arbitrary metrics in psychology. American Psychologist, 61(1), 27-41.
  13. ^ The Standards for Educational and Psychological Testing
  14. ^ Joint Committee on Standards for Educational Evaluation
  15. ^ Joint Committee on Standards for Educational Evaluation. (1988). The Personnel Evaluation Standards: How to Assess Systems for Evaluating Educators. Newbury Park, CA: Sage Publications.
  16. ^ Joint Committee on Standards for Educational Evaluation. (1994). The Program Evaluation Standards, 2nd Edition. Newbury Park, CA: Sage Publications.
  17. ^ Committee on Standards for Educational Evaluation. (2003). The Student Evaluation Standards: How to Improve Evaluations of Students. Newbury Park, CA: Corwin Press.
  18. ^ Author guidelines for reporting scale development and validation results in the Journal of the Society for Social Work and Research
  19. ^ Humphreys, L.G. (1987). "Psychometrics considerations in the evaluation of intraspecies differences in intelligence". Behav Brain Sci 10: 668–669. doi:10.1017/s0140525x0005514x. 
  20. ^ Eysenck, H.J. (1987). "The several meanings of intelligence". Behav Brain Sci 10: 663. doi:10.1017/s0140525x00055060. 
  21. ^ Locurto, C. and Scanlon, C (1987). "Individual differences and spatial learning factor in two strains of mice". Behav Brain Sci 112: 344–352. 
  22. ^ King, James E and Figueredo, Aurelio Jose (1997). "The five-factor model plus dominance in chimpanzee personality". Journal of research in personality 31 (2): 257–271. doi:10.1006/jrpe.1997.2179. 
  23. ^ J. Hernández-Orallo, D.L. Dowe, M.V. Hernández-Lloreda (2013). "Universal Psychometrics: Measuring Cognitive Abilities in the Machine Kingdom". Cognitive Systems Research. 

Further reading[edit]

External links[edit]


Original courtesy of Wikipedia: http://en.wikipedia.org/wiki/Psychometrics — Please support Wikipedia.
This page uses Creative Commons Licensed content from Wikipedia. A portion of the proceeds from advertising on Digplanet goes to supporting Wikipedia.
28130 videos foundNext > 

A guide to psychometric testing

An introductory guide to psychometric testing from Brighton, UK based leadership development, business coaching and learning & development consultancy provid...

Psychometrics by Joshua Wiley

Psychometric and item response theory methods have long been used to estimate scores or latent traits. Although traditionally used for testing, these methods...

Lec-19 Psychometrics

Lecture Series on Applied Thermodynamics for Marine Systems by Prof.P.K. Das, Department of Mechanical Engineering, IIT Kharagpur. For more details on NPTEL ...

Module 05: Introduction to Psychometrics

Dr. Cate Crowley is on the faculty in the program of speech language pathology (SLP) at Teachers College, Columbia University. She developed a two-day traini...

ExamSoft Webinar - Psychometrics 101

UPSI Psychometrics

UPSI Psychometrics is Malaysia's first psychometric profiling for universities and a joint effort between MIMOS and Universiti Pendidikan Sultan Idris (UPSI)...

Successful Psychometric Testing

Many employers use psychometric testing to help them make a decision about whether a person is right for the job and will fit into the company's environment....

Perry's Chemical Engineers' Handbook - Psychometric Charts.mp4

Psychometric charts are plots of humidity, temperature, enthalpy and other useful parameters of a gas-vapor mixture. They are helpful for rapid estimates of ...

Stephanie Zvan - The Use and Abuse of Psychometrics

SkepTech Conference 2013.

NovoPsych psychometrics iPad app reviewed by psychologist

NovoPsych is an iPad app designed for clinicians working in the mental health arena who want to administer valid and reliable psychometric tests. Visit Novop...

28130 videos foundNext > 

90 news items

Business Insider

Business Insider
Thu, 21 Aug 2014 10:15:00 -0700

Sometimes it's because people are afraid to ask their new employers for it, says Kerry Schofield, cofounder and Chief Psychometrics Officer of the self-discovery platform Good.Co. "They fear appearing lazy or non-committed, or failing to compete ...
 
Business 2 Community
Fri, 08 Aug 2014 03:07:30 -0700

I have used and been on the receiving end of a number of Psychometrics – Clifton Skills Finders, Myers Briggs and DISC. Most of my experience of using psychometrics as a practitioner however has been through 'DISC' (which stands for Dominance, ...

TIME

TIME
Thu, 07 Aug 2014 09:27:36 -0700

No, The Legend of Zelda won't replace the ACT or SAT anytime soon, but imagine a world in which archaic standardized tests (and perhaps even the notion of a final exam) were replaced by realtime, inherently evaluative experiences that bettered the ...

KQED (blog)

KQED (blog)
Mon, 11 Aug 2014 07:12:07 -0700

Schwartz is among a new breed of researchers who are applying the mechanics of games to the science of psychometrics — the measurement of the mind. Right now, he's working on a series of video games called Choicelets. They're designed to evaluate ...

NPR (blog)

NPR (blog)
Thu, 07 Aug 2014 05:04:01 -0700

Schwartz is among a new breed of researchers who are applying the mechanics of games to the science of psychometrics — the measurement of the mind. Right now, he's working on a series of video games called Choicelets. They're designed to evaluate ...

Professional Adviser

Professional Adviser
Mon, 04 Aug 2014 07:49:20 -0700

... there is no standard unit of measurement for risk tolerance, which is an enduring psychological trait rather than a temporary state. The scientific discipline of psychometrics has developed to make personal traits such as risk tolerance measureable ...

The Guardian

The Guardian
Sun, 03 Aug 2014 00:01:16 -0700

An actor on stage was publicly analysing the results of a study designed by the psychometrics department of Cambridge University that used my Facebook profile to reveal my innermost secrets, questioning almost every aspect of my personality, from my ...
 
Insurance News Net
Wed, 30 Jul 2014 12:07:30 -0700

Thirty primary care practices of different types and ownership configurations will be recruited to provide a patient sample to AHRQ's contractor, AIR for the purpose of establishing the psychometrics of the CCQM-PC and understanding the relation of its ...
Loading

Oops, we seem to be having trouble contacting Twitter

Talk About Psychometrics

You can talk about Psychometrics with people all over the world in our discussions.

Support Wikipedia

A portion of the proceeds from advertising on Digplanet goes to supporting Wikipedia. Please add your support for Wikipedia!