Concept inventories

A concept inventory is a criterion-referenced test designed to evaluate whether a student has an accurate working knowledge of a specific set of concepts. To ensure interpretability, it is common to have multiple items that address a single idea. Typically, concept inventories are organized as multiple-choice tests in order to ensure that they are scored in a reproducible manner, a feature that also facilitates administration in large classes. Unlike a typical, teacher-made multiple-choice test, questions and response choices on concept inventories are the subject of extensive research. The aims of the research include ascertaining (a) the range of what individuals think a particular question is asking and (b) the most common responses to the questions. Concept inventories are evaluated to ensure test reliability and validity. In its final form, each question includes one correct answer and several distractors. The distractors are incorrect answers that are usually (but not always) based on students’ commonly held misconceptions.

Ideally, a score on a criterion-referenced test reflects the amount of content knowledge a student has mastered. Criterion-referenced tests differ from norm-referenced tests in that (in theory) the former is not used to compare an individual's score to the scores of the group. Ordinarily, the purpose of a criterion-referenced test is to ascertain whether a student mastered a predetermined amount of content knowledge; upon obtaining a test score that is at or above a cutoff score, the student can move on to study a body of content knowledge that follows next in a learning sequence. In general, item difficulty values ranging between 30% and 70% are best able to provide information about student understanding.

Distractors are often based on ideas commonly held by students, as determined by years of research on misconceptions. Test developers often research student misconceptions by examining students' responses to open-ended essay questions and conducting "think-aloud" interviews with students. The distractors chosen by students help researchers understand student thinking and give instructors insights into students' prior knowledge (and, sometimes, firmly held beliefs). This foundation in research underlies instrument construction and design, and plays a role in helping educators obtain clues about students' ideas, scientific misconceptions, and didaskalogenic, that is, teacher-induced confusions and conceptual lacunae that interfere with learning.

Concept inventories in use
The first concept inventory was developed in 1987. It concerned photosynthesis and respiration in plants. The concept inventory not only used misconceptions as distractors, but also employed two-tiered items. First-tier items ask ‘what happens when. . .?’ (which students often know); second-tier items ask ‘why does this happen?’ (which students often don’t know). Tiers of questions based on the distinction between student knowledge of outcome and mechanism provide an additional source of information for instructors. Hestenes, Halloun, and Wells developed the first of the concept inventories to be widely disseminated, the Force Concept Inventory (FCI). The FCI was designed to assess student understanding of the Newtonian concepts of force. Hestenes (1998) found that while “nearly 80% of the [students completing introductory college physics courses] could state Newton’s Third Law at the beginning of the course … FCI data showed that less than 15% of them fully understood it at the end”. These results have been replicated in a number of studies involving students at a range of institutions (see sources section below), and have led to greater recognition in the physics education research community of the importance of students' "active engagement" with the materials to be mastered..

Since the development of the FCI, other physics instruments have been developed. These include the Force and Motion Conceptual Evaluation developed by Thornton and Sokoloff and the Brief Electricity and Magnetism Assessment developed by Ding et al. For a discussion of how a number of concept inventories were developed see Beichner. Information about physics concept tests can be found at the NC State Physics Education Research Group website (see the external links below).

Concept inventories have been developed in physics, statistics, chemistry, astronomy, basic biology,   natural selection,   genetics,, engineering. , and geoscience. A review of many concept inventories can be found in two papers (#4- Libarkin and #5- Reed-Rhoads) commissioned by the National Research Council.

A different type of conceptual assessment has been created by the Thinking Like a Biologist research group at Michigan State University. To date, they have created approximately 80 items exploring students’ understanding of matter and energy, organized into Diagnostic Question Clusters that are available for download. These items are valuable for engaging students in collaborative problem-solving activities in class. Another approach is illustrated by the Biological Concepts Instrument (BCI), which is a 24-item, multiple-choice, research-based instrument (available on-line) designed to reveal students' (and teachers') understanding of foundational ideas within the (primarily) molecular biological arena. For example, results from the administration of the BCI indicate that students have difficulty grasping the implications of random processes in biological systems.

In many areas, foundational scientific concepts transcend disciplinary boundaries. An example of an inventory that assesses knowledge of such concepts is an instrument developed by Odom and Barrow (1995) to evaluate understanding of diffusion and osmosis. In addition, there are non-multiple choice conceptual instruments, such as the essay-based approach suggested by Wright et al. (1998) and the essay and oral exams used by Nehm and Schonfeld (2008).

Caveats associated with concept inventory use
Some concept inventories are problematic. Some inventories created by scientists do not align with best practices in scale development. Concept inventories created to simply diagnose student ideas may not be viable as research-quality measures of conceptual understanding. Users should be careful to ensure that concept inventories are actually testing conceptual understanding, rather than test-taking ability, English skills, or other factors that can influence test performance.

The use of multiple-choice exams as concept inventories is not without controversy. The very structure of multiple-choice type concept inventories raises questions involving the extent to which complex, and often nuanced situations and ideas must be simplified or clarified to produce unambiguous responses. For example, a multiple-choice exam designed to assess knowledge of key concepts in natural selection does not meet a number of standards of quality control. One problem with the exam is that the members of each of several pairs of parallel items, each pair designed to measure one key concept in natural selection, sometimes have very different levels of difficulty. Another problem is that the multiple-choice exam overestimates knowledge of natural selection as reflected in student performance on a diagnostic essay exam and a diagnostic oral exam, two instruments with reasonably good construct validity. Although scoring concept inventories in the form of essay or oral exams is labor intensive, costly, and difficult to implement with large numbers of students, such exams can offer a more realistic appraisal of the actual levels of students' conceptual mastery as well as their misconceptions. Recently, however, computer technology has been developed that can score essay responses on concept inventories in biology and other domains (Nehm, Ha, & Mayfield, 2011), promising to facilitate the scoring of concept inventories organized as (transcribed) oral exams as well as essays.