Test Development and Analysis

How do developers make a test a good test?

Developers make a test good by establishing certain properties; these include factors such as reliability, validity, generalizability, standardization, and how a test was normed. Additionally, developers can make a test a good test if test users can administer, score, and interpret rather easily (an added bonus would be if the test is economical in time and money). A reliable test is one that gives precise and consistent results (e.g., if a test says you’re an introvert the first time you took it, it should say the same thing the next time you take it, if it says you’re an extrovert the second time you take it the test would be considered unreliable). A test is considered valid if it actually measures what it purports to measure.

When a researcher develops a test, he or she should consider the inferences that can be reasonably made as a result of administering that test. Furthermore, developers also administer the test to a representative sample of test-takers to establish norms. This means that similar procedures for administering and scoring the test are established under the same conditions. In addition, developers must have established norms in order to make a test good. Norming can be “modified to describe norm derivation” (e.g., race norming) and can be classified in different ways, such as age norms, grade norms, national anchor norms, local norms, norms from a fixed reference group, subgroups, and percentile norms (Cohen and Swerdlik, 2018, pp. 126-132).

Cohen, R. J. & Swerdlik, M. E. (2018). Psychological testing and assessment (9th ed.). Boston, MA: McGraw-Hill Education.

Yildirim, K., & Yenipinar, S. (2017). Psychological Unsafety in Schools: The Development and Validation of a Scale. Journal Of Education And Training Studies, 5(6), 167-176.


