The effect of differential motivation on IRT linking

  1. Abramowitz, M., & Stegun, I. A. (1972). Handbook of mathematical functions. New York, NY: Dover Publications.
  2. Béguin, A. A. (2000). Robustness of equating high-stakes tests (Doctoral dissertation). Twente University, Enschede, The Netherlands.
  3. Béguin, A. A. (2005, July). Bayesian IRT equating with correction for unmotivated respondents on the anchor-test. Paper presented at the meeting of the Psychometric Society, Tilburg, The Netherlands.
  4. Béguin, A. A., & Hanson, B. A. (2001). Effect of noncompensatory multidimensionality on seperate and concurrent estimation in IRT observed score equating (Report No. 01–02). Arnhem, The Netherlands: Cito.
  5. Béguin, A. A., & Maan, A. (2007, April). IRT linking of high-stakes tests with a low-stakes anchor. Paper presented at the meeting of the National Council of Measurement in Education (NCME), Chicago, IL.
  6. Bolt, D. M., Cohen, A. S., & Wollack, J. A. (2002). Item parameter estimation under conditions of test speededness: Application of a mixture Rasch model with ordinal constraints. Journal of Educational Measurement, 39, 331–348.
  7. Cook, L. L., & Petersen, N. S. (1987). Problems related to the use of conventional and item response theory equating methods in less than optimal circumstances. Applied Psychological Measurement, 11, 225–244.
  8. Davey, T., Nering, M. L., & Thompson, T. (1997). Realistic simulation of item response data (ACT Research Report Series 97–4). Iowa City, IA: ACT. Retrieved November 16, 2013, from https://www.act.org/research/researchers/reports/pdf/ACT˙RR97-04.pdf
  9. Dinero, T. E., & Haertel, E. (1977). Applicability of the Rasch model with varying item discriminations. Applied Psychological Measurement, 1, 581–592.
  10. Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum.
  11. Forsyth, R., Saisangjan, U., & Gilmer, J. (1981). Some empirical results related to the robustness of the Rasch model. Applied Psychological Measurement, 5, 175–186.
  12. Hanson, B. A., & Béguin, A. A. (2002). Obtaining a common scale for item response theory item parameters using separate versus concurrent estimation in the common-item equating design. Applied Psychological Measurement, 26, 3–24.
  13. Holland, P. W., & Rubin, D. B. (1982). Test equating. New York, NY: Academic press.
  14. Holland, P. W., & Wightman, L. E. (1982). Section pre-equating: A preliminary investigation. In P. W. Holland & D. R. Rubin (Eds.), Test equating (pp. 271–297). New York, NY: Academic Press.
  15. Kingston, N. M., & Dorans, N. J. (1984). Item location effects and their implications for IRT equating and adaptive testing. Applied Psychological Measurement, 8, 147–154.
  16. Klein, L. W., & Jarjoura, D. (1985). The importance of content representation for commonitem equating with nonrandom groups. Journal of Educational Measurement, 22, 197–206.
  17. Kolen, M. J., & Brennan, R. L. (2004). Test equating, scaling, and linking (2nd ed.). New York, NY: Springer.
  18. Lord, F. M., & Wingersky, M. S. (1984). Comparison of IRT true-score and equipercentile observed-score “equatings.” Applied Psychological Measurement, 8, 453–461.
  19. Meyer, J. P. (2010). A mixture Rasch model with item response time components. Applied Psychological Measurement, 34, 521–538.
  20. Mislevy, R. J.,& Verhelst, N. (1990).Modeling item responses when different subjects employ different solution strategies. Psychometrika, 55, 195–215.
  21. Mittelhaëuser, M.-A., Béguin, A. A., & Sijtsma, K. (2015). Selecting a data collection design for linking in educational measurement: Taking differential motivation into account. In R. E. Millsap, L. A. van der Ark, D. M. Bolt, & W.-C. Wang (Eds.), Quantitative psychology research: The 78th Annual Meeting of the Psychometric Society (pp. 181–193). New York, NY: Springer.
  22. Reise, S. P., & Flannery, W. P. (1996). Assessing person-fit on measures of typical performance. Applied Measurement in Education, 9, 9–26.
  23. Rost, J. (1997). Logistic mixture models. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 449–463). New York, NY: Springer.
  24. Rost, J., Carstensen, C., & von Davier, M. (1997). Applying the mixed Rasch model to personality questionnaires. In J. Rost & R. Langeheine (Eds.), Applications of latent trait and latent class models in the social sciences (pp. 324–332). Münster, Germany: Waxmann.
  25. Rupp, A. A., & Zumbo, B. D. (2006). Understanding parameter invariance in unidimensional IRT models. Educational and Psychological Measurement, 66, 63–84.
  26. Sundre, D. L. (1999, April). Does examinee motivation moderate the relationship between test consequences and test performance? Paper presented at the meeting of the American Educational Research Association, Montreal, Canada (ERIC Document Reproduction Service No. ED432588).
  27. Van Boxtel, H., Engelen, R., & de Wijs, A. (2011). Wetenschappelijke verantwoording van de Eindtoets 2010 [Scientific Report of the End of Primary School Test 2010]. Arnhem, The Netherlands: Cito.
  28. van der Linden,W. J., & Hambleton, R. K. (Eds.). (1997). Handbook of modern item response theory. New York, NY: Springer.
  29. Verhelst, N. D., Glas, C. A. W., & Verstralen, H. H. F. M. (1995). OPLM: Computer program and manual. Arnhem, The Netherlands: Cito.
  30. von Davier, M., & Yamamoto, K. (2004). Partially observed mixtures of IRT models: An extension of the generalized partial-credit model. Applied Psychological Measurement, 28, 389–406.
  31. Wells, C. S., Subkoviak, M. J., & Serlin, R. C. (2002). The effect of item parameter drift on examinee ability estimates. Applied Psychological Measurement, 26, 77–87.
  32. Wise, S. L., & DeMars, C. E. (2005). Low examinee effort in low-stakes assessment: Problems and potential solutions. Educational Assessment, 10, 1–17.
  33. Wise, S. L., & Kong, X. (2005). Response time effort: A new measure of examinee motivation in computer-based tests. Applied Measurement in Education, 18, 163–183.
  34. Wolf, L. F., Smith, J. K., & Birnbaum, M. E. (1995). The consequence of performance, test, motivation, and mentally taxing items. Applied Measurement in Education, 8, 341–351.
  35. Wollack, J. A., Cohen, A. S., & Wells, C. S. (2003). A method for maintaining scale stability in the presence of test speededness. Journal of Educational Measurement, 40, 307–330.
  36. Yamamoto, K., & Everson, H. (1997). Modeling the effects of test length and test time on parameter estimation using the HYBRID model. In J. Rost & R. Langeheine (Eds.), Applications of latent traits and latent class models in the social sciences (pp. 89–98). Münster, Germany: Waxmann.
  37. Yamamoto, K., & Mazzeo, J. (1992). Item response theory scale linking in NAEP. Journal of Educational Statistics, 17, 155–173.
  38. Zeng, L., & Kolen, M. J. (1995). An alternative approach for IRT observed-score equating of number-correct scores. Applied Psychological Measurement, 19, 231–240.