Multiple dichotomous-scored items in second language testing: investigating the multiple true-false item type under norm-referenced conditions