INNOVATIVE METHOD FOR GRADUATE ATTRIBUTE ASSESSMENT IN LARGE CLASSES

926 | P a g e D e c e m b e r 8 , 2 0 1 4 INNOVATIVE METHOD FOR GRADUATE ATTRIBUTE ASSESSMENT IN LARGE CLASSES Said M. Easa Professor, Department of Civil Engineering, Ryerson University Toronto, Ontario, Canada M5B 2K3 Email: seasa@ryerson.ca ABSTRACT Assessing graduate attributes in large classes is a time consuming task. The assessment requires a carefully designed random sampling that ensures the sample is representative of all students in the class. In addition, the assessment becomes more difficult when soft-skill graduate attributes are involved. The purpose of this paper is to present an efficient method for assessing graduate attributes in large classes without sampling. The proposed method involves defining an indicator (learning objective) by knowledge elements (topics) that the student should know or by interaction elements in a case study that represent the principles related to the indicator. Multiple-choice questions are then developed for the knowledge or interaction elements and processed using scantron sheets. The method involves a weighted-score procedure and performance scales for determining class performance. Application of the method for assessing two graduate attributes (lifelong learning and professionalism) in a fourth-year common engineering course is illustrated in this paper. The results show that class performance is sensitive to the weights assigned to the questions and therefore these weights should be carefully established by the instructors. The proposed method has shown to be useful in identifying the indicators and the specific topics within the indicator that need improvements.


INTRODUCTION
Outcome-based (OB) assessment focuses on empirically measured outcomes that include a range of skills and knowledge. The OB education is a requirement to join the Washington Accord (WA) [1]. The WA requires that its signatory countries implement outcome-based education. The accord was established in 1989 by six countries (United States, Canada, United Kingdom, Australia, Ireland, and New Zealand). According to this international accreditation agreement, graduates of accredited programs in any of the signatory countries are recognized by the other signatory countries as having met the academic requirements for entry to engineering practice. Numerous countries have shifted to OB accreditation to be able to join the accord and benefit from the global mobility of engineers. The accord was joined by four countries during 1995-2006 (Hong Kong China, South Africa, Japan, and Singapore) and five countries during 2007-2012 (Chinese Taipei, Korea, Malaysia, Russia, and Turkey). The WA also includes six countries that hold provisional status (Bangladesh, China, India, Pakistan, Philippines, and Sri Lanka). These countries have demonstrated that their accreditation systems are similar to those of the signatories and have the potential to be developed to the OB education.
The OB assessment involves several elements, including leadership structure, assessment elements, assessment design, data collection and analysis, and feedback for continual improvement. The improvements may include modifying course content, changing order of material within a course, changing order of courses or adding new courses to the curriculum, and increasing number of lab/problem sessions within a course; see Spurlinet al. [2]. The linkage of graduate attributes to course design is addressed in Barrie [3] and Jackson [4]. Framework and guidelines for OB assessment in engineering education can be found in Easa [5].
In Canada, the Canadian Engineering Accreditation Board (CEAB) has introduced in 2011 an OB criterion for the accreditation of engineering programs. The criterion includes the following graduate attributes [6]:

1.
A knowledge base for engineering: Demonstrated competence in university level mathematics, natural sciences, engineering fundamentals, and specialized engineering knowledge appropriate to the program.

2.
Problem analysis: An ability to use appropriate knowledge and skills to identify, formulate, analyze, and solve complex engineering problems in order to reach substantiated conclusions.

3.
Investigation: An ability to conduct investigations of complex problems by methods that include appropriate experiments, analysis and interpretation of data, and synthesis of information in order to reach valid conclusions.

4.
Design: An ability to design solutions for complex, open-ended engineering problems and to design systems, components or processes that meet specified needs with appropriate attention to health and safety risks, applicable standards, economic, environmental, cultural and societal considerations.

5.
Use of engineering tools: An ability to create, select, apply, adapt, and extend appropriate techniques, resources, and modern engineering tools to a range of engineering activities, from simple to complex, with an understanding of the associated limitations.

6.
Individual and team work: An ability to work effectively as a member and leader in teams, preferably in a multidisciplinary setting.

7.
Communication skills: An ability to communicate complex engineering concepts within the profession and with society at large. Such abilities include reading, writing, speaking and listening, and the ability to comprehend and write effective reports and design documentation, and to give and effectively respond to clear instructions.

8.
Professionalism: An understanding of the roles and responsibilities of the professional engineer in society, especially the primary role of protection of the public and the public interest.

9.
Impact of engineering on society and the environment: An ability to analyse social and environmental aspects of engineering activities. Such abilities include an understanding of the interactions that engineering has with the economic, social, health, safety, legal, and cultural aspects of society; the uncertainties in the prediction of such interactions; and the concepts of sustainable design and development and environmental stewardship. 10. Ethics and equity: An ability to apply professional ethics, accountability, and equity.
11. Economics and project management: An ability to appropriately incorporate economics and business practices including project, risk and change management into the practice of engineering, and to understand their limitations.
12. Lifelong learning: An ability to identify and to address their own educational needs in a changing world, sufficiently to maintain their competence and contribute to the advancement of knowledge.
The first six attributes are hard skills, while the last six are soft skills. According to the OB criterion, each engineering program in Canada must have a system in place for continuously assessing graduate attributes and using the assessment results to improve the program. The attributes and continual improvement process will not form the basis for accreditation decisions by CEAB until 2015.
Four CEAB graduate attributes were assessed at Ryerson University in the fourth-year common engineering course which addresses the legal and ethical aspects of engineering practice. The graduate attributes were as follows: professionalism (Attribute 8), impact of engineering on society and environment (Attribute 9), ethics and equity (Attribute 10), and lifelong D e c e m b e r 8 , 2 0 1 4 learning (Attribute 12). For these graduate attributes, eight indicators (learning objectives) were identified and assessed in the midterm exam (law part) and the final exam (ethics part) using multiple-choice questions (MCQ).
The purpose of this paper is to present an efficient method for assessing graduate attributes in large classes without sampling. The method, which is an extension of an earlier method that introduced the concept [7,8], involves defining an indicator by knowledge elements (topics) or by interaction elements in a case study. The proposed method involves a weighted-score scheme and performance scales for determining class performance and defining needed improvements.

PROPOSED METHOD
For large classes, it is often necessary to use MCQ for assessing student performance since other assessment methods may not be feasible. The proposed method uses MCQ for all students in the class to assess the indicators of the graduate attributes. The method involves defining multiple-choice question groups, determining class performance (individual weighted score, performance scales, and aggregate performance), and data processing,

Defining Multiple-Choice Question Groups
The idea of this method is to define a group of MCQ associated with the indicator to be assessed. The group of questions can be defined based on: (a) the knowledge elements associated with the indicator or (b) the interaction elements among different players in a case study. In the knowledge element-based groups, the indicator is represented by a number of knowledge elements (topics) that a student should know to fully understand the indicator (Fig. 1). The number of these elements, n, may vary from one indicator to another and among different assessors. A question is then developed for each of the elements. Each element i has a weight wi associated with it, where i = 1, 2, …, n. The weight of an element represents the importance of that element in engineering practice. For example, if w1 = 2 and w2 = 1, this means that Element 1 is twice as important as Element 2.

Fig 1: Defining an indicator by knowledge elements and their weights
In the interaction-based groups, a case study that involves a number of players (individuals and/or organizations) is developed. The players in the case study have n legal and ethical interactions (Fig. 2). A group of questions related to the interactions of the players is then developed to test student understanding of the underlying principles. Similar to the previous method, there are weights wi (i = 1, 2, …, n) associated with the interactions.

Determining Class Performance Individual Weighted Score
For each indicator, a group of n questions is designed to address the knowledge elements or the case study interactions that are related to each indicator. Define Qij as a binary variable such that Qij = 1 if student j answers question i correctly and Qij = 0 if student j answers the question incorrectly. Let the number of students assessed be defined as m. Then, Given the weight wi of element i, the performance of student j, Rj, is given by The maximum performance score by any student equals the total weight, W, which is given by D e c e m b e r 8 , 2 0 1 4

Performance Scales
In assessing the indicators of the graduate attributes, it is necessary to define scales for assessing class performance. The number of scales depends on the type of the graduate attribute and the indicator. Some methods consider three scales (e.g. poor, average, and excellent), while others consider five scales (e.g. poor, below average, average, above average, and excellent). The definition of the scales may be based on rubrics or some numerical values [2].
The performance scales in the proposed method are defined using the maximum performance score. For simplicity, a three-scale system is implemented: Excellent (E), Average (A), and Poor (P). The limits of the performance scale, Sj, are then defined as follows, Based on Eq. 3, the student performance is considered Excellent if the weighted score is equal to or greater than 80% of the maximum score and is considered Poor if the weighted score is less than 50% of the maximum score. The performance is considered Average if it is between the preceding limits.  Once the class performance is determined, a decision is made regarding whether improvements are needed for the indicator. In this regard, it is necessary to consider a threshold performance, TH, and a target performance, T. The threshold is the minimum acceptable level of performance for a given indicator [9] and the target is the intended level of learning proficiency for that indicator. Let the performance for an indicator be defined as the sum of PE and PA (excellent plus average). Then, if the performance is less than the threshold, this means improvements are needed. Typically, the improvement effort should focus first on such indicators, followed by those with class performances that are below the target.
Although improvements to all elements of the indicator should be made, more attention should be devoted to the particular element (question) in which the students have the most difficulty. This can be determined by defining the deficiency score for each question of the group which is given by where Di = deficiency measure of student performance for question i (%). The first term in the right side of Eq. 7 ( i w m ) is the maximum weighted score all students could achieve for question i. The second term is the actual weighted score that has been achieved. The adequacy of student performance for question i (%) equals 100 -Di. In improving the indicator, more attention should be devoted to the questions with large Di.

Data Processing
The multiple-choice midterm and final exams were conducted using scantron sheets in which students mark the answers. The sheets were then processed through the media printing centre and a report of the results of answering the questions along with summary statistics are emailed to the instructor. The report provides the number of exam questions that each student answers correctly. The report also presents for each student a row whose columns are the exam question numbers. In this row, the questions answered incorrectly are indicated in the respective cell by the choice number that the student selected, and for the questions answered correctly the cells are left blank. This format makes it easy to determine the questions in the group that are correctly answered.
The row information for each student is then used to determine the questions correctly answered for each group of questions that are related to the indicator. To simplify the collection of the data from the results report, the questions of each group are made consecutive in the exams. For each question group, the questions correctly answered by each student are recorded in a spreadsheet. The spreadsheet is then used to analyze the data as previously described.

APPLICATION
Eight indicators were assessed in the midterm and final exams using MCQ. The exams involved 60 and 80 questions, respectively. The question groups for assessing the indicators were subsets of the total exam questions. As previously mentioned, the assessment was performed for the entire class. The method is illustrated using two examples. Example 1 involves defining the indicator using knowledge elements and Example 2 involves defining the indicator using interaction elements in a case study.

Example 1: Knowledge Elements
To illustrate the application of the proposed method, the results of only one indicator (for the graduate attribute "lifelong learning") are presented here. The indicator is (at the end of the course the student will be able to) "Recognize the need for ongoing professional development to maintain competence in the field." This was the simplest indicator assessed as it was defined by only three knowledge elements as shown in Fig. 3. The assessment results of only 20 students are presented here to aid the illustration of the method. The question score (Qij), weighted score (wiQij), student weighted score (Rj), and student performance scale (Sj) are shown in Table 2. The weights are set as w1 = 2, w2 = 1, and w3 = 3. Based on the student performance scale of Table 2 (last column), the class performance is calculated using Eqs. 4-6 and the results are shown in Table 3. Table 2. Original data and total weighted score for each student in Example 1 (w1 = 2, w2 = 1, w3 = 3) Table 3. Class performance for application Example 1 (w1 = 2, w2 = 1, w3 = 3) D e c e m b e r 8 , 2 0 1 4 The sensitivity of class performance to different weighting scenarios is shown in Table 4. Scenario 1 represents the analysis previously presented. The three other scenarios correspond to weights (w1, w2, w3) of (2, 2, 2), (1, 2, 3), and (4, 1,  1). The results show that class performance is sensitive to the weights assigned to the questions. For example, the class performance of Poor for Scenarios 2, 3, and 4 ranges from 5% to 20%. It is useful to present the results in terms of the cumulative percentage of the three scales (PE, PA, and PP) as shown in Fig. 4. The threshold and target levels for this indicator were selected as 85% and 95%, respectively. If the cumulative performance of excellent and average is below the threshold, this indicates that attention for improving the indicator is needed. In this example, if Scenario 4 is the one determined by the instructors, the cumulative percentage of this scenario is 80% which lies below the threshold. This means that improvements to this indicator are needed. The deficiencies of the three questions are calculated, based on Eq. 7, as 16%, 3%, and 3%. Thus, more attention should be devoted to the element related to Question 1 (requirement for competence to practice engineering).

Example 2: Interaction Elements (Case Study)
The indicator assessed in this case study was (the student will be able to) "Evaluate ethical issues that may arise in engineering practice in terms of the professional code of ethics for engineers, with a focus of that for Ontario." The study was adapted from [10] and was included in the final exam. The students were asked to read the case study and answer four subsequent MCQ. The case study is as follows: XYZ orders 5000 custom made parts from ABC for one of its products. When the order is originally made ABC indicates it will charge $75 per part. This cost is based in part on the cost of materials. After the agreement is completed, but before production of the part begins, ABC engineer Christine Carsten determines that a much less expensive metal alloy can be used while only slightly compromising the integrity of the part. Using the less expensive alloy would cut ABC's costs by $18 a part. Christine brings this to the attention of ABC's Vernon Waller, who authorized the sales agreement with XYZ. Vernon asks, "How would anyone know the difference?" Christine replies, "Probably no one would unless they were looking for a difference and did a fair amount of testing. In most cases the performance will be virtually the same -although some parts might not last quite as long." Vernon says, "Great, Christine, you've just made a bundle for ABC." Puzzled, Christine replies, "But shouldn't you tell XYZ about the change?" "Why?" Vernon asks, "The basic idea is to satisfy the customer with good quality parts, and you've just said we will. So what's the problem?" The MCQ are presented in Appendix 1. For simplicity, equal weights of the interaction elements are considered and the assessment results are presented in Fig. 5a. Overall, the students have performed well in this indicator with only about 8% of the students with Poor performance. However, the question deficiency measure (Fig. 5b) shows that many students D e c e m b e r 8 , 2 0 1 4 have not performed well in Question 1 (D1 = 28.6%), while they did well in the other three questions (D2 = 11.4%, D3 = 7.2%, and D4 = 4.2%). These results clearly indicate the need for improving student understanding of the elements of the code of ethics which are the topic of Question 1. It was decided that a specific case study involving ethical issues will be included in the lecture to illustrate the elements of the code of ethics.

CONCLUDING REMARKS
An innovative method for assessing graduate attributes in large classes without sampling is presented in this paper. The method involves defining the indicator by knowledge elements that the student should know or by case studies that involve interactions representing the principles related to the indicator. Multiple-choice questions are established for the knowledge or interactions elements along with their weights. The student answers of the MCQ are processed using scantron sheets. The method involves a weighted-score procedure and performance scales for determining class performance. Based on this research, the following comments are offered: 1. The method performed well in assessing four graduate attributes in the common engineering course Law and Ethics in Engineering Practice. Application of the method to two indicators related to the graduate attributes "lifelong learning" and "Professionalism" are illustrated in this paper. The results show that class performance is sensitive to the question weights and therefore such weights should be carefully established. The proposed method can be used for assessing other soft-skill attributes. 2. The method requires setting weights of the knowledge or interaction elements and these weights may vary from one instructor to another. Therefore, it is necessary that a consistent procedure for establishing these weights be put in place at the department or faculty level. 3. The proposed method has a limitation. It cannot be used for assessing all graduate attributes. It is useful only for assessing graduate attributes that require knowledge of facts. In particular, some soft-skill attributes (e.g. communication skills and teamwork skills) are the hardest to assess using knowledge of facts, and other assessment methods such as rubrics would be necessary. 4. Future research should examine the sensitivity of the assessment results to the number of knowledge or interaction elements which are related to the indicator. In addition, the use of the clicker technology to collect and process the answers of the MCQ should be explored. D e c e m b e r 8 , 2 0 1 4