HOTS assessment in circulatory system learning: Validity, reliability, and item quality

The curriculum plays an important role in improving the success of education. Therefore, it is a necessary curriculum that can make learners have high thinking and analysis power to be more active and creative during learning. The application of the 2013 curriculum in the current education system aims to encourage students to observe, ask questions, reason, and communicate what they get in school (Fanani, 2018). The level of understanding of learners is necessary to assess student ability. Assessment is collecting and processing information to measure the achievement of learners' learning outcomes and aims to monitor and evaluate the process, the progress of learning, and improvement of learners' learning outcomes on an ongoing basis (Hairun, 2020; Permana et al., 2021). A R T I C L E I N F O A B S T R A C T


INTRODUCTION
The curriculum plays an important role in improving the success of education. Therefore, it is a necessary curriculum that can make learners have high thinking and analysis power to be more active and creative during learning. The application of the 2013 curriculum in the current education system aims to encourage students to observe, ask questions, reason, and communicate what they get in school (Fanani, 2018). The level of understanding of learners is necessary to assess student ability. Assessment is collecting and processing information to measure the achievement of learners' learning outcomes and aims to monitor and evaluate the process, the progress of learning, and improvement of learners' learning outcomes on an ongoing basis (Hairun, 2020;Permana et al., 2021).
To carry out the assessment and reach the stage of analyzing and interpreting the information obtained, learners must have a high level of thinking skills during the learning process (Paidi et al., 2020;Permana & Setyawan, 2020). Higher-Order Thinking Skills (HOTS) is an ability to provide more information, foster critical thoughts, evaluate, have the ability to think metacognition, and solve problems. Bloom's taxonomy revised by Bloom's high-level thinking assessment has three indicators: analyzing, evaluating, and creating.
The high level of thinking ability of learners in Indonesia is still relatively low, as seen from the results of several international studies that measure the high level of thinking ability of learners, namely Trends in Mathematics and Science Study (TIMSS) and Program for International Student Assessment (PISA). This study was conducted in the scope of countries incorporated in the Organization for Economic Cooperation and Development (OECD). Based on TIMSS 2015, Indonesia is ranked below the science field score, which is still below the international average of 397 out of 500. This puts Indonesia in 45th place out of 50 countries.
In the PISA test results in 2018 released on December 3, 2019, Indonesia experienced a decrease in rank compared to 2015; Indonesia was ranked 74th out of 79 countries that took the survey. Indonesia's science performance capability scored 396 out of a top score of 590. From the results obtained, it can be seen that the high level of thinking ability of Indonesian learners is still very low (Thomson et al., 2019). The low value of TIMSS and PISA shows a lack of analytical power, understanding and reasoning in learners. This is due to a lack of thoroughness in understanding the problem (1) transformation errors due to miss-determining formulas, (2) process skill errors, and (3) errors in determining the final answer, due to writing down conclusions but not precisely (Fazzlah et al., 2020).
Based on observations made by researchers at Public Senior High School 1 Nan Sabaris Padang Pariaman, the West Sumatera Province, on September 12, 2019, researchers conducted a trial on high-order thinking skills. From the results of the trial question, it is known that the high order thinking skills of learners are still relatively low. They are evidenced by the average value of learners only 38.45 according to Prasetyani et al (2016) that the value in the range of 21-40 is included in the criteria less.
Several things influence the low ability of learners in answering questions, there are: (1) Teachers have not fully designed problems in high cognitive levels and difficulty turning problems with low cognitive levels (C1-C3) into problems with higher cognitive levels (C4-C6) so that students are not used to working on problems with high levels of difficulty. The usual problem is still in the cognitive realm of C1-C3. This can be seen from the analysis of daily replay questions compiled by biology subject teachers at Public Senior High School 1 Nan Sabaris, where problems with cognitive levels of C2 as much as 50%, C3 40%, and C4 10%.
(2) Learners are not used to doing problems with the help of Tables, graphs, and images, so that students have difficulty analyzing the data contained in Tables and images. In line with Harisman (2020) to develop high-level thinking skills, learners are influenced by the precepts and professionalism of teachers in developing instruments of high-level thinking ability.
One of the efforts to develop high-level thinking skills is to use appropriate assessments where assessment is an integral part of learning (Rahmi & Alberida, 2017). So for that, it is necessary to develop high-order thinking skills, assessment instruments that are valid, reliable, and have good quality items to train learners to have high-level thinking skills according to the demands of the 2013 curriculum (Otty & Milton, 2019).
Previous research on HOTS assessment shows that educator skills are needed in developing HOTS assessments (Widana, 2019). Research on HOTS assessment in physics subjects shows that HOTS assessment instruments effectively train student HOTS skills (Kusuma et al., 2017;Serevina et al., 2018). In addition, research on instrument HOTS in the subjects of the circulatory system shows that instruments can be used to access student HOTS abilities (Madang et al., 2017).
Based on these previous studies, it shows that no one has conducted research related to HOTS assessment, especially for subjects of the circulatory system. So that the novelty of this HOTS assessment research is the subjects used, namely the subject matter of the circulatory system. This research activity will provide many benefits, including this HOTS assessment can be used by educators to access student HOTS skills.
In this article, researchers develop a high-level assessment instrument of thinking ability about circulatory system material. Circulatory system material is material with a 3.6 for odd semester XI class with analyzing verb (C4). The purpose of this research is to produce High Order Thinking Skills instruments about circulatory system materials for students of grade XI Public Senior High School that are valid, reliable, and have good quality items.

METHOD
This type of research is research and development. This research aims to produce products in the form of high-level thinking ability assessment instruments on circulatory system materials for grade XI Public Senior High School students that are valid, practical, reliable, and have good item quality by using a 4-D development model. At the definition stage, problem analysis, students, curriculum, and learning objectives are performed. Then in the design stage is done creating grids, problem design, cover, and instructions of question work in general and specifically. At the stage of development, a question experiment has been done, and then an analysis of the quality of the item is carried out to obtain a valid instrument. However, due to time constraints and costs, the researchers did not perform the disseminate stage.
The study subjects consisted of experts. Two lecturers majoring in biology at Universitas Negeri Padang and two of biology subject teachers at Public Senior High School 1 Nan Sabaris and the test subjects were two teachers of biology subjects at Public Senior High School 1 Nan Sabaris and 33 students of grade XI mathematics and science 1 at Public Senior High School 1 Nan Sabaris. The object of this research is a research instrument of high order thinking skills about circulatory system material for students of grade XI Public Senior High School in the form of 50 questions. This research was conducted at Universitas Negeri Padang and Public Senior High School 1 Nan Sabaris Padang Pariaman. The resulting product is a highorder thinking skills research instrument about circulatory system material for Public Senior High School grade XI learners from November 26, 2020, to January 22, 2021.

RESULTS AND DISCUSSION
The logical validity of high-level thinking ability research instruments aims to prove the validity and representative of all possible grains made in the developed instrument. Two biology lecturers carried out this logical validity from the Mathematics and Natural Sciences Faculty, Universitas Negeri Padang, and two biology subjects at Public Senior High School 1 Nan Sabaris Padang Pariaman using validation questionnaires. During the validation process, suggestions were given by the experts to improve the quality of the instrument improvements can be seen in Table 1. After the revision, the validation questionnaire of high-level thinking ability instruments of the experts is analyzed which can be seen in the Table 2. Based on the results of data analysis in Table 1 obtained results, the instrument of high-level thinking ability meets the criteria and is valid. This is seen from the average validation value of the assessment instrument of 87.54 is included in the valid criteria (Ngalim, 2009). Therefore, the assessment of high-order thinking skills developed by the author is valid, both in terms of material, construction, language, and HOTS.
Analysis of question items was carried out after conducting instrument trials on 33 students of class XI mathematics and natural science 1 at Public Senior High School 1 Nan Sabaris Padang Pariaman. The analysis of the problem grains was carried out using ANATES 4.09. The analysis results obtained 44 valid questions with 88% and six invalid questions with a percentage of 12%. Thus, the number of problems with cognitive levels of C4 is as many as 34 questions, C5 as many as nine questions, and C6 1 problem. The percentage of cognitive levels can be seen in Figure 1. The difficulty of high-order thinking skills instruments is known from the results of problem analysis using ANATES version 4.09. Based on the results of the analysis of the problem items that have a moderate difficulty level of 41 questions and increase the difficulty of difficult categories by nine questions. Data analysis of difficulty level can be seen in Table 3. The differentiating power of high-order thinking skills instruments is known from the analysis of the question using ANATES version 4.09. Based on the results of grain analysis of the instrument having good differentiating power as much as 52%, the differentiating power is sufficient as much as 46%, and the bad differentiating as much as 2%. Data analysis can be seen in Table 4. Quality option of high order thinking skills instruments about the circulatory system material from the analysis of the problem using ANATES version 4.09 obtained excellent criteria 50%, both 35.5%, less good 12% and bad 2.5%. The results of the data analysis can be seen in Figure 2. Logical validity is obtained from the results of validation questionnaires filled by three experts. The criteria assessed in logical validity are four aspects, namely material, construction, language, and high level of thinking ability. According to Arikunto (2013) the results of data analysis showed that high order thinking skills instruments were included in valid criteria with an average value of 87.54%. Valid criteria on material aspects with a value of 87.5% indicate that the material used can improve the high-level thinking ability of learners, which is in line with Fanani (2018) stating that the material developed must be in accordance with the demands of core and basic competencies.
Judging from the construction aspect of the assessment instrument is in a valid category with a value of 87.5%. This indicates that the instrument has a clear problem formulation, homogeneous and logical answer options and is highly structured in accordance with the opinion of Wahyudi et al (2019) that validity construction is a validity that refers to the extent to which a measuring test instrument to be measured based on theoretical construction that can be used as the basis in the preparation of the instrument.
In terms of language, assessment instruments get a percentage of 87.5%. This indicates that this high order thinking skills instruments already uses language that is in accordance with the rules, easy to understand and does not give rise to double interpretation (Indra et al., 2018). This is in accordance with the opinion of Afrita & Darussyamsu (2020) that each item of the question must use clear language, easy to understand and in accordance with the enhanced spelling Judging from the high level of thinking aspects, there is an average score of 81.67 that achieves valid criteria. This indicates that the problem is already at C4-C6 level, discourse, Tables, graphs and images work well in accordance with the context of the material, the question of using interesting stimulus (Sayan & Mertoğlu, 2020) so as to measure students' ability in analyzing, evaluating and creating. According to Arif et al (2020) high order thinking skills is a process of thinking in higher levels of knowledge develop from various cognitive concepts and methods and learning taxonomies, such as using bloom taxonomies. Judging from the above four aspects obtained an average yield of 87.54% from three experts. Based on these categories, this high order thinking skills instruments meets logical validity and can be used by learners.
Overall the high order thinking skills instruments developed meets all four aspects in the validation test so that this instrument can be used for biological assessment, especially circulatory system material. Before use, empirical tests and practicality tests must be carried out so that the instruments used are valid and can be used to develop the high level of thinking ability of learners.
Empirical validity aims at the level of reliability of the instruments developed (Tapsir et al., 2018). The results of empirical validity were obtained from the analysis of questions using ANATES 4.09 with the question analyzed as many as 50 questions tested to 33 students of class XI mathematics and natural science 1, Public Senior High School 1 Nan Sabaris Padang Panjang. Based on the results of the analysis, this high order thinking skills instruments is valid with the results obtained, namely 44 valid questions with a percentage of 88% and 6 invalid questions with a percentage of 12%. Percentages for each cognitive level of the problem are C4= 77.27%, C5=20.45%, and C6=2.27%. An instrument is said to be valid when the instrument elements of the assessment are relevant and represent the construction of a measuring instrument that is targeted for a specific purpose (Ihsan, 2016).
Reliability is a state that indicates a fixed (consistent) state of the instrument (Utama et al., 2020). Reliable instruments will get the same results if tested in the same group at different times. This high order thinking skills instruments is highly reliable with a reliability value of 0.78. Reliability is the consistency of measurement results shown at different times on the same subject (Ndiung & Jediut, 2020). The test is said to be reliable if the gain score has a high correlation with the total score. The validity and reliability of an instrument is influenced by the subject being shaved, the user of the instrument, and the instrument itself (Purwanti et al., 2020). So validity and reliability must always be tested before the instrument is used.
A good difficulty level for an instrument is the difficulty level ranging in the range of 0.00 -1.00 (Arifin, 2017). The greater the difficulty of the count, the easier it will be for it to be revised. A question has p = 0.00 meaning that no student answers correctly and if it has p = 1.00 it means that all students answer correctly. The result of data analysis high order thinking skills instruments in get a problem that has a moderate difficulty level of 41 questions and increase difficulty category as much as 9 problems.
The differentiating power in high order thinking skills instruments has the aim to know and distinguish highly capable learners with low or moderate ability learners (Masitoh & Aedi, 2020). Based on the results of grain analysis of the instrument has a good differentiating power as much as 52%, the differentiating power is quite as much as 46% and the power of the bad differentiating as much as 2%. In line with Irmaya et al (2020) that good differentiating power can distinguish students who have mastered competencies with students who have not / lack of competence based on certain criteria.
The quality of options in an instrument is seen from how the learner chooses the answer to the given question. This aims to see if the phishing function is working properly or not. The quality of instrument options assessment ability to think highly about the circulatory system material from the results of analysis of the problem using ANATES version 4.09 obtained excellent criteria 50%, both 35.5%, less good 12% and bad 2.5%.
From this research, an instrument with appropriate indicators was obtained to assess the high level of thinking ability in the material of the circulatory system. The impact of the existence of this instrument for teachers is that it can be used as a means of training learners to familiarize themselves with thinking at a high level and become a collection of quality questions (Utama et al., 2020), for learners can be used to know and train high-level thinking skills (Kholis et al., 2020) and for schools this instrument can be useful to familiarize learners using high-level thinking skills so as to improve the quality of learners (Arifin, 2018;Wilson & Narasuman, 2020).

CONCLUSION
The resulting HOTS instruments have an average logical validity value of 87.54% with very valid criteria. Practical validity value is 88% with each percentage of C4 = 77.27%, C5 = 20.45%, and C6 = 2.27%, has very reliable reliability with a value of 0.78, has a moderate difficulty, good differentiating power, and excellent option quality. To sum up, that has been produced the instruments of HOTS about circulatory system materials for students of grade XI Public Senior High School are logically valid, empirical, and reliable. This conclusion is instruments that were designed can be used to measure learners' ability of high-level thinking. Therefore, as a recommendation, this instrument can be used to promote students' HOTS.