More on IRR
I apologize for the long gaps between posts these past few weeks. My new semester starts today and I had to finish up an online course, so my time to spend on my research/this blog was cut short. I am ready to really dive back in now though 🙂
First off, I would like to address some of the wonderful comments on my last post:
1) Mike – Thank you so much for the keyboard shortcut tips! I’ve been using a Mac for the past 8 years, and I still don’t have them all down. 😉
2) Ryan – Thank you for the solution for Row A or Row B or Row C – This will work perfectly. As you all have shown me, there are multiple ways to accomplish this. I’m sure there will be times where one method is easier/better than another.
3) Owen – You asked if I needed to see the row for Independent Working, and I’m not sure yet. If I wind up using any text labels that attach to the Independent working code, then yes, I will need it for scripting. As it is used now, I shouldn’t need it. I’ve been slowly getting the hang of the scripting and the ability to use mathematics to calculate values in the scripts.
4) Phillip – You raise an excellent point about some other ways to calculate IRR with more subjective rating systems. In terms of measuring independence, I don’t need to calculate a magnitude of independence, but there are definitely some behavioral observations for which magnitude would be a very appropriate measurement.
Here is an example where both frequency, magnitude, and duration would need to be coded and evaluated for IRR:
If I were coding a student for aggressive behavior, I would want my raters to look both at frequency, magnitude, and duration of the behavior. I would want to see if they marked the same incidents as “aggression” (the frequency IRR), I would want to compare the lengths of the incidents, and I would want to have operational definitions for different magnitudes of aggression (common magnitude measurements for aggression are: mild, moderate, and severe). I think I would use text labels for the magnitude categories so each incident would be labeled with the level.
For the duration, I would calculate IRR in the same way I calculated it in my last post.: (A & B)/(A or B) x 100. For frequency and magnitude it gets a little trickier. I can easily look at the timelines and compare incident by incident to determine if the same incidents were coded and if the magnitude levels match, but this doesn’t save me much time because I still have to analyze each list. If I used that method, StudioCode doesn’t actually do anything different for me than a simple matrix or a checklist would. Does anyone have any ideas for ways to use scripting to report out IRR for those measurements? I’m going to think about this one and see what I come up with. Right now I can easily figure out a way to see if their frequency counts are equal in number, but that wouldn’t necessarily tell me if my raters were counting the same incidents. One method in traditional IRR calculation is to divide the period into intervals and simply mark if the behavior did or didn’t occur in an interval, but StudioCode allows us to be so much more accurate than using general intervals. Also, I would be able to code leading events to the aggression to look for patterns in the function of the behavior (why the student was acting aggressively). Any thoughts on some easy ways to report some of this data with scripting?