Running Behavioral Studies with Human Participants: A Practical Guide
Publication Year: 2013
Running Behavioral Experiments With Human Participants: A Practical Guide provides a concrete, practical roadmap for the implementation of experiments and controlled observation using human participants. Covering both conceptual and practical issues critical to implementing an experiment, the book is organized to follow the standard process in experiment-based research, covering such issues as potential ethical problems, risks to validity, experimental setup, running a study, and concluding a study.
The detailed guidance on each step of an experiment is ideal for those in both universities and industry who have had little or no previous practical training in research methodology. The book provides example scenarios to help readers organize how they run experimental studies and anticipate problems, and example forms that can serve as effective initial “recipes.” Examples and ...
- Front Matter
- Back Matter
- Subject Index
To our teachers, colleagues, students, and subjects who have taught us this material both explicitly and implicitly.
Copyright © 2013 by SAGE Publications, Inc.
All rights reserved. No part of this book may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without permission in writing from the publisher.
Printed in the United States of America
Library of Congress Cataloging-in-Publication Data
Running behavioral studies with human participants: a practical guide / Frank Ritter … [et al.].
Includes bibliographical references and index.
ISBN 978-1-4522-1742-0 (pbk.)
1. Human experimentation in psychology. 2. Psychology, Experimental. 3. Psychology—Research. I. Ritter, Frank E.
This book is printed on acid-free paper.
12 13 14 15 16 10 9 8 7 6 5 4 3 2 1
SAGE Publications, Inc.
2455 Teller Road
Thousand Oaks, California 91320
SAGE Publications Ltd.
1 Oliver's Yard
55 City Road
London EC1Y 1SP
SAGE Publications India Pvt. Ltd.
B 1/I 1 Mohan Cooperative Industrial Area
Mathura Road, New Delhi 110 044
SAGE Publications Asia-Pacific Pte. Ltd.
3 Church Street
#10-04 Samsung Hub
Acquisitions Editor: Reid Hester
Editorial Assistant: Sarita Sarak
Production Editor: Laureen Gleason
Copy Editor: Megan Granger
Typesetter: C&M Digitals (P) Ltd.
Proofreader: Vicki Reed-Castro
Indexer: Karen Wiley
Cover Designer: Anupama Krishnan
Marketing Manager: Lisa Brown
Permissions Editor: Karen Ehrmann
List of Figures and Tables[Page xi]Tables
- Table 1.1. Definitions. 4
- Table 1.2. Summary of example studies used across chapters. 14
- Table 4.1. Type I and II errors in testing the null (H0) and experimental (Ha) hypotheses. 85
- Table 6.1. Response time in seconds by learning trial (hypothetical data). Italics indicate the first trial after discovering a shortcut; bold indicates the trials before discovering the shortcut. 126
- Figure 1.1. A pictorial summary of the research process. This is similar to but developed separately from Bethel and Murphy's (2010) figure describing the process for human–robotic studies. 6
- Figure 2.1. A pictorial summary of the study preparation process, along with the sections (§§) that explain that step. 20
- Figure 2.2. A screenshot of logged data recorded in RUI. 24
- Figure 2.3. Example interfaces RUI can be run on (ER1 robot and the dismal spreadsheet). 25
- Figure 2.4. Subject wearing a head-mounted eye tracker. 26
- Figure 2.5. Example diagrams of space for running studies. Hollow walls indicate sound-proofed walls, and a triangle on the door indicates a sweep to help block sound. 27 [Page xii]
- Figure 2.6. Example eye-tracking traces showing different strategies for solving the same problem. 30
- Figure 3.1. Pictorial summary of potential ethical risks, along with the section (§) or sections (§§) that explain each risk. 58
- Figure 4.1. Pictorial summary of potential risks to validity, along with the section (§) or sections (§§) that explain that risk. 80
- Figure 5.1. A pictorial summary of preparing a research session, along with the section (§) or sections (§§) that explain each step. 100
- Figure 5.2. A storage space used as a single-subject data-collection station. 101
- Figure 5.3. An office space used to house multiple data-collection stations. 102
- Figure 5.4. A pictorial summary of running a research session, along with the section (§) or sections (§§) that explain each step. 107
- Figure 6.1. A pictorial summary of concluding a research session and concluding a study, along with the section (§) or sections (§§) that explain each step. 122
- Figure 6.2. Mean response time as a function of trial, with power law fit (data from Table 6.1; left) and the individual learning curves (right) superimposed on the average response time. 126
There are few practical guides on how to prepare and run experiments with human participants in a laboratory setting. In our experience, we have found that students are taught how to design experiments and analyze data in courses such as Design of Experiments and Statistics. On the other hand, the dearth of materials available to students preparing and running experiments has often led to a gap between theory and practice in this area, which is particularly acute outside of psychology departments. Consequently, labs frequently must impart these practical skills to students informally.
This guide presents advice that can help young experimenters and research assistants run experiments more effectively and more comfortably with human participants. In this book, our purpose is to provide hands-on knowledge about and actual procedures for experiments. We hope this book will help students of psychology, engineering, and the sciences run studies with human participants in a laboratory setting. This will particularly help students (or instructors and researchers) who are not in large departments or who are running participants in departments that do not have a large or long history of experimental studies of human behavior. This book is also intended to help people who are starting to run user and usability studies in industry.
The book is an attempt to make the implicit knowledge in this area, “the just common sense” as one reviewer called it, be more explicit and more common. David Foster Wallace noted this effect in his retelling of a parable of how fish don't know they are in water.1 The same effect happens in our field, where the knowledge of how to implement and run a study is often known implicitly, and thus it is hard to learn if you are outside of the community that uses that knowledge.[Page xiv]
We have addressed this book to advanced undergraduates and early graduate students starting to run experiments without previous experience, but we believe this guide will be useful to anyone who is starting to run research studies, training people to run studies, or studying the experimental process. It should also be useful to researchers in industry who are also starting to run studies.
We are generally speaking here from our background running cognitive psychology, cognitive ergonomics, and human–computer interaction studies. Because it is practical advice, we do not cover experimental design or data analyses. This practical advice will be less applicable in more distant areas, or when working in more complex situations, but may still be of use. For example, we do not cover how to use complex apparatus, such as functional magnetic resonance imaging (fMRI) or event-related potential (ERP). We also do not cover field studies or studies that in the United States require a full a review by an Institutional Review Board (IRB). This means that we do not cover how to work with unusual populations such as prisoners, animals, and children, or how to take and use measures that include risks to the subjects or experimenter (e.g., saliva, blood samples, or private information).
This book arose during a discussion at Jong Kim's PhD graduation. Ritter asked Kim where he thought more training might have been helpful; the conversation turned to experimental methods and the tactics and details of running studies. During the graduation ceremony, they outlined this book—a worthy genesis for a book, we think. Since then, we and others have used it to teach both in classrooms and as conference tutorials, and it has been expanded, corrected, and extended.
When running an experiment, ensuring its repeatability is of greatest importance—addressing variations in either method or participant behavior is critical. Running an experiment in exactly the same way regardless of who is conducting it or where (e.g., different research teams or laboratories) is essential. In addition, reducing unanticipated variance in the participants’ behavior is key to an experiment's repeatability. This book will help you achieve these requirements, increasing both your comfort and that of the participants in your experiments.
This book consists of several sections, with multiple appendices. As an advance organizer, we briefly describe each section's contents.
Chapter 1, Introduction, provides an overview of the research process and describes where experiments and controlled observation fit into the research process. If you have taken either an experimental methods course or a research design course, you can skip this chapter. If, on the other hand, you are either a new research assistant or working on a project in which you are unclear of your role or how to proceed, this chapter may provide some helpful context. This chapter also introduces several running examples.[Page xv]
Chapter 2, Preparation for Running Experiments, describes pertinent topics for preparing to run your experiment—such as supplemental reading materials, recruitment of participants, choosing experimental measures, and getting IRB approval for experiments involving participants.
Chapter 3, Potential Ethical Problems, describes ethical considerations necessary for safely running experiments with human participants—that is, how to ethically recruit participants, how to handle data gathered from participants, how to use that data, and how to report that data. Being vigilant and aware of these topics is an important component to rigorous, as well as ethical, research.
Chapter 4, Risks to Validity to Avoid While Running an Experiment, describes risks that can invalidate your experimental data. If you fail to avoid these types of risks, you may obtain either false or uninterpretable results from your study. Thus, before starting your study, you should be aware of these risks and how to avoid them.
Chapter 5, Running a Research Session, describes practical information about what you have to do when you run a research session. This section will give an example procedure that you can follow.
Chapter 6, Concluding a Study, describes practical information about what to do at the conclusion of each experimental session and at the end of a study.
The Afterword briefly summarizes the book and describes the appendices.
The Appendices include an example checklist for preparing a study, a checklist for setting up a study, an example consent form, an example debriefing form, and an example IRB form. The details and formats of these forms will vary by lab and IRB committee, but the materials in the appendices provide examples of the style and tone. There is also an appendix on how this material could apply to online studies.
A website holding supplementary material is available at http://www.frankritter.com/rbs/.
1 We thank Josh Gross for pointing out this story to us.[Page xvi]
Crhristine Cardone at SAGE provided some encouragement when we needed it, and the production team at SAGE has been very helpful and has greatly improved this book. Numerous people have given useful comments, and when they have used it in teaching we note that here as well. Ray Adams (Middlesex), Michelle Ahrens, Susanne Bahr (Florida Institute of Technology, who suggested the figures at the start of each chapter), Ellen Bass, Gordon Baxter (St. Andrews), Stephen Broomell, Karen Feigh (Georgia Institute of Technology, several times), Katherine Hamilton, William (Bill) Kennedy, Alex Kirlik (U. of Illinois), Michelle Moon, Geoffrey Morgan (CMU), Razvan Orendovici, Erika Poole, Michael (Q) Qin (NSMRL/U. of Connecticut), Joseph Sanford, Robert West (Carleton), Hongbin Wong (U. of Texas/Houston), Kuo-Chuan (Martin) Yeh, Xiaolong (Luke) Zhang (PSU), and several anonymous reviewers have provided useful comments. Ryan Moser and Joseph Sanford in the ACS Lab at Penn State have helped prepare this manuscript. Any incompleteness or inadequacies remain the fault of the authors.
Preparation of this manuscript was partially sponsored by a grant from the Division of Human Performance Training and Education at the Office of Naval Research, under Contracts W911QY-07-01-0004 and N00014-11-1-0275. The views and conclusions contained in this report are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the U.S. Government or The Pennsylvania State University.[Page xviii]
About the Authors[Page xix]
Frank E. Ritter is a professor of information sciences and technology, of psychology, and of computer science and engineering at Penn State. He researches the development, application, and methodology of cognitive models—particularly as applied to interface design, predicting the effect of behavioral moderators, and understanding learning—and conducts experiments to test these models. With Martin Yeh, he has developed a popular iPhone app, CaffeineZone, for predicting the time course and effects of caffeine. His lab is building and testing tutors on several topics. His report on applying cognitive models in synthetic environments was published by the Human Systems Information Analysis Center as a State of the Art Report (2003). His book on order effects on learning was published in 2007 by Oxford, and he contributed to a National Research Council report on how to use cognitive models to improve human-system design (Pew & Mavor, 2007). He is working on a textbook addressing the ABCs of the psychology that systems designers need to know (to be published by Springer), which has repeatedly won awards at the Human–Computer Interaction Consortium's annual meeting.
His papers on modeling have won awards; one on high-level languages, with St. Amant, was selected for the “Siegel-Wolf Award for Best Applied Modeling Paper” at the International Conference on Cognitive Modeling, and four have won awards at the Behavior Representation in Modeling and Simulation Conference. He currently edits the Oxford Series on Cognitive Models and Architectures for Oxford University Press. He serves on the editorial boards of Cognitive Systems Research, Human Factors, and IEEE SMC, Part A: Systems and Humans.
Jong W. Kim is a research faculty member in the Department of Psychology at the University of Central Florida. He received his PhD in the Department of Industrial Engineering at the Pennsylvania State University. His academic pursuit is to improve cognitive systems supporting optimized user performance. To that end, he runs experiments with human subjects and models human [Page xx]cognition. His recent research, sponsored by the Office of Naval Research, has investigated skill learning and forgetting, and he has developed a theory of skill retention that is being applied to a couple of intelligent tutoring systems. His current research projects focus on the influence of affect on the three stages of learning by an understanding of non-vocal expressions. Particularly, he is interested in helping autistic children learn social communication skills with human-centered computer systems.
Jonathan H. Morgan is a research assistant and lab manager for Penn State's Applied Cognitive Science Lab, where he manages people running studies about learning, retention, and usability. Morgan has published in Computational and Mathematical Organization Theory, received two paper awards from the Behavior Representation in Modeling and Simulation conference committee, and coauthored papers published in the proceedings of the annual conference of the Cognitive Science Society, the International Conference on Cognitive Modeling, and the annual conference of the Biologically Inspired Cognitive Architectures Society. He has also contributed to the design, development, and testing of two tutors. His current research includes modeling socio-cognitive processes and examining the acquisition of procedural knowledge in complex tasks.
Richard A. Carlson is professor of psychology at Penn State University, where he has been on the faculty for 27 years. He received his BSS from Cornell College and his PhD from the University of Illinois. He conducts experiments examining cognitive control, cognitive skill, and conscious awareness, focusing on control at the time scale of 1 second or less. Previous research has addressed topics such as causal thinking, the development of troubleshooting skill, task switching, the role of gesture in mental arithmetic, and the structure of conscious intentions. Current research projects focus on the role of affect in working memory and cognitive control, the effect of cognitive workload on metacognition, and changes in metacognition with increasing skill. He has published in journals such as Journal of Experimental Psychology: Learning, Memory, and Cognition; Memory & Cognition; and Human Factors. His book, Experienced Cognition (1998), which describes a theory of consciousness and cognitive skill, won a CHOICE Outstanding Academic Book award. Professor Carlson currently serves as associate head and director of Undergraduate Studies in Penn State's Department of Psychology. He is the founding coordinator of the department's online psychology major. In 2009, he received an Outstanding Faculty Adviser award. He serves on the editorial boards of the Journal of Experimental Psychology: Learning, Memory, and Cognition; Behavior Research Methods; and the American Journal of Psychology. He is a fellow of the American Psychological Association. His website is http://psych.la.psu.edu/directory/faculty-bios/carlson.html.
There are many books available concerning research methods and related statistical analyses. We realized, however, that students usually have few opportunities to learn and practice the implicit knowledge associated with running their own experiments, and there are no books that we are aware of that either formalize or teach students about these practical considerations (e.g., preparing experimental environments and scripts, coordinating participants, or managing data, etc.).
Students charged with running experiments frequently lack specific domain knowledge regarding experimental methods. Consequently, young researchers chronically make preventable mistakes. With this book, we hope to provide practical knowledge about running experiments for students and industrial researchers. The topics and guidance contained in this book arise from the authors' collective experience in both running experiments and mentoring students.
To summarize, the book covered four major topics. First, we discussed how to run a study safely. This included how to recruit participants ethically and methods for minimizing risks to participants and experimenters.
Second, we discussed experimental repeatability and validity. We described methods for ensuring repeatability so that others can replicate and validate your work. For example, we described how to present the same experience to each subject across every session (and how to minimize differences between study sessions). This is an important aspect of being able to replicate and interpret the data. We also discussed strategies for mitigating risks, such as experimenter or demand effects that might jeopardize the uniformity of this experience.
Third, we discussed how to potentially gain further insights from the experimental process. These insights may or may not be outside the strict bounds of your experiment, but in either case, they can lead to new and often very productive work. We described approaches to piloting, debriefing, [Page 138]and data analysis that make further insights more likely, and we provided anecdotal examples of these approaches in action. You may not be able to examine all of the insights with the study you are running, but you can analyze many of them in later studies.
Fourth, we discuss recording data and reporting results. The process of setting up and running the study is to learn something new and to share it. The work is not completely done until it is published. Some publications can come much later than we might anticipate or prefer, so documentation of the steps and what the data and analyses are will help with this payoff. We discuss matching study goals to publishing goals, including the potential ramifications of publishing goals for the Institutional Review Board process. Examples of the forms you will need are provided in the appendices.
Stepping back for a moment, we recognize that further methods of gathering, analyzing, and visualizing data are being developed. Though these changes will impact the development of future experimental procedures, the gross structures of a study and the aspects we have discussed here (piloting, scripts, anonymizing data, etc.) are unlikely to change.
As you venture into research, you will find new topics that will interest you. In this text, we are unable to examine all the populations or touch on all the measures and tools necessary for exploring research questions in cognitive science, human factors, human–computer interaction, or human–robot interaction. Consequently, we are not able to cover in detail the collection of eye tracking, biological specimens, or functional magnetic resonance imaging. Nevertheless, we believe this book will allow you to anticipate many of the risks associated with research design and implementation in these areas. In essence, we believe you will ask the right questions that will allow you to successfully run studies using these techniques, when supplemented by further reading, consultation with colleagues, and practice.
Running studies is often exciting work, and it helps us understand how people think and behave. It offers a chance to improve our understanding. We wish you good luck, bonne chance, as you endeavor to learn and share a little bit more about the world.
Appendix 1: A Checklist for Preparing Studies[Page 139]
This checklist contains high-level steps that are nearly always necessary for conducting studies with human participants. As an experimenter or a principal investigator for your project, you need to complete the tasks below to set up a study. You might use this list verbatim, or you might modify it to suit your study. The list is offered in serial order, but work might go on in parallel or in a different order.[Page 140]
Appendix 2: Example Scripts for Running Studies[Page 141]A High-Level Script for a Human–Computer Interaction Study
This is an example high-level summary script for an experiment. While experiments and controlled observations will differ across types and across studies, this script includes many common elements. It was used for Kim's (2008) PhD thesis study and has been slightly revised.Experimenter's Guide
Every experimenter should follow these procedures to run our user study about skill retention.
- Before you come to the lab, dress appropriately for the experiment. [Also see section 5.1.3 covering the role of a dress code.]
- Before your participants arrive, you need to set up a set of the experiment apparatus.
- Start RUI (Recording User Input) in the Terminal Window. (See details below.)
- Start the Emacs text editor.
- Prepare disposable materials and handouts, such as the informed consent form.
- Turn on the monitor located in the experimental room so that you can monitor the participant outside the room.
- Welcome your participants when they arrive.
- Put a sign on the door indicating that you are running subjects when the experiment starts.[Page 142]
- Give the Institutional Review Board approved consent form to the participants and have them read it.
- If they consent, start the experiment.
- Briefly explain what they are going to do.
- Give them the study booklet. Participants can use 30 min. maximum to study the booklet.
- While participants are reading the booklet, you can answer their questions about the task.
- When the session is finished, give an explanation about the payment or extra credit. Thank them; give them a debriefing form. Also, if there are any additional sessions, remind them.
- Take down the sign on the door when the experiment is done.
- Copy the data to the external hard drive.
- Shut down the apparatus.
- Prepare supplies for the next subject.
RUI will be used to log keystrokes and mouse actions of the participant. RUI requires Mac OS X 10.3 (Panther) or later versions. It has been tested up to Mac OS X 10.5.8 (Snow Leopard). For RUI to record user inputs, “Enable access for assistive devices” must be enabled in the Universal Access preference pane. To start RUI:
- Launch Terminal.
- In Terminal, type the below information: “./rui -s “SubjectID” -r ~/Desktop/ruioutput.txt”
- You will get this message: “rui: standing by—press ctrl+r to start recording …”
- Press “CTRL+r”
- To stop recording, press “CTRL+s”
Note; If you see the message “-bash: ./rui: Permission denied” in the Terminal window, you need to type “chmod a+x rui” while you are in the RUI directory.[Page 143]A More Detailed Script for a Cognitive Psychology Experiment
This script, slightly revised, was used in conducting an experiment reported in Carlson and Cassenti (2004).
- Access the names of participants from the subject pool. Go to subject pool under “favorites” in Explorer, type in experiment number 1013 and password ptx497. Click on the button labeled “view (and update) appointments.” Write down the names of participants on the log sheet before they start arriving.
- Turn on computers in the subject running rooms if they aren't already on. If a dialog box comes up asking for you to log in, just hit cancel.
- As participants arrive, check off their names on your list of participants. Make sure they are scheduled for our experiment—sometimes students go to the wrong room.
- Give each participant two copies of the informed consent form (found in the wooden box under the bulletin board). Make sure they sign both copies and you sign both copies. Make sure to ask if the participant has ANY questions about the informed consent form.
- Fill out the subject running sheet with subject's FIRST name only, handedness (right or left), gender, the room in which he or she will be run, and your name.
- Begin the experiment by clicking on “simple counting” file on desktop. Once the program opens, press F7. Enter the subject number from the subject running sheet. When it asks for session number, you should always enter “1.” Double-check the information when the confirmation box comes up. If the next screen asks you if it's okay to overwrite data, click “no” and put in a different subject number, changing the experiment sheet as needed. If you want to do all this while the participant is reading the informed consent to save time, go right ahead, but make sure to answer any informed-consent–related questions the participant may have.
- Take the participant to the room and say the following: “This experiment is entirely computerized, including the instructions. I'll read over the instructions for the first part of the experiment with you.” Read the instructions on the screen verbatim. Ask if the participant has any questions. After answering any questions the participant may have, leave the room and shut the door behind you. Place the “Experiment in Progress” sign on the door.[Page 144]
- At two points during the experiment, subjects will see a screen asking them to return to Room 604 for further instructions. When they come out, you can lead them back to the room, taking along the paper for Break #1 and a pen. Read aloud to them the instructions printed on the top of the sheet and ask if they have any questions. Give the participants 2 minutes to work on their list, then come back in and press the letter g (for “go on”). This will resume the experiment where they left off. Ask again if they have any questions, then leave the room again and allow them to resume the experiment. The second time the subject returns to 604, follow the same procedure, this time with the instructions and paper for Break #2.
- Fill out the log sheet if you haven't done so. You should have the necessary information from the subject pool. If somebody is signed up but doesn't show up, fill out the log sheet for that person anyway, writing “NS” next to the updated column.
- Fill out a credit slip for each participant, and be sure to sign it.
- Update participants on the web. Anyone who doesn't show up (and hasn't contacted us beforehand) gets a no-show. People who do show up on time should be given credit. If they come too late to be run, you may cancel their slot.
- Participants should leave with three things: a filled out credit receipt, a signed informed consent form, and a debriefing sheet. Ask them if they have any other questions, and do your best to answer them. If you don't know the answer, you can refer them to Rachel or Rich (info at the bottom of debriefing). Make sure to thank them for their participation
- When done for the day, lock up subject running rooms (unless someone is running subjects immediately after you and is already there when you leave). If you are the last subject runner of the day, please turn off the computers. Always lock up the lab when you leave unless someone else is actually in the lab.
Appendix 3: Example Consent Form[Page 145]
You can refer to this example of an informed consent form, taken from Kim's (2008) thesis study, when you need to generate one for your experiment.
Appendix 4: Example Debriefing Form[Page 149]
This is the debriefing form, very lightly edited, used in the study reported in Ritter, Kukreja, and St. Amant (2007).
Appendix 5: Example Institutional Review Board Application[Page 151]
Your Institutional Review Board (IRB) will have its own review forms. These forms are based on each IRB's institutional history and the types of studies and typical problems (and atypical problems) they have had to consider over time. Thus, the form we include here can be seen only as an example form. We include it to provide you with an example of the types of questions and, more important, the types of answers characteristic of the IRB process (at least at PSU). You are responsible for the answers, but it may be useful to see how long answers are and how detailed they need to be.
Following is a form used in one of our recent studies in the lab (Paik, 2011), slightly revised to correct some errors.
Appendix 6: Considerations when Running a Study Online[Page 167]
Many studies are now moving “online”—that is, the subjects are interacting with experiments that are run online through a web browser (the Social Psychology Network provides a list of such studies at http://www.socialpsychology.org/expts.htm). Using online studies, when properly done, has the ability to greatly increase your sample size, and these studies certainly offer the possibility of a much more diverse sample. You can, however, lose experimental control (you won't actually know who is participating in many circumstances), and some technical sophistication may be required to create and use an online study.
Online studies have some special considerations. This section notes a few considerations to keep in mind when running these studies. This section does not consider the choice of tools to run a study, such as Amazon's Mechanical Turk or commercial tools to create surveys, because the book focuses on how to start running studies, not how to design, implement, or analyze them, per se. This appendix is also not complete because online surveys is a growing area, and this appendix is designed only to introduce you to some of the issues in this area. For more complete treatments, see the references in the “Further Readings” section.Recruiting Subjects
If you are recruiting subjects to participate in a study, you might choose to go online to recruit them. If you do so, keep in mind that the request should be fair, and if your study is under an Institutional Review Board (IRB), how you recruit must go through the IRB as well.[Page 168]
There is a delicate balance to sharing information and drawing attention to opportunities appropriately that most people understand in the real world, but we are still learning about in the online world. We have argued previously (Cheyne & Ritter, 2001) that you should not recruit subjects through unsolicited direct e-mail, although our university does this to distraction at times. For example, it seems inappropriate to send announcements about competitions to create “learning badges” to “professors at universities we could find,” as a private university in Durham, North Carolina, recently did.
Putting the flyer (or announcement) on a very relevant mailing list can be acceptable if such a mailing list is available and appropriate. Posting announcements of studies on related websites can also be very appropriate. It may also be advisable, and perhaps overlooked, to disseminate study announcements for online studies through the same channels you would use for a non-online study, such as flyers and class announcements. But, it is very inappropriate to send such materials where there is little chance of finding suitable subjects or where the subjects being recruited are reluctant to participate. Take advice if you have doubts.
If your subjects are recruited in such a way that you don't see them, you might wish to take a few more demographic measures, depending on your theory and the hypothesis—for example, what countries subjects are in (if your software can't tell from the IP addresses of their machines) or level of education and first language. One of the clearest summaries of this problem was noted in Lock Haven University's student newspaper (October 14, 2010, p. A7) about their online poll:
This … poll is not scientific and reflects the opinions of only those Internet users who have chosen to participate. The results cannot be assumed to represent the opinions of Internet users in general, nor the public as a whole.
If you can work around this restriction—for example, finding best performance or examples—then your results will be worthwhile. If you gather the results as representative, then you are subject to this restriction.
If the link to your software has been widely disseminated, you should have the software fail gracefully after the study is done. For example, if your survey is no longer up on your web server, you could put a page up noting this and thanking those who have participated.Apparatus
Because the apparatus for gathering the data will be automatic and you will not be able to answer questions that arise (in most cases), the interaction needs [Page 169]to be particularly clear and correct. So you should run more extensive pilot studies than you would for other studies, examine the interaction experience yourself, and have the principal investigator and other research assistants use the apparatus to make sure there are no typos, unclear wordings, or other potential problems. You should also back up information from the server you are using to another machine daily.
If your apparatus is taking timing information, you should test this and not take it for granted. A timer that reports user interaction times with millisecond precision doesn't necessarily generate time stamps that are accurate to a millisecond. This can be difficult to test, but before you report timing data, you should attempt to measure its accuracy.Gaming your Apparatus
You should check your data daily. This will help you tell how subject recruitment is going. It will also help you see if a person or group is gaming the experiment. Subjects might be doing it multiple times because it is fun (but this might not provide useful data or might slow down your server), or they might enjoy “messing up” your experiment. If you find anomalies, you should contact your principal investigator with these concerns. You should also talk about criteria for removing data you believe are not provided in earnest.Further Readings[Page 170]2004). Psychological research online: Report of Board of Scientific Affairs’ Advisory Group on the Conduct of Research on the Internet. American Psychologist, 59(2), 105–117.http://dx.doi.org/10.1037/0003-066X.59.2.105, , , , , & (2011). Comparing the accuracy of RDD telephone surveys and Internet surveys conducted with probability and non-probability samples. Public Opinion Quarterly, 75, 709–747. These papers by Kraut et al. and Yeager et al., available online, describe some of the theoretical differences between real-world and Internet studies, and online and telephone surveys, including the need to understand who your respondents are.http://dx.doi.org/10.1093/poq/nfr020, , , , , , et al. (2007). Oxford handbook of Internet psychology. New York: Oxford University Press. This book includes a section (eight chapters) on doing research on the Internet.http://dx.doi.org/10.1093/oxfordhb/9780199561803.001.0001, , , & . (
References[Page 171][Page 178]2005). Effects of pixel shape and color, and matrix pixel density of Arabic digital typeface on characters’ legibility. International Journal of Industrial Ergonomics, 35(7), 652–664. American Federation for the Blind. (2012, January). Interpreting Bureau of Labor Statistics employment data. Retrieved from http://www.afb.org/Section.asp?SectionID=15&SubTopicID=177, & (American Psychological Association. (2010). Publication manual of the American Psychological Association (6th ed.). New York: Author.2004). Eye movements do not reflect retrieval processes. Psychological Science, 15(4), 225–231.http://dx.doi.org/10.1111/j.0956-7976.2004.00656.x, , & (2002). Using multidisciplinary expert evaluations to test and improve cognitive model interfaces. In Proceedings of the 11th Computer Generated Forces Conference (pp. 553–562). Orlando: University of Central Florida., & (2010). Review of human studies methods in HRI and recommendations. International Journal of Social Robotics, 2, 347–359.http://dx.doi.org/10.1007/s12369-010-0064-9, & (1984). The 2 sigma problem: The search for methods of group instruction as effective as one-to-one tutoring. Educational Researcher, 13, 3–16.(2001). The Spiral Model as a tool for evolutionary acquisition. Crosstalk: The Journal of Defense Software Engineering, 14(5), 4–11., & (2003). Averaging learning curves across and within participants. Behavior Research Methods, Instruments and Computers, 35, 11–21.http://dx.doi.org/10.3758/BF03195493, & (1897). Studies in the physiology and psychology of the telegraphic language. Psychological Review, 4, 27–53.http://dx.doi.org/10.1037/h0073806, & (1963). Experimental and quasi-experimental designs for research. Boston: Houghton Mifflin., & (2007). What do the hands externalize in simple arithmetic?Journal of Experimental Psychology: Learning, Memory, and Cognition, 33, 747–756.http://dx.doi.org/10.1037/0278-73184.108.40.2067, , , & ([Page 172]2004). Intentional control of event counting. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30, 1235–1251.http://dx.doi.org/10.1037/0278-73220.127.116.115, & (Carroll, J.M. (ed.). (2000). HCI models, theories, and frameworks: Toward a multidisciplinary science. Burlington, MA: Morgan-Kauffmann.2009). Transfer of computer-based training to simulated driving in older adults. Applied Ergonomics, 40, 943–952.http://dx.doi.org/10.1016/j.apergo.2009.02.001, & (2001). Targeting respondents on the Internet successfully and responsibly. Communications of the ACM, 44(4), 94–98.http://dx.doi.org/10.1145/367211.367276, & (1988). Statistical power analysis for the behavioral sciences ((2nd ed.). Hillsdale, NJ: Lawrence Erlbaum.1992). A power primer. Psychological Bulletin, 112, 155–159.http://dx.doi.org/10.1037/0033-2909.112.1.155(2004). Methods in behavioral research ((8th ed.). New York: McGraw-Hill.1959). A theory of the acquisition of speed-skill. Ergonomics, 2, 153–166.http://dx.doi.org/10.1080/00140135908930419(Darley, J.M., Zanna, M.P., & Roediger, H.L. (eds.). (2003). The compleat academic: A practical guide for the beginning social scientist (2nd ed.). Washington, DC: American Psychological Association.1996). Perception and memory in chess. Assen, Netherlands: Van Gorcum., & (1998). The strategy specific nature of improvement: The power law applies by strategy within task. Psychological Science, 9(1), 1–8.http://dx.doi.org/10.1111/1467-9280.00001, , , & (2004). The role of representative design in an ecological approach to cognition. Psychological Bulletin, 130, 959–988.http://dx.doi.org/10.1037/0033-2909.130.6.959, & (1994). Equity in authorship: A strategy for assigning credit when publishing. Social Science & Medicine, 38(1), 55–58.(1964). Memory: A contribution to experimental psychology. New York: Dover. (Original work published 1885)http://dx.doi.org/10.1037/10011-000(1980). Protocol analysis: Verbal reports as data. Psychological Review, 87, 215–251.http://dx.doi.org/10.1037/0033-295X.87.3.215, & (1993). Protocol analysis: Verbal reports as data (, & (2nd ed.). Cambridge, MA: MIT Press.1956). The problem of inference from group data. Psychological Bulletin, 53, 134–140.http://dx.doi.org/10.1037/h0045156(2003). When your eyes have a wet nose: The evolution of the use of guide dogs and establishing the seeing eye. Survey of Ophthalmology, 48(4), 452–458.http://dx.doi.org/10.1016/S0039-6257%2803%2900052-3(1954). The information capacity of the human motor system in controlling amplitude of movement. Journal of Experimental Psychology, 47(6), 381–391.http://dx.doi.org/10.1037/h0055392(2008). Implementierung von schematischen Denkstrategien in einer höheren Programmiersprache: Erweitern und Testen der vorhandenen Resultate durch Erfassen von zusätzlichen Daten und das Erstellen von weiteren Strategien [Page 173][Implementing diagrammatic reasoning strategies in a high level language: Extending and testing the existing model results by gathering additional data and creating additional strategies]. Faculty of Information Systems and Applied Computer Science, University of Bamberg, Germany.(1988). Development of the NASA-TLX (Task Load Index): Results of empirical and theoretical research. In P.A.Hancock & N.Meshkati (eds.), Human mental workload (pp. 139–185). Amsterdam: North Holland., & (2000). Repealing the power law: The case for an exponential law of practice. Psychonomic Bulletin & Review, 7, 185–207.http://dx.doi.org/10.3758/BF03212979, , & (1976). Orientation and mobility techniques: A guide for the practitioner. New York: American Foundation for the Blind., & (2008). Fundamental statistics for the behavioral sciences ((6th ed.). Belmont, CA: Thompson Wadsworth.1993). Handbook of individual differences, learning, and instruction. Hillsdale, NJ: Erlbaum., & (2000). Using a cognitive architecture to examine what develops. Psychological Science, 11(2), 93–100.http://dx.doi.org/10.1111/1467-9280.00222, , & (1989). Using video in the BNR usability lab. SIGCHI Bulletin, 21(2), 92–95.http://dx.doi.org/10.1145/70609.70624(2004). Design and analysis: A researcher's handbook. Upper Saddle River, NJ: Prentice Hall/Pearson Education.http://dx.doi.org/10.1007/978-1-4419-6766-4, & (2009). Drop-off detection with the long cane: Effects of different cane techniques on performance. Journal of Visual Impairment & Blindness, 103(9), 519–530., , & (2008). Procedural skills: From learning to forgetting. Unpublished doctoral dissertation, Department of Industrial and Manufacturing Engineering, The Pennsylvania State University, University Park, PA.(2007). Investigation of procedural skills degradation from different modalities. In Proceedings of the 8th International Conference on Cognitive Modeling (pp. 255–260). Oxford, UK: Taylor & Francis/Psychology Press., , & (2007). Automatically recording keystrokes in public clusters with RUI: Issues and sample answers. In Proceedings of the 29th Annual Cognitive Science Society (p. 1787). Austin, TX: Cognitive Science Society., & (2010). Brunswikian theory and method as a foundation for simulation-based research on clinical judgment. Simulation in Healthcare, 5(5), 255–259.http://dx.doi.org/10.1097/SIH.0b013e3181f12f03(2006). RUI: Recording user input from interfaces under Window and Mac OS X. Behavior Research Methods, 38(4), 656–659.http://dx.doi.org/10.3758/BF03193898, , & (1989). Simulator design and instructional features for air-to-ground attack: A transfer study. Human Factors, 31, 87–99., , , , & ([Page 174]1995). Ethics, lies and videotape. In Proceedings of ACM CHI ′95 Human Factors in Computing Systems (pp. 138–145). Denver, CO: ACM Press.(2001). STEP—A system for teaching experimental psychology using E-Prime. Behavioral Research Methods, Instruments, & Computers, 33(2), 287–296.http://dx.doi.org/10.3758/BF03195379, , , , & (1989). The Space Fortress game. Acta Psychologica, 71, 17–22.http://dx.doi.org/10.1016/0001-6918%2889%2990003-6, & (1982). Visual factors and orientation-mobility performance. American Journal of Optometry and Physiological Optics, 59(5), 413–426.http://dx.doi.org/10.1097/00006324-198205000-00009, & (2003). Using confidence intervals for graphically based data interpretation. Canadian Journal of Experimental Psychology, 57, 203–220., & (2012). Research design explained (, & (8th ed.). Belmont, CA: Wadsworth.2001). Design and analysis of experiments ((5th ed.). New York: John Wiley & Sons.2011). Using a cognitive model to provide instruction for a dynamic task. In Proceedings of the 33rd Annual Conference of the Cognitive Science Society (pp. 2283–2288). Austin, TX: Cognitive Science Society., , & (A design, tests, and considerations for improving keystroke and mouse loggers. Interacting with Computers., , , & (in press).2011). Falling on deaf ears. The Psychologist, 24(3), 178–181. Retrieved from http://www.thepsychologist.org.uk/archive/archive_home.cfm/volumeID_24-editionID_198-ArticleID_1806-getfile_getPDF/thepsychologist/0311munro.pdf(2002). AAAI/RoboCup-2001 Urban Search and Rescue Events: Reality and competition. AI Magazine, 23(1), 37–42., , & (NASA. (1987). NASA Task Load Index (NASA-TLX) Version 1.0: Computerized version. Moffett Field, CA: Human Performance Research Group, NASA Ames Research Center. Retrieved from http://humansystems.arc.nasa.gov/groups/TLX/downloads/TLX_comp_manual.pdf1997). A cognitive model of agents in a commons dilemma. In Proceedings of the 19th Annual Conference of the Cognitive Science Society (pp. 560–565). Mahwah, NJ: Erlbaum., , & (1972). Human problem solving. Englewood Cliffs, NJ: Prentice Hall.http://dx.doi.org/10.1037/h0048495, & (1994). Usability laboratories. Behaviour & Information Technology, 13(1–2), 3–8.http://dx.doi.org/10.1080/01449299408914577(1990). Heuristic evaluation of user interfaces. In Proceedings of CHI ′90 (pp. 249–256). New York: ACM., & ([Page 175]1992). Artificial instruction: A method for relating learning theory to instructional design. In M.Jones & P.H.Winne (eds.), Adaptive learning environments: Foundations and frontiers (pp. 55–83). Berlin: Springer-Verlag.(2000). Demand characteristics. In A.E.Kazdin (ed.), Encyclopedia of psychology (pp. 469–470). Washington, DC: American Psychological Association and Oxford University Press., & (2011). A novel training paradigm for knowledge and skills acquisition: Hybrid schedules lead to better learning for some but not all tasks. Unpublished doctoral thesis, Industrial Engineering, The Pennsylvania State University, University Park, PA.(1978). Exploring predecisional behavior: An alternative approach to decision research. Organizational Behavior and Human Performance, 22, 17–44.http://dx.doi.org/10.1016/0030-5073%2878%2990003-X, , & (Pew, R.W., & Mavor, A.S. (eds.). (2007). Human-system integration in the system development process: A new look. Washington, DC: National Academy Press. Retrieved from http://books.nap.edu/catalog.php?record_id=11893http://dx.doi.org/10.1518/155534308X3770631995). Skill acquisition and human performance. Thousand Oaks, CA: Sage., & (2003). Methods: Toward a science of behavior and experience ((7th ed.). Belmont, CA: Wadsworth/Thompson Learning.1988). Feeling of knowing and strategy selection for solving arithmetic problems. Bulletin of the Psychonomic Society, 26(6), 495–496., & (1992). What determines initial feeling of knowing? Familiarity with question terms, not the answer. Journal of Experimental Psychology: Learning, Memory & Cognition, 18(3), 435–451.http://dx.doi.org/10.1037/0278-7318.104.22.1685, & (2007). The effects of visual display distance on eye accommodation, head posture, and vision and neck symptoms. Human Factors, 49(5), 830–838.http://dx.doi.org/10.1518/001872007X230208, , , , & (1989). The effect of feature frequency on feeling-of-knowing and strategy selection for arithmetic problems. Unpublished master's thesis, Department of Psychology, Carnegie Mellon University, Pittsburgh, PA.(The basics of human-system interaction: What system designers really need to know about people. New York: Springer., , & (in press).2011). Practical aspects of running experiments with human participants. In Universal Access in HCI, Part I, HCII 2011, LNCS 6765 (pp. 119–128). Berlin: Springer-Verlag., , , & (2007). Including a model of visual processing with a cognitive architecture to model a simple teleoperation task. Journal of Cognitive Engineering and Decision Making, 1(2), 121–147.http://dx.doi.org/10.1518/155534307X232811, , & (1994). Developing process models as summaries of HCI action sequences. Human-Computer Interaction, 9, 345–383.http://dx.doi.org/10.1207/s15327051hci0903&4_4, & (2011). Determining the number of model runs: Treating cognitive models as theories by not sampling [Page 176]their behavior. In L.Rothrock & S.Narayanan (eds.), Human-in-the-loop simulations: Methods and practice (pp. 97–116). London: Springer-Verlag., , , & (2001). The learning curve. In W.Kintch, N.Smelser, & P.Baltes (eds.), International encyclopedia of the social and behavioral sciences (Vol. 13, pp. 8602–8605). Amsterdam: Pergamon., & (2004). What should they be called?APS Observer, 17(4), 46–48.(1987). Learning by chunking, a production system model of practice. In D.Klahr, P.Langley, & R.Neches (eds.), Production system models of learning and development (pp. 221–286). Cambridge, MA: MIT Press., & (2002). Usability engineering: Scenario-based development of human-computer interaction. San Francisco: Morgan Kaufmann., & (2001). An integrated model of eye movements and visual encoding. Cognitive Systems Research, 1(4), 201–220.http://dx.doi.org/10.1016/S1389-0417%2800%2900015-2(2009). Rapid prototyping and evaluation of in-vehicle interfaces. ACM Transactions on Computer-Human Interaction, 16(2), Article 9, 33 pages.(2000). Identifying fixations and saccades in eye-tracking protocols. In Proceedings of the Eye Tracking Research and Applications Symposium (pp. 71–78). New York: ACM Press., & (1994). Exploratory sequential data analysis: Foundations. Human-Computer Interaction, 9(3–4), 251–317.http://dx.doi.org/10.1207/s15327051hci0903&4_2, & (2000). Argus Prime: Modeling emergent microstrategies in a complex simulated task environment. In Proceedings of the 3rd International Conference on Cognitive Modeling (pp. 260–270). Veenendaal, Netherlands: Universal Press., & (2001). Argus: A suite of tools for research in complex cognition. Behavior Research Methods, Instruments, & Computers, 33(2), 130–140.http://dx.doi.org/10.3758/BF03195358, & (1993). Thoughts beyond words: When language overshadows insight. Journal of Experimental Psychology: General, 122, 166–183.http://dx.doi.org/10.1037/0096-3422.214.171.124, , & (1963). Discrimination reaction time for a 1,023-alternative task. Journal of Experimental Psychology, 66(3), 215–226.http://dx.doi.org/10.1037/h0048914(1987). The perils of averaging data over strategies: An example from children's addition. Journal of Experimental Psychology, 115, 250–264.(2005). Naïve Realism: Misplaced faith in the utility of realistic displays. Ergonomics in Design, 13(3/Summer), 6–13.http://dx.doi.org/10.1177/106480460501300303, & (2000). Foundations of privacy protection from a computer science perspective. In Proceedings, Joint Statistical Meeting. Indianapolis, IN: American Association for the Advancement of Science.(1975). What is an article worth?Journal of Political Economy, 83(5), 951–967.http://dx.doi.org/10.1086/260371, & (2007). Getting out of order: Avoiding lesson effects through instruction In F.E.Ritter, J.Nerb, T.O'Shea, & E.Lehtinen (eds.), In order to learn: How the sequences of topics affect learning (pp. 169–179). New York: Oxford University Press.([Page 177]2006). A Bayesian perspective on hypothesis testing: A comment on Killeen (2005). Psychological Science, 17, 641–642.http://dx.doi.org/10.1111/j.1467-9280.2006.01757.x, & (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54, 594–604.http://dx.doi.org/10.1037/0003-066X.54.8.594(1990). Robert Sessions Woodworth and the “Columbia Bible”: How the psychological experiment was redefined. American Journal of Psychology, 103(3), 391–401.http://dx.doi.org/10.2307/1423217(Wisconsin Department of Health Services. (2006). Sighted guide techniques. Madison, WI: Author. Retrieved from http://www.dhs.wisconsin.gov/blind/adjustment/sightedguide.pdf1938). Experimental psychology. Oxford, UK: Holt.http://dx.doi.org/10.1097/00005053-194006000-00068(2010). One Laptop per Child: Polishing up the XO Laptop user experience. Ergonomics in Design, 18(3), 8–13.http://dx.doi.org/10.1518/106480410X12793210871708, , & (