Reaction Time Effects in Lab- Versus Web-Based Research: Experimental Evidence

View Segments Segment :

  • Citations
  • Add to My List
  • Embed
  • Link
  • Help
  • Citations
  • Add to My List
  • Embed
  • Link
  • Help
Successfully saved clip
Find all your clips in My Lists
Failed to save clip
  • Transcript
  • Transcript

    Auto-Scroll: ONOFF 
    • 00:10

      BEN HILBIG: So good morning, everyone.What I want to talk about is something that I neverreally thought about intensively,and a growing amount of times, reviewers or editorspestered me with this.And I figured that sometime I haveto look at this a bit more systematically.So needless to say, and I'm going

    • 00:30

      BEN HILBIG [continued]: to start with the trivial part, is that over the past, say,at least one decade, maybe two, web-based research has reallygrown exponentially.And I guess there's really good reasons for this,and those have been documented well,particularly that we can assess large samples,heterogeneous samples in very little time.

    • 00:50

      BEN HILBIG [continued]: It costs less and so forth.So there's pretty good incentivesfor doing web-based research in general.And also, if you look at the comparability of web-basedand more traditional approaches, you usuallyfind that no matter what area we're talking about,so be it personality assessment, ability, perception, cognition,

    • 01:12

      BEN HILBIG [continued]: that the stuff we find on the webis pretty much comparable to what we're traditionallylooking at in the lab and with paper-pencil-based methods.However, there is a bit of a problem to web-based research,at least in perception of some of us, namelythat if you do web-based research,and particularly if you do it exclusively,

    • 01:34

      BEN HILBIG [continued]: you can still encounter quite a bit of skepticismfrom reviewers and editors.Now this has declined a bit, but there's one particular area,where it hasn't really declined much,and that's when we're looking at response latencies.And these are just some citationshere of people who've basically said the same thing, namelythat when you want to measure reaction times,

    • 01:55

      BEN HILBIG [continued]: and you're interested in relatively small reaction timeeffects of a few hundred milliseconds,then if you do it on the web, you'rebound to get some very basic criticism from some reviewersor editors saying that you can't really do this.Or to put it in a different way, there is this preconceptionthat the technology is not really ready for reaction time

    • 02:17

      BEN HILBIG [continued]: measurement.Now I think there are good arguments for this,and there are two main arguments that I think underliethis kind of skepticism.One is-- and this is indubitable--that there is an increase in technical and situationalvariance if you move from the lab to the web.So people are using different computers, displays, input

    • 02:38

      BEN HILBIG [continued]: devices, and you have no control over certain aspectsof lighting conditions, viewing position,time of day, what else.So I think this argument is, per se, truethat you have this increase in technical situational variance.And the other one is also generally true,namely that some of the technologiesare software solutions that you might use,

    • 02:59

      BEN HILBIG [continued]: come with certain problems.Certain technologies you might want to useare not generally available, so theyrequire your potential participantsto have certain plug-ins or certain software installed.Not all of them have that, and, more importantly,there might be confounds between whohas a certain plug-in installed and certain person

    • 03:20

      BEN HILBIG [continued]: characteristics.Also, some of these technologies havebeen shown to produce slightly inaccurate timing,although you can take countermeasures.At the same time, however, we do needto acknowledge that if you look at reaction timeeffects and studies that have tried this in the web,many of the classical reaction timeeffects we're interested in in cognitive psychology

    • 03:42

      BEN HILBIG [continued]: have been found on the web.So this is just a very small selection.There's several more, so basically,many of these kinds of effects, you can get them on the web.And also, some of the technologies you might use,especially JavaScript, are, in fact, widelyavailable to practically all participants,and they've been shown to be accurate in terms of how

    • 04:04

      BEN HILBIG [continued]: well they control for timing.So these arguments would tell us, well,should we really be generally skeptical?Now when you put forward these arguments, usually people say,well, this isn't really strong evidence.There's two reasons for this.One is that many of the studies investigating timing accuracy

    • 04:24

      BEN HILBIG [continued]: have been done with non-human response systems,so a mere technological solution lookingat how is timing accurate.And obviously, although they do tell ussomething about the technology as such,they don't tell us anything about the variation we willfind with human response data.Or as DeLay and Motts put it recently,

    • 04:45

      BEN HILBIG [continued]: what we're really interested in is whether when we'redoing a real experiment using real participantsand looking at real behavior, does the web produce morevariance to the effect that whatever we're interested in,it will be, for example, less easy to find?The other thing is that if you justreplicate well-established reaction time effects

    • 05:07

      BEN HILBIG [continued]: on the web, it's really difficult to draw conclusionsabout the calls or specific effectof technical and situational variation.And the reason is that what we're doing hereis what we tell every undergraduate sheor he should never do-- namely, it'sa cross-experimental comparison.We're making comparisons across different samples.We have no way of statistically knowing whether it actually

    • 05:30

      BEN HILBIG [continued]: makes a difference whether we're doing thisin the lab or the web, or to put it differently,as [INAUDIBLE] and colleagues have said,that really if we don't find a difference between lab and web,this doesn't really tell us much unless we reallyrandomly assign participants to the lab versus the web doingthe same tasks.So we can safely assume that what we're looking at

    • 05:51

      BEN HILBIG [continued]: is the all else being equal thing.So what we really need-- it's pretty simple--is we need true experiments.We need experiments with human response dataand with random assignment to lab versus web,if we want to really know, does it make a difference?Now surprisingly, there's very few true experiments out there,so really, studies that have randomly assigned participants

    • 06:15

      BEN HILBIG [continued]: to lab versus web and looked at whether itwould be some effective interest can be found equivalently.One recent exception is an important paperby DeLay and Motts, who did a visual search experimentwith random assignment.It was within subjects design, and what they comparedwas on the one hand, standard software typically

    • 06:36

      BEN HILBIG [continued]: being used in the lab of psychophysics toolbox,running on the lab.And on the other hand, a JavaScriptHTML-based experimental version of the same task,running on the browser, also in the lab.So both of their conditions were within the lab,but comparing these two types of software.So by doing so, they can inform us about,

    • 06:58

      BEN HILBIG [continued]: is there any effect of the technology or softwarethat we're using?And what they basically find is no.There's no strong difference between the two, which tells usthat generally speaking, using JavaScript for reaction timemeasurement is not, per se, problematic.You get the same sort of effects in typical human response datathat you would get with standard experimental software running

    • 07:19

      BEN HILBIG [continued]: in a lab.But I think in this setup, there's one thing missing,and it's also an important step, namelywhat happens if we take this type of technologyand actually go to the web, because what we alsowant to know about is is there an effectof the increase in technical situational variancethat I talked about?So this is basically what we don't know yet,

    • 07:41

      BEN HILBIG [continued]: and this is what we set out to do in our own experiment.In addition, we figured it might be a good ideato use a different task and paradigm, so notto stick with visual search, but do something else.And also, to use a different software for comparisonover here so that we're not limiting ourselvesto the comparison with the psychophysics toolbox.This is basically what we did, so we randomly

    • 08:03

      BEN HILBIG [continued]: assigned our participants to one of three conditions.One was classical lab-based condition-- in our case,running the EPrime software, which is pretty widely used,I guess.The second, in parallel to DeLay and Motts,we had the same task running JavaScript, HTML-based runningin a browser but in the lab, and then third, the same experiment

    • 08:26

      BEN HILBIG [continued]: as used here, but running on the web.So we randomly assigned participants to this,and the task we used was a very classical one, namelythe lexical decision task.So people complete 140 trials in which theyget a string of letters, and all they're required to sayis whether this is a word or not.So it's a two-alternative forced choice.And we used nouns, half of which were

    • 08:48

      BEN HILBIG [continued]: oh high versus of low frequency in the natural language.And, of course, the effective interestthat we want to look at here is a classicalwithin subject effect, namely the word frequency effect.It's been well-known for many yearsthat high frequency words, words thatare more frequent in the natural language,are detected faster to be words than low frequency words.

    • 09:10

      BEN HILBIG [continued]: And this is within subject effectbut it's typically around 150 to 200 milliseconds.And it's well-established, and because it'swithin subject's effect, this gives usthe enormous advantage of a lot of statistical powerin this experiment.So I'm going to jump right to the results.Concerning the task itself, most of the measures we took

    • 09:31

      BEN HILBIG [continued]: look very normal.Accuracy was generally high, because this is an easy task.And the latencies were also very typical,somewhere between 700 and 1,000 milliseconds.But that's not really what we're interested in.What we really want to know is what about the work frequencyeffects that this is the effective interest here,which I'm going to display as the individual median reaction

    • 09:51

      BEN HILBIG [continued]: time of high frequency versus low frequency words.And more importantly, we want to knowwhether this effect looks different dependingon the experimental conditions.So depending on whether we're doing itin the lab using EPrime, in the lab using a browser,or in the web using a browser.First off, just sort of to give you a feeling of the data,this is the effect, so this is the word frequency effect

    • 10:13

      BEN HILBIG [continued]: that is the absolute difference between highversus low frequency words in milliseconds.This is the confidence interval of this difference,and as you can see, they're all substantial.Of course, they're all significant,and the effect size is pretty large.But from this, I think it's difficult to saywhether there really is a differencebetween these conditions or not.

    • 10:35

      BEN HILBIG [continued]: The effect sizes do differ a bit,but they're all very large.So, of course, the right way to lookat this-- this is just the raw data again--is in a mixed analysis of variance kind of setup.And what you get is basically the following.You get a very strong word frequency effect,so generally, high frequency wordsare detected faster as words than low frequency words.

    • 10:57

      BEN HILBIG [continued]: That, I think we knew before.There's no main effect of condition,so this doesn't make a difference.And most importantly, for our question,is there's no interaction between the experimentalcondition and the word frequency effect, a notethat the power to detect this difference was very high.Still, if you are a bit more or, let's say,

    • 11:18

      BEN HILBIG [continued]: religious and prefer to look at this in a Bayesian way,here are all the base factors for these three effects.So there's a very large base factorfor the word frequency effect, which wewould interpret as definitive.No main effect of condition or actuallysubstantial evidence against a main effect of the condition,and most importantly, substantial evidence

    • 11:41

      BEN HILBIG [continued]: against any interaction here.Note that in this case, this is comparing a modelwith the main effect and interaction versus a modelwith the main effects only.So what can we conclude from this type of data?I think, first of all, what we seehere is that the word frequency effect, the classical effect,

    • 12:01

      BEN HILBIG [continued]: is highly comparable across the web in the lab.It's a within effect that is strong in allof our between conditions, and most importantly, wedon't find any interaction.So in essence, it doesn't really make a differencefor the word frequency effect whether we'relooking at it in the lab using standard experimental software,

    • 12:25

      BEN HILBIG [continued]: in the lab using a browser, or in the web using a browser.In Bayesian terms, you could say, actually,there's substantial evidence against any such interaction.If anything, if you look a bit more closely at the data,you will see that there is a small trend towards findinga larger word frequency effects or stronger effectin the browser-based conditions, so those

    • 12:46

      BEN HILBIG [continued]: running HTML and JavaScript as compared to EPrime.I was a bit surprised by this, but it's nota statistically reliable effect.I think that most importantly and most generally,we can say that based on this kind of experimental data,it's a bit difficult to argue that,

    • 13:07

      BEN HILBIG [continued]: sort of to uphold skepticism about web-based reaction timemeasurement in general.Of course, there might be specific conditions under whichthis is not a good idea, and I thinkthat at some point, when you go downto effects that are like 20 milliseconds of size,you might run into trouble.But the word frequency effect is around 150, 200 milliseconds.That's already depending on what kind of research you do.

    • 13:30

      BEN HILBIG [continued]: That's not very long latencies.And it's not a problem doing this on the web usingJavaScript and HTML.So I would hope that in the future,an increasing number of reviewers and editorswould not react with this kind of a priori general skepticismabout web-based reaction time measurement.That's all I'm saying.I'm not trying to advocate that we

    • 13:51

      BEN HILBIG [continued]: should do all of our cognitive researchin the web instead of the lab.And there's many good reasons for the lab,especially the type of control we have there.But approaching this type of data with skepticism, per se,I think, is really not a good idea,and I think the data of this type really showed that.Well, thank you very much for your attention.

    • 14:13

      BEN HILBIG [continued]: I'm happy to take questions.[APPLAUSE]

Reaction Time Effects in Lab- Versus Web-Based Research: Experimental Evidence

View Segments Segment :


After facing skepticism about the reliability of his web-based research, Professor Benjamin Hilbig set out to compare reaction times in lab-based and web-based trials. He found no evidence of an effect caused by experimental condition.

SAGE Video Forum
Reaction Time Effects in Lab- Versus Web-Based Research: Experimental Evidence

After facing skepticism about the reliability of his web-based research, Professor Benjamin Hilbig set out to compare reaction times in lab-based and web-based trials. He found no evidence of an effect caused by experimental condition.

Copy and paste the following HTML into your website

Back to Top