4REAL Workshop 2018

Workshop on
Replicability and Reproducibility of Research Results
in Science and Technology of Language

12 May 2018
Miyazaki, Japan


Collocated with LREC2018
11th Language Resources and Evaluation Conference


Important dates:

10 January 2018: Authors of submissions to LREC main conference notified
25 15 January 2018: UPDATE Deadline for submissions to 4REAL workshop
9 February 2018: Notification of authors of submissions to 4REAL workshop
15 February 2018: Deadline for early-bird registration to both LREC main conference and workshops
1 March 2018: Deadline for camera-ready
9-11 May 2018: LREC main conference (Wednesday to Friday)
12 May 2018: 4REAL workshop (Saturday)



Branco, António, Nicoletta Calzolary and Khlaid Choukri (eds.), 2018, Proceedings of the 4REAL 2018 – Workshop on Replicability and Reproducibility of Research Results in Science and Technology of Language, Paris, European Language Resources Association, ISBN: 979-10-95546-21-4, EAN: 9791095546214.



09:15 – 9:30 – Welcome address

09:30 – 10:30 – Session 1

Machine Learning Approach to Bilingual Terminology Alignment: Reimplementation and adaptation
Andraž Repar, Matej Martinc and Senja Pollak

Examining a Hate Speech Corpus for Hate Speech Detection and Popularity Prediction
Filip Klubička and Raquel Fernández

10:30 – 11:00 – Coffee break

11:00 – 12:00 – Session 2

Annotation Contributions: Sharing derived research data
Steve Cassidy and Dominique Estival

Replicating Speech Rate Convergence Experiments on the Switchboard Corpus

Simone Fuscone, Benoit Favre and Laurent Prévot

12:00 – 13:00 – Invited talk

Validation of Results in Linguistic Science and Technology: Terminology, problems, and solutions
Mark Liberman

13:00 – Farewell


Invited talk – abstract:

Validation of Results in Linguistic Science and Technology: Terminology, problems, and solutions
Mark Liberman

Everyone agrees that there’s a problem: very often, results and conclusions in experimental science and some areas of engineering turn out to be unreliable or false. And everyone agrees that the solution is to put more effort into verifying such results and conclusions, by having other people re-do aspects of the research and analysis.

There can be many reasons for unreliability: outright fraud, programming errors, “p-hacking”, the “file drawer effect”, or unrecognized co-variates in complex situations. And there are many types of solutions, from checking or re-writing the scripts used to analyze the original data, to trying new analysis techniques on the original data, to redoing human or machine coding of raw specimens or recordings, to collecting new datasets using the original techniques, to collecting new relevant data in new ways or in new contexts.

Unfortunately, the terminology in this area is a mess. The two ends of this spectrum of “doing over” are commonly described using the terms replicate/replication/replicability vs. reproduce/reproduction/reproducibility — but different groups use these terms in diametrically opposite ways. This makes discussion of the issues confused and confusing, in a situation where we need to be precise about diagnosing possible problems and prescribing best practices for different types of research and in different subdisciplines.

With respect to the “reproducibility crisis”, under whatever name, corpus-based speech and language analysis is decades ahead of psychology, biology, and medicine. Everyone agrees that researchers should make various types of validation easier by publishing their data and methods, and by using well-defined evaluation techniques that are resistant to over-fitting — and we’ve (mostly) been doing this for 30 years! But there’s still room for improvement. In this talk, I’ll try to clarify the terminology, assess the remaining problems in our field, and suggest directions for improvement.


Call for papers:

Reproduction and replication of research results are at the heart of the validation of scientific knowledge and of the scientific endeavor. Reproduction of results entails arriving at the same overall conclusions that is, to appropriately validate a set of results, scientists should strive to reproduce the same answer to a given research question by different means, e.g. by reimplementing an algorithm or evaluating it on a new dataset, etc. Replication has a more limited aim, typically involving running the exact same solution or approach under the same conditions in order to arrive at the same output result.

Despite their key importance, reproduction and replication have not been sufficiently encouraged given the prevailing procedures and priorities for the reviewing, selection and publication of research results.

The immediate motivation for the increased interest on reproducibility and replicability is to be found in a number of factors, including the realization that for some published results, their replication is not being obtained (e.g. Prinz et al., 2011; Belgley and Ellis, 2012); that there may be problems with the commonly accepted reviewing procedures, where deliberately falsified submissions, with fabricated errors and fake authors, get accepted even in respectable journals (e.g. Bohannon, 2013); that the expectation of researchers vis a vis misconduct, as revealed in inquiries to scientists on questionable practices, scores higher than one might expect or would be ready to accept (e.g. Fanelli, 2009); among several others.

This workshop seeks to foster the discussion and the advancement on a topic that has been given insufficient attention in the research area of language processing tools and resources and that has been an important topic emerging in other scientific areas, continuing the objectives of the first edition of the 4REAL workshop, at LREC 2016. We are thus inviting submissions of articles that present cases, either with positive or negative results, of actual replication or reproduction exercises of previous published results in our area.

Specific topics within the scope of the call include, but are not limited to:

– Reproduction and replication of previously published systems, resources, results, etc.

– Analysis of reproducibility challenges in system-oriented and user-oriented evaluation.

– Reproducibility challenges on private or proprietary data.

– Reproducibility challenges on ephemeral data, like streaming data, tweets, etc.

– Reproducibility challenges on online experiments.

– Reproducibility in evaluation campaigns.

– Evaluation infrastructures and Evaluation as a Service (EaaS).

– Experiments on data management, data curation, and data quality.

– Reproducible experimental workflows: tools and experiences.

– Data citation: citing experimental data, dynamic data sets, samples, etc.

We are interested also in articles discussing the challenges, the risk factors, the appropriate procedures, etc. specific to our area or that should be adopted, or adapted from other neighboring areas, including methodologies for monitoring, maintaining or improving citation of language resources and tools and to assess the importance of data citation for research integrity. This includes also of course the new risks raised by the replication articles themselves and their own integrity, in view of the preservation of the reputation of colleagues and works whose results are reported has having been replicated, etc.


“Replicability Focus” in Language Resources and Evaluation Journal:

Best papers will be invited to be submitted to the Language Resources and Evaluation Journal under its new “Replicability Focus”.


“Identify, Describe and Share your LRs!” initiative:

Describing your Language Resources (LRs) in the LRE Map is now a normal practice in the submission procedure of LREC (introduced in 2010 and adopted by other conferences). To continue the efforts initiated at LREC 2014 about “Sharing LRs” (data, tools, web-services, etc.), when submitting a paper, authors will have the possibility to upload LRs in a special LREC repository. This effort of sharing LRs, linked to the LRE Map for their description, may become a new “regular” feature for conferences in our field, thus contributing to creating a common repository where everyone can deposit and share data.

As scientific work requires accurate citations of referenced work so as to allow the community to understand the whole context and also replicate the experiments conducted by other researchers, LREC 2018 endorses the need to uniquely Identify LRs through the use of the International Standard Language Resource Number (ISLRN, www.islrn.org), a Persistent Unique Identifier to be assigned to each Language Resource. The assignment of ISLRNs to LRs cited in LREC papers will be offered at submission time.



Please submit your papers, duly anonymized, at https://www.softconf.com/lrec2018/4REAL. Accepted papers will be presented as oral presentations or as posters. All accepted papers will be published in the workshop proceedings.

Papers should be formatted according to the stylesheet to be provided on the LREC 2018 website and should not exceed 8 pages, including references and appendices. Papers should be submitted in PDF format through the URL mentioned above.


Organization committee:

António Branco (University of Lisbon)
Nicoletta Calzolari (ILC)
Khalid Choukri (ELRA)


Program committee (to be confirmed):

Aljoscha Burchardt (DFKI)
António Branco (University of Lisbon)
German Rigau (Universit of the Basque Counry)
Gertjan van Noord (University of Groningen)
Joseph Mariani (CNRS/LIMSI)
Kevin Cohen (University of Colorado)
Khalid Choukri (ELRA)
Maria Gavrilidou (ILSP)
Marko Grobelnik (Jozef Stefan Institute)
Marko Tadic (University of Zagreb)
Nancy Ide (Vassar College)
Nicoletta Calzolari (ILC)
Patrick Paroubek (CNRS/LIMSI))
Piek Vossen (VU University Amsterdam)
Senja Pollak (Jozef Stefan Institute)
Simon Krek (Jozef Stefan Institute)
Stelios Piperidis (ILSP)
Thierry Declerck (DFKI)
Yohei Murakami (Language Grid Japan)


Proceedings of the Workshop 4REAL 2016:

Branco, António, Nicoletta Calzolari and Khalid Choukri (eds.), 2016, Proceedings of the Workshop on Research Results Reproducibility and Resources Citation in Science and Technology of Language.


Website of the Workshop 4REAL 2016:



Begley and Ellis, 2012, Drug development: Raise standards for preclinical
cancer research, Nature
Bohannon, John, 2013, Who’s Afraid of Peer Review?, Science
Fanelli,2009 How Many Scientists Fabricate and Falsify Research?
Prinz, et al., 2011, Believe it or not: how much can we rely on published
data on potential drug targets?, Nature Reviews Drug Discovery 10, 712.