Goals
AD/AE Process
Criteria for Artifact Description (AD)
Criteria for Artifact Evaluation (AE)
Organizing Your Research Object
Badging Process FAQ
- Is participation in the badging process mandatory?
- What is the set of badge labels that a paper can apply for?
- Will SC host the artifacts of reproduced papers?
Goals
Overview
The goal of this committee is to encourage and promote reproducible research within the SC community. To that end, we aim to assist SC authors in providing us with the necessary documentation that describes your artifact and help us evaluate it so we can assign a badge to the artifact.
Why Should You Participate?
You will be making it easy for other researchers to compare with your work, to adopt and extend your research. This instantly means more recognition directly visible through badges for your work and higher impact. As described in this SC20 survey, thirty-five percent (35%) of the respondents have used the appendices information from papers in the SC conference proceedings in their research. There will be a general announcement for all accepted papers reproduced and badged.
What Is An Artifact/Research Object?
We use the terms artifact and research object as synonyms. A paper consists of several computational artifacts that extend beyond the submitted article itself: software, datasets, environment configuration, mechanized proofs, benchmarks, test suites with scripts, etc. A complete artifact package must contain (1) the computational artifacts, and (2) instructions/documentation describing the contents and how to use it. Further details and guidance about artifacts and research objects are provided all through this page.
AD/AE Process
Overview
The Artifact Description/Artifact Evaluation (AD/AE) process is single-blind, unlike paper submissions which are double-blind reviewed. Author’s need not remove identifying information from artifacts or papers. The Committee may provide feedback to authors in a single-blind arrangement. The AD/AE Committee will not share any information with the Program Committee other than to confirm whether artifacts meet the criteria.
Evaluation Timeline
The review process will take place in two stages: In Phase 1, Artifact Descriptions will be checked for completion and accessibility, and in Phase 2, Artifact Evaluation will happen for accepted papers that applied for badges.
In conjunction with the paper submission, artifact descriptions are mandatory. The artifact description provides information about a paper’s artifacts. All SC papers must provide (1) an artifact description, or (2) provide a reason why an artifact description can not be provided (see below for the Artifact Description criteria).
Based on the author’s application for badges during the AD submission, the AD/AE committee starts to evaluate the artifact. This step relies on cooperation between paper authors and the committee. We will use Linklings, the SC conference submission system, for single-blind messaging between the two parties. Via this communication, the committee may ask for access to special hardware, ask for missing components, provide failure/error messages, and generally keep the author’s posted on their evaluation status.
Important Dates
- April 15, 2022: Ad mandatory/AE (optional) submissions close (Two weeks after the paper submission deadline)
- June 10: Revised AD submission deadline (Stage 2 of AD/AE evaluation)
- June 15, 2022: Artifact badge evaluation starts for accepted papers
- July 29, 2022: Artifact freeze
- August 19, 2022: Artifact badge decision
How to Submit
Artifacts are submitted via the Artifact Description submission form. Submission includes application for badging in the second stage. Artifact freeze means the artifact must not be changed after this time, or a tagged version be provided.
Criteria for Artifact Description (AD)
Overview
The Artifact Description is a mandatory step for all submitted papers. All authors must provide descriptions of their artifacts. If you are unable to provide an artifact description (due to proprietary reasons), please provide detailed reasons why you are not able to do so. Failure to provide detailed description will lead to further questions from the AD/AE Committee. The AD/AE Committee will provide their feedback to the SC Technical Program Committee, and inadequate explanations will be taken against the overall paper review.
The criteria for the Artifact Description process is a filled AD/AE Appendix Form reflecting the current state of your artifact. The AD/AE Committee will evaluate completion of the form and artifact accessibility from any links included as part of the form. At this stage, authors can apply for badges. If you wish to acquire a badge for your artifact, you must choose appropriate badges in AD/AE form.
Criteria for Artifact Evaluation (AE)
Overview
The goal of Artifact Evaluation is to award badges to artifacts of accepted papers. We base all badges on the NISO Reproducibility Badging and Definitions Standard. In 2022 the assigned badges will be per ACM Reproducibility Standard.
Authors of papers must choose to apply for a badge a priori during the AD phase. Authors can apply for one or more of the three kinds of badges that we offer. The type of badge and the criteria for each badge is explained next.
Artifacts Available
The following are necessary to receive this badge:
- Assigned DOI to your research object by the Artifact Freeze deadline (07/29/2022). DOIs can be acquired via Zenodo, FigShare, Dryad, Software Heritage. Zenodo provides an integration with Github to automatically generate DOIs from Git tags.
- Links to code and data repositories on a hosting platform that supports versioning: GitHub, or GitLab. In other words, please do NOT provide Dropbox links or gzipped files hosted through personal webpages.
Note that, for physical objects relevant to the research, the metadata about the object should be made available.
What do we mean by accessible? Artifacts used in the research (including data and code) are permanently archived in a public repository that assigns a global identifier and guarantees persistence, and are made available via standard open licenses that maximize artifact availability.
What is a DOI? Check this out.
Artifacts Evaluated-Functional
The criteria for the Artifacts Evaluated-Functional badge require an AD/AE committee member to agree whether the artifact provides enough details to exercise the artifact of components in the paper. For example, is it possible to compile the artifact, use a Makefile, or perform a small run? If the artifact runs on a large cluster—can it be compiled on a single machine? Can analysis be run on a small scale? Does the artifact describe the components to nurture future use of this artifact?
The reviewer will assess the details of the research artifact based on the following criteria:
- Documentation: Are the artifacts sufficiently documented to enable them to be exercised by readers of the paper?
- Completeness: Do the submitted artifacts include all of the key components described in the paper?
- Exercisability: Do the submitted artifacts include the scripts and data needed to run the experiments described in the paper, and can the software be successfully executed?
We encourage authors to describe their (i) workflow underlying the paper, (ii) describing some of the black boxes, or a white box (e.g., source, configuration files, build environment), (iii) input data: either the process to generate the input data should be made available, or when the data is not generated, the actual data itself or a link to the data should be provided, (iv) environment (system configuration and initialization, scripts, workload, measurement protocol) used to produce the raw experimental data, and (v) the scripts needed to transform the raw data into the graphs included in the paper.
Results Reproduced
The evaluators successfully reproduced the key computational results using the author‐created research objects, methods, code, and conditions of analysis. Note we do not aim to recreate the exact or identical results, especially hardware-based results. However, we do aim to:
- Reproduce Behavior: This is of specific importance where results are hardware-dependent. Bit-wise reproducibility is not our goal. If we get access to the same hardware as used by experiments, we will aim to reproduce the results on that hardware. If not, we aim to work with authors to determine the equivalent or approximate behavior on available hardware. For example, if results are about response time, our objective will be to check if a given algorithm is significantly faster than another one, or that a given parameter affects negatively or positively the behavior of a system.
- Reproduce the Central Results and Claims of the Paper: We do not aim to reproduce all the results and claims of the paper. The AD/AE committee will determine the central results of the accepted paper, and will work with authors to confirm it. Once confirmed, the badge will be assigned based on the committee being able to reproduce behavior of these central results.
Organizing Your Research Object
Overview
A paper consists of several computational artifacts that extend beyond the submitted article itself: software, datasets, environment configuration, mechanized proofs, benchmarks, test suites with scripts, etc. A complete artifact package must contain (1) the artifact itself, and (2) instructions/documentation describing the contents and how to use it.
Choose a repository such as a version-controlled code and data repositories: Zenodo, FigShare, Dryad, Software Heritage, GitHub, or GitLab.
Your artifact package must include an obvious “README” that describes your artifact and provides a road map for evaluation. The README should contain or point to suitable instructions and documentation, to save committee members the burden of reverse-engineering the authors’ intentions. For example, a tool without a quick tutorial is generally very difficult to use. Similarly, a dataset is useless without some explanation on how to browse the data. For software artifacts, the README should—at a minimum—provide instructions for installing and running the software on relevant inputs. For other types of artifacts, describe your artifact and detail how to “use” it in a meaningful way.
Importantly, make your claims about your artifacts concrete. This is especially important if you think that these claims differ from the expectations set up by your paper. The AEC is still going to evaluate your artifacts relative to your paper, but your explanation can help to set expectations up front, especially in cases that might frustrate the evaluators without prior notice. For example, tell the AEC about difficulties they might encounter in using the artifact, or its maturity relative to the content of the paper.
Packaging Methods
Authors should consider one of the following methods to package the software components of their artifacts (although the AEC is open to other reasonable formats as well):
- Source Code: If your artifact has few dependencies and can be installed easily on several operating systems, you may submit source code and build scripts. However, if your artifact has a long list of dependencies, please use one of the other formats below.
- Virtual Machine/Container: A virtual machine or Docker image containing the software application already set up with the right toolchain and intended runtime environment. For example:
- For raw data, the VM would contain the data and the scripts used to analyze it.
- For a mobile phone application, the VM would have a phone emulator installed.
- For mechanized proofs, the VM would contain the right version of the relevant theorem prover. We recommend using a format that is easy for AEC members to work with, such as OVF or Docker images. An AWS EC2 instance is also possible.
- Binary Installer: Indicate exactly which platform and other run-time dependencies your artifact requires.
- Live Instance on the Web: Ensure that it is available for the duration of the artifact evaluation process.
- Internet-accessible Hardware: If your artifact requires special hardware (e.g., GPUs or clusters), or if your artifact is actually a piece of hardware, please make sure that AEC members can somehow access the device. VPN-based access to the device might be an option.
Preparation Sources
There are several sources of good advice about preparing artifacts for evaluation:
- HOWTO for AEC Submitters, by Dan Borowy, Charlie Cursinger, Emma Tosch, John Vilk, and Emery Berger
- Artifact Evaluation: Tips for Authors, by Rohan Padhye
- SIGOPS articles on award winning artifacts [1] and [2]
- Github CSArtifacts Resources
Badging Process FAQ
Is participation in the badging process mandatory?
No. Participation in the badging process is voluntary. Please choose which badges you wish to apply for in the AD/AE Appendices Form.
Artifact Evaluation will only occur for accepted papers who have applied for an appropriate badge. The badge will be assigned after the artifact evaluation process is over.
What is the set of badge labels that a paper can apply for?
- Artifacts Available
- Artifacts Evaluated-Functional
- Results Reproduced
- No badge
Will SC host the artifacts of reproduced papers?
Authors are responsible to host their artifacts, whether reproduced or not. We suggest using one of the following platforms: Zenodo, FigShare, Dryad, Software Heritage for sharing their artifacts with the AD/AE Committee. The SC Reproducibility Initiative does not have any place to permanently host what we reproduce and/or review. So any work done to badge artifacts will not be hosted on a longer term.