Obtaining Task 2 Dataset
The dataset will be distributed through the Physionet website. The steps for accessing the ShARe dataset for this year's Task 2 can be found below.
1. Register for CLEF eHealth 2014: http://220.127.116.11:8888/clef2014labs/
2. Obtain a human subjects training certificate. If you do not have a certificate, you can take the CITI training course (https://www.citiprogram.org/Default.asp) or the NIH training course (http://phrp.nihtraining.com/users/login.php)
Note: First time users need to create an account in order to be able to take the courses. Expect a couple of hours work to complete the certification. Please save an electronic copy of the certificate - it will be needed in the subsequent steps to obtain the data.
3. Go to the Physionet site: http://physionet.org/mimic2/mimic2_access.shtml
4. Click on the link for “creating a PhysioNetWorks account” (near middle of page) (https://physionet.org/pnw/login) and follow the instructions.
5. Go to this site and accept the terms of the DUA: https://physionet.org/works/MIMICIIClinicalDatabase/access.shtml
You will receive an email telling you to fill in your information on the DUA and email it back with your human subjects training certificate.
Important: Fill out the DUA using the word “ShARe/CLEF” in the description of the project and mail it back (pasted into the email) with your human subjects certificate attached.
General research area for which the data will be used: CLEF (plus perhaps something more descriptive)
6. Once you are approved, the organizers will add you to the physionetworks ShARE/CLEF eHealth 2014 account as a reviewer. We will send you an email informing you that you can go to the PhysioNetWorks website and click on the authorized users link to access the data (it will ask you to log in using your physionetworks account login): https://physionet.org/works/ShAReCLEFeHealth2014Task2/
Note: If you participated in CLEF eHealth 2013 and obtained permissions, you will skip Steps 2-5 and will be provided access to the 2014 dataset following successful Step 1 registration.
Please note that all individuals working on the data need to individually obtain a human subjects training certificate, apply for a Physionet account, and sign their own DUA on the Physionet site.
To register for the task on the CLEF site, it is sufficient to register only one participant per participating group, but for access to the task 2 data, each participating individual needs her/his own access permission from Physionet.
(Small) Example data set release: Dec 9 2013
(Full) Training data set release: Jan 10 2014
Test data set release: April 23 2014
Test data set submissions due: May 1 2014
Online working notes (internal review) due: June 3 2014
Online working notes (camera ready for CLEF) due: June 7 2014
Information and Discussion Forum
General information and discussions during the task will be organised through the following Google group:
Reviewing Task 2 Dataset and Annotations
We are providing a GUI interface for calculation of outcome measures, as well as for visualization of system annotations against reference standard annotations. Use of the Evaluation Workbench is completely optional. Because the Evaluation Workbench is still under development, we would appreciate your feedback and questions if you select to use it.
A. Memory issues. You need to allocate extra heap when you run the workbench with all the files, or you will get an "out of memory" error. To do so, you need to use a terminal (or shell) program, go to the directory containing the startup.parameters file, and type:
java -Xms512m -Xmx1024m -jar Eval*.jar
B. Startup Properties file and GUI. The Evaluation Workbench relies on a parameter file called "startup.properties". Since the Workbench is a tool for comparing two sets of annotations, the properties refer to the first (or gold standard) and second (or system) annotators. The following properties will need to be set using the Startup properties GUI before selecting “Initialize” to start the Workbench:
WorkbenchDirectory: Full filename where the executable (.jar) file is located. For example,WorkbenchDirectory=/Users/wendyc/Desktop/EvaluationWorkbenchFolderDistribution_ 2014ShARECLEF
TextInputDirectory: Directory containing
the clinical reports (every document is a single text file in the directory).
AnnotationInputDirectoryFirstAnnotator / AnnotationInputDirectorySecondAnnotator: Directories containing the sets of annotations (gold standard annotations is first, system annotations is second). If you do not have system annotations but just want to view the gold standard annotations, point both input directories to the gold standard annotations.
Knowtator Schema File: File containing the protégé ontology file representing the ShARe schema
Knowtator Schema File =/Users/wendyc/Desktop/CLEFEvaluationWorkbenchFolderDistribution_2014ShARECLEF/ SHARe_Jan18_2012_base.pont
Classification Labels: Labels for classes, attributes, and relations between classes for ShARe schema
Classification Labels= DefaultClassificationProperties
Classification Labels= associatedcode,associatedCode,distal_or_proximal_normalization,negation_indicator_normalization, negation_indicator_normalization,severity_normalization,course_normalization, subject_normalization_CU,Strength number,Strength unit,Strength,Dosage,Frequency number, Frequencyunit,Frequency,Duration,Route,Form,Attributes_medication,disease_disorder, Disease_Disorder,severity,negation_indicator,LABEL,degree_of,subject_class,TIMEX3, uncertainty_indicator_class,subject
**Please remember to set pathnames appropriate for your operating system. MacOS / Unix pathnames are in the form "/applications/EvaluationWorkbench/…", whereas Windows paths are in the form "c:\\Program Files\\Evaluation Workbench\\…" (escape characters included). After setting paths appropriately for your computer and operating system, you can activate the Workbench by going to the distribution directory and using the mouse to double-click the EvaluationWorkbench.jar icon.**
Select “Save” once you have set these parameters in the GUI, then “Initialize” to start the Evaluation Workbench.
C. Short tutorial on Evaluation Workbench (5 minute video here: http://screencast.com/t/QzaMLwWwFe):
To participate in an electronic dialogue about use of the Workbench, please sign up for the google group: https://groups.google.com/forum/?fromgroups#!forum/evaluation-workbench