Neural Architectures for Biological Inter-Sentence Relation Extraction
Overview
This website hosts the code and the corpus associated to the paper presented at SDU@AAAI 22
Corpus
The dataset used for this project is an extension of the dataset published by Noriega et al. 2018. The original corpus contains hand curated annotations with text spans for biochemical events and biological context.
We extend this corpus with full text tokenized aligned to the original annotations to be make it compatible with neural network encoder architectures.
The data files can be downlodaded here and the parsing code is locatede here
Code and Instructions
The implementation of the neural architectures to detect biocontext can be found here.
Instructions
To run the code, create the conda environment using the file named conda_environment.yml
Once the environment is created and configured, create a configuration file based on test_conf.conf
. Replace the paths with your local environment’s paths and configure the options and hyper parameters accordingly. The descriptions of the configuration fields can be found in config_README.txt
.
To run the cross validation experiments described in this paper, edit the configuration file and run the following command from within the code
directory.
This will execute a training loop and testing run on fold $FOLD
of cross validation, using the config file $CONF_FILE
, using $NUM_GPUS
, if available. Substitute the variables with your own values.
$ python transformer_methods/train.py --num-gpus $NUM_GPUS --fold $FOLD --conf $CONF_FILE
To retrieve the testing scores (precision, recall and F1), run the command:
$ python transformer_methods/retrieve_cv_test_scores.py -i $EXP_PREFIX
Substitute the variable $EXP_PREFIX
with the directory’a name prefix of the cross validation folds. For example, if you run the six-fold CV and have the following six directories with the tensorboard logs:
runs/experiment_cv_0/
runs/experiment_cv_1/
runs/experiment_cv_2/
runs/experiment_cv_3/
runs/experiment_cv_4/
runs/experiment_cv_5/
Substitute $CONF_FILE
for runs/experiment_cv_
to fetch the CV results.
Citing
If you use this work or the full-text corpus, plese cite us using the following bibtex.
TBD
If you use the annotations without relying on the full-text, please cite the authors using the following bibtex.
@ARTICLE{noriega-atala2020,
author={Noriega-Atala, Enrique and Hein, Paul D. and Thumsi, Shraddha S. and Wong, Zechy and Wang, Xia and Hendryx, Sean M. and Morrison, Clayton T.},
journal={IEEE/ACM Transactions on Computational Biology and Bioinformatics},
title={Extracting Inter-Sentence Relations for Associating Biological Context with Events in Biomedical Texts},
year={2020},
volume={17},
number={6},
pages={1895-1906},
doi={10.1109/TCBB.2019.2904231}}
}