Knowledge Integration for Breast Cancer Characterization*

*This work is partially supported by IBM Research through the AI Horizons Network.

With the rapid advancements in cancer research, the information that is useful for characterizing disease, staging tumors, and creating treatment and survivorship plans has been changing at a pace that creates challenges when practicing oncologists and physicians try to remain current. One example of this involves increasing usage of biomarkers when characterizing the pathologic prognostic stage of a breast tumor. We present our semantic technology approach to support cancer characterization and demonstrate it in our end-to-end prototype system that collects the newest breast cancer staging criteria from authoritative oncology manuals to construct an ontology for breast cancer. Using a tool we developed that utilizes this ontology, physicians can quickly stage a new patient to support identifying risks, treatment options, and monitoring plans based on authoritative and best practice guidelines. Physicians can also re-stage an existing patient, allowing them to find patients whose stage has changed in a given patient cohort. As new guidelines emerge, using our proposed mechanism, which is grounded by semantic technologies for ingesting new data from staging manuals, we have created an enriched cancer staging ontology that integrates relevant data from several sources with very little human intervention.

Demo

Setup

VirtualBox Install Method: Bringing up a Whyis VM

In order to host our application on the Whyis framework, we need to bring up a machine with Whyis installed. You could install Whyis on your host machine (if on an Ubuntu machine); but the preferred method is to do so by bringing up a VM. In order to bring up a VM, you need to have a Virtualbox platform installed.

On installing the VM container, we can now proceed to installing Whyis. For the installation you need to create a whyis-vm directory within the home directory of your system. If on a Mac/Ubuntu Machine, to do that you could:
mkdir whyis-vm
cd whyis-vm

Once we are within the whyis-vm directory(this will serve as a mounted folder between our host machine and VM), we need to download the Vagrantfile and install script to launch a Whyis machine. They can be downloaded by following the below commands.
curl -skL https://github.com/tetherless-world/whyis/blob/release/Vagrantfile > Vagrantfile
curl -skL https://github.com/tetherless-world/whyis/blob/release/install.sh > install.sh

To launch a machine:
vagrant up

Once the machine is launched, we can connect to the machine, and activate the virtual environment venv.
vagrant ssh
sudo su - whyis
cd /apps/whyis
source venv/bin/activate

Once these steps are done, we can start the Whyis server by running:
python manage.py runserver -h 0.0.0.0

The Whyis landing page can be accessed at:
http:192.168.33.36:5000

Let’s go ahead and register now, using the below URL:
http:192.168.33.36:5000/register

For further information on installation, please see: http://tetherless-world.github.io/whyis/install

Alternative method to install Whyis on a Linux box

In order to host our application on the Whyis framework, we need to bring up a machine with Whyis installed. You could install Whyis on your host machine (if on an Ubuntu machine).

Once we have a Ubuntu box, we can now proceed to installing Whyis. For the installation you need to create a whyis-vm directory within the home directory of your system. If on a Mac/Ubuntu Machine, to do that you could:
mkdir whyis-vm
cd whyis-vm

Once we are within the whyis-vm directory(this will serve as a mounted folder between our host machine and VM), we need to download the install script to launch a Whyis machine. They can be downloaded by following the below commands.
curl -skL https://github.com/tetherless-world/whyis/blob/release/install.sh > install.sh

Once the install script is downloaded, we can install Whyis, and activate the virtual environment venv.
sh install.sh
sudo su - whyis
cd /apps/whyis
source venv/bin/activate

Once these steps are done, we can start the Whyis server by running:
python manage.py runserver -h 0.0.0.0

The Whyis landing page can be accessed at:
http:0.0.0.0:5000

Let’s go ahead and register now, using the below URL:
http:0.0.0.0:5000/register

For further information on installation, please see: http://tetherless-world.github.io/whyis/install

Please note if you use the native box method of installation, the URL that needs to be accessed is: 0.0.0.0:5000

Installing the heals2vis application

Now that we have a working Whyis instance, we can install the heals2vis application: It can be downloaded by performing
cd /apps
git clone https://github.com/cancer-staging-ontology/cancer-staging-ontology.github.io

Once we have downloaded the repository, we need to install the application. We only need the heals2vis directory from the above repository, we can move the heals2vis directory up a directory.
sudo mv cancer-staging-ontology.github.io/heals2vis .
sudo chown -R whyis:whyis heals2vis/

Become whyis user again, to be able install the heals2vis app
sudo su - whyis
cp heals2vis/autonomic.py whyis/

<>Change directory to the heals2vis directory, and run the below command:
cd heals2vis && pip install -e .

Exit the whyis user mode to restart services. Now we need to restart our webserver and queueing scheduler in order for the above installation to take effect.
sudo service apache2 restart
sudo service celeryd restart

Now we can restart the server, by changing directory back to /apps/whyis and switch to whyis user mode.
sudo su - whyis
cd /apps/whyis
python manage.py runserver -h 0.0.0.0

We should be able to see a heals2vis landing page when we try 192.168.33.36:5000.

Alternative URL (for Ubuntu installation) We should be able to see a heals2vis landing page when we try 0.0.0.0:5000.

Loading the data into Whyis’s Blazegraph instance

We need to load in the SEER patient records, the CIViC drug dataset and the Cancer Staging Ontology to be able to load the Whyis physicianView and derive knowledge by inferencing. The below commands need to be run to load knowledge into the Blazegraph instance.

Considering we are at the point, we stopped at last; we just need to stop the server by using a kill command Ctrl+C; to be able to proceed with the data load
python manage.py load -i /apps/heals2vis/data/viz.ttl -f turtle
python manage.py load -i /apps/heals2vis/data/cancer_staging_terms.owl.ttl -f turtle
python manage.py load -i /apps/heals2vis/data/civic-out.txt -f trig
python manage.py load -i /apps/heals2vis/data/seer-out-sample.txt -f trig

Now that the data is loaded we need to pre-run the inferencer on these records, it can be run by the below command:
python manage.py test_agent -a heals2vis.inferencer.Infer

Accessing the view

Once these steps are done, we can restart the Whyis server by running:
python manage.py runserver -h 0.0.0.0

Now that we have the heals2vis application installed and the data loaded in: We can explore the interactive physician view at: http://192.168.33.36:5000/physicianView. Make sure to click this URL twice, it gives an error on the first hit due to a known bug.

Alternative URL ( Ubuntu method) We can explore the interactive physician view at: http://0.0.0.0:5000/physicianView. Make sure to click this URL twice, it gives an error on the first hit due to a known bug.

Publications

  • Knowledge Integration for Disease Characterization: A Breast Cancer Example, International Semantic Web Conference 2018 Resource Track Paper (pre-print)
  • Ontology-enabled Breast Cancer Characterization, International Semantic Web Conference 2018 Demo Paper. (pre-print)
  • Knowledge Representation and Reasoning for Breast Cancer, American Medical Informatics Association 2018 Knowledge Representation and Semantics Working Group Pre-Symposium Extended Abstract (submitted)

Contact

If you have any questions about this work, please contact Cancer Staging Ontology Developers.