Installation
Make sure you have Python 3.6 or later installed. Python version 3.7 or later is highly recommended!
The recommended way to install Python is to use Conda by installing one of
Alternately, * Miniforge may help avoid Windows 10 issues
Then create an environment for working with gatenlp. This example
creates an environment with the name gatenlp
and activates it:
conda create -n gatenlp python==3.8
conda activate gatenlp
The gatenlp has a number of optional dependencies which are only needed if some special features of gatenlp are used.
To install gatenlp
with the minimal set of dependencies run:
python -m pip install gatenlp
To upgrade an already installed gatenlp package to the latest version run:
python -m pip install -U gatenlp
To install gatenlp
with all dependencies run:
python -m pip install gatenlp[all]
To upgrade to the latest version with all dependencies:
python -m pip install -U gatenlp[all]
NOTE: if this fails because of a problem installing torch (this may happen on Windows), first install Pytorch separately according to the Pytorch installation instructions, see: https://pytorch.org/get-started/locally/ then run the gatenlp installation again.
The following specific dependencies can be chosen:
* formats
: to support the various serialization formats
* java
: to support the Gate slave
* stanza
: to support the Stanza bridge
* spacy
: to support the Spacy bridge
* standfordnlp
: to support the StanfordNLP bridge (may get removed in a later version)
* nltk
: to support the nltk tokenizer and nltk bridge
* gazetteers
: to support gazetteers
* dev
: dependencies needed for developing gatenlp
To install gatenlp with support for stanza and spacy and serialization:
python -m pip install gatenlp[stanza,spacy,formats]
To install the latest gatenlp
code from GitHub with all dependencies:
* Clone the repository and change into it
* Run python -m pip install -e .[all]
To also install everything needed for development use the "dev" extras:
python -m pip install -e .[all,dev]
Requirements for using the GATE slave:
- Java 8
- py4j
- GATE 8.6.1 or later
Requirements for running gatenlp
in a Jupyter notebook:
ipython
jupyter
ipykernel
To create a kernel for your conda environment run:
python -m ipykernel install --user --name gatenlp --display-name "Python gatenlp"
The available kernels can be listed with jupyter kernelspec list
To run and show a notebook run the following and use "Kernel - Change Kernel" in the notebook to choose the gatenlp environment speicific kernel:
jupyter notebook notebookname.ipynb
If you prefer Jupyter lab:
python -m pip install jupyterlab
and then start Jupyter lab with:
jupyter lab
In Jupyter lab, you can work on Jupyter notebooks but also use an interactive console which is also able to visualize documents interactively.
Requirements for development:
- Java SDK version 8
- Maven version 3.6