Stanza pipeline

If gatenlp has been installed with the stanza extra (pip install gatenlp[stanza] or pip install gatenlp[all]) you can run a Stanford Stanza pipeline on a document and get the result as gatenlp annotations.

from gatenlp import Document
from gatenlp.lib_stanza import AnnStanza
import stanza


# In order to use the English pipeline with stanza, the model has to get downloaded first
stanza.download('en')
Downloading https://raw.githubusercontent.com/stanfordnlp/stanza-resources/main/resources_1.2.2.json:   0%|   …


2021-09-12 20:25:46,231|INFO|stanza|Downloading default packages for language: en (English)...



Downloading http://nlp.stanford.edu/software/stanza/1.2.2/en/default.zip:   0%|          | 0.00/412M [00:00<?,…


2021-09-12 20:28:44,342|INFO|stanza|Finished downloading models and saved to /home/johann/stanza_resources.
doc = Document.load("https://gatenlp.github.io/python-gatenlp/testdocument2.txt")
doc

Annotating the document using Stanza

In order to annotate one or more documents using Stanza, first create a AnnStanza annotator object and the run the document(s) through this annotator:

stanza_annotator = AnnStanza(lang="en")
2021-09-12 20:28:44,689|INFO|stanza|Loading these models for language: en (English):
=========================
| Processor | Package   |
-------------------------
| tokenize  | combined  |
| pos       | combined  |
| lemma     | combined  |
| depparse  | combined  |
| sentiment | sstplus   |
| ner       | ontonotes |
=========================

2021-09-12 20:28:44,691|INFO|stanza|Use device: cpu
2021-09-12 20:28:44,692|INFO|stanza|Loading: tokenize
2021-09-12 20:28:44,697|INFO|stanza|Loading: pos
2021-09-12 20:28:44,954|INFO|stanza|Loading: lemma
2021-09-12 20:28:44,991|INFO|stanza|Loading: depparse
2021-09-12 20:28:45,369|INFO|stanza|Loading: sentiment
2021-09-12 20:28:45,766|INFO|stanza|Loading: ner
2021-09-12 20:28:46,371|INFO|stanza|Done loading processors!
doc = stanza_annotator(doc)
doc