Package gatenlp

Expand source code
# NOTE: do not place a comment at the end of the version assignment
# line since we parse that line in a shell script!
# __version__ = "0.9.9"
from gatenlp.version import __version__

try:
    import sortedcontainers
except Exception:
    import sys
    print(
        "ERROR: required package sortedcontainers cannot be imported!", file=sys.stderr
    )
    print(
        "Please install it, using e.g. 'pip install -U sortedcontainers'",
        file=sys.stderr,
    )
    sys.exit(1)
# TODO: check version of sortedcontainers (we have 2.1.0)

from gatenlp.utils import init_logger

logger = init_logger("gatenlp")
from gatenlp.span import Span
from gatenlp.annotation import Annotation
from gatenlp.annotation_set import AnnotationSet
from gatenlp.changelog import ChangeLog
from gatenlp.document import Document
from gatenlp.gateworker import GateWorker, GateWorkerAnnotator
from gatenlp.gate_interaction import _pr_decorator as GateNlpPr
from gatenlp.gate_interaction import interact


def init_notebook():
    from gatenlp.serialization.default import HtmlAnnViewerSerializer
    from gatenlp.gatenlpconfig import gatenlpconfig

    HtmlAnnViewerSerializer.init_javscript()
    gatenlpconfig.notebook_js_initialized = True

Sub-modules

gatenlp.annotation

Module for Annotation class which represents information about a span of text in a document.

gatenlp.annotation_set

Module for AnnotationSet class which represents a named collection of annotations which can arbitrarily overlap.

gatenlp.changelog

Module for ChangeLog class which represents a log of changes to any of the components of a Document: document features, annotations, annotation features.

gatenlp.changelog_consts

Module for defining the constants used in the changelog module

gatenlp.corpora

Module that defines base and implementation classes for representing document collections …

gatenlp.document

Module that implements the Document class for representing gatenlp documents with features and annotation sets.

gatenlp.features

Module that implements class Feature for representing features.

gatenlp.gate_interaction

Support for interacting between a GATE (java) process and a gatenlp (Python) process. This is used by the Java GATE Python plugin.

gatenlp.gatenlpconfig

Module that provides the class GatenlpConfig and the instance gatenlpconfig which stores various global configuration options.

gatenlp.gateworker

Module for interacting with a Java GATE process, running API commands on it and exchanging data with it.

gatenlp.impl

This subpackage contains modules for (temporary) implementation of datastructures and algorithms needed. Some of these may get replaced by other …

gatenlp.lang

Subpackage for future language-specific resources and annotators

gatenlp.lib_spacy

Support for using spacy: convert from spacy to gatenlp documents and annotations.

gatenlp.lib_stanza

Support for using stanford stanza (see https://stanfordnlp.github.io/stanza/): convert from stanford Stanza output to gatenlp documents and annotations.

gatenlp.offsetmapper

Module that implements the OffsetMapper class for mapping between Java-style and Python-style string offsets. Java strings are represented as UTF16 …

gatenlp.pam

Subpackage for modules related to pattern matching.

gatenlp.processing

Package for annotators, and other things related to processing documents.

gatenlp.serialization

Subpackage for modules related to serialization.

gatenlp.span

Module for Span class

gatenlp.urlfileutils

Module for functions that help reading binary and textual data from either URLs or local files.

gatenlp.utils

Various utilities that could be useful in several modules.

gatenlp.version

Functions

def init_notebook()