Skip to content

External Information[source]

The eds.external_information_qualifier pipeline component qualifies spans in a document based on external information and a defined distance to these contextual/external elements as in Distant Supervision (http://deepdive.stanford.edu/distant_supervision).

Parameters

PARAMETER DESCRIPTION
nlp

The spaCy pipeline object.

TYPE: PipelineProtocol DEFAULT: None

name

The name of the component.

TYPE: Optional[str] DEFAULT: "distant_qualifier"

span_getter

The function or callable to get spans from the document.

TYPE: SpanGetterArg

external_information

A dictionary where keys are the names of the attributes to set on spans, and values are ExternalInformation objects defining the context and comparison settings.

ExternalInformation
PARAMETER DESCRIPTION
doc_attr

The elements under this attribute should be a list of dicts with keys value and class (List[Dict[str, Any]]).

Example:

import datetime

doc_attr = "_.context_dates"
context_dates = [
    {"value": datetime.datetime(2024, 2, 15), "class": "irm"},
    {"value": datetime.datetime(2024, 2, 7), "class": "biopsy"},
]

TYPE: str

span_attribute

TYPE: str

threshold

TYPE: Union[float, timedelta]

reduce

one of ["all", "one_only", "closest"]

TYPE: str DEFAULT: 'all'

comparison_type

TYPE: str DEFAULT: 'similarity'

One

TYPE: Dict[str, ExternalInformation]