Work Order Processing Demo

Setup path and load DACKAR modules

[ ]:
%reload_ext autoreload
%autoreload 2

# Import libraries
import os, sys
import logging
import warnings
import spacy
from spacy import displacy
warnings.filterwarnings("ignore")

cwd = os.getcwd()
frameworkDir = os.path.abspath(os.path.join(cwd, os.pardir, 'src'))
sys.path.append(frameworkDir)

from dackar.causal.CausalPhrase import CausalPhrase
from dackar.utils.nlp.nlp_utils import generatePatternList

logging.basicConfig(format='%(asctime)s %(name)-20s %(levelname)-8s %(message)s', datefmt='%d-%b-%y %H:%M:%S', level=logging.INFO)
logging.getLogger().setLevel(logging.ERROR)

Generate entities patterns and process text using CausalPhrase class

The following information will be identified:

  • Entities

  • Alias associated with entities

  • Status associated with entities

[ ]:
# Specify Entities Labels and IDs
entLabel = "cws_component"        # user defined entity label
entId = "OPM"
# Load language model
nlp = spacy.load("en_core_web_lg", exclude=[])
matcher = CausalPhrase(nlp, entID=entId)

entIDList = ['1-91120-P1', '1-91120-PM1', '91120']
patternsEnts = generatePatternList(entIDList, label=entLabel,    id=entId,    nlp=nlp, attr="LEMMA")
matcher.addEntityPattern('cws_entity_ruler', patternsEnts)

text="1-91120-P1, CLEAN PUMP AND MOTOR. 1-91120-PM1 REQUIRES OIL. 91120, CLEAN TRASH SCREEN"

doc = nlp(text)
displacy.render(doc, style='ent', jupyter=True)

Processing work order accumulatively

[ ]:
matcher.reset()
sents = list(text.split('.'))
for sent in sents:
    matcher(sent)
matcher._entStatus

Accessing attributes of entities

[ ]:
for ent in doc.ents:
    print(ent.text, ent._.alias, ent.ent_id_, ent.label_)