Work Order Processing Demo¶
Setup path and load DACKAR modules¶
[ ]:
%reload_ext autoreload
%autoreload 2
# Import libraries
import os, sys
import logging
import warnings
import spacy
from spacy import displacy
warnings.filterwarnings("ignore")
cwd = os.getcwd()
frameworkDir = os.path.abspath(os.path.join(cwd, os.pardir, 'src'))
sys.path.append(frameworkDir)
from dackar.causal.CausalPhrase import CausalPhrase
from dackar.utils.nlp.nlp_utils import generatePatternList
logging.basicConfig(format='%(asctime)s %(name)-20s %(levelname)-8s %(message)s', datefmt='%d-%b-%y %H:%M:%S', level=logging.INFO)
logging.getLogger().setLevel(logging.ERROR)
Generate entities patterns and process text using CausalPhrase class¶
The following information will be identified:
Entities
Alias associated with entities
Status associated with entities
[ ]:
# Specify Entities Labels and IDs
entLabel = "cws_component" # user defined entity label
entId = "OPM"
# Load language model
nlp = spacy.load("en_core_web_lg", exclude=[])
matcher = CausalPhrase(nlp, entID=entId)
entIDList = ['1-91120-P1', '1-91120-PM1', '91120']
patternsEnts = generatePatternList(entIDList, label=entLabel, id=entId, nlp=nlp, attr="LEMMA")
matcher.addEntityPattern('cws_entity_ruler', patternsEnts)
text="1-91120-P1, CLEAN PUMP AND MOTOR. 1-91120-PM1 REQUIRES OIL. 91120, CLEAN TRASH SCREEN"
doc = nlp(text)
displacy.render(doc, style='ent', jupyter=True)
Processing work order accumulatively¶
[ ]:
matcher.reset()
sents = list(text.split('.'))
for sent in sents:
matcher(sent)
matcher._entStatus
Accessing attributes of entities¶
[ ]:
for ent in doc.ents:
print(ent.text, ent._.alias, ent.ent_id_, ent.label_)