Work Order Processing Demo¶
Setup path and load DACKAR modules¶
[1]:
%reload_ext autoreload
%autoreload 2
# Import libraries
import os, sys
import logging
import warnings
import spacy
from spacy import displacy
warnings.filterwarnings("ignore")
cwd = os.getcwd()
frameworkDir = os.path.abspath(os.path.join(cwd, os.pardir, 'src'))
sys.path.append(frameworkDir)
from dackar.workflows.WorkOrderProcessing import WorkOrderProcessing
from dackar.utils.nlp.nlp_utils import generatePatternList
logging.basicConfig(format='%(asctime)s %(name)-20s %(levelname)-8s %(message)s', datefmt='%d-%b-%y %H:%M:%S', level=logging.INFO)
logging.getLogger().setLevel(logging.ERROR)
Generate entities patterns and process text using WorkOrderProcessing class¶
The following information will be identified:
Entities
Alias associated with entities
Status associated with entities
[2]:
# Specify Entities Labels and IDs
entLabel = "cws_component" # user defined entity label
entId = "OPM"
# Load language model
nlp = spacy.load("en_core_web_lg", exclude=[])
matcher = WorkOrderProcessing(nlp, entID=entId)
entIDList = ['1-91120-P1', '1-91120-PM1', '91120']
patternsEnts = generatePatternList(entIDList, label=entLabel, id=entId, nlp=nlp, attr="LEMMA")
matcher.addEntityPattern('cws_entity_ruler', patternsEnts)
text="1-91120-P1, CLEAN PUMP AND MOTOR. 1-91120-PM1 REQUIRES OIL. 91120, CLEAN TRASH SCREEN"
doc = nlp(text)
displacy.render(doc, style='ent', jupyter=True)
28-May-25 09:57:58 dackar.workflows.WorkflowBase INFO Create instance of WorkOrderProcessing
28-May-25 09:58:00 dackar.utils.nlp.nlp_utils INFO Model: core_web_lg, Language: en
28-May-25 09:58:00 dackar.utils.nlp.nlp_utils INFO Available pipelines:pysbdSentenceBoundaries, tok2vec, tagger, parser, attribute_ruler, lemmatizer, mergePhrase, normEntities, initCoref, aliasResolver, anaphorCoref, anaphorEntCoref
1-91120-P1
cws_component
, CLEAN PUMP AND MOTOR.
1-91120-PM1
cws_component
REQUIRES OIL.
91120
cws_component
, CLEAN TRASH SCREEN
Processing work order accumulatively¶
[3]:
matcher.reset()
sents = list(text.split('.'))
for sent in sents:
matcher(sent)
matcher._entStatus
28-May-25 09:58:00 dackar.workflows.WorkOrderProcessing INFO Start to extract health status
28-May-25 09:58:00 dackar.workflows.WorkOrderProcessing INFO End of health status extraction!
28-May-25 09:58:00 dackar.workflows.WorkOrderProcessing INFO Start to extract causal relation using OPM model information
28-May-25 09:58:00 dackar.workflows.WorkOrderProcessing INFO End of causal relation extraction!
28-May-25 09:58:00 dackar.workflows.WorkOrderProcessing INFO Start to extract health status
28-May-25 09:58:00 dackar.workflows.WorkOrderProcessing INFO End of health status extraction!
28-May-25 09:58:00 dackar.workflows.WorkOrderProcessing INFO Start to extract causal relation using OPM model information
28-May-25 09:58:00 dackar.workflows.WorkOrderProcessing INFO End of causal relation extraction!
28-May-25 09:58:00 dackar.workflows.WorkOrderProcessing INFO Start to extract health status
28-May-25 09:58:00 dackar.workflows.WorkOrderProcessing INFO End of health status extraction!
28-May-25 09:58:00 dackar.workflows.WorkOrderProcessing INFO Start to extract causal relation using OPM model information
28-May-25 09:58:00 dackar.workflows.WorkOrderProcessing INFO End of causal relation extraction!
[3]:
entity | alias | entity_text | status | conjecture | negation | negation_text | |
---|---|---|---|---|---|---|---|
0 | 1-91120-P1 | unit 1 pump | unit 1 pump | CLEAN PUMP AND MOTOR | False | False | |
1 | 1-91120-PM1 | unit 1 pump motor | unit 1 pump motor | OIL | False | False | |
2 | 91120 | pump | pump | CLEAN TRASH SCREEN | False | False |
Accessing attributes of entities¶
[4]:
for ent in doc.ents:
print(ent.text, ent._.alias, ent.ent_id_, ent.label_)
1-91120-P1 unit 1 pump OPM cws_component
1-91120-PM1 unit 1 pump motor OPM cws_component
91120 pump OPM cws_component