src.dackar.causal.CausalBase¶
Created on April, 2024
@author: wangc, mandd
Attributes¶
Classes¶
Base Class for Causal Analysis  | 
Module Contents¶
- class src.dackar.causal.CausalBase.CausalBase(nlp, entID='SSC', causalKeywordID='causal', *args, **kwargs)[source]¶
 Bases:
objectBase Class for Causal Analysis
- _causalNames = ['cause', 'cause health status', 'causal keyword', 'effect', 'effect health status', 'sentence',...[source]¶
 
- getAttribute(name)[source]¶
 Get self attribute data
- Parameters:
 name (str) – name of protected variable
- Returns:
 attribute data
- Return type:
 pandas.DataFrame
- textProcess()[source]¶
 Function to clean text
- Parameters:
 None
- Returns:
 procObj, DACKAR.Preprocessing object
- getKeywords(filename, columnNames=None)[source]¶
 Get the keywords from given file
- Parameters:
 filename – str, the file name to read the keywords
- Returns:
 dict, dictionary contains the keywords
- Return type:
 kw
- extractLemma(varList)[source]¶
 Lammatize the variable list
- Parameters:
 varList – list, list of variables
- Returns:
 list, list of lammatized variables
- Return type:
 lemmaList
- addKeywords(keywords, ktype)[source]¶
 Method to update self._causalKeywords or self._statusKeywords
- Parameters:
 keywords – dict, keywords that will be add to self._causalKeywords or self._statusKeywords
ktype – string, either ‘status’ or ‘causal’
- addEntityPattern(name, patternList)[source]¶
 Add entity pattern, to extend doc.ents, similar function to self.extendEnt
- Parameters:
 name – str, the name for the entity pattern.
patternList – list, the pattern list, for example:
{"label" – “GPE”, “pattern”: [{“LOWER”: “san”}, {“LOWER”: “francisco”}]}
- __call__(text, extract=True, screen=False)[source]¶
 Find all token sequences matching the supplied pattern
- Parameters:
 text – string, the text that need to be processed
- Returns:
 None
- isPassive(token)[source]¶
 Check the passiveness of the token
- Parameters:
 token – spacy.tokens.Token, the token of the doc
- Returns:
 True, if the token is passive
- Return type:
 isPassive
- isConjecture(token)[source]¶
 Check the conjecture of the token
- Parameters:
 token – spacy.tokens.Token, the token of the doc, the token should be the root of the Doc
- Returns:
 True, if the token/sentence indicates conjecture
- Return type:
 isConjecture
- isNegation(token)[source]¶
 Check negation status of given token
- Parameters:
 token – spacy.tokens.Token, token from spacy.tokens.doc.Doc
- Returns:
 tuple, the negation status and the token text
- Return type:
 (neg, text)
- findVerb(doc)[source]¶
 Find the first verb in the doc
- Parameters:
 doc – spacy.tokens.doc.Doc, the processed document using nlp pipelines
- Returns:
 spacy.tokens.Token, the token that has VERB pos
- Return type:
 token
- getCustomEnts(ents, labels)[source]¶
 Get the custom entities
- Parameters:
 ents – list, all entities from the processed doc
labels – list, list of labels to be used to get the custom entities out of “ents”
- Returns:
 list, the customEnts associates with the “labels”
- Return type:
 customEnts
- getPhrase(ent, start, end, include=False)[source]¶
 Get the phrase for ent with all left children
- Parameters:
 ent – Span, the ent to amend with all left children
start – int, the start index of ent
end – int, the end index of ent
include – bool, include ent in the returned expression if True
- Returns:
 Span or Token, the identified status
- Return type:
 status
- getAmod(ent, start, end, include=False)[source]¶
 Get amod tokens for ent
- Parameters:
 ent – Span, the ent to amend with all left children
start – int, the start index of ent
end – int, the end index of ent
include – bool, include ent in the returned expression if True
- Returns:
 Span or Token, the identified status
- Return type:
 status
- getAmodOnly(ent)[source]¶
 Get amod tokens texts for ent
- Parameters:
 ent – Span, the ent to amend with all left children
- Returns:
 list, the list of amods for ent
- Return type:
 amod
- getCompoundOnly(headEnt, ent)[source]¶
 Get the compounds for headEnt except ent
- Parameters:
 headEnt – Span, the head entity to ent
- Returns:
 list, the list of compounds for head ent
- Return type:
 compDes
- getNbor(token)[source]¶
 Method to get the nbor from token, return None if nbor is not exist
- Parameters:
 token – Token, the provided Token to request nbor
- Returns:
 Token, the requested nbor
- Return type:
 nbor
- validSent(sent)[source]¶
 Check if the sentence has valid structure, either contains subject or object
- Parameters:
 sent – Span, sentence from user provided text
- Returns:
 bool, False if the sentence has no subject and object.
- Return type:
 valid
- findLeftSubj(pred, passive)[source]¶
 Find closest subject in predicates left subtree or predicates parent’s left subtree (recursive). Has a filter on organizations.
- Parameters:
 pred – spacy.tokens.Token, the predicate token
passive – bool, True if passive
- Returns:
 spacy.tokens.Token, the token that represent subject
- Return type:
 subj
- findRightObj(pred, deps=['dobj', 'pobj', 'iobj', 'obj', 'obl', 'oprd'], exclPrepos=[])[source]¶
 Find closest object in predicates right subtree. Skip prepositional objects if the preposition is in exclude list. Has a filter on organizations.
- Parameters:
 pred – spacy.tokens.Token, the predicate token
exclPrepos – list, list of the excluded prepositions
- findRightKeyword(pred, exclPrepos=[])[source]¶
 Find Skip prepositional objects if the preposition is in exclude list. Has a filter on organizations.
- Parameters:
 pred – spacy.tokens.Token, the predicate token
exclPrepos – list, list of the excluded prepositions
- findHealthStatus(root, deps)[source]¶
 Return first child of root (included) that matches dependency list by breadth first search. Search stops after first dependency match if firstDepOnly (used for subject search - do not “jump” over subjects)
- Parameters:
 root – spacy.tokens.Token, the root token
deps – list, the dependency list
- Returns:
 token, the token represents the health status
- Return type:
 child
- isValidCausalEnts(ent)[source]¶
 Check the entity if it belongs to the valid causal entities
Args:
ent: list, list of entities
Returns:
valid: bool, valid cansual ent if True
- getIndex(ent, entList)[source]¶
 Get index for ent in entList
- Parameters:
 ent – Span, ent that is used to get index
entList – list, list of entities
- Returns:
 int, the index for ent
- Return type:
 idx
- getConjuncts(entList)[source]¶
 Get a list of conjuncts from entity list
- Parameters:
 entList – list, list of entities
- Returns:
 list, list of conjuncts
- Return type:
 conjunctList
- collectSents(doc)[source]¶
 collect data of matched sentences that can be used for visualization
- Args:
 doc: spacy.tokens.doc.Doc, the processed document using nlp pipelines
- extract(sents, predSynonyms=[], exclPrepos=[])[source]¶
 General extraction method
- Parameters:
 sents – list, the list of sentences
predSynonyms – list, the list of predicate synonyms
exclPrepos – list, the list of exlcuded prepositions
- Returns:
 generator, the extracted causal relation
- Return type:
 (subject tuple, predicate, object tuple)
- bfs(root, deps)[source]¶
 Return first child of root (included) that matches entType and dependency list by breadth first search. Search stops after first dependency match if firstDepOnly (used for subject search - do not “jump” over subjects)
- Parameters:
 root – spacy.tokens.Token, the root token
deps – list, list of dependency
- Returns:
 spacy.tokens.Token, the matched token
- Return type:
 child
- findSubj(pred, passive)[source]¶
 Find closest subject in predicates left subtree or predicates parent’s left subtree (recursive). Has a filter on organizations.
- Parameters:
 pred – spacy.tokens.Token, the predicate token
passive – bool, True if the predicate token is passive
- Returns:
 spacy.tokens.Token, the token that represents subject
- Return type:
 subj
- findObj(pred, deps=['dobj', 'pobj', 'iobj', 'obj', 'obl'], exclPrepos=[])[source]¶
 Find closest object in predicates right subtree. Skip prepositional objects if the preposition is in exclude list. Has a filter on organizations.
- Parameters:
 pred – spacy.tokens.Token, the predicate token
exclPrepos – list, the list of prepositions that will be excluded
- Returns:
 spacy.tokens.Token,, the token that represents the object
- Return type:
 obj
- isValidKeyword(var, keywords)[source]¶
 - Parameters:
 var – token
keywords – list/dict
Returns: True if the var is a valid among the keywords
- getStatusForSubj(ent, include=False)[source]¶
 Get the status for nsubj/nsubjpass ent
- Parameters:
 ent – Span, the nsubj/nsubjpass ent that will be used to search status
include – bool, include ent in the returned expression if True
- Returns:
 Span or Token, the identified status
- Return type:
 status
- getStatusForObj(ent, include=False)[source]¶
 Get the status for pobj/dobj ent
- Parameters:
 ent – Span, the pobj/dobj ent that will be used to search status
include – bool, include ent in the returned expression if True
- Returns:
 Span or Token, the identified status
- Return type:
 status