src.dackar.similarity.synsetUtils¶
Functions¶
|
Compute similarity for synsetList pair |
|
Compute word order similarity for synsetList pair |
|
Construct synset order vector for word order similarity calculation |
|
Identify best similar synset from synsets |
|
Compute the similarity between two synset using semantic analysis |
|
Path length calculation using nonlinear transfer function between two Wordnet Synsets. |
|
Words at upper layers of hierarchical semantic nets have more general concepts and less semantic similarity |
|
Compute the similarity between two synsetList using semantic analysis |
|
Construct semantic vector |
|
Compute synsets similarity |
Construct semantic vector while disambiguation has been already performed |
|
|
Compute semantic similarity for given synsets while disambiguation has been already performed for given synsets |
Module Contents¶
- src.dackar.similarity.synsetUtils.synsetListSimilarity(synsetList1, synsetList2, delta=0.85)[source]¶
Compute similarity for synsetList pair
- Parameters:
synsetList1 – list, list of synset
synsetList2 – list, list of synset
delta – float, between 0 and 1, factor for semantic similarity contribution
- Returns:
float, the similarity score
- Return type:
similarity
- src.dackar.similarity.synsetUtils.wordOrderSimilaritySynsetList(synsetList1, synsetList2)[source]¶
Compute word order similarity for synsetList pair
- Parameters:
synsetList1 – list, list of synset
synsetList2 – list, list of synset
- Returns:
float, word order similarity score
- src.dackar.similarity.synsetUtils.constructSynsetOrderVector(synsets, jointSynsets, index)[source]¶
Construct synset order vector for word order similarity calculation
- Parameters:
synsets – list of synsets
jointSynsets – list of joint synsets
index – int, index for synsets
- Returns:
np.array, synset order vector
- Return type:
vector
- src.dackar.similarity.synsetUtils.identifyBestSimilarSynsetFromSynsets(syn, synsets)[source]¶
Identify best similar synset from synsets
- Parameters:
syn – wn.synset, synset
synsets – list of synsets
- Returns:
the best similar synset in synsets similarity: the best similarity score
- Return type:
bestSyn
- src.dackar.similarity.synsetUtils.semanticSimilaritySynsets(synsetA, synsetB, disambiguation=False)[source]¶
Compute the similarity between two synset using semantic analysis (e.g., using both path length and depth information in wordnet)
- Parameters:
synsetA – wordnet.synset, the first synset
synsetB – wordnet.synset, the second synset
- Returns:
float, [0, 1], the similarity score
- Return type:
similarity
- src.dackar.similarity.synsetUtils.pathLength(synsetA, synsetB, alpha=0.2, disambiguation=False)[source]¶
Path length calculation using nonlinear transfer function between two Wordnet Synsets. The two Synsets should be the best Synset Pair (e.g., disambiguation should be performed)
- Parameters:
synsetA – wordnet.synset, synset for first word
synsetB – wordnet.synset, synset for second word
alpha – float, a constant in monotonically descreasing function, exp(-alpha*wordnetpathLength),
wordnet (parameter used to scale the shortest path length. For)
0.2 (the optimal value is)
disambiguation – bool, True if disambiguation have been performed for the given synsets
- Returns:
float, [0, 1], the shortest distance between two synsets using exponential descreasing function.
- Return type:
shortDistance
- src.dackar.similarity.synsetUtils.scalingDepthEffect(synsetA, synsetB, beta=0.45, disambiguation=False)[source]¶
Words at upper layers of hierarchical semantic nets have more general concepts and less semantic similarity between words than words at lower layers. This method is used to scale the similarity behavior with repect to depth h (e.g., [exp(beta*h)-exp(-beta*g)]/[exp(beta*h)+exp(-beta*g)]) The two Synsets should be the best Synset Pair (e.g., disambiguation should be performed)
- Parameters:
synsetA – wordnet.synset, synset for first word
synsetB – wordnet.synset, synset for second word
beta – float, parameter used to scale the shortest depth, for wordnet, the optimal value is 0.45
disambiguation – bool, True if disambiguation have been performed for the given synsets
- Returns:
float, [0, 1], similary score based on depth effect in wordnet
- Return type:
treeDist
- src.dackar.similarity.synsetUtils.semanticSimilaritySynsetList(synsetList1, synsetList2)[source]¶
Compute the similarity between two synsetList using semantic analysis (i.e., compute the similarity using both path length and depth information in wordnet)
- Parameters:
synsetList1 – list, the list of synset
synsetList2 – list, the list of synset
- Returns:
float, [0, 1], the similarity score
- Return type:
semSimilarity
- src.dackar.similarity.synsetUtils.constructSemanticVector(syns, jointSyns)[source]¶
Construct semantic vector
- Parameters:
syns – list of synsets
jointSyns – list of joint synsets
- Returns:
numpy.array, the semantic vector
- Return type:
vector
- src.dackar.similarity.synsetUtils.synsetsSimilarity(synsetA, synsetB, method='semantic_similarity_synsets', disambiguation=True)[source]¶
Compute synsets similarity
- Parameters:
synsetA – wordnet.synset, the first synset
synsetB – wordnet.synset, the second synset
method – str, the method used to compute synset similarity
['semantic_similarity_synsets' (one of)
'path'
'wup'
'lch'
'res'
'jcn'
'lin']
disambiguation – bool, True if disambiguation has been already performed
- Returns:
float, [0, 1], the similarity score
- Return type:
similarity
- src.dackar.similarity.synsetUtils.constructSemanticVectorUsingDisambiguatedSynsets(wordSynsets, jointWordSynsets, simMethod='semantic_similarity_synsets')[source]¶
Construct semantic vector while disambiguation has been already performed
- Parameters:
wordSynsets – set/list, set of words synsets
jointWords – set, set of joint words synsets
simMethod – str, method for similarity analysis in the construction of semantic vectors
['semantic_similarity_synsets' (one of)
'path'
'wup'
'lch'
'res'
'jcn'
'lin']
- Returns:
numpy.array, semantic vector with disambiguation
- Return type:
vector
- src.dackar.similarity.synsetUtils.semanticSimilarityUsingDisambiguatedSynsets(synsetsA, synsetsB, simMethod='semantic_similarity_synsets')[source]¶
Compute semantic similarity for given synsets while disambiguation has been already performed for given synsets
- Parameters:
synsetsA – set/list, list of synsets
synsetsB – set/list, list of synsets
simMethod – str, method for similarity analysis in the construction of semantic vectors
['semantic_similarity_synsets' (one of)
'path'
'wup'
'lch'
'res'
'jcn'
'lin']
- Returns:
float, [0, 1], the similarity score
- Return type:
semSimilarity