src.dackar.similarity.synsetUtils¶

Functions¶

`synsetListSimilarity`(synsetList1, synsetList2[, delta])	Compute similarity for synsetList pair
`wordOrderSimilaritySynsetList`(synsetList1, synsetList2)	Compute word order similarity for synsetList pair
`constructSynsetOrderVector`(synsets, jointSynsets)	Construct synset order vector for word order similarity calculation
`identifyBestSimilarSynsetFromSynsets`(syn, synsets)	Identify best similar synset from synsets
`semanticSimilaritySynsets`(synsetA, synsetB)	Compute the similarity between two synset using semantic analysis
`pathLength`(synsetA, synsetB[, alpha, disambiguation])	Path length calculation using nonlinear transfer function between two Wordnet Synsets.
`scalingDepthEffect`(synsetA, synsetB[, beta, ...])	Words at upper layers of hierarchical semantic nets have more general concepts and less semantic similarity
`semanticSimilaritySynsetList`(synsetList1, synsetList2)	Compute the similarity between two synsetList using semantic analysis
`constructSemanticVector`(syns, jointSyns)	Construct semantic vector
`synsetsSimilarity`(synsetA, synsetB[, method, ...])	Compute synsets similarity
`constructSemanticVectorUsingDisambiguatedSynsets`(...)	Construct semantic vector while disambiguation has been already performed
`semanticSimilarityUsingDisambiguatedSynsets`(synsetsA, ...)	Compute semantic similarity for given synsets while disambiguation has been already performed for given synsets

Module Contents¶

src.dackar.similarity.synsetUtils.synsetListSimilarity(synsetList1, synsetList2, delta=0.85)[source]¶

Compute similarity for synsetList pair

Parameters:

synsetList1 – list, list of synset
synsetList2 – list, list of synset
delta – float, between 0 and 1, factor for semantic similarity contribution

Returns:

float, the similarity score

Return type:

similarity

src.dackar.similarity.synsetUtils.wordOrderSimilaritySynsetList(synsetList1, synsetList2)[source]¶

Compute word order similarity for synsetList pair

Parameters:

synsetList1 – list, list of synset
synsetList2 – list, list of synset

Returns:

float, word order similarity score

src.dackar.similarity.synsetUtils.constructSynsetOrderVector(synsets, jointSynsets)[source]¶

Construct synset order vector for word order similarity calculation

Parameters:

synsets – list of synsets
jointSynsets – list of joint synsets

Returns:

np.array, synset order vector

Return type:

vector

src.dackar.similarity.synsetUtils.identifyBestSimilarSynsetFromSynsets(syn, synsets)[source]¶

Identify best similar synset from synsets

Parameters:

syn – wn.synset, synset
synsets – list of synsets

Returns:

the best similar synset in synsets similarity: the best similarity score

Return type:

bestSyn

src.dackar.similarity.synsetUtils.semanticSimilaritySynsets(synsetA, synsetB)[source]¶

Compute the similarity between two synset using semantic analysis (e.g., using both path length and depth information in wordnet)

Parameters:

synsetA – wordnet.synset, the first synset
synsetB – wordnet.synset, the second synset

Returns:

float, [0, 1], the similarity score

Return type:

similarity

src.dackar.similarity.synsetUtils.pathLength(synsetA, synsetB, alpha=0.2, disambiguation=False)[source]¶

Path length calculation using nonlinear transfer function between two Wordnet Synsets. The two Synsets should be the best Synset Pair (e.g., disambiguation should be performed)

Parameters:

synsetA – wordnet.synset, synset for first word
synsetB – wordnet.synset, synset for second word
alpha – float, a constant in monotonically decreasing function, exp(-alpha*wordnetpathLength),
wordnet (parameter used to scale the shortest path length. For)
0.2 (the optimal value is)
disambiguation – bool, True if disambiguation have been performed for the given synsets

Returns:

float, [0, 1], the shortest distance between two synsets using exponential decreasing function.

Return type:

shortDistance

src.dackar.similarity.synsetUtils.scalingDepthEffect(synsetA, synsetB, beta=0.45, disambiguation=False)[source]¶

Words at upper layers of hierarchical semantic nets have more general concepts and less semantic similarity between words than words at lower layers. This method is used to scale the similarity behavior with repect to depth h (e.g., [exp(beta*h)-exp(-beta*g)]/[exp(beta*h)+exp(-beta*g)]) The two Synsets should be the best Synset Pair (e.g., disambiguation should be performed)

Parameters:

synsetA – wordnet.synset, synset for first word
synsetB – wordnet.synset, synset for second word
beta – float, parameter used to scale the shortest depth, for wordnet, the optimal value is 0.45
disambiguation – bool, True if disambiguation have been performed for the given synsets

Returns:

float, [0, 1], similary score based on depth effect in wordnet

Return type:

treeDist

src.dackar.similarity.synsetUtils.semanticSimilaritySynsetList(synsetList1, synsetList2)[source]¶

Compute the similarity between two synsetList using semantic analysis (i.e., compute the similarity using both path length and depth information in wordnet)

Parameters:

synsetList1 – list, the list of synset
synsetList2 – list, the list of synset

Returns:

float, [0, 1], the similarity score

Return type:

semSimilarity

src.dackar.similarity.synsetUtils.constructSemanticVector(syns, jointSyns)[source]¶

Construct semantic vector

Parameters:

syns – list of synsets
jointSyns – list of joint synsets

Returns:

numpy.array, the semantic vector

Return type:

vector

src.dackar.similarity.synsetUtils.synsetsSimilarity(synsetA, synsetB, method='semantic_similarity_synsets', disambiguation=True)[source]¶

Compute synsets similarity

Parameters:

synsetA – wordnet.synset, the first synset
synsetB – wordnet.synset, the second synset
method – str, the method used to compute synset similarity
['semantic_similarity_synsets' (one of)
'path'
'wup'
'lch'
'res'
'jcn'
'lin']
disambiguation – bool, True if disambiguation has been already performed

Returns:

float, [0, 1], the similarity score

Return type:

similarity

src.dackar.similarity.synsetUtils.constructSemanticVectorUsingDisambiguatedSynsets(wordSynsets, jointWordSynsets, simMethod='semantic_similarity_synsets')[source]¶

Construct semantic vector while disambiguation has been already performed

Parameters:

wordSynsets – set/list, set of words synsets
jointWords – set, set of joint words synsets
simMethod – str, method for similarity analysis in the construction of semantic vectors
['semantic_similarity_synsets' (one of)
'path'
'wup'
'lch'
'res'
'jcn'
'lin']

Returns:

numpy.array, semantic vector with disambiguation

Return type:

vector

src.dackar.similarity.synsetUtils.semanticSimilarityUsingDisambiguatedSynsets(synsetsA, synsetsB, simMethod='semantic_similarity_synsets')[source]¶

Compute semantic similarity for given synsets while disambiguation has been already performed for given synsets

Parameters:

synsetsA – set/list, list of synsets
synsetsB – set/list, list of synsets
simMethod – str, method for similarity analysis in the construction of semantic vectors
['semantic_similarity_synsets' (one of)
'path'
'wup'
'lch'
'res'
'jcn'
'lin']

Returns:

float, [0, 1], the similarity score

Return type:

semSimilarity