src.dackar.similarity.synsetUtils ================================= .. py:module:: src.dackar.similarity.synsetUtils Functions --------- .. autoapisummary:: src.dackar.similarity.synsetUtils.synsetListSimilarity src.dackar.similarity.synsetUtils.wordOrderSimilaritySynsetList src.dackar.similarity.synsetUtils.constructSynsetOrderVector src.dackar.similarity.synsetUtils.identifyBestSimilarSynsetFromSynsets src.dackar.similarity.synsetUtils.semanticSimilaritySynsets src.dackar.similarity.synsetUtils.pathLength src.dackar.similarity.synsetUtils.scalingDepthEffect src.dackar.similarity.synsetUtils.semanticSimilaritySynsetList src.dackar.similarity.synsetUtils.constructSemanticVector src.dackar.similarity.synsetUtils.synsetsSimilarity src.dackar.similarity.synsetUtils.constructSemanticVectorUsingDisambiguatedSynsets src.dackar.similarity.synsetUtils.semanticSimilarityUsingDisambiguatedSynsets Module Contents --------------- .. py:function:: synsetListSimilarity(synsetList1, synsetList2, delta=0.85) Compute similarity for synsetList pair :param synsetList1: list, list of synset :param synsetList2: list, list of synset :param delta: float, between 0 and 1, factor for semantic similarity contribution :returns: float, the similarity score :rtype: similarity .. py:function:: wordOrderSimilaritySynsetList(synsetList1, synsetList2) Compute word order similarity for synsetList pair :param synsetList1: list, list of synset :param synsetList2: list, list of synset :returns: float, word order similarity score .. py:function:: constructSynsetOrderVector(synsets, jointSynsets, index) Construct synset order vector for word order similarity calculation :param synsets: list of synsets :param jointSynsets: list of joint synsets :param index: int, index for synsets :returns: np.array, synset order vector :rtype: vector .. py:function:: identifyBestSimilarSynsetFromSynsets(syn, synsets) Identify best similar synset from synsets :param syn: wn.synset, synset :param synsets: list of synsets :returns: the best similar synset in synsets similarity: the best similarity score :rtype: bestSyn .. py:function:: semanticSimilaritySynsets(synsetA, synsetB, disambiguation=False) Compute the similarity between two synset using semantic analysis (e.g., using both path length and depth information in wordnet) :param synsetA: wordnet.synset, the first synset :param synsetB: wordnet.synset, the second synset :returns: float, [0, 1], the similarity score :rtype: similarity .. py:function:: pathLength(synsetA, synsetB, alpha=0.2, disambiguation=False) Path length calculation using nonlinear transfer function between two Wordnet Synsets. The two Synsets should be the best Synset Pair (e.g., disambiguation should be performed) :param synsetA: wordnet.synset, synset for first word :param synsetB: wordnet.synset, synset for second word :param alpha: float, a constant in monotonically descreasing function, exp(-alpha*wordnetpathLength), :param parameter used to scale the shortest path length. For wordnet: :param the optimal value is 0.2: :param disambiguation: bool, True if disambiguation have been performed for the given synsets :returns: float, [0, 1], the shortest distance between two synsets using exponential descreasing function. :rtype: shortDistance .. py:function:: scalingDepthEffect(synsetA, synsetB, beta=0.45, disambiguation=False) Words at upper layers of hierarchical semantic nets have more general concepts and less semantic similarity between words than words at lower layers. This method is used to scale the similarity behavior with repect to depth h (e.g., [exp(beta*h)-exp(-beta*g)]/[exp(beta*h)+exp(-beta*g)]) The two Synsets should be the best Synset Pair (e.g., disambiguation should be performed) :param synsetA: wordnet.synset, synset for first word :param synsetB: wordnet.synset, synset for second word :param beta: float, parameter used to scale the shortest depth, for wordnet, the optimal value is 0.45 :param disambiguation: bool, True if disambiguation have been performed for the given synsets :returns: float, [0, 1], similary score based on depth effect in wordnet :rtype: treeDist .. py:function:: semanticSimilaritySynsetList(synsetList1, synsetList2) Compute the similarity between two synsetList using semantic analysis (i.e., compute the similarity using both path length and depth information in wordnet) :param synsetList1: list, the list of synset :param synsetList2: list, the list of synset :returns: float, [0, 1], the similarity score :rtype: semSimilarity .. py:function:: constructSemanticVector(syns, jointSyns) Construct semantic vector :param syns: list of synsets :param jointSyns: list of joint synsets :returns: numpy.array, the semantic vector :rtype: vector .. py:function:: synsetsSimilarity(synsetA, synsetB, method='semantic_similarity_synsets', disambiguation=True) Compute synsets similarity :param synsetA: wordnet.synset, the first synset :param synsetB: wordnet.synset, the second synset :param method: str, the method used to compute synset similarity :param one of ['semantic_similarity_synsets': :param 'path': :param 'wup': :param 'lch': :param 'res': :param 'jcn': :param 'lin']: :param disambiguation: bool, True if disambiguation has been already performed :returns: float, [0, 1], the similarity score :rtype: similarity .. py:function:: constructSemanticVectorUsingDisambiguatedSynsets(wordSynsets, jointWordSynsets, simMethod='semantic_similarity_synsets') Construct semantic vector while disambiguation has been already performed :param wordSynsets: set/list, set of words synsets :param jointWords: set, set of joint words synsets :param simMethod: str, method for similarity analysis in the construction of semantic vectors :param one of ['semantic_similarity_synsets': :param 'path': :param 'wup': :param 'lch': :param 'res': :param 'jcn': :param 'lin']: :returns: numpy.array, semantic vector with disambiguation :rtype: vector .. py:function:: semanticSimilarityUsingDisambiguatedSynsets(synsetsA, synsetsB, simMethod='semantic_similarity_synsets') Compute semantic similarity for given synsets while disambiguation has been already performed for given synsets :param synsetsA: set/list, list of synsets :param synsetsB: set/list, list of synsets :param simMethod: str, method for similarity analysis in the construction of semantic vectors :param one of ['semantic_similarity_synsets': :param 'path': :param 'wup': :param 'lch': :param 'res': :param 'jcn': :param 'lin']: :returns: float, [0, 1], the similarity score :rtype: semSimilarity