src.dackar.similarity.SentenceSimilarity¶
Attributes¶
Classes¶
Module Contents¶
- class src.dackar.similarity.SentenceSimilarity.SentenceSimilarity(disambiguationMethod='simple_lesk', similarityMethod='semantic_similarity_synsets', wordOrderContribution=0.0)[source]¶
- validDisambiguation = ['simple_lesk', 'original_lesk', 'cosine_lesk', 'adapted_lesk', 'max_similarity'][source]¶
- wordnetSimMethod = ['path_similarity', 'wup_similarity', 'lch_similarity', 'res_similarity', 'jcn_similarity',...[source]¶
- constructSimilarityVectorPawarMagoMethod(arr1, arr2)[source]¶
Construct the similarity vector
- Parameters:
arr1 – set of wordnet.Synset for one sentence
arr2 – set of wordnet.Synset for the other sentence
- Returns:
list, list of similarity vector count: int, the number of words that have high similarity >=0.804
- Return type:
vector
- sentenceSimilarity(sentence1, sentence2, method='pm_disambiguation', infoContentNorm=False)[source]¶
sentence similarity calculation
- sentenceSimilarityPawarMagoMethod(sentence1, sentence2)[source]¶
Proposed method from https://arxiv.org/pdf/1802.05667.pdf
- Parameters:
sentence1 – str, first sentence used to compute sentence similarity
sentence2 – str, second sentence used to compute sentence similarity
- Returns:
float, [0, 1], the computed similarity for given two sentences
- Return type:
similarity
- sentenceSimialrityBestSense(sentence1, sentence2, infoContentNorm=False)[source]¶
Proposed method from https://github.com/anishvarsha/Sentence-Similaritity-using-corpus-statistics Compute sentence similarity using both semantic and word order similarity The semantic similarity is based on maximum word similarity between one word and another sentence
- Parameters:
sentence1 – str, first sentence used to compute sentence similarity
sentence2 – str, second sentence used to compute sentence similarity
infoContentNorm – bool, True if statistics corpus is used to weight similarity vectors
- Returns:
float, [0, 1], the computed similarity for given two sentences
- Return type:
similarity