baserec.base.similarity package¶

Submodules¶

baserec.base.similarity.compute_similarity module¶

@author: Maurizio Ferrari Dacrema & Ceshine Lee

class baserec.base.similarity.compute_similarity.ComputeSimilarity(dataMatrix, use_implementation='density', similarity=None, **args)¶

Bases: object

compute_similarity(**args)¶

class baserec.base.similarity.compute_similarity.SimilarityFunction(value)¶

Bases: enum.Enum

An enumeration.

ADJUSTED_COSINE = 'adjusted'¶

COSINE = 'cosine'¶

EUCLIDEAN = 'euclidean'¶

JACCARD = 'jaccard'¶

PEARSON = 'pearson'¶

TANIMOTO = 'tanimoto'¶

baserec.base.similarity.compute_similarity_cython module¶

@author: Maurizio Ferrari Dacrema

class baserec.base.similarity.compute_similarity_cython.ComputeSimilarityCython¶

Bases: object

compute_similarity()¶: Compute the similarity for the given dataset :param self: :param start_col: column to begin with :param end_col: column to stop before, end_col is excluded :return:

baserec.base.similarity.compute_similarity_euclidean module¶

@author: Maurizio Ferrari Dacrema & Ceshine Lee

class baserec.base.similarity.compute_similarity_euclidean.ComputeSimilarityEuclidean(dataMatrix, topK=100, shrink=0, normalize=False, normalize_avg_row=False, similarity_from_distance_mode='lin', row_weights=None, **args)¶

Bases: object

compute_similarity(start_col=None, end_col=None, block_size=100)¶: Compute the similarity for the given dataset :param self: :param start_col: column to begin with :param end_col: column to stop before, end_col is excluded :return:

baserec.base.similarity.compute_similarity_euclidean_test module¶

@author: Maurizio Ferrari Dacrema & Ceshine Lee

class baserec.base.similarity.compute_similarity_euclidean_test.MyTestCase(methodName='runTest')¶

Bases: unittest.case.TestCase

test_euclidean_similarity_float()¶

test_euclidean_similarity_integer()¶

baserec.base.similarity.compute_similarity_euclidean_test.areSparseEquals(Sparse1, Sparse2)¶

baserec.base.similarity.compute_similarity_python module¶

@author: Maurizio Ferrari Dacrema & Ceshine Lee

class baserec.base.similarity.compute_similarity_python.ComputeSimilarityPython(dataMatrix, topK=100, shrink=0, normalize=True, asymmetric_alpha=0.5, tversky_alpha=1.0, tversky_beta=1.0, similarity='cosine', row_weights=None)¶

Bases: object

Computes the cosine similarity on the columns of dataMatrix

If it is computed on URM=|users|x|items|, pass the URM as is.
If it is computed on ICM=|items|x|features|, pass the URM transposed.

Available similarity measures (the similarity parameter):

“cosine” computes Cosine similarity (this is the default)
“adjusted” computes Adjusted Cosine, removing the average of the users
“asymmetric” computes Asymmetric Cosine
“pearson” computes Pearson Correlation, removing the average of the items
“jaccard” computes Jaccard similarity for binary interactions using Tanimoto
“dice” computes Dice similarity for binary interactions
“tversky” computes Tversky similarity for binary interactions
“tanimoto” computes Tanimoto coefficient for binary interactions

Asymmetric Cosine as described in:

Aiolli, F. (2013, October). Efficient top-n recommendation for very large scale binary rated datasets. In Proceedings of the 7th ACM conference on Recommender systems (pp. 273-280). ACM.

(Note from Ceshine: since the similarities are calculated between columns, the asymmetric cosine measure doesn’t seem to make sense here?)

Parameters

dataMatrix – Numpy matrix
topK (int, optional) – Keep only the Top K entries, by default 100
shrink (int, optional) – The shrinkage parameter helps to avoid overfitting when only few ratings are available, by default 0
normalize (bool, optional) – If True divide the dot product by the product of the norms, by default True
asymmetric_alpha (float, optional) – Coefficient alpha for the asymmetric cosine, by default 0.5
tversky_alpha (float, optional) – tversky_alpha, by default 1.0
tversky_beta (float, optional) – tversky_beta, by default 1.0
similarity (str, optional) – type of similarity measure to use, by default “cosine”
row_weights (Sequence, optional) – Multiply the values in each row by a specified value, by default None

applyAdjustedCosine()¶: Remove from every data point the average for the corresponding row :return:

applyPearsonCorrelation()¶: Remove from every data point the average for the corresponding column :return:

compute_similarity(start_col=None, end_col=None, block_size=100)¶: Compute the similarity for the given dataset :param self: :param start_col: column to begin with :param end_col: column to stop before, end_col is excluded :return:

useOnlyBooleanInteractions()¶

baserec.base.similarity package¶

Submodules¶

baserec.base.similarity.compute_similarity module¶

baserec.base.similarity.compute_similarity_cython module¶

baserec.base.similarity.compute_similarity_euclidean module¶

baserec.base.similarity.compute_similarity_euclidean_test module¶

baserec.base.similarity.compute_similarity_python module¶

Module contents¶