# How to improve collaborative filtering with dimensionality reduction?

Besides a large number of applications in all fields, recommendation systems are increasingly faced with the problem of huge amount of data and scarcity. It raises concerns about the computational costs as well as the insufficient quality of the recommendations. To solve this problem, dimensionality reduction techniques are used in recommender systems to minimize processing costs and improve prediction. In this article, we’ll discuss the recommendation’s collaborative filtering approach and its limitations. We will also try to understand the parsimony problem encountered in this approach and how it can be addressed with dimensionality reduction techniques. We will cover the following main points in this article to understand this concept in detail.

**Contents**

- Collaborative filtering
- The collaborative filtering approach
- Limitation of collaborative filtering
- Reducing dimensionality in collaborative filtering
- Singular value decomposition
- Main components analysis

Let’s start by understanding collaborative filtering.

Register for our upcoming Masterclass>>

**Collaborative filtering**

Collaborative filtering is a famous technique used in most recommendation systems. Generally, collaborative filtering is classified into two senses: the narrow sense and the more general sense. Collaborative filtering, in a narrower sense, is a method of creating automatic predictions (filtering) about a user’s interests by collecting preferences or taste information from a large number of users (collaborating).

The collaborative filtering strategy is based on the concept that if person A and person B have the same opinion on a topic, A is more likely to have B’s point of view on a different topic than a person selected at the same time. chance. For example, given a partial list of a user’s likes, a collaborative filtering recommendation system for electronics accessory buying preferences might offer predictions of which accessories the user would like to buy (likes or dislikes). do not like). On the other hand, in a more general sense, it is the process of finding information or patterns using strategies that involve multiple agents, viewpoints, data sources, etc.

Collaborative filtering applications are typically used with very large data sets. Collaborative filtering methods have been applied to many types of data, including detection and monitoring data, such as mineral exploration, environmental detection over large areas or multiple sensors; financial data, such as financial services institutions that integrate multiple financial sources; and e-commerce and web applications where the focus is on user data, among others.

**Collaborative filtering approaches**

Most recommendation systems based on collaborative filtering create a community of like-minded customers. As a measure of proximity, the neighborhood creation method often uses Pearson’s correlation or cosine similarity. These algorithms generate two kinds of recommendations after determining the close neighborhood which are,

- Approximation of how much a customer C will appreciate a product P. In the case of a correlation-based algorithm, the prediction on the product ‘P’ for the customer ‘C’ is derived by calculating a weighted total of goods co-valued between C and all of its neighbors, then adding the average grade of C to that. This can be represented using the formula below. The prediction is adapted to consumer C.

In the above expression, *r** _{JC }*denotes the correlation between the user

*VS*and neighbor

*J*and J

_{P }is J’s rating on the product

*P.*

- Recommend a product list to a C customer. This is sometimes referred to as a top-N suggestion. After forming a neighborhood, the recommendation system algorithm focuses on items rated by neighbors and selects a list of N products that the customer will like.

**Limits of collaborative filtering**

These systems have been effective in a variety of areas, however, they do not always successfully match a user’s preferences. Unless the platform achieves exceptionally high levels of diversity and independence of opinion, one point of view will always prevail over another in any given group. Based on such situations, the algorithm was claimed to have certain flaws, namely:

*parsimony *

*parsimony*

Because the nearest neighbor algorithms rely on exact matches, they sacrifice the coverage and accuracy of the recommender system. Because the correlation coefficient only applies to consumers who have rated two or more products in common, many customer pairs have no correlation.

Many business recommendation systems are used in practice to analyze large sets of products (eg Amazon.com suggests books). Even active customers may have rated far less than 1% of the products in these systems (1% of 2 million volumes is 20,000 pounds, a large body to form an opinion on). As a result, Pearson’s nearest neighbor algorithms may be unable to offer many product recommendations for a single consumer. This is known as reduced coverage, and it is caused by sparse neighbor ratings. In addition, since only a limited amount of evaluation data can be provided, the accuracy of the suggestions may be poor.

*Scalability*

*Scalability*

Nearest neighbor methods involve a calculation that increases as the number of consumers and items increases. A conventional web recommendation system using existing algorithms will have major scalability issues with millions of customers and products.

*Synonymy*

*Synonymy*

In the real world, several product names can refer to the same thing. Correlation-based recommendation systems are incapable of detecting this hidden relationship and, therefore, of handling these products differently. Consider two clients who each score ten different points *recycled notepad items* as “high” and another client who rates ten different *recycled notepad products* as high. ”

Correlation-based recommendation systems would not be able to calculate the correlation because there would be no correspondence between the sets of products, and they would be unable to find the latent relationship that they both like the supplies. recycled office space.

**Reducing dimensionality in collaborative filtering**

In general, dimensionality reduction is the process of mapping a large input space into a reduced level of latent space. Matrix factorization is a dimensionality reduction subset in which a data matrix D is reduced to the product of many low rank matrices.

*Singular Value Decomposition (SVD)*

*Singular Value Decomposition (SVD)*

SVD is a powerful dimensionality reduction technique which is a specialization of the MF approach. The main problem in an SVD is to discover a reduced dimensional feature space. Since SVD is a matrix factorization technique, the matrix amx n R is factored into three matrices as follows:

*R = USV ′*

Here, S is a diagonal matrix with all the singular values of R as diagonal elements, while U and V are two orthogonal matrices. All the entries of the matrix S are positive and are kept in decreasing order of magnitude.

In recommendation systems, SVD is used to accomplish two different tasks: First, it is used to collect latent relationships between consumers and products, which allows us to calculate the estimated probability of a customer buying a product. specific. Second, it is used to create a low dimensional representation of the original customer-product space, and then to calculate the neighborhood in that space. It is then used to provide a list of the best product suggestions to customers.

*Principal Component Analysis (PCA)*

*Principal Component Analysis (PCA)*

PCR is a sophisticated dimensionality reduction technique which is a specific application of the MF approach. PCA is a statistical process that uses an orthogonal transformation to transform a set of possibly correlated observations into a set of values that are linearly uncorrelated variables called principal components (PCs).

The number of PCs is less than or equal to the initial number of variables. This transformation is defined so that the first CP has as much variance as possible and that each following component has the greatest possible variance while being orthogonal to the preceding components. Because they are the eigenvectors of the covariance matrix, the principal components are orthogonal. The relative scale of the original variables affects the PCA.

**Conclusion**

So, here in this article we have seen collaborative filtering, any kind of ML algorithm if exposed to large dimension data, it will return bad predictions, collaborative filtering also knows the same. To solve this problem, we first understand the FC approach and under what circumstances it fails. Based on the flaws, we have seen how SVD and PCA can be used to solve this problem.

**The references**

#### Join our Discord server. Be part of an engaging online community. Join here.

## Subscribe to our newsletter

Receive the latest updates and relevant offers by sharing your email.