https://embed.notionlytics.com/wt/ZXlKd1lXZGxTV1FpT2lJNE4ySmtNak5sWXpsak5ETTBPV1l5WWpSaVpXSmxZV1ppT1dKaFlUVTJaaUlzSW5kdmNtdHpjR0ZqWlZSeVlXTnJaWEpKWkNJNklraEhTMDk0TW5Gb1EyaEtWVGh6YUZaSGRIVnFJbjA9


<aside> 🔑 This guide is meant for users to understand the concept of Image Similarity, how can we use this capability to automate the repetitive tasks to make the data-preparation pipeline more faster, efficient, accurate & convenient .

</aside>


Topics Covered


IMAGE SIMILARITY

Image similarity is the measure of how similar two images are. In other words, It quantifies the degree of similarity between intensity patterns in two images.

Similar Images

Source:- https://apple.github.io/turicreate/docs/userguide/image_similarity/images/similar_images.png

Source:- https://apple.github.io/turicreate/docs/userguide/image_similarity/images/similar_images.png

IMAGE SIMILARITY & DATA PREPARATION

Data preparation is the process of transforming raw data so that data scientists and analysts can run it through machine learning models to uncover insights or make predictions . There are different steps in data preparation like, Data-Collection, Data-Curation & Data-Annotations. These steps are very time-consuming and requires a lots to resources and cost for doing them on scale, which can be automate or reduced by the capability of IMAGE SIMILARITY . Following is some use-cases where we can use the capability of IMAGE SIMILARITY to automate the tasks and make the process faster .

Source:- https://www.lightly.ai/post/data-preparation-tools-for-computer-vision-2021

Source:- https://www.lightly.ai/post/data-preparation-tools-for-computer-vision-2021

  1. Data-Collection:- Collecting data for training the ML model is the basic step in the machine learning pipeline. The predictions made by ML systems can only be as good as the data on which they have been trained.

Source :- https://miro.medium.com/max/786/1*IY6m_GAnQvZJrWrEZTtdPQ.png

Source :- https://miro.medium.com/max/786/1*IY6m_GAnQvZJrWrEZTtdPQ.png

Following are some of the problems that can arise in data collection:-