Trending February 2024 # Image Classification: 6 Applications & 4 Best Practices In 2023 # Suggested March 2024 # Top 5 Popular

You are reading the article Image Classification: 6 Applications & 4 Best Practices In 2023 updated in February 2024 on the website Achiashop.com. We hope that the information we have shared is helpful to you. If you find the content interesting and meaningful, please share it with your friends and continue to follow and support us for the latest updates. Suggested March 2024 Image Classification: 6 Applications & 4 Best Practices In 2023

Around 1.72 trillion1 photos are taken every year. Many are used to train digital solutions, such as self-driving systems powered by image recognition and computer vision (CV) technologies. However, all of these images would be useless for such digital solutions without image classification – a process that can identify and categorize images based on their visual content.

Image classification is one of the cornerstones of computer vision solutions. However, many executives are still not clear about its applications and best practices, limiting the effectiveness of image classification solutions.

To remedy that, we have curated this article to explore what image classification is and how business leaders can leverage it to develop sophisticated image recognition solutions.

What is image classification?

Image classification is a technique that involves analyzing the content of an image to attach tags or labels to it. This distributes the images into different classes. The purpose of this process is to digitally explain the contents of the image to a machine. Image classification is combined with image localization to enable an object detection system to find objects in images. 

Single-label vs. multi-label classification

Single-label image classification is a traditional image classification problem where each image is associated with only one label or class. For instance, an image of a cat can be labeled as “cat” and nothing else. The task of the classifier is to predict the label for a given image.

On the other hand, multi-label image classification is an extension of the single-label image classification problem, where an image can have multiple labels or classes associated with it. For example, an image of a cat playing in a park can be labeled as “cat” and “park”. The task of the classifier, in this case, is to predict all the relevant labels or classes for a given image.

Multi-label image classification is a more complex problem than single-label image classification since it requires the classifier to identify multiple objects or features in an image and attach them with their corresponding labels.

Image classification can be done manually or with automated tools.

Manual image classification

Manual image classification is done by viewing each image and applying a label or category based on the data processor’s judgment of the contents of the image. Like many other manual tasks, manual image classification can also be tedious and prone to errors.

Automated image classification

Based on this training data, the model can learn to recognize and categorize objects within new images and automatically classify them. However, such tools are implemented through a supervised classification approach where a human classifier is in the loop. Automated data labeling tools usually include image classification features. 

Business applications of image classification 1. Autonomous driving systems

Image classification is an essential part of autonomous driving systems. It is used to detect and classify objects in the surroundings of the vehicle, such as other cars, pedestrians, road signs, traffic lights, etc. The autonomous driving system uses this information to guide the vehicle.

2. Manufacturing

In the manufacturing sector, image classification is used to implement computer vision-enabled automated solutions in the following ways:

2.1. Quality control and defect detection

Image classification algorithms can be used to ensure quality control of finished products or parts on a manufacturing line. For example, in vehicle manufacturing, image classification algorithms can inspect car parts for cracks, chips, or other imperfections.

2.2. Sorting and classification

Image classification algorithms can also be used to sort and categorize products based on their characteristics. For instance, in the food and beverage sector, image classification models can scan and sort fruits and vegetables based on their size, shape, and other attributes.

3. Defense

In the defense sector, image classification is commonly used in areas like target identification, surveillance, and threat assessment. The models are made to automatically recognize and categorize objects in pictures and videos taken by satellites or unmanned aerial vehicles (UAVs).

4. Retail/e-commerce Figure 1. Reasons why people abandon their online shopping carts 5. Healthcare

Image classification is used to leverage computer vision technology in the healthcare and radiology sectors. Medical images like X-rays, MRIs, and CT scans can be automatically analyzed using image classification. This enables medical personnel to diagnose and treat patients more accurately and efficiently. Medical image annotation tools usually include image classification features.

For instance, pneumonia is a lung disease that is difficult to detect early by traditional diagnosis methods. An image classification algorithm trained on a dataset of chest X-ray images can be used to predict and accurately detect2 the presence of pneumonia in new X-rays.

X-ray recognition and classification through deep learning:

6. Security

Image classification can also be used to leverage computer vision in the security sector in the following ways:

Surveillance: It can help in automatically identifying and categorizing objects in surveillance footage. This can help security personnel detect and respond to potential threats more quickly.

Facial recognition: It can also help authorities identify people of interest in crowds. Image classification can help the security sector leverage computer vision technology in areas such as banks, airports, and other crowded places.

Example: Here is how an image classification system categorizes faces in an image:

Then, a computer vision system uses the classified data to identify people in a surveillance system:

Best practices in developing an image classification tool 1. Data collection

Data collection is one of the most important steps in developing an automated image classifier model. Ensuring the quality and diversity level of the training data is crucial since they directly impact the performance of the model. The dataset should be large-scale and representative of the sample of images that cover the complete range of categories to be classified.

For instance, a company developing an image classification tool for fashion retail will require different variations of clothing images to be trained.

2. Data preprocessing

Preprocessing the training data can greatly improve the classification model’s performance. This includes:

Resizing images

Removing noise

Augmenting it for more diversity

For instance, a healthcare organization pre-processes X-ray image data to ensure they are consistent in size and position. They might also apply filters to reduce noise and enhance the key features in the images. This will result in more accurate classifications by the automated tool.

3. Model selection

Image classification can be done using a wide variety of machine-learning algorithms. It is important to understand the problem and the pros and cons of each algorithm before choosing the best for the project.

For example, a surveillance company can select a convolutional neural network (CNN) as the model for their image classification tool since it is suitable for identifying objects in visual imagery. 

4. Hyperparameter tuning

The performance of the model can be significantly impacted by tweaking its parameters. It’s important to tune the hyperparameters to find the best-performing model for the project.

For example, a retail company can improve the image classification model’s accuracy and reduce the amount of time required for training by tuning the:

learning rate,

batch size,

And the number of epochs (Total number of iterations).

To learn more about the different methods and tools of hyperparameter Optimization, check out this article.

Further reading

If you need help finding a vendor or have any questions, feel free to contact us:

References

Shehmir Javaid

Shehmir Javaid is an industry analyst at AIMultiple. He has a background in logistics and supply chain management research and loves learning about innovative technology and sustainability. He completed his MSc in logistics and operations management from Cardiff University UK and Bachelor’s in international business administration From Cardiff Metropolitan University UK.

YOUR EMAIL ADDRESS WILL NOT BE PUBLISHED. REQUIRED FIELDS ARE MARKED

*

0 Comments

Comment

You're reading Image Classification: 6 Applications & 4 Best Practices In 2023

Learn Mobile Price Prediction Through Four Classification Algorithms

This article was published as a part of the Data Science Blogathon.

Introduction

Mobile phones come in all sorts of prices, features, specifications and all. Price estimation and prediction is an important part of consumer strategy. Deciding on the correct price of a product is very important for the market success of a product. A new product that has to be launched, must have the correct price so that consumers find it appropriate to buy the product.

 The Problem

The data contains information regarding mobile phone features, specifications etc and their price range. The various features and information can be used to predict the price range of a mobile phone.

The data features are as follows:

Battery Power in mAh

Has BlueTooth or not

Microprocessor clock speed

The phone has dual sim support or not

Front Camera Megapixels

Has 4G support or not

Internal Memory in GigaBytes

Mobile Depth in Cm

Weight of Mobile Phone

Number of cores in the processor

Primary Camera Megapixels

Pixel Resolution height

Pixel resolution width

RAM in MB

Mobile screen height in cm

Mobile screen width in cm

Longest time after a single charge

3g or not

Has touch screen or not

Has wifi or not

Methodology

We will proceed with reading the data, and then perform data analysis. The practice of examining data using analytical or statistical methods in order to identify meaningful information is known as data analysis. After data analysis, we will find out the data distribution and data types. We will train 4 classification algorithms to predict the output. We will also compare the outputs. Let us get started with the project implementation.

First, we import the libraries.

import numpy as np # linear algebra import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv) import seaborn as sns import matplotlib.pylab as plt %matplotlib inline

Now, we read the data and view an overview of the data.

train_data=pd.read_csv('/kaggle/input/mobile-price-classification/train.csv') train_data.head()

Output:

Now, we will use the info function to see the type of data in the dataset.

train_data.info()

Output:

RangeIndex: 2000 entries, 0 to 1999 Data columns (total 21 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 battery_power 2000 non-null int64 1 blue 2000 non-null int64 2 clock_speed 2000 non-null float64 3 dual_sim 2000 non-null int64 4 fc 2000 non-null int64 5 four_g 2000 non-null int64 6 int_memory 2000 non-null int64 7 m_dep 2000 non-null float64 8 mobile_wt 2000 non-null int64 9 n_cores 2000 non-null int64 10 pc 2000 non-null int64 11 px_height 2000 non-null int64 12 px_width 2000 non-null int64 13 ram 2000 non-null int64 14 sc_h 2000 non-null int64 15 sc_w 2000 non-null int64 16 talk_time 2000 non-null int64 17 three_g 2000 non-null int64 18 touch_screen 2000 non-null int64 19 wifi 2000 non-null int64 20 price_range 2000 non-null int64 dtypes: float64(2), int64(19) memory usage: 328.2 KB

Now, we remove the data points with missing data.

train_data_f = train_data[train_data['sc_w'] != 0] train_data_f.shape

Output:

(1820, 21)

Let us visualize the number of elements in each class of mobile phones.

#classes sns.set() price_plot=train_data_f['price_range'].value_counts().plot(kind='bar') plt.xlabel('price_range') plt.ylabel('Count') plt.show()

Output:

So, there are mobile phones in 4 price ranges. The number of elements is almost similar.

Data Distribution

Let us analyse some data features and see their distribution.

First, we see how the battery mAh is spread.

sns.set(rc={'figure.figsize':(5,5)}) ax=sns.displot(data=train_data_f["battery_power"]) plt.show()

Output:

Now, we see the count of how many devices have Bluetooth and how many don’t.

sns.set(rc={'figure.figsize':(5,5)}) ax=sns.displot(data=train_data_f["blue"]) plt.show()

Output:

So, we can see that half the devices have Bluetooth, and half don’t.

Next, we analyse the mobile depth ( in cm).

sns.set(rc={'figure.figsize':(5,5)}) ax=sns.displot(data=train_data_f["m_dep"]) plt.show()

Output:

A few mobiles are very thin and a few ones are almost a cm thick.

In a similar way, the data distribution can be analysed for all the data features. Implementing that will be very simple.

Let us see if there are any missing values or missing data.

X=train_data_f.drop(['price_range'], axis=1) y=train_data_f['price_range'] #missing values X.isna().any()

Output:

battery_power False blue False clock_speed False dual_sim False fc False four_g False int_memory False m_dep False mobile_wt False n_cores False pc False px_height False px_width False ram False sc_h False sc_w False talk_time False three_g False touch_screen False wifi False dtype: bool

Let us split the data.

#train test split of data from sklearn.model_selection import train_test_split X_train, X_valid, y_train, y_valid= train_test_split(X, y, test_size=0.2, random_state=7)

Now, we define a function for creating a confusion matrix.

#confusion matrix from sklearn.metrics import classification_report, confusion_matrix, accuracy_score def my_confusion_matrix(y_test, y_pred, plt_title): cm=confusion_matrix(y_test, y_pred) print(classification_report(y_test, y_pred)) sns.heatmap(cm, annot=True, fmt='g', cbar=False, cmap='BuPu') plt.xlabel('Predicted Values') plt.ylabel('Actual Values') plt.title(plt_title) plt.show() return cm

Now, as the function is defined, we can proceed with implementing the classification algorithms.

Random Forest Classifier

A random forest is a supervised machine learning method built from decision tree techniques. This algorithm is used to anticipate behaviour and results in a variety of sectors, including banking and e-commerce.

A random forest is a machine learning approach for solving regression and classification issues. It makes use of ensemble learning, which is a technique that combines multiple classifiers to solve complicated problems.

A random forest method is made up of a large number of decision trees. The random forest algorithm’s ‘forest’ is trained via bagging or bootstrap aggregation. Bagging is a meta-algorithm ensemble that increases the accuracy of machine learning algorithms.

The outcome is determined by the (random forest) algorithm based on the predictions of the decision trees. It forecasts by averaging or averaging the output of several trees. The precision of the outcome improves as the number of trees grows.

A random forest system is built on a variety of decision trees. Every decision tree is made up of nodes that represent decisions, leaf nodes, and a root node. The leaf node of each tree represents the decision tree’s final result. The final product is chosen using a majority-voting procedure. In this situation, the output picked by the majority of the decision trees becomes the random forest system’s ultimate output. Let us now implement the random forest algorithm.

First, we build the model.

#building the model from sklearn.ensemble import RandomForestClassifier rfc=RandomForestClassifier(bootstrap= True, max_depth= 7, max_features= 15, min_samples_leaf= 3, min_samples_split= 10, n_estimators= 200, random_state=7)

Now, we do the training and prediction.

rfc.fit(X_train, y_train) y_pred_rfc=rfc.predict(X_valid)

Let us apply the function for the accuracy metrics.

print('Random Forest Classifier Accuracy Score: ',accuracy_score(y_valid,y_pred_rfc)) cm_rfc=my_confusion_matrix(y_valid, y_pred_rfc, 'Random Forest Confusion Matrix')

Output:

Random Forest Classifier Accuracy Score: 0.9093406593406593 precision recall f1-score support 0 0.98 0.97 0.97 95 1 0.90 0.92 0.91 92 2 0.82 0.86 0.84 86 3 0.93 0.88 0.90 91 accuracy 0.91 364 macro avg 0.91 0.91 0.91 364 weighted avg 0.91 0.91 0.91 364

So, we can see that the random forest algorithm has good accuracy in prediction.

Naive Bayes

Conditional probability is the foundation of Bayes’ theorem. The conditional probability aids us in assessing the likelihood of something occurring if something else has previously occurred.

Image:  Illustration of how a Gaussian Naive Bayes (GNB) classifier works

Gaussian Naive Bayes is a Naive Bayes variation that allows continuous data and follows the Gaussian normal distribution. The Bayes theorem is the foundation of a family of supervised machine learning classification algorithms known as naive Bayes. It is a basic categorization approach with a lot of power. When the dimensionality of the inputs is high, they are useful. The Naive Bayes Classifier may also be used to solve complex classification issues.

Let us implement the Gaussian NB classifier.

from sklearn.naive_bayes import GaussianNB gnb = GaussianNB()

Now, we perform the training and prediction.

gnb.fit(X_train, y_train) y_pred_gnb=gnb.predict(X_valid)

Now, we can check the accuracy.

print('Gaussian NB Classifier Accuracy Score: ',accuracy_score(y_valid,y_pred_gnb)) cm_rfc=my_confusion_matrix(y_valid, y_pred_gnb, 'Gaussian NB Confusion Matrix')

Output:

Gaussian NB Classifier Accuracy Score: 0.8461538461538461 precision recall f1-score support 0 0.93 0.92 0.92 95 1 0.79 0.73 0.76 92 2 0.74 0.80 0.77 86 3 0.92 0.93 0.93 91 accuracy 0.85 364 macro avg 0.84 0.85 0.84 364 weighted avg 0.85 0.85 0.85 364

We can see that the model is performing well.

KNN Classifier

The K Nearest Neighbor method is a type of supervised learning technique that is used for classification and regression. It’s a flexible approach that may also be used to fill in missing values and resample datasets. K Nearest Neighbor examines K Nearest Neighbors (Data points) to forecast the class or continuous value for a new Datapoint, as the name indicates.

The K-NN method saves all available data and classifies a new data point based on its similarity to the existing data. This implies that fresh data may be quickly sorted into a well-defined category using the K-NN method. The K-NN algorithm is a non-parametric algorithm, which means it makes no assumptions about the underlying data. It’s also known as a lazy learner algorithm since it doesn’t learn from the training set right away; instead, it saves the dataset and performs an action on it when it comes time to classify it.

Let us perform the implementation of the classifier.

from sklearn.neighbors import KNeighborsClassifier knn = KNeighborsClassifier(n_neighbors=3,leaf_size=25)

Now, we train the data and make our predictions.

knn.fit(X_train, y_train) y_pred_knn=knn.predict(X_valid)

Now, we check the accuracy.

print('KNN Classifier Accuracy Score: ',accuracy_score(y_valid,y_pred_knn)) cm_rfc=my_confusion_matrix(y_valid, y_pred_knn, 'KNN Confusion Matrix')

Output:

KNN Classifier Accuracy Score: 0.9340659340659341 precision recall f1-score support 0 0.99 0.98 0.98 95 1 0.93 0.97 0.95 92 2 0.87 0.88 0.88 86 3 0.94 0.90 0.92 91 accuracy 0.93 364 macro avg 0.93 0.93 0.93 364 weighted avg 0.93 0.93 0.93 364

The KNN classifier is quite adept at its task.

SVM Classifier

Support Vector Machine, or SVM, is a prominent Supervised Learning technique that is used for both classification and regression issues. However, it is mostly utilised in Machine Learning for Classification purposes.

The SVM algorithm’s purpose is to find the optimum line or decision boundary for categorising n-dimensional space so that we may simply place fresh data points in the proper category in the future. A hyperplane is the optimal choice boundary.

Check this article for more information on SVM.

Let us do the implementation of SVM.

from sklearn import svm svm_clf = svm.SVC(decision_function_shape='ovo') svm_clf.fit(X_train, y_train) y_pred_svm=svm_clf.predict(X_valid)

Now, we check the accuracy.

print('SVM Classifier Accuracy Score: ',accuracy_score(y_valid,y_pred_svm)) cm_rfc=my_confusion_matrix(y_valid, y_pred_svm, 'SVM Confusion Matrix')

Output:

SVM Classifier Accuracy Score: 0.9587912087912088 precision recall f1-score support 0 0.98 0.98 0.98 95 1 0.93 0.97 0.95 92 2 0.94 0.93 0.94 86 3 0.99 0.96 0.97 91 accuracy 0.96 364 macro avg 0.96 0.96 0.96 364 weighted avg 0.96 0.96 0.96 364

We can see that the SVM classifier is giving the best accuracy.

Conclusion

The media shown in this article is not owned by Analytics Vidhya and are used at the Author’s discretion. 

Related

Remotely Invoke Applications With Powershell

Have you ever been given an application and instructed to run it on various computers and systems, only to realize that it wasn’t built for multiple hosts? After all, some apps are designed to be executed only locally.

While this problem can be perplexing, many IT professionals understand that they can run an app like this with various deployment tools. These tools copy the app to the necessary computers and execute it. Although this method can solve the problem, it can be overkill. 

Fortunately, Microsoft PowerShell provides several ways to invoke applications on remote computers, giving you a quick way to copy the app to a few machines and get it running. 

How to remotely invoke applications with PowerShell

PowerShell offers various ways to execute applications on remote computers. Two methods use Windows Management Implementation (WMI), and a third process uses PowerShell remoting, the preferred method.

Create() method.

The Win32_Process WMI class is one way to run a process on a remote computer. Similar to how you might use PowerShell to manage IIS (a way to host websites on remote computers), you can use PowerShell to run programs on other computers.

Win32_Process is a WMI class with a static method called Create(), which allows you to pass an EXE to it and run it on the remote computer using the WMICLASS-type accelerator.

WMICLASS is a shortcut to enable access to all of the class’s static properties and methods. Because Create() is a static method, you don’t actually have to initiate a Win32_Process object at all. You can simply call the method with the EXE as the first argument, and it will run.

([WMICLASS]”MEMBERSRV1RootCIMV2:Win32_Process”).create(“notepad.exe”)

In this instance, you’re running the chúng tôi process on the computer MEMBERSRV1. However, this process is not interactive, so you won’t see chúng tôi pop up on a logged-in console. Remote process execution is best for applications that don’t require interactive input.

Did You Know?

The WMICLASS-type accelerator makes running EXEs on remote computers more streamlined and less prone to error.

Invoke-WmiMethod cmdlet

You can also use the Invoke-WmiMethod cmdlet, which is a less-convoluted process. Invoke-WmiMethod is a more user-friendly way to call static methods such as Create().

Invoke-WmiMethod –ComputerName MEMBERSRV1 -Class win32_process -Name create -ArgumentList “notepad”

This accomplishes the same goal. However, you’re expressing the Win32_Process class and the parameter to Create() — the EXE itself — slightly differently.

FYI

The Invoke-WmiMethod cmdlet makes remote application invocation more manageable, which is helpful for more complex operations.

PowerShell remoting

PowerShell remoting will likely be your preferred option for remotely invoking applications. The previous two methods using WMI depended on remote DCOM being enabled on the computer. This may or may not be a problem, but it can sometimes pose a security risk. 

You can use PowerShell remoting through the Invoke-Command cmdlet to kick off a process on a remote computer. (You can also use WSMan, a newer, more secure protocol.)

To do this, use a combination of two cmdlets: Invoke-Command to enable you to run a command on the remote computer, and Start-Process to execute the process.

Invoke-Command –ComputerName MEMBERSRV1 –ScriptBlock {Start-Process notepad.exe}

Invoke-Command is a PowerShell cmdlet that allows you to execute code on a remote computer as if it were local. This process has a script block parameter to insert any code to run locally on that remote computer. In this instance, you’re using Start-Process, which runs a specific application.

Did You Know?

The Invoke-Command cmdlet does more than remotely launch applications. This versatile scripting tool can also enable you to remotely sync commands with PowerShell.

PowerShell provides numerous ways to invoke processes on remote computers. Start with Invoke-Command/Start-Process to see if that method gives you the results you need. If not, you might need to look into the older methods of using WMI. At least one of these methods will get that process running remotely.

Tip

If the Invoke-Command/Start-Process doesn’t produce the desired results, revert to the WMI method.

The benefits of using PowerShell to access applications

Using PowerShell to invoke applications on remote computers has three primary benefits:

PowerShell remoting saves administrators time and resources. Administrators can execute operations on remote computers without being there in person. This saves time and reduces the need for on-site support.

PowerShell remoting helps you solve problems. Invoking applications remotely means you can resolve problems wherever you are and make any necessary changes to specific applications.

PowerShell remoting makes updates easier. Using PowerShell to access applications makes distributing software and updates simpler, faster and more efficient. Many larger companies use this approach to save time.

Mark Fairlie contributed to this article.

Training An Adapter For Roberta Model For Sequence Classification Task

Introduction

The current trend in NLP includes downloading and fine-tuning pre-trained models with millions or even billions of parameters. However, storing and sharing such large trained models is time-consuming, slow, and expensive. These constraints hinder the development of more multi-purpose and adaptable NLP techniques with the RoBERTa model that can learn from and for multiple tasks; in this article, we will be focusing on the sequence classification tasks. Considering this, adapters were proposed, which are small, lightweight, and parameter-efficient alternatives to full fine-tuning. They are basically small bottleneck layers that can be dynamically added with a pre-trained model based on different tasks and languages.

In this article, we will train an adapter for ROBERTa model on the Amazon polarity dataset for sequence classification tasks with the help of adapter-transformers, the AdapterHub adaptation of Hugging Face’s transformers library. Additionally, we will compare the performance of the adapter module to a fully fine-tuned RoBERTa model trained on the same dataset.

By the end of this article, you will have learned the following:

How to train an adapter for the RoBERTa model on the Amazon Polarity dataset for the Sequence Classification task?

How can a trained adapter with the Hugging Face pipeline be used to help make quick predictions?

How to extract the adapter from the trained model and save it for later use?

How can the base model’s weights be restored to their original form by deactivating and deleting the adapter?

Push the trained model to the Hugging Face hub for later use. Additionally, we will see the comparison between the adapters and full fine-tuning.

This article was published as a part of the Data Science Blogathon.

Table of Contents Project Description

This project includes training a task adapter for the RoBERTa model on the Amazon polarity dataset for sequence classification tasks, specifically sentiment analysis. To train, we will use the RoBERTa base model from the Hugging Face hub and the AdapterHub adaptation of Hugging Face’s transformers library. Additionally, we will compare the performance of the adapter module to a fully fine-tuned RoBERTa model trained on the same dataset.

What are Adapters?

Adapters are lightweight alternatives to fully fine-tuned pre-trained models. Currently, adapters are implemented as small feedforward neural networks that are inserted between layers of a pre-trained model. They provide a parameter-efficient, computationally efficient, and modular approach to transfer learning. The following image shows added adapter.

Source: Adapterhub

Significance of Adapters in NLP Transfer Learning

The following are some important points regarding the significance of adapters in NLP transfer learning:

Efficient Use of Pretrained Models: Pretrained language models such as BERT, GPT-2, and RoBERTa have been proven effective in various NLP tasks. However, fine-tuning the entire model can be computationally expensive and time-consuming. Adapters allow for more efficient use of these pretrained models by enabling the insertion of task-specific functionality without modifying the original architecture.

Improved Adaptability: Adapters allow for greater flexibility in adapting pretrained models to new tasks. Rather than fine-tuning the entire model, adapters enable selective modification of specific layers, improving model adaptation to new tasks and leading to better performance.

Cost-Effective: Adapters can be trained with fewer data than required for training a full model, reducing the cost of training and improving the model’s scalability.

Reduced Memory Requirements: Since adapters require fewer parameters than a full model, they can be easily added to a pre-existing model without requiring significant additional memory.

Transfer Learning Across Languages: Adapters can also enable knowledge transfer across languages, allowing models to be trained on a source language and then adapted to a target language with minimal additional training. And hence they can also prove to be very effective in low-resource settings.

Overview of the RoBERTa Model

Roberta is a large pre-trained language model developed by Facebook AI and released in 2023. It shares the same architecture as the BERT model. It is a revised version of BERT with minor adjustments to the key hyperparameters and embeddings.

Except for the output layers, BERT’s pre-training and fine-tuning procedures use the same architecture. The pre-trained model parameters are utilized to initialize models for various downstream tasks, and during fine-tuning, all parameters are adjusted. The following diagram illustrates BERT’s pre-training and fine-tuning procedures. The following figure shows the BERT Architecture.

                                                                                   Source: Arxiv

In contrast, RoBERTa does not employ the next-sentence pretraining objective but utilizes much larger mini-batches and learning rates during training. RoBERTa adopts a different pretraining method and replaces the byte-level BPE tokenizer (similar to GPT-2) with a character-level BPE vocabulary. Moreover, RoBERTa uses “dynamic masking,” which helps the model learn more robust representations of the input text by forcing it to predict a diverse set of tokens rather than just predicting a fixed subset of tokens.

In this article, we will train an adapter for RoBERTa base model for the sequence classification task (more precisely, sentiment analysis). Simply put, a sequence classification task is a task that involves assigning a label or category to a sequence of words or tokens, such as a sentence or document.

Overview of the Dataset

We will use the Amazon Reviews Polarity dataset constructed by Xiang Zhang. This dataset was created by classifying reviews with scores of 1 and 2 as negative and reviews with scores of 4 and 5 as positive. Moreover, the samples with a score of 3 were ignored. Each class has 1,800,000 training samples and 200,000 testing samples.

Training the Adapter for RoBERTa Model on Amazon Polarity Dataset

To start we will begin with installing the libraries:

!pip install -U adapter-transformers datasets

And now, we will load the Amazon Reviews Polarity dataset using the HuggingFace dataset:

from datasets import load_dataset #Loading the dataset dataset = load_dataset("amazon_polarity")

Now let’s see what our dataset consists of:

dataset

})

So from the above output, we can see that the Amazon Reviews Polarity dataset consists of 3,600,000 training samples and 400,000 testing samples. Now let’s take a look at what a sample from the train set and test set looks like.

dataset["train"][0]

Output: {‘label’: 1, ‘title’: ‘Stunning even for the ‘non-gamer’, ‘content’: ‘This soundtrack was beautiful! It paints the scenery in your mind so good I would recommend it even to people who hate video game music! I have played the game Chrono Cross, but out of all of the games I have ever played, it has the best music! It backs away and takes a fresher step with great guitars and soulful orchestras. It would impress anyone who cares to listen! ^_^’}

dataset["test"][0]

Output: {‘label’: 1, ‘title’: ‘Great CD’, ‘title’: ‘Great CD’, ‘content’: ‘My lovely Pat has one of the GREAT voices of her generation. I have listened to this CD for YEARS and still LOVE IT. When I’m in a good mood, it makes me feel better. A bad mood just evaporates like sugar in the rain. This CD just oozes LIFE. The vocals are just STUNNING, and the lyrics just kill. One of life’s hidden gems. This is a desert island CD in my book. Why she never made it big is just beyond me. Every time I play this, no matter male or female, EVERYBODY says one thing “Who was that singing ?”‘}

From the output of print(dataset), dataset[“train”][0], and dataset[“test”][0], we can see that the dataset consists of three columns, i.e., “label”, “title”, and “content”. Considering this, we need to drop the column named title since we won’t require this to train the adapter.

#Removing the column "title" from the dataset dataset = dataset.remove_columns("title")

Let’s check whether the column “title” has been dropped!

dataset

Below is a Screenshot showing the composition of the dataset after dropping the column “title”.

Output:

So clearly, the column “title” has been successfully dropped and no longer exists.

Now we will encode all the dataset samples. For this, we will use RobertaTokenizer and dataset.map() function for encoding the input data. Moreover, we will rename the target column class as “labels” since that is what a transformer model takes. Furthermore, we will use set_format() function to set the dataset format to be compatible with PyTorch.

from transformers import AutoTokenizer, RobertaTokenizer tokenizer = RobertaTokenizer.from_pretrained("roberta-base") #Encoding a batch of input data with the help of tokenizer def encode_batch(batch): return tokenizer(batch["content"], max_length=100, truncation = True, padding="max_length") dataset = dataset.map(encode_batch, batched=True) #Renaming the column "label" to "labels" dataset = dataset.rename_column("label", "labels") #Setting the dataset format to torch and mentioning the columns we want to format dataset.set_format(type="torch", columns=["input_ids", "attention_mask", "labels"])

#Defining the configuration for the model config = RobertaConfig.from_pretrained(“roberta-base”, num_labels=2) #Setting up the model

We will now add an adapter with the help of the add_adapter() method. For this, we will pass an adapter name; we passed “amazon_polarity”. Following this, we will also add a matching classification head. Lastly, we will activate the adapter and prediction head using train_adapter().

Basically, train_adapter() method performs two functions majorly:

It freezes all the weights of the pre-trained model such that only the adapter weights are updated during the training.

It also activates the adapter and prediction head to use both in every forward pass.

#Adding adapter to the RoBERTa model model.add_adapter("amazon_polarity") # Adding a matching classification head model.add_classification_head( "amazon_polarity", num_labels=2, id2label={ 0: "negative", 1: "positive"} ) # Activating the adapter model.train_adapter("amazon_polarity")

We will configure the training process with the help of TraniningArguments class. Following this, we will also write a function to calculate evaluation accuracy. Lastly,  we will pass the arguments to the AdapterTrainer, a class optimized for only training adapters.

import numpy as np from transformers import TrainingArguments, AdapterTrainer, EvalPrediction training_args = TrainingArguments( learning_rate=3e-4, max_steps=80000, per_device_train_batch_size=32, per_device_eval_batch_size=32, logging_steps=1000, output_dir="adapter-roberta-base-amazon-polarity", overwrite_output_dir=True, remove_unused_columns=False, ) def compute_accuracy(eval_pred): preds = np.argmax(eval_pred.predictions, axis=1) return {"acc": (preds == eval_pred.label_ids).mean()} trainer = AdapterTrainer( model=model, args=training_args, train_dataset=dataset["train"], eval_dataset=dataset["test"], compute_metrics=compute_accuracy, )

Let’s start training now!

trainer.train()

TrainOutput(global_step=80000, training_loss=0.13133217878341674, metrics={‘train_runtime’: 7884.1676, ‘train_samples_per_second’: 324.701, ‘train_steps_per_second’: 10.147, ‘total_flos’: 1.33836672e+17, ‘train_loss’: 0.13133217878341674, ‘epoch’: 0.71})

Evaluating the Trained Model

Now let’s evaluate the adapter’s performance on the dataset’s test split.

trainer.evaluate()

We can use the trained model with the help of the Hugging Face pipeline to make quick predictions.

from transformers import TextClassificationPipeline classifier = TextClassificationPipeline(model=model, tokenizer=tokenizer, device=training_args.device.index) classifier("I came across a lot of reviews stating that it is the best book out there.")#import csv

Output: [{‘label’: ‘positive’, ‘score’: 0.5589291453361511}]

Extracting and Saving the Adapter

Ultimately, we can also extract the adapter from the trained model and save it for later use. save_adapter() creates a file for saving adapter weights and adapter configuration.

model.save_adapter("./final_adapter", "amazon_polarity")

Fig. 6 Image showing the saved adapter weights and configuration

!ls -lh final_adapter

Fig. 7 The files present in the final_adapter folder

Deactivating and Deleting the Adapter

Once we are done working with the adapters, and they are no longer needed, we can restore the weights of the base model in its original form by deactivating and deleting the adapter.

#Deactivating the adapter model.set_active_adapters(None) #Deleting the added adapter model.delete_adapter("amazon_polarity") Pushing the Trained Model to the Hub

We can also push the trained model to the Hugging Face hub for later use. For this, we will import the libraries and install git, and then we will push the model to the hub.

from huggingface_hub import notebook_login notebook_login() !apt install git-lfs !git config --global credential.helper store trainer.push_to_hub() Comparison of Adapter with Full Fine-tuning

Since the finetuning of adapters involves only the updation of adapter parameters while the parameters of the pre-trained models are frozen, this greatly reduces the training time, computational cost of fine-tuning, and memory footprint of the adapter module when compared to full fine-tuning.

The adapter module can be easily integrated with the pre-trained models to adapt them to new tasks without the need to retrain the whole model. Notably, the size of the file, which contains adapter weights, is just 3.5 MB. Both of these aspects highlight its potential for ease of reusability for multiple tasks.

To draw further conclusions, I trained the adapter and RoBERTa model on a smaller dataset, i.e., “Rotten Tomatoes”. I was pleasantly surprised that adapters scored better than the full fine-tuned model. Notably, after training the adapter for around 113 epochs, the eval_acc was 88.93%, and the model had started to overfit. On the other hand, when the RoBERTa model was trained for the same number of epochs, the eval_acc was 50%, and the train_loss and eval_loss were around 0.693, and these were still going down. Regardless, to draw a more fair and concrete conclusion, a lot more experiments need to be conducted.

Applications of the Trained Adapter

Following are some of the potential applications of an Adapter trained on the Amazon Polarity dataset for sequence classification tasks:

Customer Service: The trained adapter can be used to automatically classify the raised customer support tickets into positive or negative, allowing the support team to address and prioritize customer complaints more effectively and timely.

Product/Service Reviews: The trained adapter can automatically classify product/service reviews as positive or negative, helping businesses quickly gauge customer satisfaction with their offerings.

Market Research: The trained adapter can also be used for analyzing sentiment in customer feedback surveys, market research forms, etc., which can be further utilized to draw insights about customer sentiment toward their product/service/brand.

Brand Monitoring: The trained model can be used to monitor online mentions of a brand or product and classify them by sentiment, allowing businesses to track their online reputation and respond to negative feedback or complaints.

Pros of the Adapters

Efficient Fine-tuning: Adapters can be fine-tuned on new tasks with fewer parameters than training an entire model from scratch.

Modular: Adapters are modular/interchangeable; they can be easily swapped or added to a pre-trained model.

Domain-specific Adaptations: Adapters can be fine-tuned on domain-specific tasks, resulting in better performance at those tasks.

Incremental Learning: Adapters can be used for incremental learning, allowing for efficient continuous learning and adapting the pre-trained model to new data.

Faster Training: Adapters can be trained faster than training the entire model from scratch, which helps in faster experimentation and prototyping.

Smaller Size: Adapters are significantly smaller than a fine-tuned model, allowing for faster inference and less memory consumption.

Cons of the Adapters

Reduced Performance: Since an additional adapter layer is added on top of a pre-trained model, this can add computational overhead to the model and affect the model’s performance regarding inference speed and accuracy.

Increased Complexity: Again, as the adapters are added to a pre-trained model, the model must be modified to accept inputs and outputs from the adapter layer. This can, in turn, make the overall architecture of the model more complex.

Limited Expressiveness: Adapters are task-specific and may not be as expressive as a fully-trained model fine-tuned for certain tasks, especially for complex tasks or those requiring domain-specific knowledge.

Limited Transferability: Adapters are trained on limited task-specific data, which may not enable them to generalize well to new tasks or domains, reducing their usefulness when the task or domain differs from the one the adapter was trained on.

Potential for Overfitting: The experiments we performed in this article itself showed that the adapter started to overfit after certain steps, which can lead to poor performance on a downstream task.

Future Research Directions

Exploring Different Adapter Architectures: Adapters are currently implemented as small feedforward neural networks inserted between layers of a pre-trained model. There is huge potential for exploring different architectures for adapters that may offer better performance for specific tasks. This could include investigating new methods for parameter sharing, designing adapters with multiple layers, exploring different activation functions, incorporating attention, etc.

Studying the Impact of Adapter Size: Larger adapters have been shown to work better than smaller ones. But there’s a caveat here the “largeness” of the model affects the inference speed and the computational cost/requirement. Hence further research could be done to explore the optimal size of the adapters based on specific tasks.

Investigating Multi-Layer Adapters: Currently, adapters are added to a single layer of a pre-trained model. There is a scope for exploring multi-layer adapters that can adapt multiple layers of a model for a given task.

Adapting to Other Modalities: Although adapters have been developed, studied, and tested primarily in the context of NLP, there is a scope for studying their use for other modalities like image, audio processing, etc.

Improving Efficiency and Scalability: The efficiency and scalability of adapter training could be improved much more than it currently is.

Multi-domain Adaptation and Multi-task Learning: Adapters have been shown to adapt to new domains and tasks quickly. Future research can help develop adapters that can simultaneously adapt to multiple domains.

Compression and Pruning with Adapters: The efficiency of the adapters can be further increased by developing methods for compressing or pruning adapters while maintaining their effectiveness.

Adapters for Reinforcement Learning: Investigating the use of adapters for reinforcement learning can enable agents to learn more quickly and effectively in complex environments.

Conclusion

This article presents how we can train an adapter model to alter the weights of a given pre-trained model based on the task at hand. And we also saw that once the task is complete, we can easily restore the weights of the base model in its original form by deactivating and deleting the adapter.

To summarize, the key takeaways from this article are:

Adapters are small bottleneck layers that can be dynamically added to a pre-trained model based on different tasks and languages.

We trained an adapter for the RoBERTa model on the Amazon polarity dataset for the sentiment classification task with the help of adapter-transformers, the AdapterHub adaptation of HuggingFace’s transformers library.

train_adapter() method freezes all the weights of the pre-trained model such that only the adapter weights are updated during the training. It also activates the adapter and prediction head to use both in every forward pass.

The adapter from the trained model can be extracted and saved for later use. save_adapter() creates a file for saving adapter weights and adapter configuration.

When the adapter is not needed, we can restore the weights of the base model in its original form by deactivating and deleting the adapter.

Adapters seemed to perform better than the fully fine-tuned RoBERTa model, but, to have a concrete conclusion, more experiments must be conducted.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion. 

Related

Novel Ai Image Generation Guide

See More : How To Use NovelAI Image Generation Free

In today’s digital age, visual content plays a crucial role in capturing attention and conveying messages effectively. Whether you are an artist, designer, marketer, or content creator, having the ability to generate compelling images is a valuable skill. This is where NovelAI steps in, offering a user-friendly platform that harnesses the power of artificial intelligence to generate stunning visuals that match your creative vision.

To embark on your image generation journey with NovelAI, the first step is to subscribe to their platform. By becoming a NovelAI member, you gain access to their extensive suite of tools and features, including the highly acclaimed image generation feature. Subscribe today and unlock the full potential of your creative prowess.

Once you have subscribed to NovelAI, it’s time to dive into the world of image generation. Start by figuring out your prompt—a concise description of the image you envision. Whether it’s a majestic landscape, a mythical creature, or a futuristic cityscape, clearly define your prompt to guide the AI in generating the desired visuals.

NovelAI offers the flexibility to pick your favorite prompts and experiment with various editing options. Tweak, refine, and enhance your prompts to achieve the perfect balance between imagination and reality. Let your creativity run wild as you explore the possibilities that NovelAI has to offer.

To fine-tune your image generation process, NovelAI provides you with the option to adjust your text prompt and generation settings. The Edit Image canvas allows you to make precise modifications, ensuring that the generated images align with your creative vision.

Experiment with different parameters such as strength, noise, and resolution aspects to refine your results further. Refocus your text prompt to emphasize specific visual characteristics or let the AI interpret your words and create stunning compositions that surpass your expectations.

In the realm of image generation, NovelAI empowers users to define the visual characteristics of their creations in two distinct ways. You can either use tags to specify the desired attributes of your character or composition, or you can let the AI interpret your words and generate visuals based on its understanding.

By using tags such as {detailed}, {ornate}, {realistic}, or {photorealistic}, you can add intricate details to objects or clothing and steer the image away from a flat 2D anime style. These tags provide you with granular control over the final output, allowing you to bring your creative vision to life.

NovelAI’s cutting-edge technology is driven by their custom NovelAI Diffusion Models, built on the foundation of Stable Diffusion. This unique approach ensures the generation of high-quality images with remarkable realism and artistic appeal.

Also Read : How To Create NSFW AI Art?

After generating the initial set of images, NovelAI offers a range of powerful image editing and refinement tools to further enhance and customize your visuals. These tools allow you to fine-tune various aspects of the image, such as color, lighting, composition, and more.

With NovelAI’s intuitive interface, you can easily navigate through the editing tools and make adjustments in real-time. Experiment with different filters, effects, and adjustments to add your personal touch and make the images truly unique.

Through this collaborative process, NovelAI becomes a creative partner that evolves alongside your artistic vision, continually improving its ability to generate images that align with your unique style and preferences.

Once you are satisfied with the generated and refined images, it’s time to save and export your masterpieces. NovelAI allows you to download the images in high-resolution formats, ensuring that you can showcase your artwork in all its glory.

Whether you plan to use the images for personal projects, commercial endeavors, or sharing on social media, NovelAI provides the flexibility to export your creations in various file formats, including JPEG, PNG, and TIFF.

Q: Can I use the images generated by NovelAI for commercial purposes?

A: Yes, you can use the images generated by NovelAI for both personal and commercial purposes. However, it’s always a good practice to review and comply with NovelAI’s terms of service to ensure you are utilizing the images within the allowed guidelines.

Q: How long does it take to generate images with NovelAI?

A: The generation time for images with NovelAI may vary depending on factors such as complexity, resolution, and the number of iterations. Simple images can be generated within seconds, while more complex and high-resolution images may take a few minutes. The platform provides estimated generation times for each image, allowing you to plan your creative process accordingly.

Q: Can I collaborate with other artists or creators on NovelAI?

A: Currently, NovelAI focuses on providing individual creative experiences. While you cannot directly collaborate with other users within the platform, you can certainly share your generated images with fellow artists and collaborate outside of NovelAI using the exported files.

Q: Can I adjust the style or artistic elements of the generated images?

A: Yes, NovelAI offers a range of editing and refinement tools that allow you to adjust the style and artistic elements of the generated images. You can experiment with various filters, effects, and adjustments to achieve your desired look and make the images truly unique.

Share this:

Twitter

Facebook

Like this:

Like

Loading…

Related

Beginner’s Guide To Image Gradient

So, the gradient helps us measure how the image changes and based on sharp changes in the intensity levels; it detects the presence of an edge. We will dive deep into it by manually computing the gradient in a moment.

Why do we need an image gradient?

Image gradient is used to extract information from an image. It is one of the fundamental building blocks in image processing and edge detection. The main application of image gradient is in edge detection. Many algorithms, such as Canny Edge Detection, use image gradients for detecting edges.

Mathematical Calculation of Image gradients

Enough talking about gradients, Let’s now look at how we compute gradients manually. Let’s take a 3*3 image and try to find an edge using an image gradient. We will start by taking a center pixel around which we want to detect the edge. We have 4 main neighbors of the center pixel, which are:

(iv) P(x,y+1) bottom pixel

We will subtract the pixels opposite to each other i.e. Pbottom – Ptop and Pright – Pleft , which will give us the change in intensity or the contrast in the level of intensity of the opposite the pixel.

Change of intensity in the X direction is given by:

Gradient in Y direction = PR - PL

Change of intensity in the Y direction is given by:

Gradient in Y direction = PB - PT

Gradient for the image function is given by:

𝛥I = [𝛿I/𝛿x, 𝛿I/𝛿y]

Let us find out the gradient for the given image function:

We can see from the image above that there is a change in intensity levels only in the horizontal direction and no change in the y direction. Let’s try to replicate the above image in a 3*3 image, creating it manually-

Let us now find out the change in intensity level for the image above

GX = PR - PL Gy = PB - PT GX = 0-255 = -255 Gy = 255 - 255 = 0

𝛥I = [ -255, 0]

Let us take another image to understand the process clearly.

Let us now try to replicate this image using a grid system and create a similar 3 * 3 image.

Now we can see that there is no change in the horizontal direction of the image

GX = PR - PL , Gy = PB - PT GX = 255 - 255 = 0 Gy = 0 - 255 = -255

𝛥I = [0, -255]

But what if there is a change in the intensity level in both the direction of the image. Let us take an example in which the image intensity is changing in both the direction

Let us now try replicating this image using a grid system and create a similar 3 * 3 image.

GX = PR - PL , Gy = PB - PT GX = 0 - 255 = -255 Gy = 0 - 255 = -255

𝛥I = [ -255, -255]

Now that we have found the gradient values, let me introduce you to two new terms:

Gradient magnitude

Gradient orientation

Gradient magnitude represents the strength of the change in the intensity level of the image. It is calculated by the given formula:

Gradient Magnitude: √((change in x)² +(change in Y)²)

The higher the Gradient magnitude, the stronger the change in the image intensity

Gradient Orientation represents the direction of the change of intensity levels in the image. We can find out gradient orientation by the formula given below:

Gradient Orientation: tan-¹( (𝛿I/𝛿y) / (𝛿I/𝛿x)) * (180/𝝅) Overview of Filters

We have learned to calculate gradients manually, but we can’t do that manually each time, especially with large images. We can find out the gradient of any image by convoluting a filter over the image. To find the orientation of the edge, we have to find the gradient in both X and Y directions and then find the resultant of both to get the very edge.

Different filters or kernels can be used for finding gradients, i.e., detecting edges.

3 filters that we will be working on in this article are

Roberts filter

Prewitt filter

Sobel filter

All the filters can be used for different purposes. All these filters are similar to each other but different in some properties. All these filters have horizontal and vertical edge detecting filters.

These filters differ in terms of the values orientation and size

Roberts Filter

Suppose we have this 4*4 image

Let us look at the computation

The gradient in x-direction =

Gx = 100 *1 + 200*0 + 150*0 - 35*1 Gx = 65

The gradient in y direction =

Gy = 100 *0 + 200*1 - 150*1 + 35*0 Gy = 50

Now that we have found out both these values, let us calculate gradient strength and gradient orientation.

Gradient magnitude = √(Gx)² + (Gy)² = √(65)² + (50)² = √6725 ≅ 82

We can use the arctan2 function of NumPy to find the tan-1 in order to find the gradient orientation

Gradient Orientation = np.arctan2( Gy / Gx) * (180/ 𝝅) = 37.5685 Prewitt Filter

Prewitt filter is a 3 * 3 filter and it is more sensitive to vertical and horizontal edges as compared to the Sobel filter. It detects two types of edges – vertical and horizontal. Edges are calculated by using the difference between corresponding pixel intensities of an image.

A working example of Prewitt filter

Suppose we have the same 4*4 image as earlier

Let us look at the computation

The gradient in x direction =

Gx = 100 *(-1) + 200*0 + 100*1 + 150*(-1) + 35*0 + 100*1 + 50*(-1) + 100*0 + 200*1 Gx = 100

The gradient in y direction =

Gy = 100 *1 + 200*1 + 200*1 + 150*0 + 35*0 +100*0 + 50*(-1) + 100*(-1) + 200*(-1) Gy = 150

Now that we have found both these values let us calculate gradient strength and gradient orientation.

Gradient magnitude = √(Gx)² + (Gy)² = √(100)² + (150)² = √32500 ≅ 180

We will use the arctan2 function of NumPy to find the gradient orientation

Gradient Orientation = np.arctan2( Gy / Gx) * (180/ 𝝅) = 56.3099 Sobel Filter

Sobel filter is the same as the Prewitt filter, and just the center 2 values are changed from 1 to 2 and -1 to -2 in both the filters used for horizontal and vertical edge detection.

A working example of Sobel filter

Suppose we have the same 4*4 image as earlier

Let us look at the computation

The gradient in x direction =

Gx = 100 *(-1) + 200*0 + 100*1 + 150*(-2) + 35*0 + 100*2 + 50*(-1) + 100*0 + 200*1 Gx = 50

The gradient in y direction =

Gy = 100 *1 + 200*2 + 100*1 + 150*0 + 35*0 +100*0 + 50*(-1) + 100*(-2) + 200*(-1) Gy = 150

Now that we have found out both these values, let us calculate gradient strength and gradient orientation.

Gradient magnitude = √(Gx)² + (Gy)² = √(50)² + (150)² = √ ≅ 58

Using the arctan2 function of NumPy to find the gradient orientation

Gradient Orientation = np.arctan2( Gy / Gx) * (180/ 𝝅) = 71.5650 Implementation using OpenCV

We will perform the program on a very famous image known as Lenna.

Let us start by installing the OpenCV package

#installing opencv !pip install cv2

After we have installed the package, let us import the package and other libraries

Using Roberts filter

Python Code:



Using Prewitt Filter #Converting image to grayscale gray_img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) #Creating Prewitt filter kernelx = np.array([[1,1,1],[0,0,0],[-1,-1,-1]]) kernely = np.array([[-1,0,1],[-1,0,1],[-1,0,1]]) #Applying filter to the image in both x and y direction img_prewittx = cv2.filter2D(img, -1, kernelx) img_prewitty = cv2.filter2D(img, -1, kernely) # Taking root of squared sum(np.hypot) from both the direction and displaying the result prewitt = np.hypot(img_prewitty,img_prewittx) prewitt = prewitt[:,:,0] prewitt = prewitt.astype('int') plt.imshow(prewitt,cmap='gray')

OUTPUT:

Using Sobel filter gray_img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) kernelx = np.array([[-1,0,1],[-2,0,2],[-1,0,1]]) kernely = np.array([[1, 2, 1],[0, 0, 0],[-1,-2,-1]]) img_x = cv2.filter2D(gray_img, -1, kernelx) img_y = cv2.filter2D(gray_img, -1, kernely) #taking root of squared sum and displaying result new=np.hypot(img_x,img_y) plt.imshow(new.astype('int'),cmap='gray')

OUTPUT:

Conclusion

This article taught us the basics of Image gradient and its application in edge detection. Image gradient is one of the fundamental building blocks of image processing. It is the directional change in the intensity of the image. The main application of image gradient is in edge detection. Finding the change in intensity can conclude that it can be a boundary of an object. We can compute the gradient, its magnitude, and the orientation of the gradient manually. We usually use filters, which are of many kinds and for different results and purposes. The filters discussed in this article are the Roberts filter, Prewitt filter, and Sobel filter. We implemented the code in OpenCV, using all these 3 filters for computing gradient and eventually finding the edges.

Some key points to be noted:

Image gradient is the building block of any edge detection algorithm.

We can manually find out the image gradient and the strength and orientation of the gradient.

We learned how to find the gradient and detect edges using different filters coded using OpenCV.

Update the detailed information about Image Classification: 6 Applications & 4 Best Practices In 2023 on the Achiashop.com website. We hope the article's content will meet your needs, and we will regularly update the information to provide you with the fastest and most accurate information. Have a great day!