Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Anomaly Detection in Wine Quality using Machine Learning, Slides of Computer Science

A study aimed at developing a method for detecting anomalies in wine quality using machine learning techniques. The researchers preprocessed a dataset containing physicochemical properties of different types of wine and their quality ratings, then applied the isolation forest algorithm to identify outliers. The results showed the proposed method's effectiveness in identifying anomalous instances.

Typology: Slides

2023/2024

Uploaded on 03/29/2024

gandikota-venkata-naga-hema-kiranma
gandikota-venkata-naga-hema-kiranma 🇮🇳

1 document

1 / 12

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
WINE QUALITY
ANOMALY
DETECTION
Presentation By:
G.V.N Hema Kiranmai
S. Varshini
P. Lavanya
J. Yaswanth
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Anomaly Detection in Wine Quality using Machine Learning and more Slides Computer Science in PDF only on Docsity!

WINE QUALITY

ANOMALY

DETECTION

Presentation By:

G.V.N Hema Kiranmai

S. Varshini

P. Lavanya

J. Yaswanth

ABSTRACT

The aim of this study is to develop a method for detecting anomalies in

wine quality using machine learning techniques. The dataset used in

this study consists of physicochemical properties of different types of

wine, along with their quality ratings provided by experts. The dataset

is preprocessed to ensure its suitability for use in anomaly detection.

The proposed method for detecting anomalies in wine quality involves

several steps. First, the dataset is preprocessed by handling missing

values and scaling the features to ensure that they have the same range.

Feature selection is then performed to identify the most relevant

features that are most informative for detecting anomalies. Next, the

isolation forest algorithm is applied to the preprocessed dataset to

detect outliers. The isolation forest algorithm is a machine learning

algorithm that is particularly suitable for anomaly detection as it uses a

decision tree-based approach to isolate anomalies in the dataset. The

results of the experiment show that the proposed method is effective in

identifying anomalous instances in the dataset.

OBJECTIVES The objective of wine quality detection is to develop a model or system that can accurately detect the quality of wine based on various measurable factors such as chemical composition, sensory characteristics, and consumer preferences. The detection of wine quality can help winemakers to optimize their production processes and ensure that their products meet the desired quality standards. It can also assist consumers in making informed decisions about which wines to purchase and enjoy. Furthermore, wine quality detection can support the development of new and innovative winemaking techniques and technologies, which can improve the quality of wine and enhance the overall wine industry.

PROPOSED WORK Proposed work for wine quality detection and prediction typically involves developing and applying analytical models that can accurately predict the quality of wine based on various measurable factors. Here are some common steps involved in the proposed work for wine quality detection and prediction:

  • (^) Data collection: This involves collecting data on the various measurable factors that affect wine quality such as chemical composition, sensory attributes, and consumer preferences.
  • (^) Data preprocessing: This involves cleaning, transforming, and integrating the collected data to ensure that it is ready for analysis.
  • Feature selection: This involves identifying the most important features or variables that have the most significant impact on wine quality.

METHODOLOGY

  1. IQR (Interquartile Range) :

IQR stands for Interquartile Range, which is a statistical measure that is used to measure the

spread of a dataset. The IQR is defined as the range between the first quartile (Q1) and the third quartile

(Q3) of a dataset, which contains 50% of the data points. The IQR is calculated by subtracting the value

of Q1 from the value of Q3, that is, IQR = Q3 - Q1. The IQR is a robust measure of the spread of the

dataset, as it is not affected by extreme values or outliers in the dataset.

2. Z-score : It is also known as standard score, is a statistical measure that indicates how many standard deviations a data point is from the mean of a dataset. It is calculated by subtracting the mean of the dataset from the data point and then dividing the result by the standard deviation of the dataset. The formula for calculating the Z-score of a data point x is: Z = (x - mean) / standard deviation.

OUTPUTS

CONCLUSION In conclusion, wine quality anomaly detection is an important task in the wine industry, as it helps to identify faulty or defective wine batches before they are released to consumers. In this report, we discussed various machine learning algorithms that can be used for wine quality anomaly detection, including SVM, Random Forest, Decision Tree, and KNN. We also discussed the Wine Quality dataset, which is a popular benchmark dataset for wine quality prediction. The dataset contains physicochemical properties of red and white wine variants, as well as their quality ratings based on sensory data obtained from human tasters. The dataset can be used to train machine learning models for wine quality anomaly detection. To detect anomalies in wine quality, various approaches can be used, such as supervised and unsupervised learning algorithms, and statistical techniques. Unsupervised learning algorithms, such as DBSCAN and Isolation Forest, can be used to detect anomalies in the dataset without the need for labeled data. On the other hand, supervised learning algorithms, such as SVM and Random Forest, can be used to predict the quality of wine based on its physicochemical properties and identify outliers that deviate significantly from the predicted values.