







Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
I have analysed the EDA Loan status prediction and used Machine Learning to Predict the Approval of Loan status. I have used various classification algorithms to make predictions and compared their performance.
Typology: Study Guides, Projects, Research
1 / 13
This page cannot be seen from the preview
Don't miss anything!
1.1 Overview 1.2 Purpose 3 LITERATURE SURVEY 4 THEORITICAL ANALYSIS 3.1 Block diagram 3.2 Hardware / Software designing 5 FLOWCHART 6 RESULT 7 ADVANTAGES & DISADVANTAGES 8 APPLICATIONS 9 CONCLUSION 10 FUTURE SCOPE 11 BIBILOGRAPHY APPENDIX A. Source code
In recent years, the ML methods have become popular as they allow researchers to improve the prediction accuracy of loan and are used for various financial applications. The ML methods have been used to increase the prediction accuracy of loan accounting systems, and the data derived from the literature sources were used. Classification models tend to be used for the prediction of the loan application. These models also demonstrate how the loan satus depends on the other factors such as income etc..
. In this study, the ML classification methods were compared to predict the effect of factors with loan status. The samples were prepared by accounting for seven simultaneously controllable effect variables in the laboratory. The study aimed to determine the most
Import data set Splitting Data
Loan Status Prediction Data Collection Data Preprocessing Import libraries Visualization Handling null values Label encoding One Hot encoding Feature Scaling Model Building Application Building Create html file Build python code
Python, Python Web Frame Works, Python for Data Analysis, Python For Data Visualization, Data Pre-processing Techniques, Machine Learning, Regression Algorithms
Advantages: Using Machine learning to predict the loan status will produce high time and more accuracy in predicting the approximately close value can be done easily. Its more trust worthy and cost effective .It also reduces the man power for doing the experiments to find the strength of the concrete in different unknown situations. Disadvantages : There is a 3 % chances that the outcome will not predict the approximate value in that situation it can be troublesome.
Can predict the status of loan approval using the inputs provided. Implementable on the website
In this study, a prediction model of Loan status was established by Logistic Regression Classifier. A total of 614 sample data collected from the experimental test were used to develop the Logistic Regression Classifier model for predicting Loan status. Conclusions can be drawn as follows: Compare to all other Machine Learning Models Logistic Regression Classifier was best suitable for this data. Logistic Regression Classifier gave the maximum accuracy when tested using accuracy_score confusion matrix. Maximum accuracy received is 87.09 %.
This model can predict the outcome with many different inputs within seconds. The model will save a lot of time of the Banking sectors and to employees. It helps in raise of banking sector.
Books Hastie, Friedman, and Tibshirani, The Elements of Statistical Learning, 2001 Bishop, Pattern Recognition and Machine Learning, 2006 Ripley, Pattern Recognition and Neural Networks, 1996 Duda, Hart, and Stork, Pattern Classification, 2nd Ed., 2002 Tan, Steinbach, and Kumar, Introduction to Data Mining, Addison-Wesley, 2005. Data repositories
Kaggle.com Algorithms Thesmartbridgeteachable.com
Importing libraries: import pandas as pd import matplotlib.pyplot as plt import numpy as np import seaborn as sns importing data: train_data = pd.read_csv('train.csv') visualization: sns.heatmap(train_data.corr(),annot = True) sns.pairplot(train_data) sns.boxplot(x = "Property_Area",y = "LoanAmount",data = train_data) Handling null values: train_data.isnull().any() train_data['Gender'].fillna(train_data['Gender'].mode()[0],inplace = True) train_data['Married'].fillna(train_data['Married'].mode()[0],inplace = True) train_data['Dependents'].fillna(train_data['Dependents'].mode()[0],inplace = True) train_data['Self_Employed'].fillna(train_data['Self_Employed'].mode()[0],inplace = True) train_data['LoanAmount'].fillna(train_data['LoanAmount'].mean(),inplace = True) train_data['Loan_Amount_Term'].fillna(train_data['Loan_Amount_Term'].mean(),inplace = True) train_data['Credit_History'].fillna(train_data['Credit_History'].mode()[0],inplace = True) Label Encoding: from sklearn.preprocessing import LabelEncoder le = LabelEncoder() train_data['Gender'] = le.fit_transform(train_data['Gender']) train_data['Married'] = le.fit_transform(train_data['Married']) train_data['Education'] = le.fit_transform(train_data['Education'])
Roc_Curve: import sklearn.metrics as metrics fpr , tpr , threshold = metrics.roc_curve(y_test,y_pred) roc_auc = metrics.auc(fpr,tpr) plt.title("roc") plt.plot(fpr,tpr,color = 'blue',label = 'Auc = %0.2f'% roc_auc) plt.legend(loc = 'lower right') plt.plot([0,1],[0,1],'r') plt.xlim([0,1]) plt.ylim([0,1]) plt.xlabel('tpr') plt.ylabel('fpr') plt.show() Auc=0.