International Journal of Computer Engineering and Applications, ICCSTAR-2016, Special Issue, May.16

Similar documents
DATA MINING TECHNIQUE TO ANALYZE SOIL NUTRIENTS BASED ON HYBRID CLASSIFICATION

Application of Data Mining In Agriculture

Application of Data Mining Techniques for Crop Productivity Prediction

A Survey on Predictive Analysis in Agricultural Soil Health Data to Predict the Best Fitting Crop

A Model for Prediction of Crop Yield

ARTIFICIAL IMMUNE SYSTEM CLASSIFICATION OF MULTIPLE- CLASS PROBLEMS

CONNECTING CORPORATE GOVERNANCE TO COMPANIES PERFORMANCE BY ARTIFICIAL NEURAL NETWORKS

Agro Genius: An Emergent Expert System for Querying Agricultural Clarification Using Data Mining Technique

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

PROFITABLE ITEMSET MINING USING WEIGHTS

Property Business Classification Model Based on Indonesia E-Commerce Data

Design and Implementation of Office Automation System based on Web Service Framework and Data Mining Techniques. He Huang1, a

DATA MINING: A BRIEF INTRODUCTION

Association rules model of e-banking services

ANALYSIS OF CROP YIELD PREDICTION USING DATA MINING TECHNIQUE TO PREDICT ANNUAL YIELD OF MAJOR CROPS

Mining a Marketing Campaigns Data of Bank

Association rules model of e-banking services

BIOINFORMATICS AND SYSTEM BIOLOGY (INTERNATIONAL PROGRAM)

Data Mining for Biological Data Analysis

All in one: farmer assistant system using machine learning algorithm. Recommendation System using K-nearest neighbor

Data Mining Based Approach for Quality Prediction of Injection Molding Process

Leaf Disease Detection Using K-Means Clustering And Fuzzy Logic Classifier

Waldemar Jaroński* Tom Brijs** Koen Vanhoof** COMBINING SEQUENTIAL PATTERNS AND ASSOCIATION RULES FOR SUPPORT IN ELECTRONIC CATALOGUE DESIGN

Proactive Data Mining Using Decision Trees

Prediction of Success or Failure of Software Projects based on Reusability Metrics using Support Vector Machine

Predicting Customer Loyalty Using Data Mining Techniques

Predictive Analysis in Agriculture to Improve the Crop Productivity using ZeroR algorithm

A Survey on Recommendation Techniques in E-Commerce

Data Science Challenges for Online Advertising A Survey on Methods and Applications from a Machine Learning Perspective

Application of Decision Trees in Mining High-Value Credit Card Customers

COMPARATIVE STUDY OF SUPERVISED LEARNING IN CUSTOMER RELATIONSHIP MANAGEMENT

ScienceDirect. An Efficient CRM-Data Mining Framework for the Prediction of Customer Behaviour

When to Book: Predicting Flight Pricing

MISSING DATA CLASSIFICATION OF CHRONIC KIDNEY DISEASE

A Direct Marketing Framework to Facilitate Data Mining Usage for Marketers: A Case Study in Supermarket Promotions Strategy

A STUDY ON STATISTICAL BASED FEATURE SELECTION METHODS FOR CLASSIFICATION OF GENE MICROARRAY DATASET

Proposal for ISyE6416 Project

2 Maria Carolina Monard and Gustavo E. A. P. A. Batista

Study on the Application of Data Mining in Bioinformatics. Mingyang Yuan

Software Next Release Planning Approach through Exact Optimization

AN INTELLIGENT AGENT BASED TALENT EVALUATION SYSTEM USING A KNOWLEDGE BASE

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Fraud Detection for MCC Manipulation

Data Analytics with MATLAB Adam Filion Application Engineer MathWorks

A Profit-based Business Model for Evaluating Rule Interestingness

A SURVEY ON CROP DISEASE DETECTION USING IMAGE PROCESSING TECHNIQUE FOR ECONOMIC GROWTH OF RURAL AREA

ENHANCED CONCEPT DRIFT IN PROCESS MINING

A Study of Financial Distress Prediction based on Discernibility Matrix and ANN Xin-Zhong BAO 1,a,*, Xiu-Zhuan MENG 1, Hong-Yu FU 1

DOI: /IJITKM Page 68

REVIEW ON PREDICTION OF CHRONIC KIDNEY DISEASE USING DATA MINING TECHNIQUES

Keywords: Software quality, metric, fluffy ordering algorithm, fault-prone module.

e-trans Association Rules for e-banking Transactions

IT PROJECT DISTRIBUTION AUTOMATION

Data mining: Identify the hidden anomalous through modified data characteristics checking algorithm and disease modeling By Genomics

Keywords acrm, RFM, Clustering, Classification, SMOTE, Metastacking

DETECTION OF PLANT LEAF DISEASES USING IMAGE SEGMENTATION AGRICULTURE. 5 Dr PM Murali. Tamilnadu, India

TRANSFORMING INDIAN AGRICULTURE THROUGH INNOVATIVE TECHNOLOGY

Visual Data Mining: A case study in Thermal Power Plant

MANUFACTURING PLANT LAYOUT SUPPORTED WITH DATA MINING TECHNIQUES

Online Credit Card Fraudulent Detection Using Data Mining

The Application of Data Mining Technology in Building Energy Consumption Data Analysis

IM S5028. Architecture for Analytical CRM. Architecture for Analytical CRM. Customer Analytics. Data Mining for CRM: an overview.

KNOWLEDGE ENGINEERING TO AID THE RECRUITMENT PROCESS OF AN INDUSTRY BY IDENTIFYING SUPERIOR SELECTION CRITERIA

International Journal of Scientific & Engineering Research Volume 8, Issue 5, May ISSN

HUMAN RESOURCE PLANNING AND ENGAGEMENT DECISION SUPPORT THROUGH ANALYTICS

MACSEA Data Scientist Receives Certification

EFFICIENT PROCESS MINING WITH CONCEPT DRIFTS

Fraudulent Behavior Forecast in Telecom Industry Based on Data Mining Technology

Data Mining In Production Planning and Scheduling: A Review

Security Analytics Course Overview. Purdue University Prof. Ninghui Li Based on slides by Prof. Jenifer Neville and Chris Clifton

Stock Market Prediction with Multiple Regression, Fuzzy Type-2 Clustering and Neural Networks

Using Decision Tree to predict repeat customers

The prediction of electric energy consumption using an artificial neural network

International Journal of Scientific Research and Reviews

Analysis of Crop Yield Prediction of Kharif & Rabi Jowar Crops Using Data Mining Techniques

Mining Heterogeneous Urban Data at Multiple Granularity Layers

Chapter 13 Knowledge Discovery Systems: Systems That Create Knowledge

Decision Support System for reducing Post-Harvest Loss in Supply Chain Logistics using geographical Information System in Agriculture

ConicIT: Automated Reasoning to Troubleshoot Mainframe Problems

Operational Optimization of a Multipurpose Hydropower- Irrigation plant

A Literature Review of Predicting Cancer Disease Using Modified ID3 Algorithm

Forecasting Seasonal Footwear Demand Using Machine Learning. By Majd Kharfan & Vicky Chan, SCM 2018 Advisor: Tugba Efendigil

Clustering Method using Item Preference based on RFM for Recommendation System in u-commerce

[Kaur*, 5(3): March,2016] ISSN: (I2OR), Publication Impact Factor: 3.785

Classification of Bank Customers for Granting Banking Facility Using Fuzzy Expert System Based on Rules Extracted from the Banking Data

Automatic Model Building for Timeseries in Energy Industry. Ján Dolinský

Improving Credit Card Fraud Detection using a Meta- Classification Strategy

Raj Kumar 1, Ms. Sonia 2 1 Assistant Professor, 2 M.Tech student. Department of CSE Jind Institute of Engineering & Technology, Jind(Haryana)

Understanding the Drivers of Negative Electricity Price Using Decision Tree

2015 The MathWorks, Inc. 1

Conclusions and Future Work

Information Systems in Organizations. Decision Making. Mintzberg s five 5 organizational parts. Types of Organizations.

IBM SPSS Modeler Personal

A Review For Electricity Price Forecasting Techniques In Electricity Markets

Improving Farm Practices with Automation

Applications of Machine Learning to Predict Yelp Ratings

A Quantified Approach for Analyzing the User Rating Behaviour in Social Media

TeraCrunch Solution Blade White Paper

Predicting Reddit Post Popularity Via Initial Commentary by Andrei Terentiev and Alanna Tempest

Data Mining. Textbook:

Transcription:

A REVIEW OF DATA MINING TECHNIQUES FOR AGRICULTURAL CROP YIELD PREDICTION Akshitha K 1, Dr. Rajashree Shettar 2 1 M.Tech Student, Dept of CSE, RV College of Engineering,, Bengaluru, India 2 Prof: Dept. of Computer Science and Engineering, RVCE, Bengaluru, India ABSTRACT: Data Mining has become one of the most prominent research areas which have gained lot of attentions in the field of agricultural crop yield analysis. The prediction of agricultural crop yield has become very essential as it has a great impact on the yearly production of crop in a particular region. It is very essential for a farmer to compute the aggregate yield that is to be generated for that particular year. There are various data mining techniques such as K- Means, K-Nearest Neighbors, Artificial Neural Networks (ANN) and Support Vector Machine (SVM) which have been implemented very recently for yield data prediction in agricultural field. Yield prediction has been considered as an important agricultural problem which needs to be resolved with respect to the existing data mining techniques. This study considers the research issues associated with the existing state of art studies towards agricultural crop yield prediction mechanisms. The proposed study also aims to investigate the existing novel models which can be utilized in solving the yield prediction issues associated with any agricultural field. This study also aims to highlight the suitable data models which achieve very higher accuracy and generality in predicting the agricultural crop yield. The significant contribution of this paper is highlighted in the research gap section which deals with providing the information about the existing research issues and its impact on agricultural data mining applications.. Keywords: Data Mining Techniques, Agricultural Yield Prediction, Artificial Neural Network, K-Means, Support Vector Machine. [1] INTRODUCTION Since past two decades yield prediction in order to compute agricultural growth of a particular country as well as future direction towards investment plans on agricultural fields has been generalized by farmers based on their previous experiences. It leads to a situation where farmers fail to evaluate the accurate yield data e.g. Inaccurate estimation of future agricultural 134

A REVIEW OF DATA MINING TECHNIQUES FOR AGRICULTURAL CROP YIELD PREDICTION production based on past 5 year s data associated with rainfall and respective crop production in a particular field. The main aim of the agricultural production is maximizing the crop yield productivity with respect to minimum cost [1] [2]. There are various noteworthy evidences which show that the early detection and management towards crop yield issues may save the investment of a farmer on a particular field, it also helps to generate subsequent yield and profit yearly [3]. The recent analysis shows that there various factors such as regional climate changes, agricultural soil data sets certainly have some significant impact on agricultural growth and production. The early prediction of crop yield could be happened by managers to avoid the loses during any kind of unfavorable conditions [4]. In the earlier days most farmers used to rely on their long term experiences to figure out the prediction associated with crop yield which sometimes taken route towards a false direction. The existing research trends highlights that two different kind of approaches have been introduced in the past to achieve efficient crop yield prediction first one is some traditional mathematical approach and the second one is applications associated with artificial intelligence[5]. There are various Data Mining models also have been listed for crop yield prediction. Data mining is a process of extracting meaningful information from a set of data where each and every data belongs to that data set will have kind of correlation in between them. This proposed study aims to introduce various existing research trends associated with crop yield prediction in the field of agricultural data mining. It also highlights the significant contribution of the past 5 year s state of art data mining and classification techniques on the agricultural yield prediction such as K-mean, KNN etc. The significant contribution of this proposed study has been highlighted on the research gap section where it discusses about the various research issues of the data classification techniques and maximum adaptability of ANN concept in existing studies in order to achieve higher probability of yield prediction accuracy. The paper is organized as follows section II summarize the background of existing Data Mining techniques. Section III and IV illustrates applications of Data Mining in Agriculture and the existing techniques respectively whereas section V summarizes the whole paper. [2] APPLICATIONS OF DATA MINING IN AGRICULTURE It can be seen that there are various existing state of art experimental prototyping which have been carried out by the researchers in past to evaluate an efficient data mining technique for agricultural yield prediction. Naive Bayes Data Mining model is designed for classifying soil samples that can be used for analyzing large soil profile experimental data sets. [11]. Decision tree algorithm could also be used in data mining to predict the soil fertility [12]. The 135

overall objective of the research towards data yield prediction was to measure the accuracy of the land utilization for agriculture and non-agriculture areas for the past five years. The authors in [12] [13] have utilized k-means model for estimating the crop yield. Some data mining methodologies also have been designed to be utilized in agricultural domain are reviewed by the study in [14]. The fig. 2 represents applications of different types of data mining models. Fig 2: Application of Different types of Data Mining Models The application of k-means method towards agriculture is discussed below: The k-means algorithm has been developed to implement on soil classifications with the use of GPS-based technologies in [15]. Characterization of plant, soil, and deposit areas of enthusiasm by shading pictures, Grading apples before promoting, Monitoring water quality changes, Detecting weeds in exactness horticulture, forecast of wine yield, etc are some of the applications where k-means approach is used. Knowing ahead of time that the wine maturation procedure could get stuck or be moderate can help the farmers to take measures to get a guaranteed yield [16]. The application of k-nearest neighbor models towards the field of agriculture is as given below: The k-nearest algorithm has been used to simulate daily precipitations and other weather variables and Estimating soil water parameters and Climate forecasting [8]. 136

A REVIEW OF DATA MINING TECHNIQUES FOR AGRICULTURAL CROP YIELD PREDICTION The applications of neural networks towards agriculture in case of predicting the flowering and maturity dates of soybean and in forecasting of water resources parameters is discussed in [14]. Support Vector Machines (SVM) approach is used in agriculture towards the Classification of crop and in the analysis of the climatic parameters change scenarios. Fig. 3 demonstrates the design of product expectation which incorporates an information module which is in charge of taking data from farmer. In that the farmer needs to give region of area, district, financial status and city. Subsequent to selecting the city parameter in view of height, longitude and scope programmed climatic information will be reflected from yield learning base. The component determination module is in charge of subset choice of quality from yield information. The harvest learning base is comprises of homestead learning for eg. Area id, locale name, soil-sort, water ph, precipitation, mugginess, daylight, land data, ecological parameter, city, pesticides data, crop information such as product sort, seed sort. The learning base additionally incorporates the specimens of product with comparing ranch learning, natural parameter, and pesticides data. After subset determination of characteristic, the information goes to arrangement and affiliation principle for gathering comparable substance. At that point forecast tenets will be connected to yield of clustering to get results as far as harvest, pesticide and expense [13]. The existing data mining techniques used by researchers in past, highlighted in the review studies and shows a correlation should be possible. Different information mining methods are utilized to foresee diverse parameters of climate such as dampness, temperature, wind blast. Different credited utilized for the examination are applications, creators, information mining methods, calculations, characteristics, time period, dataset size, exactness rate, points of interest and hindrances. They yield diverse results with their cons and stars. The principle result of this is defined by the sans no lunch theorem, which expresses that there is no generally best information mining calculation. This triggers the need to choose the proper learning calculation for a given issue. For climate expectation, decision tree and k-mean grouping turns out to be great with higher forecast exactness than different strategies of information mining. Relapse procedure couldn't discover exact estimation of expectation. It is additionally watched that with the expansion in dataset size, the precision first increments however then reductions after a specific degree. One reason might be because of over fitting of preparing data sets. [15]. Table 1 tabulates the work done in the area of agriculture by different researchers and the application of various techniques to the agricultural data available. 137

Analysis of the current research trends highlights that most of the studies uses K-mean Clustering techniques with Neural networks where integration of feature selection technique along with efficient data prediction capability in remote area is needed. Most of the existing studies also found to enable decision tree mechanism with neural network model where data transformation, extra computation, handling of continuous data are required in order to execute an efficient yield prediction model. However, after evaluating so many exisitng studies it has been found that the combination of decision tree and neural network model configures the best network which can be utilized in order to gain higher accuracy in prediction. Author Sanjay et al [29] Somvanshi, et al. [30] Jagielska et al [31] Tellaeche [32], Verheyen et al [33] Urtubia et al [34], Veenadhari et al [35] Shalvi and Claris [36] Altannar et al [37] Rajagopalan and Lal [38] Problems of Interface Classification and Prediction Modeling and prediction Automated knowledge acquisition A vision-based hybrid classifier High resolution continuous soil classification Prediction of industrial wine problem fermentations Crop productivity mapping Medical data mining techniques Agricultural and Environmental Sciences Daily precipitation and other weather variable Techniques Applied Neural Networks Neural Networks K-means K-means Fuzzy set Fuzzy set K-nearest Neighbor K-nearest Neighbor Support Vector Machine Support Vector Machine Table 1: Application of Data Mining Techniques to agricultural data [6] CONCLUSION 138

A REVIEW OF DATA MINING TECHNIQUES FOR AGRICULTURAL CROP YIELD PREDICTION Agriculture is considered as one of the most noteworthy application region especially in the creating nations like India. Utilization of data innovation in horticulture can change the circumstance of choice making and agriculturists can yield in better way. Information mining assumes a critical part for choice making on a few issues identified with agribusiness field. It examines about the part of information mining in the farming field and their related work by a few creators in setting to horticulture space. It additionally examines on various information mining applications in tackling the distinctive horticultural issues. This paper coordinates as well as integrates the work of different researchers in one place so it is helpful for specialists to get data of current situation of information mining procedures and applications in setting to agricultural field. The proposed study highlights some of the significant contributions of Neural Network models in the field of agricultural data mining and also suggests the flexibility of ANN in the future research REFERENCES [1] Han, J, Kamber, M., & Pei, J. (2006). Data mining: concepts and techniques. Morgan kaufmann.s. [2] http://www.publishyourarticles.net/knowledge-hub/essay/essay-on-the-importance-ofagriculture-in-the-indian-economy.html [3] Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). From data mining to knowledge discovery in databases. AI magazine, 17(3), 37. [4] Mucherino, A., Papajorgji, P., & Pardalos, P. (2009). Data mining in agriculture (Vol. 34). Springer. [5] Beniwal, S., & Arora, J. (2012). Classification and feature selection techniques in data mining. International Journal of Engineering Research & Technology (IJERT), 1(6).. [6] Lior Rokach, Oded Maimon. Clustering Methods. Chap-15 [7] Xu, R & Wunsch, D (2005). Survey of clustering algorithms. Neural Networks, IEEE Transactions on, 16(3), 645-678. [8] Periklis Andritsos Data Clustering Techniques. University of Toronto, Department of Computer Science. ftp://ftp.cs.toronto.edu/csrg-technical-reports/443/depth.pdf [9] Srikant, R V Q & Agrawal, R (1997, August). Mining Association Rules with Item Constraints. In KDD (Vol. 97, pp. 67-73). [10] Agrawal, R., Imieliński, T., & Swami, A. (1993, June). Mining association rules between sets of items in large databases. In ACM SIGMOD Record (Vol. 22, No. 2, pp. 207-216). ACM. [11] Zaki, M J (1999). Parallel and distributed association mining: A survey. IEEE concurrency, 7(4), 14-25. 139

[12] [13] Jay Gholap. (2012). Performance tuning of j48 algorithm for prediction of soil fertility. Asian Journal of Computer Science And Information Technology 2: 8 (2012) 251 252. [14] Megala, S., & Hemalatha, M. (2011). A Novel Datamining Approach to Determine the Vanished Agricultural Land in Tamilnadu. International Journal of Computer Applications, 23. [15] D Ramesh, B Vishnu Vardhan, (2013). Data Mining Techniques and Applications to Agricultural Yield [16] Data. International Journal of Advanced Research in Computer and Communication Engineering 2(9). [17] V. Ramesh and K. Ramar, 2011. Classification of Agricultural Land Soils: A Data Mining Approach. Agricultural Journal, 6: 82-86. Author[s] brief Introduction Akshitha K M.tech, student RVCE, Flat N0.017,DSMAXSWASTIK, 20 TH CROSS,VEERNAJANEYA NAGAR,TURAHALLI,UTTARAHALLI HOBLI,BENGALURU 560061 PH.NO9880141284 140