1 Kazile

Latest Research Papers On Data Mining

data mining IEEE PAPER 2016

Analytical Implementation of Web Structure Mining Using Data Analysis in Educational Domain
free download
Abstract The optimal web data mining analysis of web page structure acts as a key factor in educational domain which provides the systematic way of novel implementation towards real-time data with different level of implications. Our experimental setup initially focuses

free download
ABSTRACT Achieving and maintaining good indoor environment quality (IEQ) while improving energy efficiency of housing is a globally relevant goal. Current developments of sensor networks are increasing the availability of high-resolution data from buildings,

Journal of Data Mining in Genomics Proteomics
free download
Abstract The vast majority of microbes form a healthy symbiotic 'superorganism'with the hosts. There are two types of symbiosis (Sym), exosymbiosis (eg microbiota) and endosymbiosis (eg mitochondria). It has been suggested that the exo-endo Sym balance (

Deploying nEmesis: Preventing Foodborne Illness by Data Mining Social Media
free download
Abstract Foodborne illness afflicts 48 million people annually in the US alone. Over 128,000 are hospitalized and 3,000 die from the infection. While preventable with proper food safety practices, the traditional restaurant inspection process has limited impact given the

Distributed Data Mining: Implementing Data Mining Jobs on Grid Environments
free download
ABSTRACT Data mining technology is not only composed by efficient and effective algorithms, executed as standalone kernels. Rather, it is constituted by complex applications articulated in the non-trivial interaction among hardware and software components,

Estimation of Pan-Evaporation using Spatiotemporal Data Mining Approach
free download
Abstract:Weather forecasting has been one of the most scientifically and technologically challenging problems around the world. Hence, we aim to predict the event before days or can say before weeks thus increasing response time with which we can prevent huge

A Hybrid Approach for Detecting Suspicious Accounts in Money Laundering Using Data Mining Techniques
free download
Abstract:Money laundering is a criminal activity to disguise black money as white money. It is a process by which illegal funds and assets are converted into legitimate funds and assets. Money Laundering occurs in three stages: Placement, Layering, and Integration. It

Data Mining Methods for New Feature of Malicious Program
free download
Abstract Rapid Propagation of malicious program has caused great harm to the security of user information, the traditional way of killing methods, which is lagging behind and nonintelligent, has been unable to meet the demand of current detection. Studying the

Fraud Detection on Bulk Tax Data Using Business Intelligence Data MiningTool: A Case of Zambia Revenue Authority
free download
Abstract: Zambia Revenue Authority (ZRA) generates large volumes of data that need complex mechanisms in order to extract useful tax information. The purpose of the study was to develop a data mining model for detection of fraud on tax and taxpayer data for ZRA.

Recent advances in environmental data mining
free download
Due to the large amount and complexity of data available nowadays in geo-and environmental sciences, we face the need to develop and incorporate more robust and efficient methods for their analysis, modelling and visualization. An important part of these

free download
Abstract The Main objective of data mining is to find out the new, unknown and unpredictable information from huge database, which is useful and helps in decision making. There are number of techniques used in data mining to identify frequent pattern

Boosted Apriori: an Effective Data Mining Association Rules for Heart Disease Prediction System
free download
Abstract: In health concern business, data mining plays a significant task for predicting diseases. Numeral number of tests must be requisite from the patient for detecting a disease. However using data mining technique can reduce the number of test that are required.

Research of Data-Aiming Mining Algorithm in Cloud Environment
free download
Abstract Cloud computing contains a huge amount of data, which are featured as being widely distributed, heterogeneous, and dynamic. Thus, aiming at how to mine useful parts in these information, this paper proposes an Apriori algorithm based on cloud computing

Research on Pattern Analysis and Data Classification Methodology for Data Mining and Knowledge Discovery
free download
Abstract A plethora of big data applications are emerging and being researched in the computer science community which require online classification and pattern recognition of huge data pools collected from sensor networks, image and video systems, online forum

A New Feature Selection Method for Oral Cancer Using Data MiningTechniques
free download
Abstract: The term cancer is used generically for more than 100 different diseases including malignant tumours of different sites (such as breast, cervix, prostate, stomach, colon/rectum, lung, mouth, leukaemia, sarcoma of bone, Hodgkin disease, and non-Hodgkin lymphoma)

International Journal of Biomedical Data Mining
free download
Abstract The rapid advances in high-throughput technologies, such as microarrays have revolutionizing the knowledge and understanding of biological systems and genetic signatures of human diseases. This has led to the generation and accumulation of a large

A Data-Mining Approach for Wind Turbine Power Generation Performance Monitoring Based on Power Curve
free download
Abstract A new data-mining approach based on power curve profiles is put forward to monitor the power generation performance of wind turbines in this paper. Through assessing the wind-speed power datasets, the weakened power generation performance Abstract The recent applications of data mining such as biological, scientific, financial and others are changing data regularly, which is uncertain and incomplete. For finding tendency in these data up-to-date, we need to modify existing data mining algorithms with dynamic

A Proposal for Improving Project Coordination using Data Mining and Proximity Tracking
free download
Abstract. Coordination is an important success factor for a development project. Communication gaps, eg between product owners shaping the requirements and testers verifying the developed software can result in wasted effort and unsuccessful products.

Survey of Data Mining Techniques used in Healthcare Domain
free download
ABSTRACT Health care industry produces enormous quantity of data that clutches complex information relating to patients and their medical conditions. Data mining is gaining popularity in different research arenas due to its infinite applications and methodologies to

A Survey: Privacy Preservation Data Mining Techniques and Geometric Transformation
free download
ABSTRACT What is Privacy Preserving Data Mining is the process of hiding and protecting sensitive data of individuals. In the recent era, we use many applications which require personal sensitive data of individuals. Thus, people are more concern about sharing their

free download
Abstract: An organized and systematic solution is essential for all universities and organization. In every institutes or colleges the records of students are maintained. It is tedious job to maintain such a huge data manually. Student management system provides

A Comparison And Prediction Analysis For The Diagnosis Of Parkinson Disease Using Data Mining Techniques On Voice Datasets
free download
Abstract After Alzheimer's, the most dangerous neurological disorder is Parkinson Disease (PD). The problem with these disorders are once they un-earth the body, complete cure is not possible, but its prevalence can control some extent. Doctors can be able to stop the

free download
Abstract Mining association rules in large database is one of most popular data mining techniques for business decision makers. Discovering frequent item set is the core process in association rule mining. Numerous algorithms are available in the literature to find

free download
Abstract: Data mining methods are often implemented for analyzing available data and extracting Information and knowledge to support decision-making. This review paper is aimed at revealing the high potential of data mining applications for university

Comparison of data mining approaches for estimating soil nutrient contents using diffuse reflectance spectroscopy
free download
Diffuse reflectance spectroscopy (DRS) operating in wavelength range of 350�2500 nm is emerging as a rapid and non-invasive approach for estimating soil nutrient content. The success of the DRS approach relies on the ability of the data mining algorithms to extract

Multi-Data Association Rule Mining Algorithm Based on Grey Relational Analysis
free download
Abstract Mining association rules for data is not only an essential part of data mining but also a hot issue in knowledge engineering and researches on data mining technology. Since multi-data mining is characterized as being multi-type, multi-level, multi-implicational and

A Multi-Index Grey Relational Data Mining Model of Complex System Based on Grey System Theory
free download
Abstract Data mining is a complex systematic engineering. For complex systems that is influenced by multi indexes, the relevance, hierarchy and fuzziness among indexes pose great challenges to data mining of complex system. Therefore, the paper proposes a multi-

Advanced Data Mining Appraoch For Handoff Procedure's in Lte Technology
free download
Abstract With expansion in the innovation, the requests of the individuals are expanding as individuals are more intrigued by the web offices with higher information rate. Despite the fact that 3G advances convey essentially higher bit rates than 2G innovations," LTE"(3GPP

Relative Study for Prediction of Coronary Artery Disease Using Various Data Mining Techniques
free download
Abstract: Coronary Artery Diseases (CAD) are most common but it sometimes pushes into the death stage. This disease development and progression are stimulated by environment and/or genetic factor. CAD is clinically relevant in symptomaticpatients, either acute or

Evaluation of Public Servant Execution Based on Data Mining Technique and Multiple Factors Joint Modeling Analysis
free download
Abstract With the rapid development of computer science and technology, data mining modelling techniques have emerged and rapidly developed as an alternative powerful meta- learning tool to accurately and fast analyze the massive volume of data generated by

A Study on Data Mining Horizons
free download
Abstract:Data mining is the process of analyzing data from different perspectives and summarizing it into useful information. Among the rapid pace of data with the need of analysis as well as summarizing, the data mining techniques are applicable in insurance

Automation of Interventional Analysis in Physiological Variability Using Data Mining: A Review
free download
Abstract: The changes or variations that occur in physiological parameters of the body when a person is in state of rest, which means he/she is still, lying down, not talking, not under any pressure not even thinking, if possible, is termed as physiological variability (PV). The

Design Decision Support System for Loans Based on Data Mining Techniques
free download
Abstract�Decision Support Systems (DSS) is a particular type of computerized information system that support business and organizational decision making activities. on the other hand, Data Mining (DM) expand the potentials for decision support by finding styles and This volume contains 19 research papers belonging, roughly speaking, to the areas of computational statistics, data mining, and their applications. Those papers, all written specifically for this volume, are their authors' contributions to honour and celebrate

Integrating Data Mining Into Managerial Accounting System: Challenges and Opportunities
free download
Data mining involves extracting information from large data sets, discovering the hidden relationships and unknown dependencies, and supporting strategic decision-making tasks. The alignment of data mining and business would bring benefits to the organization's

Data Mining Perspective: Prognosis of Life Style on Hypertension and Diabetes
free download
Abstract: In the present era, the data mining techniques are widely and deeply useful as decision support systems in the fields of health care systems. The proposed research is an interdisciplinary work of informatics and health care, with the help of data mining

free download
Abstract-Data Mining is popularly used to combact frauds because of its effectiveness. Using data mining techniques or model the Credit card fraud can be detected. The Credit card provides cashless shopping all over the world. Hence the risk of fraud using Credit card is

free download
ABSTRACT: This article presents a procedure for using machine vision data to predict surface roughness parameter of the end milled components. Stylus based surface roughness measurements were used and compared to vision based prediction of surface

Edifice an Educational Framework using Educational Data Mining and Visual Analytics
free download
Abstract Educational Data Mining and Visual analytics are two emerging trends in the industry that plays a major role in bringing out changes in the educational institutions. This paper discusses about building an educational framework that suits the higher education

Data Mining from Smart Card Data using Data Clustering
free download
Abstract: The aim of this paper is to develop an effective methodology for the better understanding of the travelling patterns and evaluating behavioral attributes of traveller's trip. Using smart card data, the data such as boarding location, boarding time, alighting

Research on the College Graduate Employment Education Based on Data Mining Technology
free download
ABSTRACT With the weak global economic process, the unemployment rate of the college graduate has become a hot issue of all the countries. How to enhance the employment competitiveness of the college graduate is a main problem of universities. It is an effective

The Pixelization Method for Data Mining of Defect Prevention through Exterior Wall Design
free download
Abstract. Defects act as depreciator over the mid to long term life cycle of buildings, and have a huge impact on the external walls of buildings, which consequently influences the first impression of the building. For this reason,Defect Prevention through Exterior Wall

Medical Data Mining Techniques for Health Care Systems
free download
Abstract: Due to the sequence in the information technology, the prevalence of the healthcare organizations conserves their data electronically. Enormous progress in medical data leads to be scarce in the mining of well-informed in series from the mass data. There

Application of Data Mining Techniques for Web Personalization
free download
Abstract: The web has expanded much more than expected in past few years. Further, the advent of new secure technologies, the online shopping trend has increased. People are bored of searching products online page by page. So they prefer websites which provide

free download
ABSTRACT: In recent times Information Technology acting a very important role in every aspects of the human life. It is very essential to gather data from different sources. This data can be stored and maintained to generate information and knowledge. Data mining has

Data Mining Based Store Layout Architecture for Supermarket
free download
Abstract: The mentioned system is designed to find the most frequent combinations of items. It is based on developing an efficient algorithm that outperforms the best available frequent pattern algorithms on a number of typical data sets. This will help in marketing and sales.

Survey on Big Data and Mining Algorithm
free download
ABSTRACT An information stream is a requested arrangement of examples that can be perused just once or a little number of times utilizing constrained processing and stockpiling abilities. Numerous applications take a shot at stream information like web pursuit, system

Statistical Based Agricultural Data Analysis Using Mining
free download
Abstract-This Paper is Basically applied to the Advancement in Farming by technological Evolution as the growth in Computing and information Assessment, Retrieval and Storage have provided vast amount of Data. Data mining Techniques have been extensively seed

Students' Employability Prediction Model through Data Mining
free download
Abstract The students' employability is a major concern for the institutions offering higher education and a method for early prediction of employability of the students is always desirable to take timely action. The paper uses various classification techniques of data

Clustering Assisted Co-location Pattern Mining for Spatial Data
free download
Abstract The importance of spatial data mining is growing with the increasing incidence and importance of large spatial datasets repositories of remote-sensing images, location based mobile app data, satellite imagery, medical data and crime data with location information,

A Technique of Data Privacy Preservation in Deploying Third Party MiningTools over the Cloud Using SVD and LSA
free download
Abstract: In these days, information sharing as a crucial part appears in our vision, bringing about a bulk of discussions about methods and techniques of privacy preserving for data mining which are regarded as strong guarantee to avoid information disclosure and

Research on Data Mining Algorithm based on Business Cloud Platform for Mobile Internet
free download
Abstract Mobile Internet is a mainstream access and communication technology, due to access to Internet anytime and anywhere, the business will varied, and bring mass data, but the data processing has different characteristics, the delay and energy consumption are

Educational Data Mining Classifier For Semester One Performance to Improve Engineering Students Achievement
free download
Abstract: In higher education system, it's very important to predict academic performance for students, instructors and management during the course. If institutions able to predict and assess student performance early in the beginning of the course, thus students and

Data Mining Unblocking the Intelligence in Data
free download
Abstract�This paper highlights the perspective applications of data mining tool WEKA to enhance the performance of some of the core business processes in banking sector. Data mining, or knowledge discovery, is the computer-assisted process of digging through and

Application of Forecasting Models on Indian Coal Mining Fatal Accident (Time Series) Data
free download
Abstract Director General of Mines Safety under the Ministry of Labour, Government of India, published an annual report which includes safety statistics of Indian mines. A combined/unified database is created from these reports for detailed analysis. The time

Survey of Interestingness Measures for Association Rules Mining: Data Mining, Data Science for Business Perspective
free download
Abstract:field of data processing is changing as fastly as the volume is increasing at a faster pace and as the more intelligent and automated viewpoint for looking at data are the need of the time. This changing need is from all dimensions of life like Business, Biology

free download
ABSTRACT: Data mining technology extensively used in managing relationship through a variety of approaches. There are many tools and methods for analyzing mortality data. The mining technology is one of these tools. The research aims to illustrate the concept of data

PatGen DB�a consolidated genetic patent database implementing standard data mining resources
free download
ABSTRACT Compared to the wealth of online resources covering genomic, proteomic and derived data, the bioinformatics community is rather under served when it comes to genetic patent information. This paper describes how PatGen DB has been compiled. This is a

A Proficient Heart Disease Prediction Method Using Different Data MiningTools
free download
Abstract: Heart disease is a major health problem and it affects a large number of people. In medical, prediction of heart disease is very important. In order to save a patients life, lot of effort is being taken by the hospitals and medical practitioner. The health sector today

Not-So-Linked Solution to the Linked Data Mining Challenge 2016
free download
Abstract. We present a solution for the Linked Data Mining Challenge 2016, that achieved 92.5% accuracy according to the submission system. The solution uses a hand-crafted dataset, that was created by scraping various websites for reviews. We use logistic

Discovering Relationships between Reservoir Properties and Production Datafor CHOPS Using Data Mining Methods
free download
Abstract Cold Heavy Oil Production with Sand (CHOPS) produces sand, and greatly contributes to primary oil recovery. It's generally believed that wormholes, resulting from sand flow, enhance oil recovery in this process. However, due to complexity and variability

Analysis of Various Data Mining Techniques to Predict Diabetes Mellitus
free download
Abstract Data mining approach helps to diagnose patient's diseases. Diabetes Mellitus is a chronic disease to affect various organs of the human body. Early prediction can save human life and can take control over the diseases. This paper explores the early

Data Mining and Intrusion Detection Systems
free download
Abstract�The rapid evolution of technology and the increased connectivity among its components, imposes new cyber-security challenges. To tackle this growing trend in computer attacks and respond threat, industry professionals and academics are joining

Time Orient Flow Estimation Based Data Mining Approach for Intrusion Detection in Wireless Local Area Networks Using Delay Averaging Scheme
free download
Abstract: The intrusion detection is the process of maintaining secure access of the services available and to provide more secured services in the wireless local area network. The generic nature of WLAN has no management of identity of users and the services

Hybridized Soft Computing Approaches Based Data Mining Techniques For Protein Dataset
free download
Abstract Bioinformatics is one of the emerging technologies which is played an important role in the field of biology. The molecular biology and Bioinformatics information are extracted from the protein datasetwhich is used for analyzing the different kind of

Multi Attribute Data Availability Estimation Scheme for Multi Agent Data Miningin Parallel and Distributed System
free download
Abstract Multi agent data mining in parallel and distributed systems has been studied in various situations and they suffers with the problem of data availability, because they categorize the network nodes according to the type of data the nodes has and suffers with

Transactions on Machine Learning and Data Mining Vol. 9, No. 1 (2016) 1-2 2016, ibai-publishing
free download
The amount of data currently generated by various activities of the society has never been so big, and is been generated in an ever increasing speed [1]. Managing and gaining insight from these large amount of data is an enormous challenge for researcher, but it is, at the

Statistical Comparisons of the Top 10 Algorithms in Data Mining for Classification Task
free download
Abstract:This work is builds on the study of the 10 top data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) community in December 2006. We address the same study, but with the application of statistical tests to establish, a more

Novel Approaches for Privacy Preserving Data Mining in k-Anonymity Model
free download
In privacy preserving data mining, anonymization based approaches have been used to preserve the privacy of an individual. Existing literature addresses various anonymization based approaches for preserving the sensitive private information of an individual. The k-

Estimation Based Location Selection Approach in Multi-Agent Disease Prediction Model for Decision Support System Using Diagnosis Pattern and Data Mining
free download
Abstract: The presence of decision support systems plays a vital role in many situations like business intelligence and medical solutions. There are many designs has been proposed earlier to support decision making and suffers with the problem of accuracy and time

free download
ABSTRACT Nowadays, diabetes is considered as one of the diseases which cause more deaths than any other disease in the world. To avoid the dangerous complications of the diabetes, patients should control a blood glucose level as the HbA1c (accumulative blood

Novel Outlier Detection In Diabetics Classification Using Data MiningTechniques
free download
Abstract Less production of insulin or produced insulin cannot be used by the body leads diabetes. If there is lack of evidence, then it is difficult to understand types of diabetes. Normally several tests are done which includes classification or clustering of large scale

free download
Abstract-In this paper, we present a critical review of the research now being undergoing in applications of data mining for a management of the healthcare system. The goal of this study is to explore emerging and new areas of data mining techniques used in healthcare

Summarizing the Concept of Data Mining, Frequent Pattern Mining and Actionable Pattern Mining Techniques
free download
Abstract: Data mining is an activity that offers business advantages, as well as solutions to some mounting problems associated with exploiting knowledge embedded within corporate databases such as growing disk space capabilities, Improvements over the relational

Analysis of Cardiovascular Heart Disease Prediction Using Data MiningTechniques
free download
Abstract: Hear diseases are the number one cause of death. The health industry is generally Information rich but Knowledge poor which is not possible to handle manually. The data mining is used to predict the disease from the datasets. Knowledge discovery in

A Review on Data Mining Algorithms for Internet of Things
free download
ABSTRACT The fields of computer science and electronics have merged to result into one of the most notable technological advances in the form of realization of the Internet of Things (IoT). Internet of Things (IoT) is an innovative idea which will transform the real world

Privacy Preserving Data Mining in Bio Medical Databases
free download
Abstract: Biomedical involves the applications of the natural sciences, especially the biological and physiological to clinical medicine. It is a discipline in biological, medicine to improve human health by integrating the medical usages to help the clinical practices.

High Speed Data Streams Using Data Mining Techniques
free download
ABSTRACT Many organizations today have more than very large data-bases; they have databases that grow without limit at a rate of several million records per day. Mining these continuous data streams brings unique opportunities, but also new challenges. This paper

How well can we forecast future model error and uncertainty by mining past model performance data
free download
Consider a hydrological model Y (t)= M (X (t), P), where X= vector of inputs; P= vector of parameters; Y= model output (typically flow); t= time. In cases when there is enough past data on the model M performance, it is possible to use this data to build a (data-driven)

Analysis of Girls Vocational High School Students' Academic Failure Causes with Data Mining Techniques
free download
1Afyon Kocatepe University, Vocational School of Afyon, Ali Cetinkaya Campus, Izmir Yolu 8 Km, 03200, Afyonkarahisar, Turkey 2Ministry of Education, Vocational School of Ali Cetinkaya, Department of Computer Programming, Afyonkarahisar, Turkey Telephone:+

Analysis of Draize eye irritation testing and its prediction by mining publicly available 2008-2014 REACH data
free download
Summary Public data from ECHA online dossiers on 9,801 substances encompassing 326,749 experimental key studies and additional information on classification and labeling were made computable. Eye irritation hazard, for which the rabbit Draize eye test still

Data Mining for Various Internets of Things Applications
free download
Abstract:Internet of Things is now an accelerating technology in the world of devices. It helps us connect all the devices which we use in our day to day chores via the internet. Starting from home, office, industry automation to health care and smart cities internet of

EbIDAM: Efficient Data Mining Java Library
free download
Abstract:Pattern generation in transactional databases is an important Big Data mining problem. Developing efficient systems that are able to handle large volumes of data therefore becomes unique. Finding all frequent item sets and association rules in a given

free download
Abstract: The aim of this study is to describe the experiences of some Young Women workers who have encountered exploitation in their workplaces in garment industries. It equally looks at the effects of worker exploitation on Young Women workers, and

Models and Issues in Data Stream Mining
free download
Abstract:With great innovation in technology there is a huge data explosion. Real-time surveillance, internet traffic, sensor data, health monitoring systems, communication networks, online transactions in the financial market and so on contribute as a data

Various Methodologies in the discrimination of Data Mining
free download
Abstract: Data mining is an increasingly important technology for extracting useful knowledge hidden in large collections of data. There are, however, negative social perceptions about data mining, among which potential privacy invasion and potential

A Survey on Data Mining Techniques for Analysis of Social Network
free download
Abstract: Data mining is the extraction of projecting information from large data sets, is a great innovative technology which helps corporations focus on the most important information in their data stockrooms. Data mining makes use of various statistical,

free download
Amaranthus cruentus and A. hypochondriacus x hybridus based on data mining and sequence alignment.-Genetika, Vol 48 No. 1, 211-218. Bioinformatic tool have became an inevitable part of molecular genetic research in many applications. In the present study, an

A Survey on Privacy Preserving Data Mining
free download
ABSTRACT Privacy-preserving data mining has been considered widely because of the wide propagation of sensitive information over internet. A number of algorithmic techniques have been designed for privacy-preserving data mining that includes the state-of-the-art

Analyse the Metrological Data Using Data Mining
free download
ABSTRACT Data Mining is the process of discovering new patterns from large data sets, this technology which is employed in inferring useful knowledge that can be put to use from a vast amount of data, various data mining techniques such as Classification, Prediction,

Materialized view selection for data warehouse using frequent itemset mining
free download
Abstract: Data warehouses are subject oriented, consolidated, integrated, and time variant repository of possibly heterogeneous data. A data warehouse is used to response to on-line analytical queries over the millions records of data in an acceptable time. Since a data

Data Mining Data Warehousing: An Exhaustive Elucidation
free download
Abstract Now a day, bigger or even smaller organizations rely more on database applications to maintain their data. These databases are referred to as the operational data and are responsible for daily transactions. However, there is a need for storage of

A Survey on Feature Selection in Data Mining
free download
Abstract:Feature Selection is a fundamental problem in machine learning and data mining. Feature Selection is an effective way for reducing dimensionality, removing irrelevant data increasing learning accuracy. Feature Selection is the process of identifying a subset of

Survey on Mining Educational Data and Recommending Best Engineering College
free download
Abstract:In this present era, Engineering colleges are increasing day by day. Good education comes from good colleges and everyone is in search of the best to enhance their future and to live the best of it. Thus finding out one from a 1000 is a difficult task. In India

A Survey on Automatic Bug Triage Using Data Mining Concepts
free download
Abstract: In bug triage process, assigning a correct developer to fix the new incoming bug is tedious than fixing that bug. Since the number of daily new coming bug is high, manual triaging increases the development cost and time. In order to automate the bug triage

free download
ABSTRACT: Popular use of the World Wide Web as a global information system has flooded us with a tremendous amount of data and information. This explosive growth in stored data has generated an urgent need for new techniques and automated tools that can

A Review on Data Anonymization in Privacy Preserving Data Mining
free download
Abstract: People today are very reluctant to share their information as they are well aware of the privacy threats of their sensitive data. Data in its original form contains sensitive information about individuals, and publishing such data without revealing sensitive

A Survey Paper on Climate Changes Prediction Using Data mining
free download
Abstract: The purpose of data mining effort is generally to create a descriptive model or a predictive model. In this paper the concepts of regression was summarized to accomplish the task of prediction and various methodologies of regression and its significance was

A Survey on Data Mining Optimization Techniques
free download
Abstract Data mining is a field of research which is increasing day-by-day. Data mining consist of various steps which have been discussed in this paper. But, with data mining optimization have become important for the improvement of results. Optimization finds out

Data Mining Technique to Predict Missing Items and Find Optimal Customer for Beneficial Customer Relationship and Management
free download
Abstract: The aim of association rule mining is to find frequently co-occurring groups of items in transactional databases. The intention of this knowledge is for prediction purposes. This paper contributes a technique that uses the partial information about the contents of a

Short-Term Load Forecasting System Using Data Mining
free download
Abstract:In this paper, by means of data mining techniques, a platform of data warehouse is designed after preprocessing the huge amounts original data of power system, and a system for short term load forecasting (STLF) is developed, in which there is the synthetic

free download
ABSTRACT Extant data mining is based on data-driven methodologies. It either views data mining as an autonomous datadriven, trial-and-error process or only analyzes business issues in an isolated, case-by-case manner. As a result, very often the knowledge

I 2 mapreduce: Fine-Grain Incremental Processing In Big Data Mining
free download
Abstract: I 2 MAPREDUCE: Fine-Grain Incremental Processing in big data mining a novel incremental processing extension to Map Reduce, the most widely used framework for mining big data. As compare with the high-tech work on In coop, I 2 MapReduce has its

An Application of Weather Forecasting and Climate Changing Using Data Mining Techniques
free download
Abstract The Weather forecasting is a vital application in meteorology and has been one of the most scientifically and technologically challenging problems around the world in the last century. In this research paper, we investigate the use of data mining techniques in

free download
Abstract:Clustering is an important data mining technique which has gained a tremendous importance in recent times due to its inherent nature of capturing the hidden structure of the data. In Clustering, different objects that have some similarity based on their

free download
Abstract:The Internet of Things (IoT) is an emerging topic in today's era. It has a lot of significance in technology, business, social and engineering fields. This technology provides an easier way of communication of devices with the minimal interaction of human

A Data Mining Approach to Choosing Categorical Attributes for Ranked Lists
free download
ABSTRACT This work proposes and evaluates a novel approach to determine interesting category for ranked lists using -SVM. We identify three characteristics (features), entropy, unlikability, and peculiarity and show how to train a classifier on these features using a set

Collaborative Mining Method for Location Big Data of Outdoor Activities
free download
Abstract With the continued popularity of location services and the application of Internet of Vehicles, the location big data of outdoor activities which is composed of geographical data, vehicle track and application record and so on has used to obtain the perception of the

E-Business Service using Data mining Techniques
free download
Abstract:With the increasing competition and changing demands of customer it is difficult to satisfy the people with different backgrounds. E-business with its various advantages of low cost, high efficiency, time saving has assumed importance these days. But it is confronted

A Review on Data Extraction using Web Mining Techniques
free download
Abstract: Today internet is full of structured or unstructured information and this information influences people or society directly or indirectly. With the rapid growth of internet technologies, the web is considered as a world's largest repository of knowledge. Web

free download
ABSTRACT Web content mining is the mining, extraction and integration of useful data, information and knowledge from Web page content. Due to heterogeneity and the lack of structure that permits much of the ever-expanding information sources on the WWW as

Data Mining with Cloud Computing:-An Overview
free download
Abstract: Data mining is a process of extracting potentially useful information from data, so as to improve the quality of information service. The integration of data mining techniques with Cloud computing allows the users to extract useful information from a data

A Survey on Decision Tree Algorithms of Classification in Data Mining
free download
Abstract: As the computer technology and computer network technology are developing, the amount of data in information industry is getting higher and higher. It is necessary to analyze this large amount of data and extract useful knowledge from it. Process of extracting the

Context Awareness in Data Mining Applications�A Survey
free download
Abstract: With the advent of the digital explosion and big data, data mining and analysis play a key role in any business development. Tremendous amount of data is collected by organizations through their day to day activities and transactions�both internal and

Identifying Personality Trait using Social Media: A Data Mining Approach
free download
Abstract-The Social media is no more a new concept today. With increase in the penetration of internet and low cost smart phones access to social media has more become a trend and necessity to many. Having more number of likes and plethora of comments and further

A Review on Customer Churn Prediction in Telecommunication Using Data Mining Techniques
free download
Abstract: Customer churn is the term which indicates the customer who is in the stage to leave the company. Particularly it is happening recurrently in the telecommunication industry and the telecom industries are also in a position to retain their customer to avoid the

free download
Abstract: Data mining is a collection of techniques for efficient automated discovery of previously unknown, valid, novel, useful and understandable patterns in large databases. The patterns must be actionable so that they may be used in an enterprise's decision

Privacy Preserving through Data Perturbation using Random Rotation Based Technique in Data Mining
free download
Abstract:Data perturbation technique is, a widely employed and accepted Data Mining (PPDM) approach, used to single level trust on data miners. Privacy Preserving Data Mining deals with the problem of developing accurate models about aggregated data without

Application of Educational Data mining Techniques in E-Learning Systems with its Security Issues: A Case Study
free download
Abstract: Recently, Educational Data Mining has become an emerging research field used to extract knowledge and discover patterns from E-Learning systems. This work is a survey of the specific application of data mining in learning management systems and a case study

Data Size versus Accuracy: Performance by different Data Mining Tools
free download
Abstract:While increasing the data size the improvement in accuracy becomes better. This is true only up to a fixed size. After this point, the performance usually becomes stable. In the context of small data sets expecting better performance usually leads to failure. Data

free download
Data Mining is an analytic process designed to explore data or big data in search of consistent patterns or systematic relationships between variables, and then to validate the findings by applying the detected patterns to new subsets of data. In this study use

Effective Positive and Negative Association Rules Using Bit Vector Matrix inData Mining
free download
Abstract: Association rule learning is a popular and well researched method for discovering interesting relations between variables in large databases. Most of the algorithms for mining quantitative association rule for finding frequent item set of positive item sets. Most of the

Data Mining Techniques for E-Business Service using
free download
AbstractWith the increasing competition and changing demands of customer it is difficult to satisfy the people with different backgrounds. E-business with its various advantages of low cost, high efficiency, time saving has assumed importance these days. But it is confronted

Building a Data Mining Model using Data Warehouse and OLAP Cubes
free download
A data warehouse is a centralized repository that stores data from multiple information sources and transforms them into a common, multidimensional data model for efficient querying and analysis. OLAP and Data Mining are two complementary technologies for

Review on Intrusion Detection System based on Data Mining Techniques
free download
Abstract:Enterprise networked systems are exposed to the increasing threats posed by malicious users as well as hackers which are internal to a network. Through monitoring unusual user activity, illegitimate use is detected. This can be achieved by an Intrusion

Improving Customer Relationship Management Using Data Mining
free download
Abstract:Customer Relationship Management (CRM) possesses Business Intelligence by incorporating information acquisition, decision support functions and information storage to provide customized customer service. CRM enables to analyze and classify data in order

A Novel Data Classification with Anonymity Method for Privacy Preserving in Medical Data Mining
free download
Abstract:Data mining is the process of extracting interesting patterns or knowledge from huge amount of data. The privacy preserving in data mining comes into picture for security. K- Anonymity is one of the easy and efficient techniques to achieve privacy preserving for

free download
Abstract:data mining term refers to the finding of relevant and helpful data from database. Data mining scours databases for hidden patterns, finding predictive information and trends that experts may miss, as it goes past their desires. When implementation on a high

Data Mining: Current Applications Trends
free download
Abstract-Data mining is used in many areas. There are many data mining systems available and yet there are many challenges in this field. Data Mining helps us in future forecast of market trends and can itself do decision making. In order to analyze patterns and rules

Possibility of Preventing Cancer using Big Data Analytics and Data MiningTechnology
free download
Abstract: This paper presents a case study of how Big Data analysis Data Mining technology can provide an indicative solution to humanity for early detection and prevention

Study and Analysis of Rule Mining from Heterogeneous Applications in Data Mining
free download
Abstract-In this work we have identified the several issues and efficient approaches when we apply association rule mining is one of the data mining techniques for distributed data sets that are located on the distributed environments. In data mining, there are various

The taming of the data: Using text mining in building a corpus for diachronic analysis
free download
Social and historical linguistic studies benefit from corpora encoding contextual metadata (eg time, register, genre) and relevant structural information (eg document structure). While small, handcrafted corpora control over selected contextual variables (eg the Brown/LOB

Intelligent Traffic Signal System using Classification of Vehicular Traffic Using Lazy and Function Family Data Mining Classifiers
free download
Abstract:This paper witnesses a novel Vehicle Traffic signal switching methodology based on the estimated density of the traffic over a road. Depending on the density of the traffic the signal stretch would be altered dynamically. This paper takes yet one more progressive

An optimized distributed association rule mining based privacy prevention using frequent pattern mining of data
free download
Abstract: Reproached by developments such as cloud computing and data warehousing, there has been considerable interest in providing data mining as a service. The company lacking in experience or processing resources can outsource it s to the another

A Review on Mining Students' Data for Performance Prediction
free download
Abstract: A country's growth is strongly measured by the quality of its education system. Education sector has witnessed sea change in its functioning. Today it is recognized as an industry and as an industry it is facing challenges. The challenges of higher education

Bayesian Linear Regression in Data Mining
free download
Mining and methods substantially differ from the new trend of Data Mining From a Statistical perspective Data Mining can be viewed as computer automated exploratory data analysis of large complex data sets. Despite the obvious connections between data mining and

Impact of Climate Change in Agriculture with Data Mining Concepts
free download
Abstract-Crop management of certain agriculture region is depends on the climatic conditions of that region because climate can make huge impact on crop productivity. Real time weather data can helps to attain the good crop management. This work surveys

Data Mining Applied to Customer Categorization Based on Load Profiling
free download
Abstract-Load Profiling, a method where load consumption patterns of different electricity consumers are identified using the daily/monthly load curves is used in Distribution System planning activities like peak load management and time of use tariff. The load profiling

Information Security in Large Amount of Data: Privacy and Data Mining
free download
ABSTRACT Now a days, In our day to day life, development in data mining becomes very much popular. But, growing popularity and development in data mining technologies brings serious threat to the security of individual's sensitive information. So, To avoid access to

A Survey on Data Mining Algorithms
free download
Abstract: Data mining is multidisciplinary field of computer science. It is the process of recognizing patterns from massive data sets, which is big data. It contains the approaches of Machine Learning, database systems, artificial intelligence and statistics. A data mining

free download
ABSTRACT Data leakage means sending confidential data to an unauthorized person. Nowadays, identifying confidential data is a big challenge for the organizations. We developed a system by using data mining techniques, which identifies confidential data of

Survey on web crime detection using data mining technique
free download
Abstract Crime analysis is a law enforcement function that involves systematic analysis for identifying and analyzing patterns and trends in crime and disorder. Information on patterns can help law enforcement agencies deploy resources in a more effective manner, and

Data Mining Historical Newspaper Metadata
free download
Abstract: In this age of Big Data this paper describes how the state-of-the-art OLR (optical layout recognition) technique in one of the largest heritage press digitization projects in Europe (www. europeananewspapers. eu, 2012-2015) was used in a data mining

free download
ABSTRACT Mobile phone customers are nowadays increased. People are not communicated only through voice. Recent user is using their handset to make purchases from retail stores, conduct personal Banking make reservations as well as to view sports,

A Review on Road Accident in Traffic System using Data Mining Techniques
free download
Abstract: Road traffic accidents are a major public health concern, resulting in an estimated 1.2 million deaths and 50 million injuries worldwide each year. In the developing world, road traffic accidents are among the leading cause of death and injury. The objective of this

Detection of Lung Cancer through Image Data Mining Techniques-A Survey
free download
ABSTRACT According to World Health Organization stated that, 7.6 million deaths globally each year are caused by cancer. Cancer represents 13% of all global deaths. National Cancer Institute stated that, by the end of 2015 there will be 221,200 new lung cancer

free download
ABSTRACT: In deploying data mining into the real world business scenerios, organizational factors, user preferences and business needs. However, the current data mining algorithms and tools often stop at the delivery of patterns satisfying expected technical

Survey paper on Prediction of Heart Disease Using Data Mining Technique
free download
Abstract-Prediction of heart disease is most complicated and challenging task in the field of medical science. Heart disease is the most threatening one among various diseases as it can not be detected easily. Bad clinical decisions would cause death of a patient. In our

free download
ABSTRACT The health sector has witnessed a great evolution following the development of new computer technologies, and that pushed this area to produce more medical data, which gave birth to multiple fields of research. Many efforts are done to cope with the explosion

free download
Abstract:The data mining techniques and applications are marked as important field in the data mining which we used in this paper. These are very helpful to understand the concept of data mining. Even beginners' are understand it very easily The concept of data mining

A Survey on Data Mining in Big Data
free download
Abstract:Collection of large and complex data is termed as big data. Tons of data are collected in applications such as medical processing, whether reporting, digital libraries, etc. and these data should be managed. Also they contain large amount of varying data such

Efficient Statistical Techniques and Mathematical Models for Data Mining
free download
Abstract: This research paper is intended to serve as an overview of a rapidly emerging research and application area in data mining. Data Mining is one of the most important phases of the knowledge discovery in database activity. So, in order to extract the

Privacy-Preserving in Data Mining using Anonymity Algorithm for RelationalData
free download
Abstract: Data mining is the process of analyzing data from different perspectives. To summarize it into useful information, we can consider several algorithms. To protect data from unauthorized user in this case is a problem to solve. Access control mechanisms

Improved Approach For Infrequent Weighted Itemsets in Data Mining
free download
Abstract:Weighted item-set mining is used to find the profitable connection between the data. There are two types of items contained in dataset ie frequent and infrequent. Infrequent item-sets are nothing but items which are rarely found in database. Mining frequent items

Different Comparisons for Text Data Mining
free download
Abstract:Content information digging ought to be valuable for foreseeing new advances and new uses for existing advances, seeing that one can endeavor to associate reciprocal bits of data crosswise over two unique areas, or subsets, of the exploratory writing. The

Predicting Thyroid Disease using Linear Discriminant Analysis (LDA) Data Mining Technique
free download
ABSTRACT Thyroid disease is very common disease in human. Nowadays most of the women suffering from thyroid disease than male. There are two types in thyroid disease like hypothyroid and hyperthyroid disease. These diseases giving many side effects such as

Analytical Study for Pattern Mining in E-Commerce Data
free download
Abstract: If Analytical Study for Pattern Mining of E-Commerce did not exist then one would not be able to extract interesting patterns for business boost. DDoS is Detection System,[CBR] is Case Based Reasoning,[IDS] is Intrusion Detection System are few of the

Data Mining as a tool for Knowledge Discovery
free download
October 2015 Subject of research Application of datamining techniques to messages of users who intend to commit suicide. DataMining as a tool for Knowledge Discovery Fotini Kolokathi, Vasilis Stavrou {p3090088@dias.aueb.gr stavrouv@aueb.gr}

free download
ABSTRACT Data mining application in healthcare today for prediction and decision making is great, because healthcare sector is rich with information, and data mining is becoming a necessity. Healthcare organizations produce and collect large volumes of information on

Refinement and Coarsening of Online-Offline Data Mining Methods with Sparse Grids
free download
Abstract This thesis deals with an adaptive sparse grid density estimation method. Given a datastream, eg via Online Data Mining, an Offline/Online splitting corresponding to this density estimation approach is introduced. To obtain the density declaring coefficients a

CRSA Cryptosystem Based Secure Data Mining Model for Business Intelligence Applications
free download
Abstract:In the present competitive scenario, business intelligence (BI) applications play a very significant role for organizational decision support system (DSS). These BI applications need huge data transactions and information sets along with certain efficient data mining

Exploring Practical Data Mining Techniques at Undergraduate Level
free download
Abstract: Data mining is referred to as the process of analyzing and extracting patterns embedded in large amounts of data by using various methods from machine learning, pattern recognition, statistics and database management. With the rapid proliferation of

Machine Learning and Data Mining with Apache Spark
free download
Abstract. This paper deals with the concepts of machine learning and data mining social networks, which are increasingly useful for businesses to know the consumers' sentiment towards their brand. This project, intended for use by engineers at Orange France, focuses

Relative Analysis of Data Mining Tools with Performance Assessment of Database System
free download
ABSTRACT Data mining, the withdrawal of hidden predictive, analytical information from huge databases, is superior new technology with immense potential to help companies focus on the most vital raw material in their data warehouses. In these days, huge amount

Introduction to Databases and Data Mining
free download

free download
Abstract: The expeditious development of information technology and adaptability of service technologies created a regime in various fields appreciably. The growing interest in usage of data for analysis has brought an indispensable enhancement in data mining field. Data

A Survival Study on Density Based Clustering Algorithms for Large Spatial Databases in Data Mining
free download
Abstract Density based clustering algorithm is the primary methods for clustering in data mining. The clusters which are formed based on the density are easy to understand. It does not limit itself to the shapes of clusters. This paper gives a survey of the existing density

free download
Data mining is the process of analyzing data from different perspectives and summarizing it into useful information. Feature selection is one of the important techniques in data mining. It is used for selecting the relevant features and removes the redundant features in dataset.

Data Mining With Predictive Analytics for Financial Applications
free download
Abstract This paper describes data mining with predictive analytics for financial applications and explores methodologies and techniques in data mining area combined with predictive analytics for application driven results for financial data. The basic idea is to apply patterns

IE 485-Introduction to Data Mining-R Tutorial
free download
The document starts with discussions of databases that we use in our course. Later chapters on visualization, descriptive modeling, fundamental statistics, prescriptive modeling (will) take place. In this course we will take an integrated approach to data mining applications.

Classifying Office Plug Load Appliance Events in the context of NILM using Time-series Data Mining
free download
Abstract Smart building energy management requires knowledge of individual appliance operation from reduced metering points. The key purpose of this study is to present a classification framework for offices that can help discover individual appliances and its

Review on State of Art Data Mining and Machine Learning Techniques for Intelligent Airport Systems
free download
ABSTRACT It is a generally accepted fact that the Airport is the focal point of the country which creates a lasting impression of its people. The challenge faced by airports today is the complexity of players and processes, and the inability of multiple systems to share and

A Study on Data Mining Approaches for Agricultural Intelligence
free download
Abstract: Agricultural intelligence is a specific and emerging field of intelligence dedicated to an enhanced understanding of cultivation, productivity of crop, and minimized risk associated agriculture. Crop prediction is an important agricultural problem. To address

free download
ABSTRACT Nowadays, There are many risks related to bank loans, for the bank and for those who get the loans. The analysis of risk in bank loans need understanding what is the meaning of risk. In addition, the number of transactions in banking sector is rapidly

Measuring community influence: a data mining approach to Eastern European blogosphere Mclean N.(Commonwealth of Australia) -
free download
Abstract: we use data mining techniques (in particular, social network analysis approach) to measure the influence and 'community spirit'coefficient of various online groups in the Easter European blogosphere. The purpose of this paper is to demonstrate how we collect, clean

Data Mining Approach and Its Application to Dresses Sales Recommendation
free download
Abstract: The fundament of data mining (DM) is to analyses data from various points of view. Classify the data and summarize it, DM has begun to be widespread in every and each application. Although we have huge magnitude of data, but we do not have helpful

Help the Society in Selecting Their Best Life Insurance Cover (LIC) Using Data Mining Technique
free download
ABSTRACT Investment is the key for making money. Investing small amount of money will give one large benefits and also help to improve an individual's lifestyle. Today in India, the life insurance sector is increasing so rapidly that most of the people are interested to

Work In-progress: Mining the Student Data for Fitness
free download
Abstract. Data mining-driven agents are often used in applications such as waiting times estimation or traffic flow prediction. Such approaches often require large amounts of data from multiple sources, which may be difficult to obtain and lead to incomplete or noisy

Data mining IEOR Fall 2016
free download
The goal of the course is to present some fundamental techniques used in data mining and machine learning nowadays. The course will show the students those techniques in a way that will enable them to quickly start working on many problems on the frontier of modern


Data Mining and Machine Learning Papers

Below are select papers on a variety of topics. The list is not meant to be exhaustive. The papers found on this page either relate to my research interests of are used when I teach courses on machine learning or data mining.

  • General (articles)
    • Data Mining and Statistics: What's the Connection?
    • Data Mining: Statistics and More?, D. Hand, American Statistician, 52(2):112-118.
    • Data Mining, G. Weiss and B. Davison, in Handbook of Technology Management, John Wiley and Sons, expected 2010.
    • From Data Mining to Knowledge Discovery in Databases, U. Fayyad, G. Piatesky-Shapiro & P. Smyth, AI Magazine, 17(3):37-54, Fall 1996.
    • Mining Business Databases, Communications of the ACM, 39(11): 42-48.
    • 10 Challenging Problems in Data Mining Research, Q. Yiang and X. Wu, International Journal of Information Technology & Decision Making, Vol. 5, No. 4, 2006, 597-604. (slides)
  • General (short news articles)
  • General Data Mining Methods and Algorithms
    • Top 10 Algorithms in Data Mining, X. Wu, V. Kumar, J.R. Quinlan, J. Ghosh, Q. Yang, H. motoda, G.J. MClachlan, A. Ng, B. Liu, P.S. Yu, Z. Zhou, M. Steinbach, D. J. Hand, D. Steinberg, Knowl Inf Syst (2008) 141-37.
    • Induction of Decision Trees, R. Quinlan, Machine Learning, 1(1):81-106, 1986.
  • Web and Link Mining
    • The Pagerank Citation Ranking: Bringing Order to the Web, L. Page, S. Brin, R. Motwani, T. Winograd, Technical Report, Stanford University, 1999.
    • The Structure and Function of Complex Networks, M. E. J. Newman, SIAM Review, 2003, 45, 167-256.
    • Link Mining: A New Data Mining Challenge, L. Getoor, SIGKDD Explorations, 2003, 5(1), 84-89.
    • Link Mining: A Survey, L. Getoor, SIGKDD Explorations, 2005, 7(2), 3-12.
  • Semi-supervised Learning
    • Semi-Supervised Learning Literature Survey, X. Zhu, Computer Sciences TR 1530, University of Wisconsin -- Madison.
    • Introduction to Semi-Supervised Learning, in Semi-Supervised Learning (Chapter 1) O. Chapelle, B. Scholkopf, A. Zien (eds.), MIT Press, 2006. (Fordham's library has online access to the entire text)
    • Learning with Labeled and Unlabeled Data, M. Seeger, University of Edinburgh (unpublished), 2002.
    • Person Identification in Webcam Images: An Application of Semi-Supervised Learning, M. Balcan, A. Blum, P. Choi, J. lafferty, B. Pantano, M. Rwebangira, X. Zhu, Proceedings of the 22nd ICML Workshop on Learning with Partially Classified Training Data, 2005.
    • Learning from Labeled and Unlabeled Data: An Empirical Study across Techniques and Domains, N. Chawla, G. Karakoulas, Journal of Artificial Intelligence Research, 23:331-366, 2005.
    • Text Classification from Labeled and Unlabeled Documents using EM, K. Nigam, A. McCallum, S. Thrun, T. Mitchell, Machine Learning, 39, 103-134, 2000.
    • Self-taught Learning: Transfer Learning from Unlabeled Data, R. Raina, A. Battle, H. Lee, B. Packer, A. Ng, in Proceedings of the 24th International Conference on Machine Learning, 2007.
    • An iterative algorithm for extending learners to a semisupervised setting, M. Culp, G. Michailidis, 2007 Joint Statistical Meetings (JSM), 2007
  • Partially-Supervised Learning / Learning with Uncertain Class Labels
    • Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers, V. Sheng, F. Provost, P. Ipeirotis, in Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2008.
    • Logistic Regression for Partial Labels, in 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, Volume III, pp. 1935-1941, 2002.
    • Classification with Partial labels, N. Nguyen, R. Caruana, in Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2008.
    • Imprecise and Uncertain Labelling: A Solution based on Mixture Model and Belief Functions, E. Come, 2008 (powerpoint slides).
    • Induction of Decision Trees from Partially Classified Data Using Belief Functions, M. Bjanger, Norweigen University of Science and Technology, 2000.
    • Knowledge Discovery in Large Image Databases: Dealing with Uncertainties in Ground Truth, P. Smyth, M. Burl, U. Fayyad, P. Perona, KDD Workshop 1994, AAAI Technical Report WS-94-03, pp. 109-120, 1994.
  • Recommender Systems
  • Rarity and Class Imbalance

    General resources available on this topic:


    • A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data, G. Batista, R. Prati, and M. Monard, SIGKDD Explorations, 6(1):20-29, 2004.
    • Class Imbalance versus Small Disjuncts, T. Jo and N. Japkowicz, SIGKDD Explorations, 6(1): 40-49, 2004.
    • Extreme Re-balancing for SVMs: a Case Study, B. Raskutti and A. Kowalczyk, SIGKDD Explorations, 6(1):60-69, 2004.
    • A Multiple Resampling Method for Learning from Imbalanced Data Sets, A. Estabrooks, T. Jo, and N. Japkowicz, in Computational Intelligence, 20(1), 2004.
    • SMOTE: Synthetic Minority Over-sampling Technique, N. Chawla, K. Boyer, L. Hall, and W. Kegelmeyer, Journal of Articifial Intelligence Research, 16:321-357.
    • Generative Oversampling for Mining Imbalanced Datasets, A. Liu, J. Ghosh, and C. Martin, Third International Conference on Data Mining (DMIN-07), 66-72.
    • Learning from Little: Comparison of Classifiers Given Little of Classifiers given Little Training, G. Forman and I. Cohen, in 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, 161-172, 2004.
    • Issues in Mining Imbalanced Data Sets - A Review Paper, S. Visa and A. Ralescu, in Proceedings of the Sixteen Midwest Artificial Intelligence and Cognitive Science Conference, pp. 67-73, 2005.
    • Wrapper-based Computation and Evaluation of Sampling Methods for Imbalanced Datasets, N. Chawla, L. Hall, and A. Joshi, in Proceedings of the 1st International Workshop on Utility-based Data Mining, 24-33, 2005.
    • C4.5, Class Imbalance, and Cost Sensitivity: Why Under-Sampling beats Over-Sampling, C. Drummond and R. Holte, in ICML Workshop on Learning from Imbalanced Datasets II, 2003.
    • C4.5 and Imbalanced Data sets: Investigating the effect of sampling method, probabilistic estimate, and decision tree structure, N. Chawla, in ICML Workshop on Learning from Imbalanced Datasets II, 2003.
    • Class Imbalances: Are we Focusing on the Right Issue?, N. Japkowicz, in ICML Workshop on Learning from Imbalanced Datasets II, 2003.
    • Learning when Data Sets are Imbalanced and When Costs are Unequal and Unknown, M. Maloof, in ICML Workshop on Learning from Imbalanced Datasets II, 2003.
    • Uncertainty Sampling Methods for One-class Classifiers, P. Juszcak and R. Duin, in ICML Workshop on Learning from Imbalanced Datasets II, 2003.
  • Active Learning
    • Improving Generalization with Active Learning, D Cohn, L. Atlas, and R. Ladner, Machine Learning 15(2), 201-221, May 1994.
    • On Active Learning for Data Acquisition, Z. Zheng and B. Padmanabhan, In Proc. of IEEE Intl. Conf. on Data Mining, 2002.
    • Active Sampling for Class Probability Estimation and Ranking, M. Saar-Tsechansky and F. Provost, Machine Learning 54:2 2004, 153-178.
    • The Learning-Curve Sampling Method Applied to Model-Based Clustering, C. Meek, B. Thiesson, and D. Heckerman, Journal of Machine Learning Research 2:397-418, 2002.
    • Active Sampling for Feature Selection, S. Veeramachaneni and P. Avesani, Third IEEE Conference on Data Mining, 2003.
    • Heterogeneous Uncertainty Sampling for Supervised Learning, D. Lewis and J. Catlett, In Proceedings of the 11th International Conference on Machine Learning, 148-156, 1994.
    • Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction, G. Weiss and F. Provost, Journal of Artificial Intelligence Research, 19:315-354, 2003.
    • Active Learning using Adaptive Resampling, KDD 2000, 91-98.
  • Cost-Sensitive Learning

Leave a Comment


Your email address will not be published. Required fields are marked *