数据挖掘

出版时间：2001-5 出版社：高等教育出版社作者：Jiawei Han 页数：550 字数：762000
Tag标签：无

前言

　　20世纪末，以计算机和通信技术为代表的信息科学和技术，对世界的经济、军事、科技、教育、文化、卫生等方面的发展产生了深刻的影响，由此而兴起的信息产业已经成为世界经济发展的支柱。进入21世纪，各国为了加快本国的信息产业，加大了资金投入和政策扶持。　　为了加快我国信息产业的进程，在我国《国民经济和社会发展第十个五年计划纲要》中，明确提出“以信息化带动工业化，发挥后发优势，实现社会生产力的跨越式发展。”信息产业的国际竞争将日趋激烈。在我国加入WTO后，我国信息产业将面临国外竞争对手的严峻挑战。竞争成败最终将取决于信息科学和技术人才的多少与优劣。　　在20世纪末，我国信息产业虽然得到迅猛发展，但与国际先进国家相比，差距还很大。为了赶上并超过国际先进水平，我国必须加快信息技术人才的培养，特别要培养一大批具有国际竞争能力的高水平的信息技术人才，促进我国信息产业和国家信息化水平的全面提高。为此，教育部高等教育司根据教育部吕福源副部长的意见，在长期重视推动高等学校信息科学和技术的教学的基础上，将实施超前发展战略，采取一些重要举措，加快推动高等学校的信息科学和技术等相关专业的教学工作。在大力宣传、推荐我国专家编著的面向21世纪和“九五”重点的信息科学和技术课程教材的基础上，在有条件的高等学校的某些信息科学和技术课程中推动使用国外优秀教材的影印版进行英语或双语教学，以缩短我国在计算机教学上与国际先进水平的差距，同时也有助于强化我国大学生的英语水平。

内容概要

　　本书阐述了数据挖掘（通常称为数据库知识发现）的概念、方法和应用。从强调数据分析入手，介绍了数据库和数据挖掘的概念，指出数据挖掘是对大型数据库、数据构件库和其他大型信息资源中标识知识含义的那些类型的自动的或便捷的提取，并通过一个通用的框架回顾了当前的市场可供产品。数据挖掘是一个跨学科的知识领域，汲取了数据库技术、人工智能、机器学习、神经网络、统计学、模式识别、知识库系统、知识获取、信息检索、高性能计算、数据可视化等方面的成果，本书内容从数据库的视角，描述了数据挖掘系统的原型、结构、特征、方法，重点讲解了数据挖掘的可行性、实用性、有效性和大型数据库中模型发现的可测量性等问题。本书逐章讲解了数据分类、预测、联结和分组的概念和技术，这些专题都配有实例，对各类问题都分别列举了最佳算法，并对怎样运用技术给出了经过实践检验的实用型规则。这种讲述方式决定了本书的可读性强，能够使读者从中学到数据挖掘领域的知识，了解产业最新动向。本书适用于计算机科学系的学生、应用软件开发人员、商业领域的专家和相关知识领域的科技研究人员。 　　内容：1. 数据挖掘简介 2. 数据构件库和数据挖掘中的在线分析处理技术 3. 数据处理 4. 数据挖掘原型、语言和系统结构 5. 概念描述：特征与对比 6. 大型数据库中的挖掘联结规则 7. 分类和预测 8. 分组分析9. 挖掘复合数据类型 10. 数据挖掘应用及趋势 附录一 微软公司数据挖掘的对象链接和嵌入数据库 附录二 数据库挖掘器简介

作者简介

Jiawei Han is director of the Intelligent Database Systems research Laboratory and professor in the School of Computing Science at Simon Fraser University.Well dnown for his research in the areas of data mining and data-base systems,he has served on progr

书籍目录

ForewordPrefaceChapter1 Introduction　1.1 What Motivated Data Mining? Why Is It Important?　1.2 So,What Is Data Mining?　1.3 Data Mining-On What Kind of Data?　1.4 Data Mining Functionalities-What Kinds of Patterns Can Be Mined?　1.5 Are All of the Patterns Interesting?　1.6 Classification of Data Mining Systems　1.7 Major Issues in Data Mining 　1.8 Summary　Exercises　Bibliographic NotesChapter2 Data Warehouse and LOAP Technology for Data Mining　2.1 What Is a Data Warehouse?　2.2 A Multidimensional Data Model　2.3 Data Warehouse Architecture　2.4 Data Warehouse Implementation　2.5 Further Development of Data Cube Technology　2.6 From Data Warehousing to Data Mining　2.7 Summary 　Exercises　Bibliographic NotesChapter3 Data Preprocessing　3.1 Why Preprocess the Data?　3.2 Data Cleaning　3.3 Data Integration and Transformation　3.4 Data Reduction　3.5 Discretization and Concept Hierarchy Generation　3.6 Summary 　Exercises　Bibliographic NotesChapter4 Data Mining Primitives，Languages，and System ArchitecturesChapter5 Concept Description：Characterization and ComparisonChapter6 Mining Association Rules in Large DatabasesChapter7 Classification and PredictionChapter8 Cluster AnalysisChapter9 Mining Comples Types of DataChapter10 Applications and Trends in Data MiningAppendix A Introduction to Microsoft’s OLE DB for Data MiningAppendix B An Introduction to BDMiner BibliographyIndex

章节摘录

　　In this section, we examine a number of different data stores on which mining can be performed. In principle, data mining should be applicable to any kind of information repository. This includes relational databases, data warehouses, transactional databases, advanced database systems, flat files, and the World Wide Web. Advanced database systems include object-oriented and object-relational databases, and specific application-oriented databases, such as spatial databases, time-series databases, text databases, and multimedia databases. The challenges and techniques of mining may differ for each of the repository systems.　　Although this book assumes that readers have primitive knowledge of information systems, we provide a brief introduction to each of the major data repository systems listed above. In this section, we also introduce the fictitious All Electronics store, which will be used to illustrate concepts throughout the text. A database system, also called a database management system （DBMS）, consists of a collection of interrelated data, known as a database, and a set of software pro-grams to manage and access the data. The software programs involve mechanisms for the definition of database structures; for data storage; for concurrent, shared, or distributed data access; and for ensuring the consistency and security of the information stored, despite system crashes or attempts at unauthorized access.　　A relational database is a collection of tables, each of which is assigned a unique name. Each table consists of a set of attributes （columns or fields） and usually stores a large set of tuples （records or rows）. Each tuples in a relational table represents an object identified by a unique key and described by a set of attribute values. A semantic data model, such as an entity-relationship （ER） data model, which models the database as a set of entities and their relationships, is often constructed for relational databases.　　Consider the following example.

图书封面

图书标签Tags

无

评论、评分、阅读与下载

还没读过(10)
勉强可看(795)
一般般(135)
内容丰富(5627)
强力推荐(461)

数据挖掘 PDF格式下载

用户评论 (总计1条)

封面是汉字内容却是英文的晕

数据挖掘

用户评论 (总计1条)

推荐图书

相关图书