
出版时间:2011-3  出版社:机械工业出版社  作者:(西班牙) Ricardo Baeza-Yates,(巴西)Berthier Ribeiro-Neto  页数:913  




  Ricardo Baeza-Yates,于加拿大滑铁卢大学获得计算机科学博士学位,现为雅虎欧洲和拉丁美洲研究院副总裁,主管雅虎在巴塞罗纳(西班牙)和圣地亚哥(智利)(的研究中心,并监管海法研究中心。他曾担任智利计算机科学学会主席、智利大学计算机科学系Web研究中心主任、ICREA教授,并且他还在巴塞罗纳法布拉大学创立了信息与通信技术系Web研究组。现在他仍是智利大学和法布拉大学的兼职教授。他的主要研究方向为算法与数据结构、信息检索、用户界面以及可视化在数据库中的应用等。


Preface to the Second EditionPreface to the First EditionAuthors' Acknowledgements to the Second EditionAuthors' Acknowledgements to the First EditionPublishers' Acknowledgements1 Introduction1.1 Information Retrieval1.1.1 Early Developments1.1.2 Information Retrieval in Libraries and Digital Libraries1.1.3 IR at the Center of the Stage1.2 The IR Problem1.2.1 The User's Task1.2.2 Information versus Data Retrieval1.3 The IR System1.3.1 Software Architecture of the IR System1.3.2 The Retrieval and Ranking Processes1.4 The Web1.4.1 A Brief History1.4.2 The e-Publishing Era1.4.3 How the Web Changed Search1.4.4 Practical Issues on the Web1.5 Organization of the Book1.5.1 Focus of the Book1.5.2 Book Contents1.6 The Book Web Site: A Teaching Resource1.7 Bibliographic DiscussionUser Interfaces for Searchby Marti Hearst2.1 Introduction2.2 How People Search2.2.1 Information Lookup versus Exploratory Search2.2.2 Classic versus Dynamic Model of Information Seeking . 2.2.3 Navigation versus Search2.2.4 Observations cf the Search Process2.3 Search Interfaces Today2.3.1 Getting Started2.3.2 Query Specification2.3.3 Query Specification Interfaces2.3.4 Retrieval Results Display2.3.5 Query Reformulation2.3.6 Organizing Search Results2.4 Visualization in Search Interfaces2.4.1 Visualizing Bcolesn Syntax2.4.2 Visualizing Query Terms within Retrieval Results2.4.3 Visualizing Relationships Among Words and Documents 2.4.4 Visualization for Text Mining2.5 Design and Evaluation of Search Interfaces 2.6 Trends and Research Issues2.7 Bibliographic DiscussionModeling3.1 IR Models3.1.1 Modeling and Rankirg3.1.2 Characterization cf an IR Model3.1.3 A Taxonomy of IR Models3.2 Classic Information Retrieval3.2.1 Basic Concepts3.2.2 The Boolean Model3.2.3 Term Weighting 3.2A TF-IDF Weights3.2.5 Document Length Normalization3.2.6 The Vector Model3.2.7 The Probabilistic Mcdel3.2.8 Brief Comparison of Classic Models3.3 Alternative Set Theoretic Models3.3.1 Set-Based Model3.3.2 Extended Boolean Model3.3.3 Fuzzy Set Model 3.4 Alternative Algebraic Models 3.4.1 Generalized Vector Space Model 3.4.2 Latent Semantic Indexing Moo'el3.4.3 Neural Netwozk Model3.5 Alternative Probabilistic Mcdels3.5.1 BM253.5.2 Language Models3.5.3 Divergence from Randomness 3.5.4 Bayesian Network Models3.6 Other Models……4 Retrieval Evaluation5 Relevance Feedback and Query Expansion6 Documents:Languages &Properties7 Queries:Languages &Properties8 Text Classiftcation9 Indexiong and Searching 10 Parallel and Distributed IR11 Web Retrieval12 Web Crawling 13 Structured Text Retrieval 14 Multimedia Information retrieval15 Enterprise Search16 Library Systems17 Digital Libraries


  Libraries were among the first institutions to adopt IR systems for retrieving information. Usually, library systems were initially developed by academic institutions and later by commercial vendors. In the first generation, such systems consisted of anautomation of existing processes such as card catalogs searching, restricted to authornames and titles. In the second generation, increased search functionality was added to include subject headings, keywords, and query operators. In the third generation,which is currently being deployed, the focus has been on improved graphical in terfaces,electronic f  rms, hypertext features, and open system architectures.  Traditional library management system vendors include Endeavor InformationSystems Inc., Innovative Interfaces Inc., and EOS International. Among systems developed with a research focus, we distinguish MELVYL developed by the California Digital Library at University of California, and the Cheshire system developed originally at UC Berkeley and lately in cooperation with the University of Liverpool.Further details on these library systems can be found in Chapter IR at the Center of the Stage Despite its maturity, until recently, IR was seen as a narrow area of interest restrictedmainly to librarians and infrmation experts. Such a tendentious vision prevailed for many years, despite the rapid dissemination, among users of modern personalcomputers, of IR tools for multimedia and hypertext applications. In the beginning of the 1990s, a single fact changed once and for all these perceptions the in troductionof the World Wide Web.  The Web, invented in 1989 by Tim Berners-Lee, has become a universal repository of human knowedge and culture. Its success is based on the conception of a standarduser interface which is always the same, no matter the computational environmentused to run the interface, and which allows any user to create their own documents.As a result, millions of users have created billions of documents that compose the largest human repository of knowledge in history. An immediate consequence is that finding useful information on the Web is not always a simple task and usually requiresposing a query to a search engine, i.e., running a search. And search is all aboutIR and its technologies. Thus, a hnost overnight, IR has gained a place with other technologies at the center of the stage.  ……




