作业帮 > 英语 > 作业

英语翻译Web is a vast resource of information,but its representa

来源:学生作业帮 编辑:作业帮 分类:英语作业 时间:2024/07/15 00:58:55
英语翻译
Web is a vast resource of information,but its representation limits its availability; the main information in a web page is always hidden among unimportant features such as unnecessary images and extraneous links,and this makes it difficult for the users to acquire the topical information.Information extraction can help the users to locate the information of interest.A new extraction methodology based on DOM is proposed by transforming DOM trees to STU-DOM trees and then processing them with some algorithms.A STU-DOM tree can be viewed as a DOM tree with some semantic contextual attributes.The key algorithm is to filter and prune the STU-DOM tree.It can automatically and accurately extract the useful and relevant content from HTML documents.This approach is a universal method,which is independent of document structures and domains.Unlike most approaches,it maintains the structure and content as well.Hence the approach is significant and reliable.It can be widely applied for web browsing on handheld devices,such as PDAs and mobile phones,and retrieval systems.
互联网是一个庞大的信息资源,但其代表性,限制了它的可用性;信息,主要在网页中,始终隐藏其中不重要的功能,如不必要的形象和外在的联系,这使人们难以为用户无法获得相关资料.信息提取技术可以帮助用户查找感兴趣的信息.一种新的提取方法,基于DOM的是提出改造的DOM树,以斯图- DOM的树木,然后再加工,他们与一些算法.1斯图- DOM树可以看成一个DOM树与一些语义上下文属性.关键算法是过滤和修剪了斯图- DOM树.它能够自动,准确地提取有用的和相关的内容从HTML文件.这种做法是一个普遍的方法,它是独立的文件结构和领域.不像一般的办法,它维持的结构和内容,以及.因此该方法是显着的和可靠的.它可广泛地用于网页浏览掌上型装置,如PDA和移动电话,以及检索系统.