Web Harvesting

in Web
Web harvesting has the ability to automate the process of capturing data from the web pages. It can also be thought of as a focused or directed format of Web Crawling. Search engines do help in gathering information from web pages, but a lot of labor is involved in copying the content and converting them to required formats. Web harvesting indexes all the content that is related to the audience search term from the web pages available on the Internet. It provides a very fast form of searching since it focuses on indexing the URLs to which they are directed thus reducing the size of the index. It projects a more refined search because the indexed URLs are pre-filtered for a particular topic or areas or interest.

The process is started by providing a list of URLs that map to a specific collection or source of information. The hyperlinks associated with these URLs can be ignored or used depending on the type of intended usage.

Web Harvesting can refer to various processes, for example web structure harvesting, web content harvesting and web usage harvesting.

When performing web content harvesting, a particular aspect of the web documents is focused on, such as hypertext files, electronic messages, pictures or product pricing. Web usage harvesting collects its data from web servers keeping in mind the users' needs so it can better anticipate user behavior.

A robust, feature-rich web harvesting application is Visual Web Ripper. It can extract information from dynamic web pages even when the information is in a format other than plain HTML, such as AJAX, ASP.NET or any other technology. The tools used for web harvesting follow a step-by-step procedure to fetch the desired information. Firstly, the user formats the harvest based on one page from site from which the data is required to be extracted. Visual Web Ripper then mimics this extraction from the remainder of the site's specified pages.

The harvested data from the web pages can be converted to various formats like word, excel, CSV, XML, text, or other database formats. Government agencies also use web harvesting to enforce policies. It helps the business professionals to analyze their competitors and also the marketing techniques used by them. They can also use it to gather information about the selling price, competitors information, customer data and financial information of various types.
Author Box
Tracy Morgan has 1 articles online


For more information about Web Harvesting Please visit www.visualwebripper.com

Add New Comment

Web Harvesting

Log in or Create Account to post a comment.
     
*
*
Security Code: Captcha Image Change Image
Showing 1 comments

Sort by: Subscribe by RSS

  • comment_image

    Contact- 2012/11/30 21:13:57 pm

    Web harvesting is the process of collecting, manipulating and analyzing various set of information to be used by organizations all over to keep their businesses above their competitors. These data collected by these organizations are used for other works as well like financial analysis, blogging and competitive intelligence. These web harvesting techniques have made a lot easier to pull out data on competitors and that includes press releases, prices and of course the financial data; which is the most vital resources for any corporate house.. Why is web harvesting important?

This article was published on 2011/02/14