WebHarvest

Free and open source web data extraction tool written in Java
Download

WebHarvest Ranking & Summary

Advertisement

  • Rating:
  • License:
  • Freeware
  • Price:
  • FREE
  • Publisher Name:
  • Vladimir Nikic
  • Publisher web site:
  • Operating Systems:
  • Mac OS X
  • File Size:
  • 6.3 MB

WebHarvest Tags


WebHarvest Description

Free and open source web data extraction tool written in Java Web-Harvest offers a way to collect desired Web pages and extract useful data from them. In order to do that, it leverages well established techniques and technologies for text/xml manipulation such as Regular Expressions, XQuery and XSLT.Web-Harvest mainly focuses on HTML/XML based web sites which still make vast majority of the Web content. On the other hand, it could be easily supplemented by custom Java libraries in order to augment its extraction capabilities. NOTE: WebHarvest is licensed and distributed under the terms of the BSD License. Requirements: · Java What's New in This Release: · GUI is introduced. · html-to-xml processor exposes attributes for controlling cleaner's behaviour. · More scripting languages and features supported. · Access to HttpClient in runtime supported. · Number of other improvements and fixes.


WebHarvest Related Software