Lingua::ZH::WordSegmenter

Lingua::ZH::WordSegmenter is a Perl module that offers a simplified Chinese word segmentation.
Download

Lingua::ZH::WordSegmenter Ranking & Summary

Advertisement

  • Rating:
  • License:
  • Perl Artistic License
  • Price:
  • FREE
  • Publisher Name:
  • Zhang Jun
  • Publisher web site:
  • http://search.cpan.org/~jzhang/Lingua-ZH-WordSegmenter-0.01/lib/Lingua/ZH/WordSegmenter.pm

Lingua::ZH::WordSegmenter Tags


Lingua::ZH::WordSegmenter Description

Lingua::ZH::WordSegmenter is a Perl module that offers a simplified Chinese word segmentation. Lingua::ZH::WordSegmenter is a Perl module that offers a simplified Chinese word segmentation.SYNOPSIS use Lingua::ZH::WordSegmenter; my $segmenter = Lingua::ZH::WordSegmenter->new(); print encode('gbk', $segmenter->seg($_) );This is a perl version of simplified Chinese word segmentation.The algorithm for this segmenter is to search the longest word at each point from both left and right directions, and choose the one with higher frequency product.The original program is from the CPAN module Lingua::ZH::WordSegment (http://search.cpan.org/~chenyr/) I did the follwing changes: 1) make the interface object oriented; 2) make the internal string into utf8; 3) using sogou's dictionary (http://www.sogou.com/labs/dl/w.html) as the default dictionary. Requirements: · Perl


Lingua::ZH::WordSegmenter Related Software