KylmKyoto Language Modeling toolkit | |
Download |
Kylm Ranking & Summary
Advertisement
- License:
- LGPL
- Price:
- FREE
- Publisher Name:
- Graham Neubig
- Publisher web site:
- Operating Systems:
- Mac OS X
- File Size:
- 64 KB
Kylm Tags
Kylm Description
Kyoto Language Modeling toolkit Kylm is a free and open source language modeling toolkit written in Java for natural language processing applications.Kylm can handle character-by-character modeling of unknown words, language model combination, comparison, and evaluation, as well as a number of smoothing techniques. Here are some key features of "Kylm": · Two programs, CountNgrams and CrossEntropy · Support for n-gram language models and several smoothing techniques (Maximum Likelihood, Good-Turing, Absolute Discounting, Witten-Bell, Kneser-Ney) · Support for input/output in ARPA and binary format · JUnit scripts to perform regression tests Requirements: · Java What's New in This Release: · Support for class-based language models · Support for improved Kneser-Ney smoothing Ver. 0.0.3 (6/22/2009) · Fixed the creation of empty strings when multiple white spaces exist A variety of speed and memory improvements (removal of linked lists, indexing of the root node) · Added support for character-based modeling of unknown words · Fixed trimming so it works with Good-Turing smoothed models · Fixed a problem when piping data in to CountNgrams Fixed a problem with WFST output that was killing beginning-of-sentence context Ver. 0.0.2 (5/28/2009) · New Features: It is now possible to trim n-grams by count A set vocabulary list can be used to limit the vocabulary Output in AT&T WFST format is possible Documentation has been improved · Outstanding Issues: Trimming does not work in tandem with Good-Turing smoothing
Kylm Related Software