XML/TA library for glib2 and pcre

SourceForge!
xmlg 0.7.18


 
Summary
Download

 


 
README
ChangeLog
old tarballs

 


 
xmlg sources
bin examples

 


unfinished documents:

 
NODEMODEL
XPATHDEF
CHECKING
HYPERCSS
TEXTMINING


pfe sources
pfe manpages
pfe docbook

 "bow library"

http://www-2.cs.cmu.edu/~mccallum/bow/    - LGPL, plain C

from it's frontpage:
    The library provides facilities for:
    * Recursively descending directories, finding text files.
    * Finding `document' boundaries when there are multiple documents per file.
    * Tokenizing a text file, according to several different methods.
    * Including N-grams among the tokens.
    * Mapping strings to integers and back again, very efficiently.
    * Building a sparse matrix of document/token counts.
    * Pruning vocabulary by word counts or by information gain.
    * Building and manipulating word vectors.
    * Setting word vector weights according to Naive Bayes, TFIDF, 
      and several other methods.
    * Smoothing word probabilities according to Laplace (Dirichlet 
      uniform), M-estimates, Witten-Bell, and Good-Turning.
    * Scoring queries for retrieval or classification.
    * Writing all data structures to disk in a compact format.
    * Reading the document/token matrix from disk in an efficient, 
      sparse fashion.
    * Performing test/train splits, and automatic classification tests.
    * Operating in server mode, receiving and answering queries over a socket.
The library does not:
    * Have English parsing or part-of-speech tagging facilities.
    * Do smoothing across N-gram models.
    * Claim to be finished.
    * Have good documentation.
    * Claim to be bug-free. 


 Isearch

http://www.etymon.com/Isearch/     - BSD, C/C++

 Amberfish

http://www.etymon.com/Amberfish/   - GPL, C/C++