Guile Mecab
Guile bindings for MeCab, yet another part-of-speech and morphological analyzer. The bindings link to libmecab and offer a small API for using MeCab.
Installation
Using Guix, it is easy to use and install mecab, as we
provide a recipe for mecab, mecab-ipadic (one possible dictionary for Japanese)
and guile-mecab in the guix directory. For instance, to enter an environment
with all the tools needed to run guile-mecab, you can run, from this repository:
guix environment -L guix --ad-hoc guile guile-mecab mecab-ipadicUsage
Once installed, you can use the bindings by loading (mecab mecab) in a
guile program.
- Scheme Procedure: mecab-version Returns MeCab's version number.
The tagger
The tagger is a global object used to load a dictionary. You can create and remove a tagger by using the following procedures:
Scheme Procedure: mecab-new-tagger [
args'()]This procedure creates a new tagger that can be passed to other functions.
argsis a list of strings that represent the arguments passed to mecab. See mecab's help for a list of accepted arguments.Scheme Procedure: mecab-destroy
taggerThis procedure destroys a tagger. Reusing it afterwards will not work, and may lead to a segmentation fault.
Scheme Procedure: mecab-error
taggerReturns a string representing the last error message returned by the tagger.
The analysis
To analyse a sentence, you can use one of the following procedures.
Scheme Procedure: mecab-parse
taggerstrParses a sentence
strwithtaggerand returns a node.Scheme Procedure: mecab-parse-to-str
taggerstrParses a sentence
strwithtaggerand returns a string containing the result. The resulting string can be affected by options such as-Oor--node-format.Scheme Procedure: mecab-split
taggerstrReturns the list of words in
str.Scheme Procedure: mecab-features
taggerstrReturns the list of features associated with each word in
str, in the same order asmecab-split. Note that the strings are always in CSV format, and this cannot be affected by the-O,--node-formatetc. options.Scheme Procedure: mecab-words
taggerstrReturns the list of words in dictionary form that are present in
str, leaving out punctuation and declension.
Nodes
Nodes can be manipulated with the following procedures.
Scheme Procedure: node-feature
nodeReturn the feature string of the given
node.Scheme Procedure: node-surface
nodeReturn the surface value of the given
node, that is the word as it is written in the sentence (not necessarily its dictionary form). Note that this function is not very reliable.Scheme Procedure: node-stat
nodeReturn the type of the given
node. The value is a number which is one ofMECAB_NOR_NODE,MECAB_UNK_NODE,MECAB_BOS_NODE,MECAB_EOS_NODEorMECAB_EON_NODE.Scheme Procedure: node-next
nodeReturn the node following the given
node.Scheme Procedure: node-prev
nodeReturn the node preceding the given
node.