proycon/colibri-core
Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.
GitHub repository with 130 stars and 20 forks.
Language: C++
Topics: c-plus-plus, python, nlp, ngrams, skipgram, ngram, corpus, linguistics, library, text-processing