Pāṇini
A LLM-powered linguistic feature extraction & analysis framework.
Pāṇini allows you to describe a language's morphology as Rust types, write extraction directives, and let the pipeline handle the rest: prompt assembly, JSON schema generation, LLM orchestration, response parsing, and validation.
-
Linguistic Agnosticism
No universal schema imposed. Each language defines exactly the features it needs.
-
Type-Safe Guarantee
Morphology validated at compile-time. Auto-generated JSON schemas. Every LLM response is verified against the target schema, with automatic retry and self-correction logic to ensure result integrity.
-
Multi-Language Access
High-performance Rust core, accessible via CLI or Python package (
panini-lang). -
Corpus Analysis
Transform raw extractions into statistical coverage reports and lexical inventories.
🚀 Panini in 1 Minute
1. Describe Your Language
Model your grammar using standard Rust types. Your types become the "source of truth" for the AI.
pub enum PolishMorphology {
Noun { lemma: String, gender: Gender, case: Case },
Verb { lemma: String, aspect: Aspect, tense: Tense },
}
Generated JSON Schema (Fragment):
{
"anyOf": [
{
"type": "object",
"properties": {
"pos": { "const": "noun" },
"lemma": { "type": "string" },
"gender": { "$ref": "#/$defs/Gender" },
"case": { "$ref": "#/$defs/Case" }
}
}
]
}
2. Extract!
Use the CLI or Python package to analyze any text.
import panini
result = panini.extract(
language="pol",
text="Studentka czyta książkę.",
targets=["studentka"]
)
Sample Extraction Result:
{
"morphology": {
"target_features": [
{
"word": "studentka",
"morphology": {
"pos": "noun",
"lemma": "studentka",
"gender": "feminine",
"case": "nominative"
}
}
]
}
}
3. Analyze the Lexicon
Generate a statistical analysis of the corpus to track coverage and frequency.
Extraction Output Sample (Polish):
{
"word": "studentka",
"morphology": {
"pos": "noun",
"lemma": "studentka",
"gender": "feminine",
"number": "singular",
"case": "nominative"
}
}
Aggregation Report Sample:
[NOUN] total: 15
|- case [3/7]: nominative(8), accusative(5), genitive(2)
|- gender [2/3]: feminine(10), masculine(5)
📚 Where to Start?
- Researchers & Linguists: Learn how to use the Python Package.
- Backend Developers: Integrate Panini into your Rust Project.
- Framework Architects: Discover how to Add a Language or create a Custom Component.
About the Name
Pāṇini is named after Pāṇini, the ancient Indian grammarian and author of the Aṣṭādhyāyī, the first systematic and formal description of the Sanskrit language.
Note: Pāṇini Framework is a technical tool for feature extraction and does not strictly follow the specific "Paninian Framework" notation used in traditional NLP (e.g., Pāṇinian Syntactico-Semantic Relation Labels).