Skip to content

Code Examples

Explore how Pฤแน‡ini handles real-world linguistic data across different languages and analysis modes.

๐Ÿ‡น๐Ÿ‡ท Turkish: Agglutinative Segmentation

Key Concept: Morpheme Inventory

Turkish is a highly agglutinative language where a single word can contain multiple grammatical functions. Pฤแน‡ini uses an Inventory to match surface forms to grammatical meanings.

# Example of Turkish segmentation
result = panini.extract(
    language="tur",
    text="gelmeyecekler",
    components=["morpheme_segmentation"]
)

Output Example: * Word: gelmeyecekler * gel (Verb Root) * ma (Negative) * y (Buffer) * ecek (Future) * ler (3rd Person Plural)


๐Ÿ‡ฆ๐Ÿ‡ช Arabic: Morphological Aggregation

The Arabic example showcases Pฤแน‡ini's ability to perform statistical analysis over a corpora, specifically focusing on the 3-consonantal root system.

Pivoted Aggregation

In Semitic languages, verbs, nouns and adjectives can be constructed from a 3-consonantal root. For example, the root k-t-b (ูƒ ุช ุจ) related to "writing" generates: - kataba (ูƒูŽุชูŽุจูŽ) โ€” "he wrote" (verb) - kitฤb (ูƒูุชูŽุงุจ) โ€” "book" (noun) - kฤtib (ูƒูŽุงุชูุจ) โ€” "writer" (noun) - maktaba (ู…ูŽูƒู’ุชูŽุจูŽุฉ) โ€” "library" (noun)

Therefore, for those languages, it is often useful to aggregate data by root rather than by word. Panini's record_pivoted allows you to define a "pivot" callback to group results dynamically.

# Aggregate by root
aggregator.record_pivoted(
    lang_code="ara",
    result=extraction_result,
    pivot_callback=lambda feat: next(iter(feat.values())).get("root", "no-root")
)

๐Ÿ“Š Visualizing Results

Pฤแน‡ini's aggregation data can be easily exported for visualization. Below are samples generated using the helper.py utility included in the examples.

PoS Category Distribution

Understand the balance of nouns, verbs, and adjectives in your corpus. PoS Distribution

Case Distribution (Arabic)

Explore how case markers are used across the analyzed text. Arabic Case Example

Root Frequency Analysis

Identify the most productive roots in your linguistic data. Root Distribution


๐Ÿš€ Lexicon Explorer

For a truly immersive experience, the Python examples include a Lexicon Explorer dashboard built with D3.js.

  • Interactive Network: View the relationships between roots and their lexical realizations.
  • Dynamic Pivoting: Switch between views (Root, PoS, Case, Aspect) in real-time.
  • Deep Insight: Hover over nodes to see the full morphological trait set.

[!TIP] Run the Arabic aggregation example in the examples/python directory to generate your own lexicon_data.json and launch the dashboard!