Decoding India’s
Historic Scripts with AI

From Modi Script to Modern Marathi:
Building New datasets for IndicScript AI models.

Data Sources

Historic Modi-script manuscripts and letters
Modi → Devanagari transliterated works from “Dafters”
Specialized history dictionaries and encyclopedias:
- Aitihasik Shabdakosh (Y. N. Kelkar, 1962)
- Marathi Vishwakosh
- Dictionary of Old Marathi
- Other Marathi–Marathi, English–Marathi, and terminology dictionaries
~4,000+ Marathi book titles related to Maratha history (planned corpus)

Why Word-Level Pairing

Our experience shows that sentence‑level transliteration in old Devanagari Marathi is often difficult to understand and cannot be reliably reused outside that specific sentence.

By focusing on word‑level pairings, we:

Enable flexible recombination across multiple texts
Make dictionary verification and correction easier
Provide stronger supervision for AI models, improving accuracy and generalization.

A high-level Technical Approach

Digitization – Scan Modi manuscripts and printed transliterated Marathi works.
Segmentation – Use custom JS tools to split Modi and Devanagari text into word-level units.
Pairing & Annotation – Pair Modi script words with transliterated and modern Marathi words, add dictionary-based meanings, and historical tags (people, places, events).
Vectorization – Convert dictionary and word entries into vectors and store in a vector database.
Model Training & RAG – Use the datasets for:
- Transliteration and translation models
- Retrieval-augmented generation for historical Q&A
- Future fine-tuning of a Maratha history LLM

Our Vision

Our long-term vision is to create reliable AI tools that can read, understand, and explain historic Indian documents — not just at the surface level of text, but with awareness of period-specific language, idioms, and historical context. We want historians, archivists, students, and citizens to be able to access centuries of material that is today locked away in scripts very few people can read.

Join us in building the foundation for AI that understands India’s past.

Together, we can preserve heritage and empower the future.

Collaborate with Us →