Decoding India’s Historic Scripts with AI

From Modi Script to Modern Marathi: Building New Datasets for IndicScript AI Models

Building AI on Authentic Historical Foundations

Indic-Scripts Research Forum is a Pune-based AI research company building ground-up, word-level datasets and models to transliterate and translate historic Indian scripts — starting with Modi-scripted Marathi — into modern languages like Marathi, Hindi, and English.

We believe that accurate AI for historic scripts is only possible when it is built on carefully curated, high-quality datasets. Our team works with verified Modi manuscripts, transliterated documents, and historic Marathi dictionaries to build large paired word datasets. These datasets power robust transliteration and translation AI models.

Our long-term goal is to achieve 80% accuracy in machine translations. By focusing on the last three hundred years of Modi-scripted manuscripts, we are laying the foundation for a future Maratha history LLM.

Data First

We are constructing a large Modi–Marathi and Modern Marathi dataset from authentic sources such as historic Peshve-period letters, Fermans, pothi manuscripts, biographies, and other Modi documents.

Expert-Validated

We collaborate with Modi script experts and historians to verify transliterations and word pairings, ensuring outputs are legible, historically accurate, and context-aware.

Purpose-Built Models

Our goal is to build AI systems that go beyond basic OCR — models that understand context, relationships, and historical terminology to support serious research and heritage preservation.