We believe that accurate AI for historic scripts is only possible when it is built on carefully curated, high‑quality datasets. Our team works with verified Modi manuscripts, transliterated documents, and historic Marathi dictionaries to build large, paired word datasets. These datasets power robust transliteration and translation AI models to eventually accomplish our goal to achieve 80% accuracy in machine translations. As we are focusing on last three hundred years of Modi-scripted manuscripts it will automatically lay the foundation for a future Maratha history LLM.
Data First
We are constructing a large Modi–Marathi and Modern Marathi dataset from authentic sources such as historic Peshve-period letters, Fermans, pothi, biographies and other Modi documents.
Expert-Validated
We collaborate with Modi script experts and historians to verify transliterations and word pairings, ensuring that outputs are legible, historically accurate, and context-aware.
Purpose-Built Models
Our goal is to build AI systems that go beyond basic OCR — models that understand context, relationships, and historical terminology to support serious research and heritage preservation.