About Us

Who We Are

Indic-Scripts Research Forum is a newly formed research group based in Pune, India, focused on building machine-readable systems for ancient and historic Indian scripts.

Our work currently centers on Modi-scripted Marathi, its transliteration and translation into modern Marathi, with a roadmap that includes other Indic scripts such as Pali and Brahmi.

Why We Exist

Existing efforts to transliterate Modi script into Devanagari have been limited in scope and accuracy. Our own tests on publicly announced models show that sentence-level approaches with small datasets often produce illegible and historically unusable output.

We are addressing this gap by investing the time and effort required to construct large, high-quality character and word-level datasets and robust AI pipelines grounded in dictionaries and expert knowledge.

What We Do

  • Focus on past experimentations at the Modi character level research
  • Test characters to form words
  • Reprogram to improve word construction algorithms
  • Test algorithms to transliterate Modi scripted words to Marathi
  • Build word-level paired datasets:
    • Modi script → transliterated Marathi (Devanagari)
    • Transliteration → modern Marathi equivalents
  • Digitize, code, and build datasets using:
    • Historic dictionaries (e.g., Aitihasik Shabdakosh and other Marathi/old Marathi dictionaries)
    • Marathi Vishwakosh and other reference works
    • Thousands of Maratha history books and documents
  • Use vector databases and retrieval-augmented generation (RAG) to:
    • Link words, meanings, places, people, events
    • Provide context-aware answers and translations
  • Design and fine-tune specialized models and a future Maratha history LLM

Vision

Our long-term vision is to create reliable AI tools that can read, understand, and explain historic Indian documents — not just at the surface level of text, but with awareness of period-specific language, idioms, and historical context.

We want historians, archivists, students, and citizens to be able to access centuries of material that is today locked away in scripts very few people can read.