Terma Heritage Foundation
AboutProgramsImpactTeamNews & PressContactSupport Our Work
Terma Heritage Foundation

Preserving Tibetan and Himalayan cultural heritage through technology, arts, education, and community programs.

Foundation

  • About
  • Impact
  • Team
  • News & Press
  • Contact

Programs

  • TermaVision
  • TermaFoundry
  • Gangjong Doeghar
  • Sacred Audio
  • Terma Studio
  • View All Programs

Connect

  • Email

© 2026 Terma Heritage Foundation, Inc. | New York Not-for-Profit Corporation

Privacy Policy
All Programs

TermaVision

Web Prototype

Making centuries of Buddhist iconographic knowledge accessible through purpose-built computer vision

Explore
TermaVision

Overview

Buddhist art is one of the richest visual traditions on earth — thousands of deities, bodhisattvas, protectors, and teachers, each depicted with precise iconographic rules developed over centuries. A single thangka painting can contain dozens of figures, each identifiable by their body color, hand gestures (mudras), sacred objects, posture, and companion figures. This knowledge lives in the minds of trained scholars and monks, but is inaccessible to most people who encounter Buddhist art.

TermaVision is not a generic AI or a fine-tuned large language model. It is a small, purpose-built vision model with a unique architecture designed from the ground up for one task: identifying sacred figures in Buddhist artwork. It knows nothing else — it only speaks Buddhist art.

What makes it different is its architecture. The model detects individual figures in a complex composition, then classifies each one against 93 trained classes. But it doesn't stop there — it cross-references every identification against a hand-built iconography database of 557 deities, checking body color, hand gestures, sacred objects, and posture. Then it applies compositional reasoning: knowledge of traditional Buddhist groupings like the Rigsum Gonpo (Avalokiteshvara, Manjushri, Vajrapani) or the Five Dhyani Buddhas, using the presence of one figure to confirm or correct the identification of others.

No generic AI can do this. It requires domain-specific architecture that encodes centuries of iconographic knowledge into the system itself. TermaVision serves scholars, museums, practitioners, and anyone who encounters Buddhist art and wants to understand what they are seeing.

93
Trained Classes
94.4%
Top-1 Accuracy
557
Deity Database
70
Strong Classes (>80%)
2–4s
Inference Speed

How It Works

1

Not a large language model — a small, specialized vision model with a custom architecture built from the ground up for Buddhist art. It knows nothing else.

2

Multi-figure detection — finds all individual figures in a complex thangka containing dozens of deities, even in crowded compositions with overlapping figures

3

93-class deity classification — identifies deities, bodhisattvas, dharma kings, teachers, arhats, and protectors across Tibetan, Himalayan, and broader Buddhist traditions

4

Iconography database — a hand-built knowledge base of 557 deities with body colors, mudras, sacred objects, postures, and lineage information used to verify every identification

5

Compositional reasoning — encodes knowledge of 8 traditional Buddhist groupings (Rigsum Gonpo, Tse Lha Nam Sum, Five Dhyani Buddhas, Eight Great Bodhisattvas, and others) to use the presence of one figure to confirm or correct identification of others

6

Iconographic output — returns Tibetan name, Sanskrit name, lineage, category, known aliases, and associated symbolism for every identified figure

How It’s Built

Stage 1 — Figure Detection

Locates every individual figure in the artwork, even in complex multi-figure thangkas. Adapts automatically — no retraining needed when new figure types are added.

Stage 2 — Figure Classification

Each detected figure is isolated and classified independently against 93 trained classes. The model is purpose-built and lightweight — not a general AI repurposed for this task.

Stage 3 — Iconographic Verification

Every identification is cross-checked against a hand-built database of 557 deities — verifying body color, hand gestures, sacred objects, and posture. This is where domain knowledge is encoded directly into the system.

Stage 4 — Compositional Reasoning

The system understands how Buddhist figures appear together. Knowledge of 8 traditional groupings allows it to use context — if Avalokiteshvara is present, it knows to look for Manjushri and Vajrapani nearby.

Stage 5 — Validation

Color mismatch detection, duplicate detection, and confidence calibration. Each identification receives a confidence level: confident, likely, ambiguous, or uncertain.

Research & Publications

TermaVision: A Multi-Stage Deep Learning Pipeline for Automated Buddhist Iconography Identification

Thupten N. Chakrishar · 2025

Abstract

This paper presents TermaVision, an automated multi-stage pipeline that combines frozen vision-language features, a lightweight classifier, zero-shot attribute verification, and a structured iconography knowledge graph to identify Buddhist figures in thangka paintings, statues, and murals. The system achieves 94.4% top-1 accuracy across 93 classes, processing images in 1.3–4 seconds on a consumer GPU. Unlike generic large language models, TermaVision employs a purpose-built architecture with a zero-shot subject filter, compositional reasoning encoding canonical Buddhist groupings, and a 557-deity knowledge graph with structured iconographic attributes. The pipeline also provides zero-shot provenance classification, iconometric proportion assessment based on the traditional Tibetan Angula system, and regional artistic style classification.

Key Findings

94.4% top-1 accuracy across 93 deity classes using a purpose-built vision model

557-deity iconography knowledge graph with 9 attribute categories for explainable verification

Compositional reasoning module encoding 8 canonical Buddhist figure groupings

Zero-shot provenance, iconometric, and regional style analysis without additional training data

Processes images in 1.3–4 seconds vs. 8–35 seconds for the prior system

Buddhist art identificationdeep learningzero-shot classificationknowledge graphiconographycultural heritagevision-language modelsthangka
Read Full Paper

Related Programs

Digital Preservation

Tibetan Library

A free digital archive of Tibetan literature, scripture, and historical texts for communities worldwide

Language & Technology

TermaFoundry

A complete digital infrastructure for reading, writing, and preserving the Tibetan language