資料庫說明 Introduction
人類有一個古老的夢想:在一個地方裝下古往今來所有的知識——以所有語言寫成的所有的書籍。許多年以前,人們曾大致造出了這樣的圖書館, 建造於西元前300年的亞歷山大圖書館,當時就是被設計成用來放置已知世界的所有羊皮卷。
在相當一段時間內,它裏面放置了約50萬個羊皮卷,大概占了當時所有圖書的30%~70%。 但是即使在亞歷山大圖書館被毀壞之前,能夠把所有知識放在一幢房子的時代也已經過去了。
網路技術的出現和檢索引擎的發展,使得人類可以從新審視將所有的知識放在一個地方的古老夢想了。 目前人類已經完成了將人類第一個10%的知識,網頁形態的非可信知識,進行收集和管理的工作。
2006年7月發佈的Unicode 5.0標準已經將世界統一的字元總數推進到98,884個(過去國標是七千二百個,臺灣Big5為一萬三千個,我們在這種環境下生活了二十年),
也為建構超級知識庫提供了可能性,使得人們可以開始整理存在於世界上的第二個10%的知識——印製在紙質媒介上的、圖文對應的可信知識。一個值得關注的現像是, 由於歷史原因,東亞的文獻在數量上和字元使用量上都具有壓倒性的優勢,東亞應該也必將能夠成為建構超級知識庫的主力軍。從中國歷史上看文字字元的使用情況,
《瀚堂典藏》是目前唯一採用國際通用的超大字元集進行加工校勘的古籍資料庫,其最大特色是文本精准無缺字,並採用高速檢索技術整合各個分庫。 瀚堂在廣泛收集版本和精細校勘的基礎上,致力於建構巨型文獻平臺,以圖文對照的電子圖書館的高新技術形式,以檔案夾分類的書目樹模式,完整保存典籍文獻,提供讀者存真、
Hytung Ancient Book Database
The unique Hytung Ancient Book Database (HYTUNG BOOKS), based on Hytung UTF-16 Search
Engine and standard Unicode-5, covers a wide array of ancient Chinese dictionaries,
etymologies, literature and archaeological writings etc. As a research platform
for the sinologists over the world, HYTUNG BOOKS is compatible with Microsoft OS
and common internet browsers, and it ends the history of sinology research relying
on handwriting, picture replacement or creating symbols when publishing.
Hytung search engine can manage over 70,000 different characters on a universal
platform. The natural language search engine can search a string of characters within
milliseconds from 30 Gigabyte UTF-16 full text materials. At this time, Hytung Books
has collected 30,000 copies of ancient Chinese books and archaeological materials,
with over 1 billion characters and 5 million book images. All the data are managed
within 10 million records or metadata, and the text and image in every record are
associated with each other. It is very easy to review the text by the original images.
The resources collected in the database are primarily focused on about 200 copies
of Chinese ancient etymological and classical dictionaries, historical literature,
unearthed picture writings such as oracle bone inscriptions and bronze inscriptions,
seals, bamboo slips and silk character writings which are organized together with
their related images and photos. Some Japanese ancient dictionaries and western
classics on Sinology will be added into the database as well. All the ancient materials
are full of uncommon characters, and have been edited critically and collated carefully
in order to guarantee the database cited as an authority. Hytung has published the
revised versions of the two famous Chinese dictionaries, Shuo Wen Jie Zi《說文解字》and
Kangxi Dictionary《康熙字典》in 2005 and 2008 respectively.