2 Set Up
2.1 Installation and configuration
::install_gitlab("enpchina/histtext-r-client", auth_token = "replace with your gitlab token") devtools
Configuration of the package (replace fields with actual server information)
::set_config_file(domain = "https://rapi.enpchina.eu",
histtextuser = "user_info", password = "user_info_password")
If successfully configured, the following command will return “OK”
::get_server_status() histtext
Now you can upload the library
library(histtext)
2.2 Available Corpora
The function list_corpora serves to list all the corpora available on the Modern China Text Base created by the ENP-China Project. The corpora are stored on a SolR server. Each corpus is labeled with the specific name to be used in the search functions (see below):
::list_corpora() histtext
## [1] "archives" "chinajournal-pages" "csmo-pages"
## [4] "dongfangzz" "elder_workers" "elder_workers_format"
## [7] "imh-en" "imh-zh" "kmt9k"
## [10] "ncbras" "proquest" "reports-en"
## [13] "reports-fr" "scmp-recent" "shimingru-diary"
## [16] "shunpao" "shunpao-revised" "shunpao-tok"
## [19] "waiguozaihua" "wikibio-en" "wikibio-zh"
## [22] "zhanggangdiary"
### Brief description
Periodicals:
- shunpao: Chinese newspaper Shenbao 申報 (1872-1949): original version from the provider (GetHong)
- shunpao-revised: Chinese newspaper Shenbao 申報 (1872-1949): corrected version by the ENP-China project (date formatting, correction of titles mixed up with text, segmentation of extra-long articles)
- proquest: English-language periodicals from the ProQuest Chinese Newspapers Collection (CNC)
- dongfangzz: Dongfang zazhi 東方雜誌 (1904-1948)
- ncbras: Journal of the North China Branch of the Royal Asiatic Society (1858-1948)
- chinajournal-pages: The China Journal (1904-1949) (access at page level)
- csmo-pages: Chinese Students’ Monthly (1906-1931) (access at page level)
- scmp-recent: South China Morning Post (1954-2000) (subset from the ProQuest collection)
- cmj: China Medical Journal (1887-1949)
- elder_workers_format: corpus of interviews of Shanghai workers (1953-1958) [not public]
Other printed sources:
- imh-en: Collection of English-language who’s whos, directories and other biographical data from the Institute of Modern History (IMH), Academia Sinica 近現代人物資訊整合系統 The Integrated Information System on Modern and Contemporary Characters (IISMCC)
- imh-zh: Collection of Chinese-language who’s whos, directories and other biographical data from the Institute of Modern History (IMH), Academia Sinica 近現代人物資訊整合系統 The Integrated Information System on Modern and Contemporary Characters (IISMCC)
- kmt9k: Biographical dictionary of Zhongguo Guomindang jiuqian jiangling 中国国民党九千将领 (9,000 Generals of the Guomindang)
- waiguozaihua: The Universal Dictionary of Foreign Business in Modern China 外国在华工商企业辞典 China Waiguo zaihua gongshang qiye cidian, Chengdu 成都, Sichuan renmin chubanshe 四川人民出版社, 1995.
Archives:
- archives: Shanghai Municipal Police (SMP) Archives and Records of the Department of State Relating to Political Relations between China and Japan 1930-1944
- reports-fr: Annual Reports of the French Municipal Administration in Shanghai (“Compte rendu de la gestion pour l’exercice…”) (1893-1940)
- reports-en: Annual Reports of the Shanghai Municipal Council (SMC) (“Report for the year…”) (1859-1943)
Wikipedia:
- wikibio-en: corpus of biographies of individuals active in modern China extracted from Wikipedia (English)
- wikibio-zh: corpus of biographies of individuals active in modern China extracted from Wikipedia (Chinese)
Diaries and Memoirs
- zhanggangdiary: Zhang Gang Diary (1888–1942)
The content of the Modern China Text Base is expanding continuously. The presentation above may not reflect the most recent state of its collections.