creators_name: Situngkir, Hokky type: techreport datestamp: 2007-04-04 lastmod: 2011-03-11 08:56:49 metadata_visibility: show title: An Observational Framework to the Zipfian Analysis among Different Languages: Studies to Indonesian Ethnic Biblical Texts ispublished: pub subjects: comp-sci-stat-model subjects: comp-sci-lang subjects: ling-comput subjects: comp-sci-complex-theory subjects: psy-ling subjects: phil-mind subjects: ling-compara subjects: hist-ling subjects: phil-lang subjects: issinform subjects: ling-syntax full_text_status: public keywords: statistical processing of natural language, Zipf’s law, Zipf-Mandelbrot fit, corpus, evolution of language abstract: The paper introduces the used of Zipfian statistics to observe the human languages by using the same (meaning) corpus/corpora but different in grammatical and structural utterances. We used biblical texts since they contain corpuses that have been most widely and carefully translated into many languages. The idea is to reduce the possibility of noise came from the meaning of the texts in distinctive language. The result is that the robustness of the Zipfian law is observable and some statistical differences are discovered between English and widely used national and several ethnic languages in Indonesia. The paper ends by modestly propose further possible framework in interdisciplinary approaches to human language evolution. date: 2007-02 date_type: published institution: Bandung Fe Institute department: Computational Sociology refereed: TRUE referencetext: American Bible Society. (1992). Bible Today's English Version 2nd Edition. Chomsky, N. (1957). Syntactic Structures. The Hague. Gaffeo, E., Gallegati, M., Giulioni, G., Palestrini, A. (2003). “Power Laws and Macroeconomic Fluctuations”. Physica A 324:408-416. Gordon, Raymond G., Jr. (ed.), 2005. Ethnologue: Languages of the World, 15th edition. Dallas, Tex.: SIL International. Online version: http://www.ethnologue.com/. Kennedy, J. (1971). “A History of Malaya”. The Journal of Asian Studies 30 (3):736-7. Kosmidis, K., Kalampokis, A., Argyrakis, P. (2006). “Statistical Mechanical Approach to Human Language. Physica A 366:495-502. Lembaga Alkitab Indonesia. (1974). Alkitab Terjemahan Baru. Lembaga Alkitab Indonesia. (1991). Alkitab Angkola. Lembaga Alkitab Indonesia. (1991). Alkitab Sunda. Lembaga Alkitab Indonesia. (1994). Alkitab Jawa. Lembaga Alkitab Indonesia. (1998). Alkitab Pakpak Dairi. Lembaga Alkitab Indonesia. (1998). Alkitab Toba Ejaan Baru. Lembaga Alkitab Indonesia. (2000). Alkitab Karo Edisi III. Lembaga Alkitab Indonesia. (2000). Alkitab Simalungun. Li, W. (1992). “Random Texts Exhibit Zipf’s-Law-like Word Frequency Distribution”. IEEE Transaction Information Theory 38 (6):1842-45. Mandelbrot, B. B. (1983). The Fractal Geometry of Nature. Freeman. Manning, C. D., Schütze, H. (1999). Foundations of Statistical Natural Language Processing. MIT Press. Mantegna, R. N., Buldyrev, S., Goldberger, A. L., Havlin, S., Peng, C-K., Simons, M. (1994). “Linguistic Features of Noncoding DNA Sequences. Physical Review Letter 73:3169-72. Mulianta, I., SItungkir, H., Surya, Y. (2004). “Power Law Signature in Indonesian Population: Empirical Studies of Kabupaten and Kotamadya Population in Indonesia”. BFI Working Paper Series WPT2004. Bandung Fe Institute. Newman, M. E. J. (2005). "Power laws, Pareto distributions and Zipf's law". Contemporary Physics 46: 323–351. Powers, D. M. W. (1998). “Applications and Explanations of Zipf’s Law”. In D. M. W. Powers (ed.). New Methods in Language Processing and Computational Natural Language Processing. ACL. Primo, C., Galvàn, A., Sordo, C., Gutiérrez, J. M. (2007). “Statistical Linguistic Characterization of Variability in Observed and Synthetic Daily Precipitation Series”. Physica A 374:389-402. Simon, H. A. (1955). "On a Class of Skew Distribution Functions". Biometrika 42: 425-40. Situngkir, H., Surya, Y. (2003). “Dari Transisi Fasa ke Sistem Keuangan: Distribusi Statistika pada Sistem Kompleks”. BFI Working Paper Series WPQ2003. Bandung Fe Institute. Situngkir, H., Surya, Y. (2004). “Democracy: Order out of Chaos – Understanding Power-Law in Indonesian Elections”. BFI Working Paper Series WPO2004. Bandung Fe Institute. Situngkir, H., Surya, Y. (2005). “What can We See from Investment Simulation based on Generalized (m,2)-Zipf Law”. BFI Working Paper Series WPO2005. Bandung Fe Institute. Zanette, D. H., Montemurro, M. A. (2005). “Dynamics of Text Generation with Realistic Zipf Distribution”. Journal of Quantitative Linguistics 12(1): 29-40. Routledge. Zipf, G. K. (1949). Human Behavior and the Principle of Least Effort. Addison-Wesley. citation: Situngkir, Hokky (2007) An Observational Framework to the Zipfian Analysis among Different Languages: Studies to Indonesian Ethnic Biblical Texts. [Departmental Technical Report] document_url: http://cogprints.org/5481/1/2007a.pdf