An Observational Framework to the Zipfian Analysis among Different Languages: Studies to Indonesian Ethnic Biblical Texts

Situngkir, Hokky (2007) An Observational Framework to the Zipfian Analysis among Different Languages: Studies to Indonesian Ethnic Biblical Texts. [Departmental Technical Report]

Full text available as:



The paper introduces the used of Zipfian statistics to observe the human languages by using the same (meaning) corpus/corpora but different in grammatical and structural utterances. We used biblical texts since they contain corpuses that have been most widely and carefully translated into many languages. The idea is to reduce the possibility of noise came from the meaning of the texts in distinctive language. The result is that the robustness of the Zipfian law is observable and some statistical differences are discovered between English and widely used national and several ethnic languages in Indonesia. The paper ends by modestly propose further possible framework in interdisciplinary approaches to human language evolution.

Item Type:Departmental Technical Report
Keywords:statistical processing of natural language, Zipf’s law, Zipf-Mandelbrot fit, corpus, evolution of language
Subjects:Computer Science > Statistical Models
Computer Science > Language
Linguistics > Computational Linguistics
Computer Science > Complexity Theory
Psychology > Psycholinguistics
Philosophy > Philosophy of Mind
Linguistics > Comparative Linguistics
Linguistics > Historical Linguistics
Philosophy > Philosophy of Language
JOURNALS > Issues in Informing Science and Information Technology
Linguistics > Syntax
ID Code:5481
Deposited By:Situngkir, Mr Hokky
Deposited On:04 Apr 2007
Last Modified:11 Mar 2011 08:56

References in Article

Select the SEEK icon to attempt to find the referenced article. If it does not appear to be in cogprints you will be forwarded to the paracite service. Poorly formated references will probably not work.

American Bible Society. (1992). Bible Today's English Version 2nd Edition.

Chomsky, N. (1957). Syntactic Structures. The Hague.

Gaffeo, E., Gallegati, M., Giulioni, G., Palestrini, A. (2003). “Power Laws and Macroeconomic Fluctuations”. Physica A 324:408-416.

Gordon, Raymond G., Jr. (ed.), 2005. Ethnologue: Languages of the World, 15th edition. Dallas, Tex.: SIL International. Online version:

Kennedy, J. (1971). “A History of Malaya”. The Journal of Asian Studies 30 (3):736-7.

Kosmidis, K., Kalampokis, A., Argyrakis, P. (2006). “Statistical Mechanical Approach to Human Language. Physica A 366:495-502.

Lembaga Alkitab Indonesia. (1974). Alkitab Terjemahan Baru.

Lembaga Alkitab Indonesia. (1991). Alkitab Angkola.

Lembaga Alkitab Indonesia. (1991). Alkitab Sunda.

Lembaga Alkitab Indonesia. (1994). Alkitab Jawa.

Lembaga Alkitab Indonesia. (1998). Alkitab Pakpak Dairi.

Lembaga Alkitab Indonesia. (1998). Alkitab Toba Ejaan Baru.

Lembaga Alkitab Indonesia. (2000). Alkitab Karo Edisi III.

Lembaga Alkitab Indonesia. (2000). Alkitab Simalungun.

Li, W. (1992). “Random Texts Exhibit Zipf’s-Law-like Word Frequency Distribution”. IEEE Transaction Information Theory 38 (6):1842-45.

Mandelbrot, B. B. (1983). The Fractal Geometry of Nature. Freeman.

Manning, C. D., Schütze, H. (1999). Foundations of Statistical Natural Language Processing. MIT Press.

Mantegna, R. N., Buldyrev, S., Goldberger, A. L., Havlin, S., Peng, C-K., Simons, M. (1994). “Linguistic Features of Noncoding DNA Sequences. Physical Review Letter 73:3169-72.

Mulianta, I., SItungkir, H., Surya, Y. (2004). “Power Law Signature in Indonesian Population: Empirical Studies of Kabupaten and Kotamadya Population in Indonesia”. BFI Working Paper Series WPT2004. Bandung Fe Institute.

Newman, M. E. J. (2005). "Power laws, Pareto distributions and Zipf's law". Contemporary Physics 46: 323–351.

Powers, D. M. W. (1998). “Applications and Explanations of Zipf’s Law”. In D. M. W. Powers (ed.). New Methods in Language Processing and Computational Natural Language Processing. ACL.

Primo, C., Galvàn, A., Sordo, C., Gutiérrez, J. M. (2007). “Statistical Linguistic Characterization of Variability in Observed and Synthetic Daily Precipitation Series”. Physica A 374:389-402.

Simon, H. A. (1955). "On a Class of Skew Distribution Functions". Biometrika 42: 425-40.

Situngkir, H., Surya, Y. (2003). “Dari Transisi Fasa ke Sistem Keuangan: Distribusi Statistika pada Sistem Kompleks”. BFI Working Paper Series WPQ2003. Bandung Fe Institute.

Situngkir, H., Surya, Y. (2004). “Democracy: Order out of Chaos – Understanding Power-Law in Indonesian Elections”. BFI Working Paper Series WPO2004. Bandung Fe Institute.

Situngkir, H., Surya, Y. (2005). “What can We See from Investment Simulation based on Generalized (m,2)-Zipf Law”. BFI Working Paper Series WPO2005. Bandung Fe Institute.

Zanette, D. H., Montemurro, M. A. (2005). “Dynamics of Text Generation with Realistic Zipf Distribution”. Journal of Quantitative Linguistics 12(1): 29-40. Routledge.

Zipf, G. K. (1949). Human Behavior and the Principle of Least Effort. Addison-Wesley.


Repository Staff Only: item control page