This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors retain the copyright without restrictions for their published content in this journal. HSSR is a SHERPA ROMEO Green Journal.
This is an open-access article distributed under the terms of
INVESTIGATING ARABIC CORPUS (KorSA) OF INDONESIAN UNDERGRADUATE THESIS ABSTRACTS
Corresponding Author(s) : Mohammad Ahsanuddin
Humanities & Social Sciences Reviews,
Vol. 8 No. 3 (2020): May
Purpose: This study was designed to unveil Arabic corpus written in the Indonesian undergraduate thesis abstracts.
Methodology: Experimental and descriptive approaches were employed to elicit data which were in the forms of isim (noun), fi’il (verb), concordances, and idioms collected from 59 thesis abstracts written by undergraduate students of Universitas Negeri Malang UIN Maulana Malik Ibrahim Malang, both are state-owned universities based in East Java, Indonesia.
Principal Findings: The results of this study informed that two steps were carried out to craft an Arabic corpus for undergraduate theses, such as enacting need analysis and designing the model of Arabic corpus. Furthermore, this study also uncovered that the main page of the corpus website was concordances and word frequency.
Implications/Applications: If in the context of translating the Qur'an into English only, for example, there is not yet a parallel corpus model of the Qur'an. Furthermore, similar models in other languages, such as the Indonesian context are possible.
Novelty/Originality of this study: This research is the first step towards the formation of the first model of the parallel Qur'an corpus and its translation in the Indonesian language. Besides, this model can later be used as a reference for the preparation of other identical corpus models in the context of bilingual bodies or more for the benefit of translation research.
Download CitationEndnote/Zotero/Mendeley (RIS)
Ahsanuddin, M. (2018). Tashmim Al-mudawwanah Al-Mutawaziyah Li mustakhlash Al-buhuts Al-Ilmiah Al-Indunisiya Al-Arabiyah ‘AlaDhauiNadzariyah Mona Baker Li Al-Takafu’ Al-Lughawi Fi Al-Tarjamah. UIN Maulana Malik Ibrahim Malang.
Alfaifi, A. & Atwell, E. (2016). Comparative evaluation of tools for Arabic corpora search and analysis. International Journal of Speech Technology, 9(2): 347-357. https://doi.org/10.1007/s10772-015-9285-5 DOI: https://doi.org/10.1007/s10772-015-9285-5
Anthony, L. (2013). A critical look at software tools in corpus linguistics. Linguistic Research, 30(2), 141–61. https://doi.org/10.17250/khisli.30.2.201308.001 DOI: https://doi.org/10.17250/khisli.30.2.201308.001
Atwell, E., Al-Sulaiti, L., Al-Osaimi, S. & Abu Shawar, B. (2004). A review of Arabic corpus analysis tools. In B. Bel& I. Marlien (Eds.), Proceedings of TALN04: XI Conference sur le TraitementAutomatique des LanguesNaturelles (volume 2, pp. 229–234). ATALA.
Baker, P. (2010). Sociolinguistics and corpus linguistics. Edinburgh, Scotland: Edinburgh University Press.
Biber, D. & Jones, J. K. (2009). “Quantitative Methods in Corpus Linguistics.” In Lüdeling, A. and M. Kytö (eds.) Corpus Linguistics: An International Handbook Vol. 2. Berlin, New York: Mouton de Gruyter, 1286–1304.
Biber, D. (1988). Corpus linguistics. Cambridge, UK: Cambridge University Press.
Bunchutrakun, C., Lieungnapar, A., Wangsomchok, C., &Aeka, A. (2016). A corpus-based approach to learning a tour guide talk. International Journal of Humanities, Arts and Social Sciences, 2(2), 58-63. https://doi.org/10.20469/ijhss.2.20002-2 DOI: https://doi.org/10.20469/ijhss.2.20002-2
Cheng, W. (2012). Exploring corpus linguistics:Language in action. Oxford, UK: Routledge. https://doi.org/10.4324/9780203802632 DOI: https://doi.org/10.4324/9780203802632
Eddakrouri, A. (2016). Web-based (Searchable) corpora. Infoguistics. Retrieved from https://bit.ly/2Z5kv2r
Evert, S. (2009b). “Corpora and Collocations.” In Lüdeling, A. and M. Kytö (eds.) Corpus Linguistics: An International Handbook. Vol. 2. Berlin, New York: Mouton de Gruyter, 1212–1248.
Evert, S. (2009a). “Rethinking Corpus Frequencies.” Paper presented at the ICAME 30 Conference, Lancaster, May, 27-31.
Flood, B. J. (1999). Historical note: The start of a stop list at biological abstracts. JASIS, 50(12). https://doi.org/10.1002/(SICI)1097-4571(1999)50:12<1066::AID-ASI5>3.0.CO;2-A DOI: https://doi.org/10.1002/(SICI)1097-4571(1999)50:12<1066::AID-ASI5>3.0.CO;2-A
Kilgarriff, A. &Grefenstette, D. G. (2003). Introduction to the special issue on the web as corpus. Computational Linguistics, 29(3), 333–347. https://doi.org/10.1162/089120103322711569 DOI: https://doi.org/10.1162/089120103322711569
Luhn, H. P. (1960). Key word‐in‐context index for technical literature (kwic index). American Documentation, 11(4), 288-295. https://doi.org/10.1002/asi.5090110403 DOI: https://doi.org/10.1002/asi.5090110403
Madina, T., Sholpan, Z., Zhanar, K., Bekzhan, A., &Kuandy, K. (2017). Implementation of the official language policy and the linguistic reality in Astana, Kazakhstan. International Journal of Humanities, Arts and Social Sciences, 3(6), 264-274. https://doi.org/10.20469/ijhss.3.20003-6 DOI: https://doi.org/10.20469/ijhss.3.20003-6
McEnery, T. (1997). Multilingual corpora–current practice and future trends. In 13th ASLIB Machine Translation Conference (pp. 75-86), Helsinki, Finland.
McEnery, T., &Hardie, A. (2011). Corpus linguistics: Method, theory and practice. Oxford, UK: Cambridge University Press. https://doi.org/10.1017/CBO9780511981395 DOI: https://doi.org/10.1017/CBO9780511981395
McEnery, T., & Wilson, D. A. (1996). Corpus linguistics. Edinburgh, Scotland: Edinburgh University Press.
O’Mahony, C. T. (2018). An analysis of dialects and how they are neither linguistically superior nor inferior to one another. International Journal of Humanities, Arts and Social Sciences, 4(5), 221-226. https://doi.org/10.20469/ijhss.4.10004-5 DOI: https://doi.org/10.20469/ijhss.4.10004-5
Sasongko, J. (2010). Application to Build corpus from crawling data with various data formats automatically. Dynamic, 15(1).
Setiawan, T. (n.d.). Corpus linguistics in language teaching. Seminar NasionalPerspektifBaruPenelitianLinguistikTerapan, UniversitasNegeri Yogyakarta, Yogyakarta, Indonesia.
Setyawan, A. (2018a). Benefits of the frequency list in language corps. Retrieved from https://bit.ly/34xIQzd
Setyawan, A. (2018b). Understanding concordance and how to use it in a linguistic corpus. Retrieved from https://bit.ly/2Q9SwL9
Tognini-Bonelli, E. (2001). Corpus linguistics at work. Amsterdam, Netherlands: J. Benjamins Publishing. https://doi.org/10.1075/scl.6 DOI: https://doi.org/10.1075/scl.6
Veerachaisantikul, A. &Chansin, W. (2018). A corpus-based approach to lessons development for EFL reading course. Journal of Advances in Humanities and Social Sciences, 4(5), 197-205. https://doi.org/10.20474/jahss-4.5.1 DOI: https://doi.org/10.20474/jahss-4.5.1
Veerachaisantikul, A., &Chootarut, S. (2016). General vocabulary in Thai EFL university students’ writing: A corpus-based lexical study. Journal of Advanced Research in Social Sciences and Humanities, 1(1), 52-57. https://doi.org/10.26500/JARSSH-01-2016-0107 DOI: https://doi.org/10.26500/JARSSH-01-2016-0107
Wagner, J. &Nesselhauf, N. (2006). Collocations in a learner corpus. Amsterdam, Netherlands: John Benjamins. https://doi.org/10.1075/scl.14 DOI: https://doi.org/10.1007/s10590-007-9028-8
Wiechmann, D. (2008). “On the Computation of Collostruction Strength: Testing Measures of Association as Expressions of Lexical Bias.” Corpus Linguistics and Linguistic Theory 4(2): 253-290. https://doi.org/10.1515/CLLT.2008.011 DOI: https://doi.org/10.1515/CLLT.2008.011
Zipf, H. (1949). Human behaviours and the principle of least effort. Boston, MA: Addison-Wesley.