INVESTIGATING ARABIC CORPUS (KorSA) OF INDONESIAN UNDERGRADUATE THESIS ABSTRACTS

Mohammad Ahsanuddin; Ali Maâ€™sum; Nur Anisah Ridwan

doi:10.18510/hssr.2020.8396

Issue

Vol. 8 No. 3 (2020): May

Issue Published : May 8, 2020

Authors retain the copyright without restrictions for their published content in this journal. HSSR is a SHERPA ROMEO Green Journal.

Publishing License

This is an open-access article distributed under the terms of

INVESTIGATING ARABIC CORPUS (KorSA) OF INDONESIAN UNDERGRADUATE THESIS ABSTRACTS

https://doi.org/10.18510/hssr.2020.8396

Mohammad Ahsanuddin

Department of Arabic Literature, Faculty of Letters, Universitas Negeri Malang, Indonesia

Ali Maâ€™sum

Department of Arabic Literature, Faculty of Letters, Universitas Negeri Malang, Indonesia

Nur Anisah Ridwan

Department of Arabic Literature, Faculty of Letters, Universitas Negeri Malang, Indonesia

Corresponding Author(s) : Mohammad Ahsanuddin

[email protected]

Humanities & Social Sciences Reviews, Vol. 8 No. 3 (2020): May
Article Published : June 17, 2020

Abstract

Purpose: This study was designed to unveil Arabic corpus written in the Indonesian undergraduate thesis abstracts.

Methodology: Experimental and descriptive approaches were employed to elicit data which were in the forms of isim (noun), fiâ€™il (verb), concordances, and idioms collected from 59 thesis abstracts written by undergraduate students of Universitas Negeri Malang UIN Maulana Malik Ibrahim Malang, both are state-owned universities based in East Java, Indonesia.

Principal Findings: The results of this study informed that two steps were carried out to craft an Arabic corpus for undergraduate theses, such as enacting need analysis and designing the model of Arabic corpus. Furthermore, this study also uncovered that the main page of the corpus website was concordances and word frequency.

Implications/Applications: If in the context of translating the Qur'an into English only, for example, there is not yet a parallel corpus model of the Qur'an. Furthermore, similar models in other languages, such as the Indonesian context are possible.

Novelty/Originality of this study: This research is the first step towards the formation of the first model of the parallel Qur'an corpus and its translation in the Indonesian language. Besides, this model can later be used as a reference for the preparation of other identical corpus models in the context of bilingual bodies or more for the benefit of translation research.

Keywords

Corpus Arabic Language Undergraduate Thesis Concordance Word Frequency Indonesian Language

Ahsanuddin, M., Maâ€™sum, A., & Ridwan, N. A. (2020). INVESTIGATING ARABIC CORPUS (KorSA) OF INDONESIAN UNDERGRADUATE THESIS ABSTRACTS. Humanities & Social Sciences Reviews, 8(3), 920–927. https://doi.org/10.18510/hssr.2020.8396

Download Citation

References

Ahsanuddin, M. (2018). Tashmim Al-mudawwanah Al-Mutawaziyah Li mustakhlash Al-buhuts Al-Ilmiah Al-Indunisiya Al-Arabiyah â€˜AlaDhauiNadzariyah Mona Baker Li Al-Takafuâ€™ Al-Lughawi Fi Al-Tarjamah. UIN Maulana Malik Ibrahim Malang.
Alfaifi, A. & Atwell, E. (2016). Comparative evaluation of tools for Arabic corpora search and analysis. International Journal of Speech Technology, 9(2): 347-357. https://doi.org/10.1007/s10772-015-9285-5 DOI: https://doi.org/10.1007/s10772-015-9285-5
Anthony, L. (2013). A critical look at software tools in corpus linguistics. Linguistic Research, 30(2), 141â€“61. https://doi.org/10.17250/khisli.30.2.201308.001 DOI: https://doi.org/10.17250/khisli.30.2.201308.001
Atwell, E., Al-Sulaiti, L., Al-Osaimi, S. & Abu Shawar, B. (2004). A review of Arabic corpus analysis tools. In B. Bel& I. Marlien (Eds.), Proceedings of TALN04: XI Conference sur le TraitementAutomatique des LanguesNaturelles (volume 2, pp. 229â€“234). ATALA.
Baker, P. (2010). Sociolinguistics and corpus linguistics. Edinburgh, Scotland: Edinburgh University Press.
Biber, D. & Jones, J. K. (2009). â€œQuantitative Methods in Corpus Linguistics.â€ In LÃ¼deling, A. and M. KytÃ¶ (eds.) Corpus Linguistics: An International Handbook Vol. 2. Berlin, New York: Mouton de Gruyter, 1286â€“1304.
Biber, D. (1988). Corpus linguistics. Cambridge, UK: Cambridge University Press.
Bunchutrakun, C., Lieungnapar, A., Wangsomchok, C., &Aeka, A. (2016). A corpus-based approach to learning a tour guide talk. International Journal of Humanities, Arts and Social Sciences, 2(2), 58-63. https://doi.org/10.20469/ijhss.2.20002-2 DOI: https://doi.org/10.20469/ijhss.2.20002-2
Cheng, W. (2012). Exploring corpus linguistics:Language in action. Oxford, UK: Routledge. https://doi.org/10.4324/9780203802632 DOI: https://doi.org/10.4324/9780203802632
Eddakrouri, A. (2016). Web-based (Searchable) corpora. Infoguistics. Retrieved from https://bit.ly/2Z5kv2r
Evert, S. (2009b). â€œCorpora and Collocations.â€ In LÃ¼deling, A. and M. KytÃ¶ (eds.) Corpus Linguistics: An International Handbook. Vol. 2. Berlin, New York: Mouton de Gruyter, 1212â€“1248.
Evert, S. (2009a). â€œRethinking Corpus Frequencies.â€ Paper presented at the ICAME 30 Conference, Lancaster, May, 27-31.
Flood, B. J. (1999). Historical note: The start of a stop list at biological abstracts. JASIS, 50(12). https://doi.org/10.1002/(SICI)1097-4571(1999)50:12<1066::AID-ASI5>3.0.CO;2-A DOI: https://doi.org/10.1002/(SICI)1097-4571(1999)50:12<1066::AID-ASI5>3.0.CO;2-A
Kilgarriff, A. &Grefenstette, D. G. (2003). Introduction to the special issue on the web as corpus. Computational Linguistics, 29(3), 333â€“347. https://doi.org/10.1162/089120103322711569 DOI: https://doi.org/10.1162/089120103322711569
Luhn, H. P. (1960). Key wordâ€inâ€context index for technical literature (kwic index). American Documentation, 11(4), 288-295. https://doi.org/10.1002/asi.5090110403 DOI: https://doi.org/10.1002/asi.5090110403
Madina, T., Sholpan, Z., Zhanar, K., Bekzhan, A., &Kuandy, K. (2017). Implementation of the official language policy and the linguistic reality in Astana, Kazakhstan. International Journal of Humanities, Arts and Social Sciences, 3(6), 264-274. https://doi.org/10.20469/ijhss.3.20003-6 DOI: https://doi.org/10.20469/ijhss.3.20003-6
McEnery, T. (1997). Multilingual corporaâ€“current practice and future trends. In 13th ASLIB Machine Translation Conference (pp. 75-86), Helsinki, Finland.
McEnery, T., &Hardie, A. (2011). Corpus linguistics: Method, theory and practice. Oxford, UK: Cambridge University Press. https://doi.org/10.1017/CBO9780511981395 DOI: https://doi.org/10.1017/CBO9780511981395
McEnery, T., & Wilson, D. A. (1996). Corpus linguistics. Edinburgh, Scotland: Edinburgh University Press.
Oâ€™Mahony, C. T. (2018). An analysis of dialects and how they are neither linguistically superior nor inferior to one another. International Journal of Humanities, Arts and Social Sciences, 4(5), 221-226. https://doi.org/10.20469/ijhss.4.10004-5 DOI: https://doi.org/10.20469/ijhss.4.10004-5
Sasongko, J. (2010). Application to Build corpus from crawling data with various data formats automatically. Dynamic, 15(1).
Setiawan, T. (n.d.). Corpus linguistics in language teaching. Seminar NasionalPerspektifBaruPenelitianLinguistikTerapan, UniversitasNegeri Yogyakarta, Yogyakarta, Indonesia.
Setyawan, A. (2018a). Benefits of the frequency list in language corps. Retrieved from https://bit.ly/34xIQzd
Setyawan, A. (2018b). Understanding concordance and how to use it in a linguistic corpus. Retrieved from https://bit.ly/2Q9SwL9
Tognini-Bonelli, E. (2001). Corpus linguistics at work. Amsterdam, Netherlands: J. Benjamins Publishing. https://doi.org/10.1075/scl.6 DOI: https://doi.org/10.1075/scl.6
Veerachaisantikul, A. &Chansin, W. (2018). A corpus-based approach to lessons development for EFL reading course. Journal of Advances in Humanities and Social Sciences, 4(5), 197-205. https://doi.org/10.20474/jahss-4.5.1 DOI: https://doi.org/10.20474/jahss-4.5.1
Veerachaisantikul, A., &Chootarut, S. (2016). General vocabulary in Thai EFL university studentsâ€™ writing: A corpus-based lexical study. Journal of Advanced Research in Social Sciences and Humanities, 1(1), 52-57. https://doi.org/10.26500/JARSSH-01-2016-0107 DOI: https://doi.org/10.26500/JARSSH-01-2016-0107
Wagner, J. &Nesselhauf, N. (2006). Collocations in a learner corpus. Amsterdam, Netherlands: John Benjamins. https://doi.org/10.1075/scl.14 DOI: https://doi.org/10.1007/s10590-007-9028-8
Wiechmann, D. (2008). â€œOn the Computation of Collostruction Strength: Testing Measures of Association as Expressions of Lexical Bias.â€ Corpus Linguistics and Linguistic Theory 4(2): 253-290. https://doi.org/10.1515/CLLT.2008.011 DOI: https://doi.org/10.1515/CLLT.2008.011
Zipf, H. (1949). Human behaviours and the principle of least effort. Boston, MA: Addison-Wesley.

References

Ahsanuddin, M. (2018). Tashmim Al-mudawwanah Al-Mutawaziyah Li mustakhlash Al-buhuts Al-Ilmiah Al-Indunisiya Al-Arabiyah â€˜AlaDhauiNadzariyah Mona Baker Li Al-Takafuâ€™ Al-Lughawi Fi Al-Tarjamah. UIN Maulana Malik Ibrahim Malang.

Alfaifi, A. & Atwell, E. (2016). Comparative evaluation of tools for Arabic corpora search and analysis. International Journal of Speech Technology, 9(2): 347-357. https://doi.org/10.1007/s10772-015-9285-5 DOI: https://doi.org/10.1007/s10772-015-9285-5

Anthony, L. (2013). A critical look at software tools in corpus linguistics. Linguistic Research, 30(2), 141â€“61. https://doi.org/10.17250/khisli.30.2.201308.001 DOI: https://doi.org/10.17250/khisli.30.2.201308.001

Atwell, E., Al-Sulaiti, L., Al-Osaimi, S. & Abu Shawar, B. (2004). A review of Arabic corpus analysis tools. In B. Bel& I. Marlien (Eds.), Proceedings of TALN04: XI Conference sur le TraitementAutomatique des LanguesNaturelles (volume 2, pp. 229â€“234). ATALA.

Baker, P. (2010). Sociolinguistics and corpus linguistics. Edinburgh, Scotland: Edinburgh University Press.

Biber, D. & Jones, J. K. (2009). â€œQuantitative Methods in Corpus Linguistics.â€ In LÃ¼deling, A. and M. KytÃ¶ (eds.) Corpus Linguistics: An International Handbook Vol. 2. Berlin, New York: Mouton de Gruyter, 1286â€“1304.

Biber, D. (1988). Corpus linguistics. Cambridge, UK: Cambridge University Press.

Bunchutrakun, C., Lieungnapar, A., Wangsomchok, C., &Aeka, A. (2016). A corpus-based approach to learning a tour guide talk. International Journal of Humanities, Arts and Social Sciences, 2(2), 58-63. https://doi.org/10.20469/ijhss.2.20002-2 DOI: https://doi.org/10.20469/ijhss.2.20002-2

Cheng, W. (2012). Exploring corpus linguistics:Language in action. Oxford, UK: Routledge. https://doi.org/10.4324/9780203802632 DOI: https://doi.org/10.4324/9780203802632

Eddakrouri, A. (2016). Web-based (Searchable) corpora. Infoguistics. Retrieved from https://bit.ly/2Z5kv2r

Evert, S. (2009b). â€œCorpora and Collocations.â€ In LÃ¼deling, A. and M. KytÃ¶ (eds.) Corpus Linguistics: An International Handbook. Vol. 2. Berlin, New York: Mouton de Gruyter, 1212â€“1248.

Evert, S. (2009a). â€œRethinking Corpus Frequencies.â€ Paper presented at the ICAME 30 Conference, Lancaster, May, 27-31.

Flood, B. J. (1999). Historical note: The start of a stop list at biological abstracts. JASIS, 50(12). https://doi.org/10.1002/(SICI)1097-4571(1999)50:12<1066::AID-ASI5>3.0.CO;2-A DOI: https://doi.org/10.1002/(SICI)1097-4571(1999)50:12<1066::AID-ASI5>3.0.CO;2-A

Kilgarriff, A. &Grefenstette, D. G. (2003). Introduction to the special issue on the web as corpus. Computational Linguistics, 29(3), 333â€“347. https://doi.org/10.1162/089120103322711569 DOI: https://doi.org/10.1162/089120103322711569

Luhn, H. P. (1960). Key wordâ€inâ€context index for technical literature (kwic index). American Documentation, 11(4), 288-295. https://doi.org/10.1002/asi.5090110403 DOI: https://doi.org/10.1002/asi.5090110403

Madina, T., Sholpan, Z., Zhanar, K., Bekzhan, A., &Kuandy, K. (2017). Implementation of the official language policy and the linguistic reality in Astana, Kazakhstan. International Journal of Humanities, Arts and Social Sciences, 3(6), 264-274. https://doi.org/10.20469/ijhss.3.20003-6 DOI: https://doi.org/10.20469/ijhss.3.20003-6

McEnery, T. (1997). Multilingual corporaâ€“current practice and future trends. In 13th ASLIB Machine Translation Conference (pp. 75-86), Helsinki, Finland.

McEnery, T., &Hardie, A. (2011). Corpus linguistics: Method, theory and practice. Oxford, UK: Cambridge University Press. https://doi.org/10.1017/CBO9780511981395 DOI: https://doi.org/10.1017/CBO9780511981395

McEnery, T., & Wilson, D. A. (1996). Corpus linguistics. Edinburgh, Scotland: Edinburgh University Press.

Oâ€™Mahony, C. T. (2018). An analysis of dialects and how they are neither linguistically superior nor inferior to one another. International Journal of Humanities, Arts and Social Sciences, 4(5), 221-226. https://doi.org/10.20469/ijhss.4.10004-5 DOI: https://doi.org/10.20469/ijhss.4.10004-5

Sasongko, J. (2010). Application to Build corpus from crawling data with various data formats automatically. Dynamic, 15(1).

Setiawan, T. (n.d.). Corpus linguistics in language teaching. Seminar NasionalPerspektifBaruPenelitianLinguistikTerapan, UniversitasNegeri Yogyakarta, Yogyakarta, Indonesia.

Setyawan, A. (2018a). Benefits of the frequency list in language corps. Retrieved from https://bit.ly/34xIQzd

Setyawan, A. (2018b). Understanding concordance and how to use it in a linguistic corpus. Retrieved from https://bit.ly/2Q9SwL9

Tognini-Bonelli, E. (2001). Corpus linguistics at work. Amsterdam, Netherlands: J. Benjamins Publishing. https://doi.org/10.1075/scl.6 DOI: https://doi.org/10.1075/scl.6

Veerachaisantikul, A. &Chansin, W. (2018). A corpus-based approach to lessons development for EFL reading course. Journal of Advances in Humanities and Social Sciences, 4(5), 197-205. https://doi.org/10.20474/jahss-4.5.1 DOI: https://doi.org/10.20474/jahss-4.5.1

Veerachaisantikul, A., &Chootarut, S. (2016). General vocabulary in Thai EFL university studentsâ€™ writing: A corpus-based lexical study. Journal of Advanced Research in Social Sciences and Humanities, 1(1), 52-57. https://doi.org/10.26500/JARSSH-01-2016-0107 DOI: https://doi.org/10.26500/JARSSH-01-2016-0107

Wagner, J. &Nesselhauf, N. (2006). Collocations in a learner corpus. Amsterdam, Netherlands: John Benjamins. https://doi.org/10.1075/scl.14 DOI: https://doi.org/10.1007/s10590-007-9028-8

Wiechmann, D. (2008). â€œOn the Computation of Collostruction Strength: Testing Measures of Association as Expressions of Lexical Bias.â€ Corpus Linguistics and Linguistic Theory 4(2): 253-290. https://doi.org/10.1515/CLLT.2008.011 DOI: https://doi.org/10.1515/CLLT.2008.011

Zipf, H. (1949). Human behaviours and the principle of least effort. Boston, MA: Addison-Wesley.

Author biographies is not available.