MONOPHTHONGISATION OF ENGLISH DIPHTHONGS /aɪ/, /eɪ / and /Ͻɪ/ BY NATIVE SPEAKERS OF PASHTO

Purpose of the study: Phonological variation in diphthongs, when spoken as a second or foreign language, is an accepted phenomenon. The diphthongs of English when pronounced by native speakers of Pashto go through certain changes and sometimes, monophthised. The purpose of the present study is to investigate phonological variation, i.e., monophthongisation of English diphthongs. Methodology: For this purpose, 20 Pashto speakers of both soft and hard dialects were taken and they were asked to pronounce the words having the target diphthongs at initial, medial and final position, followed and preceded by voiced and voiceless sounds. PRAAT was used to analyze the data to measure the possible variations in the sounds. Main Findings: The findings show that the phenomenon of monophthongisation of English diphthongs is common in the speaking of native speakers of Pashto. The study further shows that there is the lengthening of some of the sounds and deletion as well in certain contexts. This makes the variety of English spoken by the native speakers of Pashto a separate variety. Applications of this study: This study has applications for English language learners and teachers. The learners and teachers of English can benefit from this research and they can work on the problematic diphthongs that are usually monophthised. In this way, these sounds can be practised and the problems can be rectified. Novelty/Originality of this study: This is an original study where the problematic diphthongs have been considered and researched that how these English diphthongs are monophthised by the native speakers of Pashto making it a distinctive feature of the native Pashto speakers of English.


INTRODUCTION
English has attended the status of a global language, holding the world together and reducing the apparent distance of color and creed. Gone are the days when English was considered to be the language confined only to colonizers and the symbol of their supremacy. English, as described by Azad (2013) has been denationalized and does not remain the property of only British and Americans anymore. It has developed in the form of a code to link people of diverse cultural groups throughout the world. It is the only language that has more non-native speakers than native speakers. Only 3% of the total English-speaking population in the world is its native speakers while 97% are non-native speakers of English (Weeks, 1996). Kachru and Nelson (2006) examine that the total number of English users only in South Asia is up to 600 million whereas the number of native English speakers (the inner circle) is estimated up to 375 million and the number of English users in the outer circle is up to 1,000 million. The flash development and propagation of English to various parts of the globe have given birth to the phenomenon of World Englishes (WEs). In other words, World Englishes (WEs) refers to the emerging varieties of English, spoken in different parts of the world, influenced by indigenous languages. Kachru (1986) the founder and pioneer of the concept and term WEs, elaborates that English has become an international language and due to its exposure to various cultures and social setups, considerable variations can be observed in its varieties spoken in different parts of the world. Philipson (2008) highlights the same point that World Englishes may vary according to the culture or community where it is spoken, resulting in convergences with the concerned native languages of the people. The importance of the emerging of new varieties can be understood by their multifarious functions in diverse cultural groups and different communities. Kachru and Smith (2008) elaborate the term World Englishes that they represent the formal and functional variations in a language, and its worldwide acculturation, for example in West, South and East Africa, South Asia, South East Asia, West Indies, Philippines, and in the conventional English-using countries like USA, UK, Australia, Canada, and New Zealand. The language does not only belong to those who use it as the first language but even to those who use it as an additional language, whether in its standard or indigenous forms.
In the present day, the spread of English throughout the world is beyond expectation. Besides standard varieties of Englishes, e.g., British English, American English, Australian English, etc. and so many different varieties (local forms) like Polish English, Singaporean English, Malaysian English, and Thai English, etc. are also rapidly emerging, leading towards indigenization of English language. Besides, a broad term South Asian English is also there which refers to all For this purpose, one of such explorations is the change in the sound system. One such change of the sound systems is monophthongisation of English. Therefore, the present explores monophthongisation of English diphthongs /aɪ/, /eɪ/ and /Ͻɪ/ by Native Speakers of Pashto.

Research Objectives
The present study aims: 1. To identify commonly used variant English Diphthongs (used as monophthongs) by native Pashto speakers. 2. To explore the common features of the process of monophthongisation of English diphthongs by Pashto speakers.

LITERATURE REVIEW
During the exploration of learning English as a second language, Mousa (2015) worked on the acquisition of closing diphthongs /ǝʊ/ and /eI/. The target learners were Arab participants and Jamaican informants who were assumed to replace diphthongs with monophthongs in their use of English. 'They opt for /e:/ and /o:/ instead of /ǝʊ/ and /ei/' (Mousa, 2015, p.5). They also tend to lengthen the sound "Instead of coming up with the glide associated with the production of diphthongs, they chose to lengthen the equivalent of the second element of the diphthong concerned" (Mousa, 2015, p.7). The reason for this monophthongisation of diphthongs by Arabs and Jamaicans is the absence of centralized vowels in their vocalic system. This process of monophthongisation is not only confined to those whose native language is exempted from variant diphthongs but even native adult users of English also apply the same process for the reason that it requires less gymnastic or muscular movement as compared to the production of diphthongs. The same point is highlighted by Mousa when he says "diphthongs are more complex than monophthongs" (Mousa, 2015, p.12) and other long vowel sounds from the perspective of articulation. An important aspect of the diphthong is glide and its absence results in "hindrance in the production of diphthongs" (Mousa, 2015. p.12). Speakers tend to reduce diphthongs by simplifying them into simple monophthong by eliminating glide during their production. This elimination of glide minimizes the difference between monophthong and diphthong and hence the users (prefer to) produce monophthong in their use of English instead of the diphthong. The same point is highlighted by Moosmüller (1997) while exploring diphthongs in Austrian German. Moosmüller (1997) traces the history of this process (monophthongisation of diphthongs) and assumes its initiation around 1900. Since then, diphthongs are passing through the proposed process and losing their status as sounds in language, as asserted by Moosmüler "consequently diphthongs are excluded from the phoneme inventory of this variety" (Moosmüler, 1997, p.1). In this way, the pronunciations of words also change and speakers fail to utter diphthongs in their use of language. This exploratory study reflects that not only pure English is tended to be monophthongised but other varieties are also affected by the process of monophthongisation. Another study in this regard, monophthongisation by native speakers, is presented by Watt (2003). While exploring the internal and external forces causing a sound change in language, Watt (2003) examines the emergence of new forms and decline in the use of local forms. Analyzing the /ǝʊ/ and /ei/ diphthongs as in goat and gate, are localized by most of the speakers in the age between 16 and 25. They tend to localize these diphthongs by pronouncing them as monophthongs in their use of English. These diphthongs are replaced by /o:/ and /e:/ sounds. This tendency can be noticed explicitly in the speech of females, whose frequency of /o:/ and /e:/ were recorded 176 and 132 as compared to males, 127, 112, respectively. Watt (2003) asserts that Changes in Tyneside FACE and Goat are fairly typical of current changes reported to be taking place in Urban British English more widely and point to a general process of leveling whereby localized speech forms lose out to variants coming into a variety from elsewhere (Watt, 2003(Watt, , p.1623).
Changing the statement in reverse form, the process of monophthongisation is also not confined to the native varieties of English but even African and Asian Englishes are also influenced by the proposed process.
While working on the process of indigenization of English by native speakers of Shona (Zimbabwean), Kadengi (2009) explored the process of monophthongisation by simplifying diphthongs. He further explores that this process is applied by native Shona speakers who unconsciously developed Afrikaan in English. Kadengi (2009) also indicates the use of only five vowels (monophthongs) in Shona. In this regard, the use of monophthongs instead of diphthongs is because of the non-availability of complex vowel sounds i.e., diphthongs and triphthongs. While favoring the process of monophthongization, Kadenge regards it as a distinct quality of this verity by asserting that this underpins the legitimate owner of Zimbabweans (Kadengi, 2009). The words like cake, say, stair and chair, etc. are pronounced as /keki/, /sei/, /tʃɪ:/. One of the unique characteristics of Shona speakers (while using English) is the use of /j/ sound which is inserted to break the diphthong. The diphthong /oi/ as in oil is pronounced as /ojil/. The reason behind the use of this epenthetic /j/ sound instead of glide is the same place of articulation (include hard palate) for the production of /j/, /e/, and /ɪ/. This use of epenthetic vowel is not only confined to the use of /j/ sound but also include/w/ to break diphthongs and hence the diphthong /au/ is replaced by/aw/ during the pronunciation of English diphthongs. The word town /taun/ is pronounced as "/tawun/, poor as /pʊwǝ/". Furthermore, such variation in diphthongs also results in the division of syllables as asserted by Kadengi (2009) "the two components of diphthongs separated by the glide form two different syllable nuclei and hence the word is disyllabified" (Kadengi, 2009, p.163). In the process of Monophthongisation, only those sounds are retained which have their equal in Shona, and "L1 Shona speakers prefer their native vowels to English diphthongs in their spoken English" (Kadengi, 2009, p. 165). Furthermore, the absence of long vowels in Shona also disregards their use in English. Words containing long vowel sounds are pronounced with short vowel sounds e.g /flɪ:t/ as /fliti/, /pu:l/ as /pul/. While closely studying such words, one may also assume confusion while dealing with minimal pairs like pool and pull, full and fool, etc. The overall variations in this variety of English can be stated in three terms: monophthongisation, length reduction and glide epenthesis. These characteristics also contribute to the process of nativization of English in the stated context.
Keeping in view the reviewed literature, the importance of the process of monophthongisation of English diphthongs in a global context can be clearly deduced. The current study is not in any type of contradiction with the existing literature on the process of monophthongisation of English diphthongs but confirms the reviewed literature and adds something important and unique to the pool of World Englishes in general and Pakistani English in particular. Phonological variations in English, specifically from the perspective of diphthongs in a Pashto language context, have never been explored earlier. Furthermore, most of the attempts to describe Pakistani English is from the perspective of Urdu (Talat, 2002, Baumgardner 1993, Mahboob 2009, Rehman et al., 2012 or Punjabi (Riaz 2015, Hussain and Mahmood 2012) but the present research seeks to address English from the perspective of Pashto to highlight diphthongal variations in English which exist in Pashto speaking context. This study aims to describe the features of English used by Pashto speakers in their currently existing forms in a Pashto speaking context rather than to treat them as deviations from SE. Besides, the researchers have picked up either the linguistics or sociolinguistics aspect and very few of them have touched both of the aspects. It is deemed appropriate to investigate English diphthongs in the context of Pashto, because this will not only approach the issue from both linguistics and sociolinguistics dimensions but will study English diphthongs concerning a language that has not yet been studied and will highlight the specific features which have become localized and are no more considered as errors. Thus, the present study would be the first and foremost in its nature because English diphthongs have never been explored from the perspective of Pashto speakers. Identifying variations in the use of English diphthongs is the responsibility of local linguists and the present study is an enterprise towards it.

METHODOLOGY
The study is, basically, qualitative that used PRAAT for the analysis of data. It follows the process of Pillai (2014) to use the text grid function in PRAAT to write the words built on the target diphthongs. Pillai claims, that it is the presence or absence of glide that determines the status of sound as monophthong or diphthong. In order to find out the glide from one element to the other elements of a diphthong, the Rate of Change ROC formula was used as it is adopted by Pillai in the analysis of the diphthongs of Malaysian English. PRAAT 5.3 was used as an instrument for the analysis of variations in English diphthongs by Pashto speakers.
PRAAT is a freeware system for speech analysis and it does not only record the data but also analyzes it in terms of wavelength, spectrogram and formants. It is capable of recording the sounds of speech, reflects duration/time of sounds, shows formants, represents wavelength, transcribes the words from an audio file, and even is capable of printing and saving files. Diphthongs have a lot of formants movement on a spectrogram of sound files. Usually, in the measurement of monophthongs, the very beginning and end of the vowel are avoided and the focus is on those formants which seem to be flat and stable on the spectrogram for a while. Diphthongs are measured with their starting point. A spot that is closer to the beginning of the diphthong is chosen for measurement. The procedure of the measurement is given in detail in the data analysis section. The analysis of diphthongs is based on the spectrogram and formants of individual sound at different levels. The analysis of these diphthongs enables us to highlight the glide between the elements of diphthongs. As mentioned above, it is the presence or absence of glide that distinguishes diphthongs from monophthongs. The absence of glide in the use of English diphthongs leads to the monophthongisation of English diphthongs. Such variations in the use of English by native Pashto speakers are the main focus of the present study.
For developing a word list for the proposed study, words were structured based on the position of target diphthongs. The target diphthongs are placed at three positions: word-initial, word medial, and word-final levels. At the word-initial level, the target diphthong is placed at the beginning of the word e.g., /aɪ/ as in 'item' and /eɪ/ as in 'aim'. At the word medial level, the target diphthong is placed in the middle of the word e.g., /ɔi/ as in 'soil'. At the final level, the diphthong is placed at the end of the word. This structure of words based on the English diphthongs at initial, final, and medial levels provides a suitable environment for the phonological investigation of these proposed sounds and helps avoid ambiguity in their pronunciation. Such structures also ensure objectivity in the investigation of a language or its elements.
In order to avoid any kind of prejudice and subjectivity and to ensure the authenticity of data and the results of the present study, the proposed wordlist is not only deliberately structured on the positions of target diphthongs but these diphthongs are structured in a way where they are preceded and followed by voiced and voiceless sounds. The target diphthongs on one hand are followed by voiced sound in word list at word-initial level e.g., /aɪ/ in 'idol' and /eɪ/ as in 'aid', and proceeded by voiced at word-final and medial levels. On the other hand, the target diphthongs are followed by The pronunciation of each word by each subject was recorded separately using PRAAT. The target diphthong in each word, as mentioned above, was placed at three levels: initial, final, and medial preceded or/and followed by a voiceless consonant and the process of these structure of words are repeated again with voiced consonants preceded or followed by the target diphthong in order to help us remove ambiguity and mispronunciation in the production of each diphthong. Therefore, 20 Pashto speakers of both soft and hard dialects were taken and they were asked to pronounce the words having the target diphthongs at initial, medial, and final position, followed and preceded by voiced and voiceless sounds.
The following tables (1 and 2) represent word list (deliberately and conditionally structured) containing target diphthongs at initial, medial, and final levels preceded or followed by a voiceless consonant in one table (1) and voiced consonant in the other table (2) with few exceptions regarding their structure.

RESULTS AND DISCUSSION
In the following section, taking Pillai (2014), the data has been analyzed by considering the Rate of Change (ROC).
Screenshots showing wavelength, spectrogram, and first Formants (F1) and tables are given for each of the selected diphthongs. The ROC formula helps in showing the glide from one vowel to another. If the value of ROC is high, it means that there is more glide in the diphthong and if it is low, it means the glide is also less. The formula for calculating ROC is "F1end -F1start/Duration (seconds) = ROC (Hz/second)."

Source: Authors
The variations resulting in the pronunciation of native speakers of Pashto while pronouncing the selected diphthongs in the light of ROC are given here. The diphthong /ai/ ROC value is -1560, which suggest the highest diphthongal movement or glide from /a/ to /i/. As mentioned above, the negative F1 ROC values are the indicators of a rising trajectory or glide and show a movement from a lower target /a/ to a higher one /ɪ/. The pronunciation of /aɪ/ is not with any kind of variation at word-initial and final levels (e.g. item, idol, sky, and guy have no significant variation) in the speech of native Pashto speakers. However, at the word medial level the same diphthong is interrupted by the insertion of /j/. Words like /dait/ /nait/ and /bait/ are pronounced as /dajit/ /najit/ and /bajit/.
The spectrogram clearly indicates a rising trajectory with the interruption of /j/ as there is a wide gap between the glide and the last sound of the word bite /t/ in the above spectrogram. The interruption of /j/ between the elements of diphthongs thus indicates variation in the form of monophthongisation of a diphthong. This variation is anyhow, not necessarily at the word medial level. If the example of /taim/ is taken into consideration so there is no insertion of /j/ sound in its pronunciation at the medial level by Pashto speakers.
The word time is pronounced with normal diphthongal movement and the glide from /a/ to /i/ is clearly observable on the spectrogram of /taim/. This analysis of /ai/ in different words at medial position shows that the insertion of /j/ between the elements of /ai/ depends on the following sound. The observation of most of the words built on the target diphthong at medial level (not only from word list but other words as well) shows that if the target diphthong is immediately followed by /t/ and /d/, it is pronounced with variation in the form of the insertion of /j/. Words built on /ai/ at a medial level like bite, kite, diet, fight, guide, night, height, light, might, right, sight, tight, glide, hide, bride and white, etc. are produced with the insertion of/j/ between the elements of diphthongs. Whereas words built on the same diphthong at the medial level immediately followed by sounds other than /t/ and /d/like, file, dial, giant, lime, mine, pile, time and vine, etc. are pronounced without the insertion of /j/. However, the result related to this diphthong at the medial level cannot be confined to the investigation and analysis of only these words. There may be words based on /aɪ/ at medial level followed by sounds other than /t/ and /d/ like pipe and price with the same variation (insertion of /j/) and sometimes words even with /d/ are pronounced without the insertion of /j/ as in kind, bind and blind, etc. In the light of the above examples /aɪ/ at medial level, pronounced with variation needs further investigation to get the accurate result. The diphthong /eɪ/ ROC value is the least in the above table which shows no or very little diphthongal movement in the pronunciation of Pashto speakers.
The negative F1 ROC of /ei/ in the below-mentioned table indicates a rising trajectory but the value is so less that no glide can be observed in the spectrogram of this diphthong. Instead of creating glide, the native Pashto speakers tend to pronounce it as monophthong and with a bit longer length. As mentioned earlier, it is the presence or absence of glide that labels a sound as monophthong or diphthong. In the case of /ei/ glide is totally eliminated by Pashto speakers in the production of this English diphthong. As a result, we have monophthong instead of English diphthong.

Source: Authors
The diphthong /ɔi/ has a positive F1 ROC which indicates a downward glide from /ɔ/ to /ɪ/. The ROC value is also higher which represents prominent and noticeable glide. The trajectory from /ɔ/ to /ɪ/ is maintained in the production of words built on /ɔɪ/ as in oil, coin, and boil, etc. The spectrogram of the word "boil" built on the proposed diphthong /ɔɪ/ with its formants and wavelength is given below: Figure 4: The spectrogram, wavelength, text grid, and F1 of /ɔɪ/ in "boil"

Source: Authors
In the above spectrogram, the glide between the two elements /ɔ/ and /ɪ/ of the proposed diphthong /ɔɪ/ is clearly visible hence indicating the production of diphthong in its proper condition. Pashto speakers tend to produce this target English diphthong with glide while using the English language. However, the proper maintenance of glide in /ɔɪ/ is not perfectly guaranteed in the English speech of native Pashto speakers. In this regard, the word 'point' is produced with unique variations and once again represents the process of monophthonguisation of English diphthong.
The analysis of words built on this target sound shows that the diphthong is pronounced with a downward trajectory by Pashto speakers. However, there is also an exception (in the form of variation) in the pronunciation of words based on this diphthong with a glide. The word 'point' is pronounced with the insertion of /w/ before the target diphthong which eliminates the direct glide between the elements of diphthong and then /ɔɪ/ is changed in to /aɪ/ so instead of /pɔint/ the Pashto speakers tend to pronounce it as /pwaint/ but this is not the case with other words built on same diphthongs e.g., boil, joy or oil, etc. These words are pronounced with a normal glide from the first element of the diphthong to the second element without the interruption of any other sound. Therefore, variations in this diphthong may be controversial and need further investigation.  The spectrogram, wavelength, text grid, and F1 of /ɔɪ/ in "point"

Source: Authors
The process of monophthongisation is also demonstrated in the form of words and their analysis given in the following table:

Source: Authors
Based on the examples obtained from the analysis of the above word list, the most salient features (specifically the monophthongisation of English diphthongs) of the pronunciation of these participants (representatives of Pashtuns) that differentiate it from other varieties of English are discussed as follows.
The analysis of spectrograms, formants, and investigation of standard transcription and variant transcription reflects specifically the process of monphthongisation of English diphthongs by native Pashto speakers in KP (Khyber Pakhtunkhwa). Pashto speakers tend to pronounce all three English diphthongs with variations at different levels (initial, medial and final). One of the most common variant diphthongs is /eɪ/ which is pronounced as /e:/ by native Pashto speakers in their use of English words build on this diphthong. In this way, the English diphthong /eɪ/ is converted into a monophthong by the omission of the second element of the stated diphthong in accordance with the findings of Utulu who states that in the process of monophthongization of English diphthong, "one of the two elements, usually the second element, is deleted" (Utulu, 2014, p.1). The findings are also in accordance with Lorenson (1991) where he claimed long before about the shift of some English sounds. This shift is more prominent in the other varieties of English.
/w/ is sometimes inserted between the elements of /ɔɪ/ as in point which is pronounced as /pwaint/ by Pashto speakers.
Here in this word, Pashto speakers not only insert /w/ but also change /ɔ/ into /a/ sound. Thus, the pronunciation of point becomes/pwaint/ by Pashto speakers.