INVESTIGATING EXISTENTIAL INTERPRETATION OF WAITING FOR GODOT: A CORPUS-BASED STUDY OF LEXICAL FEATURES

Purpose of the study: In this study, Samuel Beckett's dramatic writings' major aspects and linguistic qualities are examined, focusing on Waiting for Godot. The importance of these texts' linguistic characteristics has also been investigated. The study also looks into whether linguistic interpretations of Waiting for Godot are compatible with Existential readings. Methodology: Computational techniques such as UAMCT, MAT, SUAS, and AntConc were used to analyze the data. However, UAMTC was employed as the primary tool, and the other techniques were only used to verify the results' validity and complement specific areas of analysis that UAMCT lacked. Main Findings: Samuel Beckett's dramatic works are a linguistic paradox, lexically simple but structurally complicated, according to our linguistic analysis. Waiting for Godot's linguistic elements develop themes such as "Pessimism," "Directionlessness," "Skepticism," "Nothingness," "Existence," "Ambivalence," "Boredom," and "Alienation," These topics are consistent with Waiting for Godot's Existential interpretations. Applications of this study: Applications of this study reside on its far-reaching pedagogical consequences for literature and language. It is extremely important for students and teachers of English Language and Literature and syllabus designers who deal with literature. Originality/ Novelty: The play has a linguistic spontaneity of Existential themes. In a nutshell, in Waiting for Godot, Beckett has not told the predicament of Existence but made it happen linguistically.


INTRODUCTION
Samuel Beckett is regarded as a representative modern dramatist. Because of his use of experimentation and inventiveness in his expression, he is known as "the last modernist." The famous phrase "making it new" by Ezra Pound is regarded as a key aspect of modernism (Alexander, 1979). Modernism is a broad term that incorporates a wide range of multi-faceted twentieth-century trends such as "Expressionism," "Dadaism," "Futurism," "Symbolism," and all other movements centered on the concept of "unpredictability" and "experimentation" (McDonald, 2006).
Incorporating such disparate beliefs into a single faculty of modernism becomes fairly problematic, but apart from that, a solid notion that it carves up is its cultural and social realizations of the contemporary period are characterized as "modern." Despite a variety of inspirations, Samuel Beckett's work has a distinct voice (Afridi, 2020). In Eliot's words, I should add that his aesthetic genius and "unique talent" transcended all expectations and established him as literature's intuition. His groundbreaking works questioned and forever altered literary and theatrical paradigms. His writings are used in introductory and advanced literature classes all across the world. Without understanding Beckett, a study of literature is incomplete. It's surprising that, in this age of objectivity, so few efforts have been made to use objective approaches of analysis of his writings. Text analysis, stylistic analysis, and discourse analysis have all achieved incredible advances (Can, 2017). The analyses of Samuel Beckett's works lack linguistic depth. Linguistic examination of Beckett's writings may offer up new vistas and a spectrum of objective interpretations, leading to a greater understanding of his works based on more solid empirical findings; otherwise, his works will remain a puzzle as they always have been (Cronin, 1999). Cronin (1999) has portrayed the enigmatic personality of Beckett and his uneventful personal life and highly imaginative creative art. Weisberg (2000) has shown the influence of shifting modernist trends about the social function of literature as a formative factor in shaping up Beckett's personality as a writer. Cohn (2001) has dealt comprehensively with all genres of Beckett and the difficulty that might arise once one picks any work of Beckett. Beckett's choice of words shows that language has materiality and, consequently, a history (Salisbury, 2008). Pattie (2009) has researched the fundamental nature of high and darkness in his world and the relationship of this ambivalence of light and awakens the quest for a credible reality that might be approached after stripping away all riles of illusion. His TV plays have been traced to have an enhanced textual awareness (Hulle, 2011). William and Taylor (2011) have used Beckett's marginalia and proposed that their study can serve as a milestone in further research in this area. Bell (2011) has viewed Beckett's works as constituting a dynamic dialogue with his aesthetic of exclusion. Woycicki (2012) has shown growing plenty of mathematical patterns and structures of Beckett's plays dominating both the content and form has demonstrated that Beckett shares a similar approach towards a logographic representation of a record, in other words. Paraskeva (2013) has suggested alternative cross-links between Becketts' works for stage and screen, despite his often-cited inflexibility on the matter of intermodal adaptation. Finally, Tubridy (2014) has examined Beckett in terms of Performance Art which leads to reconsidering elements vital to his theatres, such as the experience of the body in space about duration and endurance, the role of repetition, reiteration, and rehearsal, and the visceral interplay between language and body.
Several researchers, on the other hand, have looked into the linguistic aspects of his writings. Carriere (2005) investigated Beckett's bilingualism and its impact on his subsequent plays. She looked into his writing style in relation to All That Fall and Happy Days and discovered that his native language influenced Beckett's writing style. Sikorska (1994) conducted a pragmatic analysis of Endgame's text and discovered that Beckett's characters' speaking patterns enabled them to express the force of their existence. Laws (1996) investigated the relationship between music and language in Beckett's works and showed the patterns of phonological elements used to compose music. Computational techniques have been used to analyze linguistic aspects of text since the beginning of the twenty-first century, and these technologies have yielded highly useful objective knowledge. UAMCT is the most useful and productive tool for analyzing text layers. Researchers have used UAMCT to investigate a number of topics. Authentic learner data collected from UAMCT is used to document word order errors with adverbs, while Fryer (2013) annotates 100 research papers totaling 700,000 words from the most prestigious medical journals. With the help of UAMCT, a sample of 164,000 words from that dataset was annotated for heteroglossia features. Subjects' infinite statements were tagged using UAMCT software for features including syntax, animacy, information status, number, and paradigmatically.
USAS is another very useful computational tool to study the semantic features of a text. Many studies in different areas have been conducted using USAS, including metaphor in discourse, multi-word extraction, translation studies, to name but a few in the area of literary texts (Krennmayr, 2011).
Multidimensional Analysis Tagger (MAT) is a multidimensional tagger to explore the linguistic features of a text. This has also been applied to many texts saying literary text like Beckett's dramatic text.
The approaches discussed above have produced substantial knowledge about different types of text, but researchers find a very relatively uncommon application of such methods to the study of literary text, particularly unique texts of Beckett's plays such as Waiting for Godot; thus, there is a gap, which the current study intends to fill.

Statement of the Problem
Samuel Beckett has produced a lot that is very popular in the genre of Drama, but he established his ways to impart his genius. Due to their linguistic compositions, Beckett's works remained a mystery. In the last few decades, several critical versions of his works have been published, but only a few significant research studies have focused on the linguistic features of his dramatic writings. This study is designed to describe the use of lexical features as employed in Waiting for Godot. This study examined lexical features to draw an overall picture of the nature possessed in Waiting for Godot by Beckett. Most of Beckett's critics have adjudged his plays predominantly soaked in Existentialist spirit. The present research explored this very theme of the conformity of linguistic interpretation with the interpretation of the play grounded in Existential philosophy by investigating lexical features in the play Waiting for Godot. For this very purpose, computational tools are employed in this work to highlight lexical features and their conformity with the existential interpretation of the text.

Aim of the Study
This research aims at exploring the lexical features of the play. The study also aims at examining the conformity of linguistic interpretation with the Existential interpretation of Waiting for Godot. The present study, thus, set out to answer the following research questions: • What are the dominant lexical features of the text of the play? • To what extent do the results of linguistic interpretation conform to the main tenants of Existentialism?

Significance
Our study broadens the scope of Samuel Beckett's literary works and emphasizes the innovative aspect of his work. It establishes some firm foundations for the most impartial interpretation of Beckett's works ever known to studies of Beckett. The present research is the study of how Beckett created Experiential meanings in his play; the research may also be important for professional writers. It has far-reaching pedagogical consequences for literature and language. Moreover, this is extremely important for students and teachers of English Language and Literature. The research is also significant for syllabus designers who deal with literature.

METHODOLOGY
The data in the case of the present study is the text of Samuel Beckett's Waiting for Godot. As indicated in the prefatory lines above, the analysis of the text of these plays mainly relies on computational tools. The data are obtained in electronic form in order to use above mentioned tools. Typically, corpus software works with text files in the.txt format. As a result, the.txt format is needed for data collection. The text of the play was exported into PDF format, and then it was converted to.txt format.
In the next stage, we edit the text in the necessary format. The play's text contains stage directions, meta-comments, and other items written by the author that were removed. Moreover, the characters' names listed just before dialogue were omitted while their names which appeared in dialogues, were left as those were important parts of Transitivity as Participants. The rest of the text, which was ready for software processing, consisted of dialogues between the play's characters. However, two separate texts, edited and unedited, were saved and analyzed separately. During the data analysis process, it was observed that two words, Silence and Pause, occurred very frequently as part of stage direction notes.
For the present study, the following software was used to analyze the texts of Waiting for Godot: UAM Corpus Tool: 3, USAS: the current version, available online, MAT: 1.1, AntConc:3.2.4w.

Sampling
A random sampling technique was adopted to study the text of the play. A sample proportionate to the size of the text of the play was selected randomly from three portions (beginning, middle, and end) of this play.

Piloting
Before formal data analysis, a pilot project was designed and implemented to study any problems or deficiencies in data analysis. During piloting, it was found that UAMCT did not analyze subjectless passive constructions like, 'nothing to be done. To solve this issue, MAT was used to identify the subjectless passive construction. Moreover, UMACT provided lexical information about the texts under analysis, this information was sufficient to draw some themes from the text, but it was not sufficient enough to draw conclusive themes. Some technical problems were also identified during the pilot project. The most important issue was to retrieve data in printable form. UAMCT generated a lot of data, but only descriptive statistical readings and Ideational coding could be printed. Actual tables of analysis could not be printed directly. These difficulties were resolved by devising alternative ways of extracting authentic data from the tool for final analysis. Piloting helped a lot in refining the final data analysis.

DATA ANALYSIS TOOLS AND PROCEDURE
The corpus-based analysis of the text of Waiting for Godot was conducted through four Softwares: UAMCT, USAS online English semantic tagger, MAT, and AntConc.

UAM CORPUS TOOL (UAMCT)
UAMCT (http://www.wagsoft.com/CorpusTool/download.html) is an internet platform for analyzing language data; a prominent computational linguist, Mick O'Donnell, has designed this platform. Linguistic features are organized in terms of a systemic network --an inheritance hierarchy --to reduce the amount of coding effort. A function called 'Explore' is available that allows you to search for already analyzed data in UAMCT. In addition, the UAMCT provides frequencies and significance tests. In the Explore function, the researcher has two choices: to explore lexis (word counts, keywords, 2-6 words long phrases, etc.) and features (e.g., frequencies of the desired features, i.e., Ideational, Interpersonal, and Textual).
The results can be obtained in this tool in 4 different forms: This software automatically provides the frequency of linguistic features out of 100 words. However, in Biber's studies and those by other researchers, the counts have been normalized to 1000. Therefore, any comparison among varieties requires that the counts be made out of the standardized occurrence per thousand words. Following this tradition, this research has also used a normalized count per 1000 words.
Operating MAT is relatively easy than UAM. First, the data is put in a folder. Then, as the tool is activated, it asks for the input folder. As we browse the input folder (the one that contains the target text, in our case, the .txt file of Waiting for Godot), the software starts running. The software automatically provides the following results: frequency counts of linguistic features (per 100 words), dimension scores, and plots placing the target text relative to other texts on different textual dimensions.

ANTCONC
AntConc, developed by Laurence Anthony of Waseda University, Japan, in contrast to UAM, USAS, and MAT, is corpus search software that works on tagged and untagged data and helps researchers find out the desired information. The use of this tool keeps on increasing with every passing day, and it is virtually impossible to guess the number of corpus linguists who benefit from it every day. AntConc is used mainly for the creation of concordance for words/phrases, word lists, clusters, and collocation. The current research draws on the first three of these. The analysis in AntConc was performed on the untagged text. Operating AntConc is very easy. After opening, it asks for browsing the file(s) to be processed. As the file is browsed, it is ready for its complete functionalities.

ANTCONC SETTINGS FOR DIFFERENT SEARCH FUNCTIONS
As mentioned earlier, UAM Corpus Tools, USAS, and MAT help tag the data while all searches through these tagged files are done through a software that searches the desired chunks, in our case AntConc. To perform different search operations within the untagged text file and USAS and MAT tagged files, the following settings were made in AntConc.
Finding vocabulary items related to different tags involves the "Collocates." After clicking the "Collocates" button, the window span is set to "1L"; the right was set to 0. After these settings, the tag in focus was inserted in the search bar. For example, to generate instances of living creatures, the tag "L2" was inserted in the search bar, which produced the following results shown in table 1:  2  15  15  0  Tree  3  7  7  0  leaves  4  3  3  0  radish  5  3  3  0  bough  6  2  2  0  willow  7  2  2  0  Pines  8  2  2  0  Bush  9  1  1  0  thickets  10  1  1  0  Shrub  11  1  1  0  Reeds  12  1  1  0  radishes  To find out the frequency of the n-gram cluster of words, the untagged file was uploaded in AntConc. The next step was to specify the size of the n-gram cluster, which was set to 3. A sample output from the cluster function is shown below.

RESULTS AND DISCUSSION
The above-mentioned tools were applied to analyze the text. Therefore, the analysis and interpretation of data have been presented in this section.

LEXICAL FEATURES (THROUGH USAS, MAT, AND ANTCONC)
The corpus tools were primarily applied to study the text's lexical and semantic features and cross-validate the results obtained through UAMCT. This mixing of computational tools and analytic methods ensured valid and reliable results for the study. As mentioned earlier, UAMCT did not study some subjectless clauses; therefore, it was important to apply such analytical tools that could make up for this deficiency of the main tool.

N-GRAM ANALYSIS
The frequency of n-gram (of 3-word size), computed through AntConc, emphasizes the theme of lostness and directionlessness. The most frequent N-gram clusters are presented in Table 2 below.

RELATIVE FREQUENCY OF TIME WORDS
In Waiting for Godot, the present time is referred to with a greater frequency than either past or future. To compare time orientation in the play, a USAS tagger was used. The output file of lexical items shows the appearance of time for frequency, which is (maximum to minimum): present, past, future. The following figure captures this order along with the exact frequency of verbs related to aspects of time.

Source: Authors' calculation
As the above figure shows, the past tense is about two times more frequent than the future, while the present time is nearly 4 times more frequent than the past (see for frequencies). Analysis of time orientation is not covered in the analysis through UAMCT and is supplemented by MAT. This feature of the text is again in conformity with the essence of Existentialism.

FORMS OF BE
Multidimensional Analysis Tagger (MAT) tags a text for different forms of BE along with the frequency of this feature. The frequency of different forms of this lemma is18.9 per 1000 words in Waiting for Godot. Different forms of the word 'be' have been used to explore different existential aspects, as evident from the following examples: So you are again… What is it?
What are you insinuating?
Nothing is certain… N-gram Cluster Frequency I don't know 30 We're waiting for 6 Don't know sir 5 What'll we do 4 It's not certain 3 Is it not rather Sunday?
This is enough for you.
The road is free to all.
What age are you if it's not a rude question?
The tears of the world are a constant quantity.
You are not from these parts…

It is pale and luminous
We were respectable in those days The Dead Sea was pale blue.
Were you not there?
I thought it was he.
Who am I to tell my private nightmares… But am I heavier than you?
The results drawn through UAMCT show a very low frequency of Existential clauses, but MAT has provided results about existence forms of 'be' in total. Thus, MAT generates more consolidated results on this aspect of the text of Waiting for Godot. Existentialism is based on the idea of existence; we exist only because we are; nothing precedes or follows our existence.

SEMANTIC ANALYSIS
The search through the USAS-tagged file through the AntConc concordance function reveals that the words used for plants outnumber those meant for living creatures. Reference to living creatures is less frequent than plants (14:42::1:3). Among the words associated with plants, the following words are most frequent: tree (15), leaves (7), radish (3), and bough (3). There are only two references to the animal world: 'dog' and 'pig.' Even these references are also not too frequent compared to the Reference Density of personal pronoun. Man is shown only regarding the man, or, in other words, man is shown about his existence.

CONVERSATIONALITY OF WAITING FOR GODOT
To assess the extent of conversationality of the text, a multidimensional analysis of the text of Waiting for Godot was conducted through MAT, which produced the following results: The scores on dimensions 4 and 5, due to having no direct connection with conversational aspects, will not be receiving any attention in the following lines.
The results of Dimension 1 to 3 are graphically represented in Figure 2 below: The texts that achieve high positive scores on Dimension 1 (Involved vs. Informational Production) are typical spoken texts, while those achieving high negative scores are typical of written mode. The score achieved by conversation on this dimension is 35.3, while for spontaneous speeches, it is 18.2 (Biber, 1988). The text of Waiting for Godot (with a high positive score of 23.95) in this respect may be seen as resembling spoken discourse, especially 'spontaneous speeches.' The score achieved by the play on Dimension 2 (Narrative vs. non-narrative Concerns), i.e., 0.55, equally sheds light on the conversational character of the text. All texts that have high negative scores on this dimension and high negative scores on Dimension 1 are typical written texts. As evident from the results, the text of the play does not obtain negative scores on any of the first two dimensions. On Dimension 3 (Explicit vs. Situation Dependent Reference), all typically written texts have high positive scores while typically spoken texts have high negative scores. The score obtained by conversation on this dimension in Biber's (1988) study is -3.9. On this dimension, the text of the play achieves, as illustrated above, a fairly high negative score, i.e., -2.88. Based on the scores of Dimension 1-3 of multidimensional analysis, it is fair to assume that the text of Waiting for Godot resembles conversation/spoken communication.

Figure 2: Dimension Scores of Waiting for Godot
Source: Authors' calculation The major portion of results has been produced by applying UAMCT. But the results generated through USAS, MAT, and AntConc are also no less significant. Most of the corpus tool's results match each other and maintain the reliability and validity of the findings of this study. The analysis and interpretation of the data by applying corpus tools reveal that Waiting for Godot has many linguistic features that conform to its Existentialist interpretations. The results produced through the analysis of data have been discussed in this section.
The text contains many grammatical segments, mainly because the tool identifies each linguistic unit as a segment if it is potentially equivalent to a sentence, for example, No, Yes, Why, OK, etc. The average length of a word in Waiting for Godot is 3.85 letters per word, and the average length of a grammatical segment in it is 2.67 words per segment. These counts signify the simplicity of the text. It also means that the text is fraught with monosyllabic words. Both these features of Beckett's dramatic text concur with the simplicity of a text. But very small grammatical segments in a text construct a theme of ambiguity and inconclusiveness. Waiting for Godot is a linguistic paradox; lexically, this is very simple but has complicated grammatical constructions (Ilyas & Rafi, 2016).
Waiting for Godot has a very low lexical density; the number of content lexemes is 42.08% average length of a lexical unit is 1.12. aW#This shows that there are no inflections or affixations in the text. This short length of lexemes again corresponds with the simplicity of a text. But the number of lexemes in the text is significantly low. Meanings of a text are dependent upon the number of lexemes, too, because lexemes are content words and are laden with meaning. These structural lexemes have a structural (grammatical) rather than semantic role. This high value of non-content words in Waiting for Godot makes it difficult for the readers to comprehend. Thus, the text's vocabulary is very simple, but syntactic compositions are either very complicated or fragmentary (Hwang, 2019). These linguistic features construct a theme of complexity, and the text of Waiting for Godot may be termed as 'simplicity complicated.' The analysis of Waiting for Godot reveals that the text reflects linguistic features which develop themes of isolation, directionlessness, obscurity, alienation, pessimism, and boredom. In other words, it may be said that Waiting for Godot is more akin to the themes of Existentialism (Afridi, 2020).
The text has a very high Reference Density (i.e., frequency of 1 st person, 2 nd person, and 3 rd person pronouns referred to as 1p Reference, 2p Reference, and 3p Reference, respectively). The text has a very high value of pronominal references. The use of personal pronouns is very high as compared to a normal English text. The high frequency of personal pronouns highlights the fact that Beckett's plays are highly interactive. But if we look at the text of Waiting for Godot, we realize that there are only five characters in the play; then, why is there such a high density of 3p Reference. Who are these HEs, SHEs, and THEYs? This feature of the text constructs the theme of obscurity (Can, 2017).
Two words, 'Silence and Pause,' occurred very frequently (as part of stage direction) in Waiting for Godot. So many pauses and occurrences of silence in a text suggest but a much-broken flow of communication. This feature constructs a theme of boredom and stagnancy in the world the text represents. The frequency of 'Silence' and 'Pause' suggests confusion and ambivalence of the characters in Waiting for Godot (Kouachi, 2018).
Waiting for Godot has a very high count of 'wh' words. The characters are shown to be skeptical and uncertain about the world around them. This lexical feature of Waiting for Godot also constructs the theme of skepticism (Dreyfus & Bennett,