332 10 3494 https://omeka.ibu.edu.ba/files/original/677baf7f9d6988021e9bd4876247c178.docx f050eaf6ddbaa3af0f965389373856ab https://omeka.ibu.edu.ba/files/original/a5fed30f085884d6608fc112c9ba1cf1.pdf 72164ffc68b8cc2154db7c5d26f82643 PDF Text Text Use of Literary Texts in Language Classrooms: A Fun Way of Teaching English Hasan Serkan Kirca Süleyman Demirel University/ Isparta, Turkey Key words: motivation, literature, language teaching ABSTRACT Use of literary texts in language classrooms has long been a concern for researchers. Underlying rationale for the use of different genres of literature lies in the fact that they familiarize language learners with different uses of the target language through authentic materials. Furthermore, literary texts provide a student-friendly atmosphere which is conducive to meaningful and entertaining learning. Language learning is considered to be a demanding endavour for language learners. Included in the challenges associated with language learning are affective variables. However, literary texts, while exposing the learners to the imaginary and calming world of literature, help learners cope with anxiety or stress which might be exerted and witnessed in the process of language learning. Along with the aforementioned advantages, literary texts promote higher level of thinking skills such as synthesizing, analyzing and critical thinking.among language learners. The first part of the presentation will be devoted to the rationale for using literary texts in the language classrooms with an emphasis on their potential benefits. In the second part, the presenter will provide information on a number of literary genres which can be employed in language classrooms. The presenter will end up the session with an exemplary demonstration as to how short stories ,as a literary genre, can be utilized in language classrooms. The last part of the presentation will be interactive through the participation of the audience. � Dublin Core The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/. Extent The size or duration of the resource. 1808 Title A name given to the resource Use of Literary Texts in Language Classrooms: A Fun Way of Teaching English Author Author KIRCA, Hasan Serkan Abstract A summary of the resource. Key words: motivation, literature, language teaching ABSTRACT Use of literary texts in language classrooms has long been a concern for researchers. Underlying rationale for the use of different genres of literature lies in the fact that they familiarize language learners with different uses of the target language through authentic materials. Furthermore, literary texts provide a student-friendly atmosphere which is conducive to meaningful and entertaining learning. Language learning is considered to be a demanding endavour for language learners. Included in the challenges associated with language learning are affective variables. However, literary texts, while exposing the learners to the imaginary and calming world of literature, help learners cope with anxiety or stress which might be exerted and witnessed in the process of language learning. Along with the aforementioned advantages, literary texts promote higher level of thinking skills such as synthesizing, analyzing and critical thinking.among language learners. The first part of the presentation will be devoted to the rationale for using literary texts in the language classrooms with an emphasis on their potential benefits. In the second part, the presenter will provide information on a number of literary genres which can be employed in language classrooms. The presenter will end up the session with an exemplary demonstration as to how short stories ,as a literary genre, can be utilized in language classrooms. The last part of the presentation will be interactive through the participation of the audience. Publisher An entity responsible for making the resource available IBU Publishing Date A point or period of time associated with an event in the lifecycle of the resource 2013-05-03 Keywords Keywords. Article PeerReviewed Dublin Core The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/. Extent The size or duration of the resource. 3537 Title A name given to the resource USE OF LITERATURE IN ELT A SHORT STORY SAMPLE: VERSION OF T H E A D V E N T U RE S O F H U C K L E B E RRY F I N N Author Author Erten, Selcen Abstract A summary of the resource. The aim of this paper was to emphasize the use of literature in ESL/EFL contexts and investigate what the students considered literature in general and in English classes. To be specific, the use of short stories was explained and investigated through the Adventures of Huckleberry Finn. At the beginning of the first lesson with the researcher, a questionnaire was given to 32 students, who were the preparation class beginner level students at Eskisehir Osmangazi University in the spring semester of 2012-2013 academic year. The second application of the questionnaire was made in the last lesson. The results basically revealed that the students believed the importance and effectiveness of short stories in EFL classes and the reason for their attitudes toward literature in English was actually because of limitations in their linguistic levels needed to understand and appreciate literature. Keywords: Literature, EFL/ESL context, short story. Date A point or period of time associated with an event in the lifecycle of the resource 2014 Keywords Keywords. Conference or Workshop Item PeerReviewed PE English Dublin Core The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/. Extent The size or duration of the resource. 995 Title A name given to the resource Using ‘Glocal News’ to Develop Students’ Reading and Speaking Skills. Author Author ÖZÇINAR-SIREL, Nazan Abstract A summary of the resource. With the improvement of technology, many young people regard themselves as non-readers because they would rather engage in getting information from other forms of media such as the Internet, television, advertising, music, movies, video games and other digital realities. Therefore, teachers are constantly thinking of challenging ways to assign tasks that students can perform with these digital gadgets. Teachers are also aware of the fact that students need to be exposed to reading materials as much as possible so that they can improve their level of English. It is difficult to envisage a language- teaching programme without any reading tasks assigned to students. Whether teachers assign their students to read graded reader tasks or newspaper articles does not make any difference. It is a known fact that students will improve their reading skills with any reading tasks assigned to them. Therefore, reading newspaper articles is an effective way that teachers can use with their intermediate level of students to improve their reading skills. Unfortunately, we are not much of a reading society and we don’t even read a newspaper regularly in our mother tongue let alone in English. ‘Glocal News’ is one way of these challenging tasks designed for students at the intermediate level to encourage students to read online newspaper articles that they are interested in and present it online as a summary activity on MOODLE, an online open source known also as Course Management Systems (CMS). This workshop attempts to suggest an innovative approach to reading online newspaper articles to create online video journals. Date A point or period of time associated with an event in the lifecycle of the resource 2012-05-04 Keywords Keywords. Conference or Workshop Item PeerReviewed P Philology. Linguistics https://omeka.ibu.edu.ba/files/original/100774075b54fcc2fdcd0941149f75dc.docx 25ebec4ed420c209ed706f6b0255c8b1 https://omeka.ibu.edu.ba/files/original/b39cf054c7052f7a0273e2415d401947.pdf 692c7b2d0018f57e73b4696d03d0002a PDF Text Text Using a Case Study to Teach the (Non)Subtleties of Language: Logical Fallacies and Principles of Conversational Coherence Artur Hadaj & Christina Standerfer University of Tirana/ Tirana, Albania Key words:logical fallacies, case study, conversational coherence ABSTRACT This paper centers on a practical and relevant way to teach logical fallacies and how to avoid them to English as a second language learners in the Balkan region. The paper begins with a brief overview of the importance of teaching subtleties of language, such as logical fallacies and principles of conversational coherence and then proceeds to describe a rather heated written exchange between the editors of the Albanian daily newspaper Shekulli and representatives of the U.S. Embassy. In 2011, Shekulli published a long editorial without adding any statement saying that the views expressed in the article did not represent the stand of the newspaper. Immediately after this editorial, the US Embassy issued a brief statement accusing this newspaper of using an ad hominem argument when they explicitly referred to the ambassador’s Asian looks and his short stature. In their statement, the Embassy conveyed information regarding money the U.S. government had donated to the Albanian Media Institute for the qualification of Albania journalists. The implication being that the journalists of this newspaper either did not want to attend the qualification courses organized by the Institute or they could not understand the modern principles of newspaper writing. A few days later the Dutch embassy in Tirana severed relations with Shekulli, accusing its editors of engaging in slander. Description of the case is followed by an analysis, with a focus on the logical fallacies evident in the discourse (e.g., ad hominem arguments, non sequiturs, and glittering generalities). The paper concludes with lesson plans for how the case can be used to teach not only logical fallacies but also principles of conversational coherence (Grice, 1989) by leading students through a series of exercises in which they reimagine and reconstruct the exchange in ways that produce different and perhaps more favorable outcomes. � Dublin Core The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/. Extent The size or duration of the resource. 1741 Title A name given to the resource Using a Case Study to Teach the (Non)Subtleties of Language: Logical Fallacies and Principles of Conversational Coherence Author Author HADAJ, Artur STANDERFER, Christina Abstract A summary of the resource. Key words:logical fallacies, case study, conversational coherence ABSTRACT This paper centers on a practical and relevant way to teach logical fallacies and how to avoid them to English as a second language learners in the Balkan region. The paper begins with a brief overview of the importance of teaching subtleties of language, such as logical fallacies and principles of conversational coherence and then proceeds to describe a rather heated written exchange between the editors of the Albanian daily newspaper Shekulli and representatives of the U.S. Embassy. In 2011, Shekulli published a long editorial without adding any statement saying that the views expressed in the article did not represent the stand of the newspaper. Immediately after this editorial, the US Embassy issued a brief statement accusing this newspaper of using an ad hominem argument when they explicitly referred to the ambassador’s Asian looks and his short stature. In their statement, the Embassy conveyed information regarding money the U.S. government had donated to the Albanian Media Institute for the qualification of Albania journalists. The implication being that the journalists of this newspaper either did not want to attend the qualification courses organized by the Institute or they could not understand the modern principles of newspaper writing. A few days later the Dutch embassy in Tirana severed relations with Shekulli, accusing its editors of engaging in slander. Description of the case is followed by an analysis, with a focus on the logical fallacies evident in the discourse (e.g., ad hominem arguments, non sequiturs, and glittering generalities). The paper concludes with lesson plans for how the case can be used to teach not only logical fallacies but also principles of conversational coherence (Grice, 1989) by leading students through a series of exercises in which they reimagine and reconstruct the exchange in ways that produce different and perhaps more favorable outcomes. Publisher An entity responsible for making the resource available IBU Publishing Date A point or period of time associated with an event in the lifecycle of the resource 2013-05-03 Keywords Keywords. Article PeerReviewed Dublin Core The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/. Extent The size or duration of the resource. 770 Title A name given to the resource Using A Moodle Platform In An Online Exchange To Enhance Intercultural Sensitivity: A Practical Experience In Higher Education Author Author Raluy Alonso, Angel Abstract A summary of the resource. As the Council of Europe suggests foreign language teaching needs to comprise not only linguistic performance but also intercultural consciousness and intercultural skills. Despite being grammatically and lexically competent, many university students have limited experience in handling cultural difference due to a lack of exposure to intercultural interaction (Belz, 2006). As O’Dowd (2007) states, online communication tools not only offer more opportunities than before to interact with peers from distant societies but they also provide an authentic and effective way of preparing learners for intercultural enrichment through partnership. The aim of this talk is to present a summary of the experience and the findings of a semester long online exchange between specialist learners of English at the University of Vic (Barcelona, Spain) and at the University of Opole (Poland) during the 2011-2012 academic year. The immediate objective pursed by both institutions was to establish a closer relationship between third year students both physically and virtually so as to foster a better understanding of their counterparts’ culture. The project rested on the principles of reciprocity and learner autonomy, so the communication was asynchronous and fundamentally developed outside the classroom. In order to test the impact of the online communication on the students’ intercultural sensitivity a small scale study was conducted. During the session, the structure, outcomes, challenges and future of the experience will be discussed and some preliminary results of the research project will be presented. Date A point or period of time associated with an event in the lifecycle of the resource 2012-05 Keywords Keywords. Conference or Workshop Item PeerReviewed P Philology. Linguistics https://omeka.ibu.edu.ba/files/original/68a47120ae18ec74cd7d533e54a2be14.pdf f7ba5d25d9c3b40841699af4cf75665e PDF Text Text 3rd International Symposium on Sustainable Development, May 31 - June 01 2012, Sarajevo Talinli, I., Topuz, E. and Akbay, M.U. (2010) Comparative Analysis for Energy Production Processes (EPPs): Sustainable Energy Futures for Turkey, Energy Policy, 38, 44794488. Toksarı, M. and Toksarı, M.D. (2011) Bulanık Analitik Hiyerarşi Prosesi (AHP) Yaklaşımı Kullanılarak Hedef Pazarın Belirlenmesi, ODTÜ Gelişme Dergisi, 38, 51-70. Tseng, M-L., Lin, Y-H. and Chiu, A.S.F. (2009) Fuzzy AHP-Based Study of Cleaner Production Implementation in Taiwan PWB Manufacturer, Journal of Cleaner Production, 17, 1249-1256. Wang, L., Xu, L. and Song, H. (2011) Environmental Performance Evaluation of Beijing's Energy Use Planning, Energy Policy, 39, 3483-3495. Zheng, G., Jing, Y., Huang, H., Shi, G. and Zhang, X. (2010) Developing a Fuzzy Analytic Hierarchical Process Model for Building Energy Conservation Assessment, Renewable Energy, 35, 78-87. Zheng, J. (2011) Enterprise Knowledge Management Application Evaluation Based on Cloud Gravity Center Model and Fuzzy Extended AHP, Journal of Computers, 6(6), 11101116. Using Artificial Neural Networks To Forecast Gdp For Turkey Karaatli Meltem, Göçmen Yağcilar Gamze, Karacadal Hüseyin, Sezer Fırat Suleyman Suleyman Demirel University, Isparta, Turkey E-mails: meltemkaraatli@sdu.edu.tr,gamzeyagcilar@sdu.edu.tr, huseyin_karacadal@hotmail.com,cihangir_07_@hotmail.com Abstract Artificial Neural Networks (ANN) is a system resembling biological neural systems and uses working principles of human brain as a base. ANN can be applied in various fields for the purposes of forecasting, classification, optimization, data binding and so on. ANN has been frequently used in financial applications in recent years. In this study, ANN is used in forecasting Gross Domestic Product of Turkey. Gross Domestic Product (GDP) refers to the market value of all final goods and services produced within a country in a given period. GDP can be thought as the size of an economy and it is the foremost important measure of macroeconomic performance of a country, a country’s health and standard of living. Therefore, expectations about future GDP can be the primary determinant of investments, employment, wages, profits and even stock market activities. With respect to its economic 326 �3rd International Symposium on Sustainable Development, May 31 - June 01 2012, Sarajevo significance mentioned above, the purpose of this study is to forecast Gross Domestic Product (GDP) for Turkey and to test the ability of ANN Method in forecasting GDP. Keywords: Importance of Gross Domestic Product, Forecasting, Artificial Neural Networks. 1. INTRODUCTION Gross Domestic Product (GDP) is the total market value of all the final goods and services produced within a country’s boarders in a given year. This production is generated by both citizens of the country and foreigners living in its borders. GDP is one of the most important indicators of an economic growth, health and welfare. Therefore, it tells us a lot about the real economic activity. Calculation of GDP can be basically done in one of two ways: either by adding up what everyone earned (income approach), or by adding up what everyone spent (expenditure method) in a year. Logically, both measures should arrive at roughly the same total (www.investopedia.com). In Turkey, GDP is measured quarterly by TUİK. To compute economic growth, each quarter is compared to the previous one. Considering its large impact on almost everybody in an economy, forecasting GDP has a great importance both theoretically and practically. First of all, GDP represents economic production and growth. So it gives a signal about the future employment and wages. GDP also determines stock market return rates. If GDP growth rate is positive, then investors may expect to gain revenue (www.investopedia.com).By using GDP reports, it can be seen which sectors of the economy are growing and which ones are declining. This would help investors to determine whether they should invest in or which sectors they should invest in (http://useconomy.about.com/od/grossdomesticproduct/p/GDP.htm). The GDP statistics can help the economists a lot in solving the problems of inflation in the country. The national income figures throw light as to how much general price level has increased or decreased, how much of their income people spend on consumption goods and how much they save? Government can devise measures of controlling inflation or deflation on the basis of these figures of consumption, saving and investment in the country (http://www.economicsconcepts.com/gdp_as_a_measure_of_welfare.htm). In the existing literature, forecasting GDP is widely studied with different methods. In this paper, we wish to determine whether the forecasting performance of this variable can be improved using neural network models. In this context, the purpose of this study is to forecast GDP of Turkey using Artificial Neural Networks (ANN) Method. The rest of this paper is organized as follows: Section 2 reviews some of the literature on GDP forecasts. Section 3 describes the methodology, while Section 4 presents the results. Finally, section 5 concludes the paper. 2. Literature review 327 �3rd International Symposium on Sustainable Development, May 31 - June 01 2012, Sarajevo Tkacz and Hu (1999) have determined whether more accurate indicator models of output growth, based on monetary and financial variables, can be developed using neural networks. The authors have used ANN model to forecast GDP growth for Canada. The main findings of this study are that, at the 1-quarter forecasting horizon, neural networks yield no significant forecast improvements. At the 4-quarter horizon, however, the improved forecast accuracy is statistically significant. The root mean squared forecast errors of the best neural network models are about 15 to 19 per cent lower than their linear model counterparts. Marcellino (2007) has evaluated whether complicated time series models can outperform standard linear models for forecasting GDP growth and inflation for the United States. In the study, it is considered as a large variety of models and evaluation criteria, using a bootstrap algorithm to evaluate the statistical significance of the results. The main conclusion is that in general linear time series, models can be hardly beaten if they are carefully specified. Schumacher and Breitung (2008) have employed factor models to forecast German GDP using mixed-frequency real-time data, where the time series are subject to different statistical publication lags. In the empirical application, the authors have used a novel real-time dataset for the German economy. Employing a recursive forecast experiment, they have evaluated the forecast accuracy of the factor model with respect to German GDP. Guegan and Rakotomarolahy (2010) have conducted an empirical forecast accuracy comparison of the non-parametric method, known as multivariate Nearest Neighbor method, with parametric VAR modeling on the euro area of GDP. By using both methods for now casting and forecasting the GDP, through the estimation of economic indicators plugged in the bridge equations, the authors have got more accurate forecasts when using nearest neighbor method. It is also proven the asymptotic normality of the multivariate k-nearest neighbor regression estimator for dependent time series, providing confidence intervals for point forecast in time series. Mirbagheri (2010) has investigated the supply side economic growth of Iran by estimating GDP growth. In this study, the predictive results of Fuzzy-logic and Neural-Fuzzy methods are also compared. According to the findings of the study, forecasting by the Neural-Fuzzy method is recommended. Ge and Cui (2011) have used process neural network (PNN) into the GDP forecast and established the forecast model based on PNN by choosing the main factors influencing GDP and using the dual extraction capacity on time and space cumulative effect of PNN. By means of comparing and analyzing with traditional neural network forecast model, the result shows that GDP forecast model which bases on PNN has a better performance. Liliana and Napitupulu (2010) have also used ANN method in forecasting GDP. In this study, authors have forecasted GDP for Indonesia and they put forward many advantages and disadvantages of the method. According to the results, the authors have concluded that the ANN model has better ability in forecasting the macroeconomic indicators. 3. Methodology 328 �3rd International Symposium on Sustainable Development, May 31 - June 01 2012, Sarajevo Artificial neural networks (ANN) may be identified as computing technologies containing performances and general features of biological neural networks (Deng v.d., 2008:1118). ANN, developed by imitating the human brain's operating mechanism with the aim of realizing the basic operations performed by the brain, is a logical computer programming technique. In a computer media, an algorithm, which attempts to operate as the brain does, makes a decision, makes a conclusion, arrives at a conclusion on the basis of the existing data when data are missing, accepts new data input constantly, learns and remembers, is called as "Artificial Neural Networks".(Kaltakçı, 1997:411-420) Artificial neural networks consist of many simple processing elements called as nodes or nerves. Each nerve is attached to the other nerves with weights. These weights indicate the information used by the network to solve a problem. Nerves are located in each layer and these layers are interconnected to the other nerves in adjacent layers. A weight gives the mathematical value of the relative power of information's connections that have been transferred from one layer to another. Addition function calculates the sum of all the weighted inputs of a nerve. Activation function is used for the conversion of output in an acceptable range. (usually 0-1 range). Input layer is identified with the independent variables while output layer is identified with the dependent variables (Deng v.d., 2008:1118). Networks having one layer are called single-layered neural networks while networks having more than one layer are called multilayered neural networks. In a multilayered neural network, number of neurons in each layer may vary (Hines, 1997; 206). While a singlelayered network consists of an input and output layer, a multilayered network may consist one or more middle (hidden) layers. As the number of middle layers increases, the ability of artificial neural network to get statistics from input data also increases (Nygren, 2004). If an artificial neural network is required to solve a nonlinear problem, a more sophisticated type of network is needed for these types of problems. Multilayered sensors (MLS) are network architectures developed for this purpose. This network has a forward network architecture and a supervised learning method is used (Deng, 2008:1118). MLS consists of an input layer, one or more middle layers and an output layers. Each layer has one or more processing elements. All processing elements in a layer is interconnected to all processing elements in a top layer. The flow of information is forwards and there is no feedback. Therefore, these types of networks are called as feed-forward neural network model. There is no information processing in input layer. The number of processing elements in input and output layers is totally dependent on the practiced problem. The number of middle layers and the number of processing elements in middle layers are found by trial and error method (Lippmann, 1987; 24-25). Each produced output in these types of networks is compared with the target output in each learning iteration and errors are calculated. By propogating backwards in neural network, this error is used to correct the weights. This process goes ahead so long as the mean squared error between target output and output produced by network is minimized (Deng, 2008:1118). For this reason, this type of network is also called as error propogation model or backpropogation 329 �3rd International Symposium on Sustainable Development, May 31 - June 01 2012, Sarajevo network model (Öztemel, 2003:76). These types of networks are illustrated (exemplified) in Figure 1). Figure 1: A Multilayered Network Model Output Backward Error Flow Forward Activation Flow Output Layer ….. Middle Layer Input ….. Layer Input 1 Input 2 Input N Kaynak: (Hamid ve Iqbal, 2004:1118) 4. Forecasting Gross Domestic Product with ANN In this study, by the method of artificial neural networks, the gross domestic product has been estimated on the basis of the calculated data by the method of three-monthly expenditures for the years of 1998-2010. Data have been drawn from the website of Turkish Statistical Institute. In the study, 52 pieces of data have been used for each variable covering the threemonthly periods of 13 years. 20% of the data consists of tests and 80% of it consists of trainings which thus randomly creates 4 different groups. Gross domestic product consists of a composite of macroeconomic variables such as resident household consumption, government final consumption expenditure, gross fixed capital formation, stock exchanges, export and import of goods and services. Gross domestic product is considered to be dependent variable while household consumption, government final consumption expenditure, gross fixed capital formation, stock exchanges and export and import of goods and services and time are considered to be independent variable. Together with their symbols, the dependent and independent variables used in the study are shown below. Gross National Product: GDP Time: T 330 �3rd International Symposium on Sustainable Development, May 31 - June 01 2012, Sarajevo Resident Household Consumption: RHC Government final consumption expenditure: GFCE Gross fixed capital formation: GFCF Stock Exchanges: SE Goods and Services Expenditures: GSE Import of Goods and Services: IGS In the study, as the values of independents are unknown during the desired terms accept the time variable, GDP ,which is a dependent variable, has been predicted after each independent variable has been estimated separately depending on the time. Namely, each independent variable has been considered as dependent variable and they have been predicted depending on the time variable. Different neuron numbers and hidden layer numbers have been tested to find the most appropriate network which will be used in the prediction of all variables. The estimated performance metrics have been evaluated in determining the most appropriate network. The network structure, of which forecasting measurements are the smallest, is identified as the most suitable one. The most appropriate network structures used to predict the all variables are illustrated in Table 2. Yet, as the stock exchanges, taken as independent variable, have so many sharp rises and falls, each quarter is estimated and combined within itself. The estimation performance metrics; MSE (Mean Square Error), RMSE (Root mean square) and MAPE (Mean absolute percentage error), which are commonly used in the literature, are shown in Formula 1,2 and 3 (Zhang ve Hu, 1998:500, Cho, 2003:328, De Lurgio, 1998:53).  (y RMSE   t  yt )2 T (1)  MAPE  1 T  yt  yt yt  100 (2)  MSE    y t    yt   2 T Here; yt = The actual observation values,  yt = Estimated values, T = Estimated numb 331 (3) �3rd International Symposium on Sustainable Development, May 31 - June 01 2012, Sarajevo Table1: The network structures used for estimation of variables Number of neurons in the input layer The number of intermidiate layer neurons Number of neurons in the output layer R MAPE µ ( The number of iteration) RHC 1 3 1 0,97 3,73 3 GFCE 1 4 1 0,96 3,94 5 GFCF 1 3 1 0,81 9,32 10 1 0,83 70,5 20 SE Independent variables 2 3 2 SE1 1 SE2 1 3 1 0,86 88,6 20 SE3 1 5 1 0,78 6,6 15 SE4 1 1 0,97 19,5 20 2 3 GSE 1 2 1 0,94 5,20 15 IGS 1 3 1 0,95 6,6 12 GDP 7 3 1 0,99 2,77 2 Estimation performance metrics of Gross domestic product (GDP) are obtained as MSE=0,000042, RMSE=0,006451 ve MAPE=2,775746%. On the basis of these measurements, Witt and Witt (2000) classified the estimation models and called those whose MAPE values are under 10% as the models having " high accuracy" and those whose values are between 10% nd 20% as the "correct predictions". Similarly, Lewis classified the models and called those hose MAPE values are less than 10% as "very good", those between 10% and 20% as "good", those between 20% and 50% as "acceptable" and those under 50% as "false and erroneous" (Aktaran, Çuhadar ve Kayacan, 2005:6). 332 �3rd International Symposium on Sustainable Development, May 31 - June 01 2012, Sarajevo Figure2: The optimum network structure to estimate the GDP Input layer RHC Hidden layer GFCE Outputlayer GFCF SE GDP GSE IGS T In this study, Matlab 7.9 computer package program has been used. For training function 'trainlm' , for learning function 'learngdm', for performance function 'MSE' and for the transfer function 'tansig' have been selected. In the study, predicted and actual values have been given in Table 2. Table 2: Actual and Estimated Values of GDP 333 Actual(1.000TL) Estimated (1.000TL) 2011 GDP 85.139.293 109.708.230 2011-Q1 26.205.423 26.070.548 2011-Q2 27.904.922 27.911.332 2011-Q3 31.028.948 28.430.643 2011-Q4 --------- 27.295.707 2012 GDP --------- 111.233.502 2011-Q1 --------- 26.813.588 2011-Q2 --------- 28.153.427 2011-Q3 --------- 28.500.755 2011-Q4 --------- 27.765.733 �3rd International Symposium on Sustainable Development, May 31 - June 01 2012, Sarajevo 5. CONCLUSIONS Gross Domestic Product is an important indicator for all economic units including companies, investors and households. Because it determines their future incomes, returns of their investments, cost of capital and so on. So economic units make their decisions and set economic policies depending on future economic conditions determined by what the future GDP will be. Here the question is which methods can be more suitable and successful in forecasting GDP. In this paper we applied Artificial Neural Networks method as a prediction model. Results suggest that forecasting performance of this variable can be improved using neural network models. REFERENCES Cho, V. (2003). “A Comparison of Three Different Approaches to Tourist Arrival Forecasting”, Tourism Management, 24: 323-330. Çuhadar, M. ve Kayacan C. (2005), “Yapay Sinir Ağları Kullanılarak Konaklama İşletmelerinde Doluluk Oranı Tahmini: Türkiye’deki Konaklama İşletmeleri Üzerine Bir Deneme”, Anatolia:Turizm Araştırmaları Dergisi, 16(1): 1990-2005. De LURGIO, A. S. (1998), Forecasting Principles and Applications, Irwin McGrawHill:Singapore. Deng, Wei-Jaw, Wen-Chin Chen, Wen Pei “Back-propagation neural network based importance–performance analysis for determining critical service attributes”, Expert Systems with Applications 34 (2008) 1115–1125. Ge, L., Cui, B., (2011), “Research on Forecasting GDP Based on Process Neural Network”, IEEE 2011, 7. International Conference on Natural Computation, 821-824. Guegan, D., Rakotomarolahy, P., (2010), “Alternative Methods for Forecasting GDP”, University of Paris, CES Working Papers, 2010.65. Hamid, Shaikh, A. ve Zahid Iqbal (2004), “Using Neural Networks for Forecasting Volatility of S&P 500 Index Futures Prices”, Journal of Business Research, 57: 1116-1125. Hines, J, W., MATLAB, Supplement to Fuzzy and Neural Approaches in Engineering, John Wiley&Sons, Inc., 1997. Kaltakci, M, Y., Dere, Y., Yapay Sinir Ağları Uygulamalarının İnşaat Mühendisliğinde Kullanımı, Prof. Dr. Rifat Yarar Sempozyumu, Editör: Semih S. Liliana, Napitupulu, T.A., (2010), “Artificial Neural Network Application in Gross Domestic Product Forecasting- an Indonesian Case”, 2. International Conference on Advances in Computing, Control and Telecommunication Technologies, IEEE 2010, 89-93. Lippmann, R., “An Introduction to Computing with Neural Nets”, Vol.4, 1987. Marcellino, M., (2007), “A Comparison of Time Series Models for Forecasting GDP Growth and Inflation”, http://www.eui.eu/Personal/Marcellino/1.pdf. 334 �3rd International Symposium on Sustainable Development, May 31 - June 01 2012, Sarajevo Mirbagheri, M., (2010), “Fuzzy Logic and Neural Network Fuzzy Forecasting of Iran GDP Growth”, African Journal of Business Management, Vol.4, No.6, 925-929. Nygren, K., Stock Prediction: A Neural Network Approach, Master Thesis, Royal Institute Of Technology, KTH, 2004. Öztemel, E., Yapay Sinir Ağları, Papatya Yayıncılık, İstanbul, 2003. Schumacher, C., Breitung, J., (2008), “Real-time Forecasting of German GDP based on Large Factor Model with Monthly and Quarterly Data”, International Journal of Forecasting, Vol. 24, 386-398. Tkacz, Greg, Hu, Sarah, (1999), “Forecasting GDP Growth Using Artificial Neural Networks”, Bank of Canada Working Papers, 99-3. Zhang, G., Hu, M.Y. (1998) “Neural Network Forecasting of the British Pound/US Dollar Exchange Rate”, Omega Int. J. Mgmt. Sci, 26(4): 495-506. (http://useconomy.about.com/od/grossdomesticproduct/p/GDP.htm). (http://www.economicsconcepts.com/gdp_as_a_measure_of_welfare.htm). (www.investopedia.com). The Importance And The Place Of Ombudsman In Law State Feyzullah Ünal Dumlupinar University, Faculty of Economics and Administrative Sciences E-mail: feyz_unal@mynet.com Abstract In analyzing the ombudsman from the respesct of its historical roots, it is understood that this institution has been inspired by Islam state system and Otoman state system. The institution ombudsman has been implemented in countries more than 100 today and overtaken the mission of protecting the citizens against the maladministration, securing the fundamental rights and liberty and constituted security for both governing and governed. In this study, it is offered that the fundamental rights and freedoms should be under the security, all activities of the government should be under the control of jurisdiction and the significance of this institution sould be awared in realizing the legal governance. Keywords: Ombudsman, law state, fundamental rights and freedoms, justice, control and judicial control. 335 � Dublin Core The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/. Extent The size or duration of the resource. 1129 Title A name given to the resource Using Artificial Neural Networks To Forecast Gdp For Turkey Author Author Karaatli, Meltem Abstract A summary of the resource. Artificial Neural Networks (ANN) is a system resembling biological neural systems and uses working principles of human brain as a base. ANN can be applied in various fields for the purposes of forecasting, classification, optimization, data binding and so on. ANN has been frequently used in financial applications in recent years. In this study, ANN is used in forecasting Gross Domestic Product of Turkey. Gross Domestic Product (GDP) refers to the market value of all final goods and services produced within a country in a given period. GDP can be thought as the size of an economy and it is the foremost important measure of macroeconomic performance of a country, a country’s health and standard of living. Therefore, expectations about future GDP can be the primary determinant of investments, employment, wages, profits and even stock market activities. With respect to its economic significance mentioned above, the purpose of this study is to forecast Gross Domestic Product (GDP) for Turkey and to test the ability of ANN Method in forecasting GDP. Keywords: Importance of Gross Domestic Product, Forecasting, Artificial Neural Networks. Date A point or period of time associated with an event in the lifecycle of the resource 2012-05-31 Keywords Keywords. Conference or Workshop Item PeerReviewed H Social Sciences (General) Dublin Core The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/. Extent The size or duration of the resource. 904 Title A name given to the resource Using Current out-of-class Materials in Teaching Reading Comprehension Author Author Asgari, Majid Abstract A summary of the resource. The studies on the integrating out of class materials with class materials mostly show the crucial role of this task for teachers and its benefits for students. This study investigates the effect of the integrating currents issues of interest into class materials on the students’ reading comprehension. The following question is proposed. Is relating current issues of interest to class materials useful on students reading comprehension? A true and a null hypothesis are given. The true hypothesis is integrating current issues of interest with class materials in teaching reading has a positive effect on reading comprehension. The study is performed at Islamic Azad University in Hidaj with 60 participants--male and female-- who are majoring in ‘mechanical’ and ‘electrical’ engineering. The subjects are randomly divided into two groups, each with 30 students. One of the groups is used as the experimental group (G1) and the other one as the control group (G2). The subjects are taught for two weeks and finally will take an achievement test. After analyzing the results of the test, and by comparing the means of the scores using t-test, the null hypothesis will be testified to show whether integrating current issues of interest with class materials improves reading comprehension of students in English class at university. Date A point or period of time associated with an event in the lifecycle of the resource 2012-05 Keywords Keywords. Conference or Workshop Item PeerReviewed P Philology. Linguistics https://omeka.ibu.edu.ba/files/original/37e978cf4ff09117c1045deec0a58515.pdf f756b0d603e556316fcddca5d288d4cd PDF Text Text USING DATABASE AUDIT FOR ANALYZING ON HISTORICAL DATA Adnan Hodžić International Burch University Bosnia and Herzegovina adnan.hodzic@ibu.edu.ba Adem Karadag Turkey nuhadem@gmail.com Abstract: Database auditing is one of the biggest issues in data security. Absence of information auditing drives the business applications to the lost trail of business procedures. To cope with auditing and in order to track operations and the actors of those operations in time, we need historical data or temporary database. Legitimate and exchange times are two important time-stamps in temporary database. In this paper, we show the methods to handle database auditing in business exchange operations, accurate times, and performers of the operations. These strategies are separated in two sets; utilizing relational databases, and utilizing semi-structured information. Keywords: Database Audit, Historical Data Introduction It is very crucial that a company no matter how big is it maintains the security of its information. Since there are many stealing of valuable data such as customers’ credit card data, designs and maybe source codes, the data should be protected all the time. Keeping safe your data is protecting its confidentiality, integrity, and availability. To ensure the data security, there should be a security plan. Authentication and administration can facilitate the security at a point (Mullins & Craig, 2002). However, there is a need to keep log files and check them separately from the database. Thus database audit was introduced to inspect the trail maintenance. Data servers help to create a database audit policy to protect the database safe. In this way, user entries can be controlled. There will be some techniques showing how to make database auditing depend on historical data. This paper divided into 4 parts. In part 2 and 3, there is literature review of historical data and auditing. The outline of auditing of database was described in some ways. We used a relational database (Grad, 2013) to represent the row, column based and log-file auditing strategies. Database Auditing Database auditing includes inspecting a database to control and view the actions of database users. In this way, auditor can see the manipulations, corruptions or glitches on the data. Database audit also refers to a professional database auditing resolutiongiving chance to track and inspect of any database activity involving accessing, login, protection breaches, user activities, insert-delete-change the data. Recently, to supply accurate data auditing a framework has been introduced in respect to data retention strategies. (Lu & Miklau, 2009) Under retention restriction a formula applied to audit data in the protected history. In this way database audit would be more accurate. ICESoS 2016 - Proceedings Book 283 �International Conference on Economic and Social Studies (ICESoS’16) It is important to detect changes that are deviates from standard. To differentiate the normal behaviors on the data and have better results in audit, data mining techniques are generally applied. This method can only detect the static actions of the user. This disadvantage can be affected by tracking all activities of user in an data audit system. As a result, anomaly detection method was introduced to model the normal behavior of the user. (Park & Lee, 2008) In this way normal behaviors can be easily differentiated from suspicious ones. To teach database security and auditing and make the students have better understanding about it, hands –on lab studies are set (Luebbers, Grimmer, & Jarke, 2003) In these studies various database scenario are set to integrate theories of database protection into practices. Historical Data Historical data is the information outlining activity, conditions and trends in a company’s past database. Historical data is often archived, and may be held in non-volatile, secondary storage. Historical data can be useful in helping to predict the future of a company and a market, as when conducting predictive analyses. Table 1: Operational Student Table Referenced By Student-History Table For RowBased Auditing Student Number Name Birth Adress Registration Date Fee 445 Zeynep 10.10.1988 Ankara 15.01.2008 2400 822 Mahmut 12.09.1990 Istanbul 01.09.2010 2600 544 Ayşe 15.05.1991 Istanbul 01.09.2011 2600 It is very significant to detect who made the changes like insertion of a new data, data manipulation or deletion on the database. In this way, a good data audit can be retrieved. The time and the user is important issue to analyze the modification of data. When was the action happened can be answered by valid and transaction times. In a study it is mentioned that valid and transaction times should assure no data loss. (Bhargava & Gadia, 1993) Arranging Historical Data For Auditing On Relational Database There are some ways to design historical data in a relational database (Margaret Rouse, 2015) like separated tables for recording past data and transaction log files. The idea of arranging separated tables for each relational database table is easy way to track to changes for each item. With both strategies there is no change on the original data tables. There are 3 ways that we represent here to supply historical data for auditing database. They are auditing on a row level, column level and logtable. Database Audit on a Row Level Our original relational tables stay same but we create a separate table for each table to apply data audit. Operational “Student” table as shown in Table 1 supplies the current data of each student for operations. There are 2 kinds of data type in this table; 284 ICESoS 2016 - Proceedings Book �Regional Economic Development: Entrepreneurship and Innovation static and operational data. Static data stays same or rarely change like Student Number, Registration date or Name. Historical or operational data continuously can be updated like address of the student. Static query, which is always used, already stays same to call the data from “Student”. Table 2 is an auditing table that includes all students’ data in the operational table. Two time intervenes needed for valid times. We need to know the beginning and ending time to sustain the life cycle of the data. Besides the valid time, we acquire to have operation type to diminish the complexity of comparison among histories of the same data and the user to make him responsible from the action. History of “Student” table is shown in Table 2. It can be seen from history table that Ali Oz has been a student since 01.09.2005. The user Mustafa updated his fee 2 times by increasing by $100 each and updated address by changing it from Istanbul to Adana. Ali has finished the school and his record deleted from the Student table by Semih. Ahmet moved from Hatay to Ankara on 23.09.2008 and his record terminated on January 2009 by Mustafa. Zeynep’s fee was increased by $100 by Mustafa. Finally, Semih added two new students Mahmut and Ayşe to the Student table. Table 2: Operational Student Table Referenced By Student-History Table For Row-Based Auditing Student Number Name Birth Address 966 Ali 21.04.1986 Istanbul 966 Ali 21.04.1987 966 Ali 855 Regist. Date Fee Begin End O p User 01.09.2005 2300 01.09.2005 01.09.2007 I Mustafa Adana 01.09.2005 2300 01.09.2007 U Mustafa 21.04.1988 Adana 01.09.2005 2450 01.09.2007 23.06.2008 D Semih Ahmet 11.05.1986 Hatay 01.09.2006 2350 21.09.2007 01.09.2008 I Semih 855 Ahmet 11.05.1986 Ankara 01.09.2006 2350 23.09.2008 15.01.2009 D Mustafa 445 Zeynep 10.10.1988 Ankara 15.01.2008 2300 15.01.2008 15.06.2010 I Mustafa 445 Zeynep 10.10.1988 Ankara 15.01.2008 2400 15.01.2008 U Mustafa 822 Mahmut 12.09.1990 Istanbul 01.09.2010 2600 01.09.2010 I Semih 544 Ayşe 15.05.1991 Istanbul 01.09.2011 2600 01.09.2011 I Semih Operational table and audit table records are identical. Data is repeated in different rows but this is kept for the sake of historical query. Database audit on a row level has some advantages and drawbacks. It is easier to apply auditing. When the user wants to insert, update or delete something from the operation table, the program can simply copy the all value in the record into the historical table. Besides, the end column should be updated with the operation. This operation can be achieved by the database as used in (Yang, 2009) article. Drawbacks can be mentioned that redundancy makes the system complicated. Also, calling historical data is needed to the comparison between operational table and auditing table by using recursive query. ICESoS 2016 - Proceedings Book 285 �International Conference on Economic and Social Studies (ICESoS’16) SELECT S1.fee, MINS, MAXS, S1.USER, OPERATION FROM Student_HISTORY_R S1, ( SELECT S2.fee, MIN(S2.begin) MINS, MAX(S2.end) MAXS FROM Student_HISTORY_R S2 WHERE Student Number = 966 GROUP BY fee) S3 WHERE S1.fee = S3.fee Database Audit on Column Level Column level audit is not including redundant data as seen in the row level audit. This historical table does not contain static data like birth date and registration date. The auditing table just sustains the changed data except primary key like student number. This is required to save the data in the operational table. Student history in Table 3 keeps just the changed data and it is less redundant than the Table 2. The student number 966 Ali moved from Istanbul to Adana on 01.09.2007 got raised fee from 2300 to 2450 on 01.09.2007. Selecting not-null value on a particular auditing column in SELECT statement would display only the actual change. For example, SELECT fee, begin, end, USER, OPERATION FROM Student_HISTORY_C WHERE Student Number = 966 AND fee IS NOT NULL The query displays the auditing of Ali’s fee. Comparing with row-based auditing on the same query, the SELECT statement is much less complex. Each record in column-based auditing table cannot contain more than one value of historical data because of the uncertainty of end time of each auditing data. Table 3: Student_History_C Table Using Column-Based Auditing Student Number Address 966 966 Begin End Operation User Istanbul 01.09.2005 01.09.2007 I Mustafa Adana 01.09.2007 U Mustafa 966 Fee 2450 01.09.2007 23.06.2008 D Semih 855 Hatay 21.09.2007 01.09.2008 I Semih 855 Ankara 23.09.2008 15.01.2009 D Mustafa 445 Ankara 2300 15.01.2008 15.06.2010 I Mustafa 2400 15.01.2008 U Mustafa 445 822 Istanbul 2600 01.09.2010 I Semih 544 Istanbul 2600 01.09.2011 I Semih 286 ICESoS 2016 - Proceedings Book �Regional Economic Development: Entrepreneurship and Innovation Since it is less complicated column level audit is faster. Less disk space is used also. However, many NULL values would cause other issues when writing queries Auditing on Log Table A log table that tracks changes to a system are also referred audit as it gives a bunch of information like user, data, time of execution that can be used to audit a system. Relational Database Management Systems (RDBMS)’s like audit option like in DB2 (IBM Knowledge Center, 2015), SQL (Stankovic, 2016) and ORACLE Servers (Stackowiak, Bales, & Greenwald, 2004) and facilitate database administrators to sustain an audit trail (Logging, Auditing, and Monitoring the Directory) and saved it in a log file. However, log tables are not keeping the finished time to program. To prevent this, there may be two ways. Column Based Log Audit Tables for Operation Logs We need to isolate auditing log data from the operational data. To do this, we make additional table for each auditing column. For instance, if ADDRESS and FEE columns in the STUDENT table are auditing columns, we make ADDRESS and FEE tables for auditing purposes as appeared in Table 4 and Table 5. There are some advantages about this way. First, it decreases the amount of auditing data and it makes it easier to analyze the tables. However, the number of independent tables may increase. Table 4: Audit Log Table For Address PK Student Number Adress Begin End 1 966 Istanbul 01.09.2005 01.09.2007 2 966 Adana 01.09.2007 3 855 Hatay 21.09.2007 4 855 23.09.2008 5 822 Istanbul 6 544 Istanbul OP User I Mustafa U Mustafa 01.09.2008 I Semih 15.01.2009 D Mustafa 01.09.2010 I Semih 01.09.2011 I Semih ICESoS 2016 - Proceedings Book 287 �International Conference on Economic and Social Studies (ICESoS’16) Table 5: Audit Log Table For Fee PK Student Number Fee Begin 1 966 2300 01.09.2007 2 966 2450 01.09.2007 3 445 2300 15.01.2008 4 445 2400 5 822 6 544 End Op User U Mustafa 23.06.2008 D Semih 15.06.2010 I Mustafa 15.01.2008 U Mustafa 2600 01.09.2010 I Semih 2600 01.09.2011 I Semih One Log Audit Table for Operation Logs To join audit data into one spot, we coordinate each auditing column from all operational tables into one single auditing log table. The audit log table makes out of name of table and column, Student ID of the record in the operational table, changed value, begin time, operation that causes the change and name of user who controls this data. Case of single audit log table of the database containing Student and Faculty tables is appeared in the Table 6. All changes made on the tables is built into the single audit log table. A solitary insertion of Student number 966 into Student table makes the insertion into audit log table two times; one log record for ADDRESS and another for Fee if Student table has two auditing columns. Upgrading on an auditing trait will embed an auditing record into the log table. You can see same action like in insertion; deletion of a record will be logged twice into audit log table if there should be an occurrence of two auditing columns, for example, deletion of Student 966 in Table 6. Table 6: One Audit Log Table For Every Table; Student And Faculty In Database PK Student Number Table Column Value Begin 1 966 Student Address Adana 01.09.2007 2 966 Student Fee 2450 4 855 Student Address Hatay 5 855 Student Fee 6 445 Student 7 445 8 9 Op User I Mustafa 23.06.2008 D Mustafa 21.09.2007 01.09.2008 I Semih 2350 23.09.2008 15.01.2009 D Semih Address Ankara 15.01.2008 15.06.2010 I Mustafa Student Fee 2300 15.01.2008 I Mustafa 445 Student Fee 2400 01.09.2010 U Semih 822 Student Address Istanbul 01.09.2010 I Semih 288 ICESoS 2016 - Proceedings Book End 13.04.2011 �Regional Economic Development: Entrepreneurship and Innovation 10 822 Student Fee 2600 01.09.2010 I Mustafa 11 544 Student Address Istanbul 01.09.2011 I Mustafa 12 544 Student Fee 2600 01.09.2011 I Semih 13 221 Faculty Manager 108 01.01.2012 I Semih 14 103 Faculty Manager 120 21.06.2013 U Mustafa Audit log table is expansive if there are numerous auditing columns from various tables. Separating the data in columns and having a solitary audit log table for every subsystem are suggested. Both methodologies require additional handling for each operation at the databases, particularly, the auditing data. Of course, database motors have as of now controlled log tables. With this additional handling, the general framework will be slowed down. Conclusion Operation tables and auditing tables should be apart from each other. In this way database engine could be much faster in running the auditing query when we compare a table includes both operational and auditing data. Overhead of checking which partition will be used against the query is added to execution time. Also, database administrator would manage the database management system easier. There are many options for auditing database. Some solutions are appropriate for relational databases. On the other hand, marketing databases are mostly using semistructured databases. Database auditing is one of the crucial issue for a company to maintain its’ not only security-related concerns but also performance and reliability. Monitoring and recording of selected user database actions determine the future of the company’s business. Overall, security and reliability of the data can be sustained by a good database auditing method References • Bhargava, G., & Gadia, S. K. (1993). Relational Database Systems with Zero Information Loss. 5 (1), 76- 87. • Grad, B. (2013). Relational Database Management Systems: The Business Explosion. IEEE Annals of the History of Computing archive , 35 (2), 8-9. • IBM Knowledge Center. (2015, January 15). Retrieved April 30, 2016, from ibm. com: http://www.ibm.com/support/knowledgecenter/#!/SSEPGG_8.2.0/ welcome.html • Lu, W., & Miklau, G. (2009). Auditing a Database Under Retention Restrictions. IEEE Inter. Conf. on Data Eng. (pp. 42-53). ICDE. • Luebbers, D., Grimmer, U., & Jarke, M. (2003). ystematic Development of Data Mining- Based Data Quality Tools. proc. of the 29th VLDB Conference, (pp. 548 - 559). Berlin. • Margaret Rouse. (2015). Relational database management systems (RDBMS). Retrieved April 5, 2016, from TechTaregt: http://searchsqlserver.techtarget.com/ definition/relational-database-management-system ICESoS 2016 - Proceedings Book 289 �International Conference on Economic and Social Studies (ICESoS’16) • Mullins, & Craig. (2002). Database administration: the complete guide to practices and procedures. Addison-Wesley. • Park, N. H., & Lee, W. S. (2008). Anomaly Detection over Clustering Multidimensional Transactional Audit Streams. IEEE International Workshop on Semantic Computing and Applications (pp. 78-80). IWSCE. • Stackowiak, R., Bales, D., & Greenwald, R. (2004, August 26). Oracle Docs. Retrieved April 30, 2016, from docs.oracle.com: http://download.oracle.com/ docs/cd/B14099_19/idmanage.1012/b14082/logging.htm#i126963 • Stankovic, I. (2016, April 5). SQL Server Audit (Database Engine). Retrieved April 30, 2016, from msdn.microsoft.com: https://msdn.microsoft.com/en- us/en%20 us/library/cc280386.aspx • Yang, L. (2009). Teaching Database Security and Auditing. SIGCSE, (pp. 241-245). 290 ICESoS 2016 - Proceedings Book � Dublin Core The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/. Extent The size or duration of the resource. 3307 Title A name given to the resource USING DATABASE AUDIT FOR ANALYZING ON HISTORICAL DATA Author Author Hodzic, Adnan Karadag, Adem Abstract A summary of the resource. Abstract: Database auditing is one of the biggest issues in data security. Absence of information auditing drives the business applications to the lost trail of business procedures. To cope with auditing and in order to track operations and the actors of those operations in time, we need historical data or temporary database. Legitimate and exchange times are two important time-stamps in temporary database. In this paper, we show the methods to handle database auditing in business exchange operations, accurate times, and performers of the operations. These strategies are separated in two sets; utilizing relational databases, and utilizing semi-structured information. Date A point or period of time associated with an event in the lifecycle of the resource 2016 Keywords Keywords. Conference or Workshop Item PeerReviewed H Social Sciences (General) https://omeka.ibu.edu.ba/files/original/41cddaabb1e237ad0c086dbca13071d0.pdf 369ba7ca22ee871fc9b74adde3cf1d69 PDF Text Text Journal of Natural Sciences and Engineering, Vol. 3, (2020) DOI number: 12.34567/JONSAE2020123 Using Exploratory Data Analysis and Big Data Analytics for Detecting Anomalies in Cloud Computing Ibrahim Muzaferija1, Zerina Mašetić1 1 International Burch University, Sarajevo, Bosnia and Herzegovina ibrahim.muzaferija@stu.ibu.edu.ba zerina.masetic@ibu.edu.ba Abstract – While leveraging cloud computing for large-scale distributed applications allows seamless scaling, many companies struggle following up with the amount of data generated in terms of efficient processing and anomaly detection, which is a necessary part of the management of modern applications. As the record of user behavior, weblogs surely become the research item related to anomaly detection. Many anomaly detection methods based on automated log analysis have been proposed. However, not in the context of big data applications where anomalous behavior needs to be detected in understanding phases prior to modeling a system for such use. Big Data Analytics often ignores anomalous point due to high volume of data. To address this problem, we propose a complemented methodology for Big Data Analytics – the Exploratory Data Analysis, which assists in gaining insight into data relationships without the classical hypothesis modeling. In that way, we can gain better understanding of the patterns and spot anomalies. Results show that Exploratory Data Analysis facilitates anomaly detection and the CRISP-DM Business Understanding phase, making it one of the key steps in the Data Understanding phase. Keywords - Cloud Computing, Big Data, Data Mining, Anomaly Detection 1. Introduction With constant growth and advancements of the Internet, there are more systems connected to other connected systems, constantly generating and exchanging data. That data is referred to as Big Data and is constantly targeted by cyber-attacks as it contains sensitive and valuable information. The term “big data” refers to data that is so large, complex, or rapid that it’s not possible to process using traditional computing and data management tools. Big Data provides opportunities to improve research, operational efficiency, and decision-support applications with increased value for digital applications [1]. At the same time, Big Data represents the challenges to store, transport, process, mine, and serve the data. Data that is high in volume, velocity, variety, and veracity must be processed with advanced analytical tools and algorithms to reveal meaningful information and provide value. Cloud computing represents the use of distributed and shared resources such as computing, storage, networking, and analytical software, and provides fundamental support to address the challenges of Big 1 �Journal of Natural Sciences and Engineering, Vol. 3, (2020) DOI number: 12.34567/JONSAE2020123 Data. Cloud computing serves both as a technological enabler and producer of big data [1]. Anomalies represent unusual or behaviors that deviate from the normal. In efforts to increase cloud computing reliability, anomaly detection poses a frequent problem in threat detection and identification, as reported by Cloud Security Alliance (CSA) [2] which represents the world’s leading organization dedicated to securing cloud computing environments, conducts annual research with an aim to raise awareness of threats, risks, and vulnerabilities in the cloud environment. In their latest (2019) report [3], CSA re-examined the risks with cloud security and took a new approach, examining the problems in configuration and authentication, rather than the traditional focus on vulnerabilities and malware, highlighting the following threats: 1. Data Breaches 2. Misconfiguration and inadequate change control 3. Lack of cloud security architecture and strategy 4. Insufficient identity, credential, access, and key management 5. Account hijacking 6. Insider threat 7. Insecure interfaces and APIs 8. Weak control plane 9. Metastructure and applistructure failures 10. Limited cloud usage visibility 11. Abuse and nefarious use of cloud services In this research, we aim to address the threats which can be traced in user logs (numbered 1, 4, 5, 6, 8, 9 and 11) by utilizing Big Data Analytics and Exploratory Data Analysis in order to discover anomalies and contribute to increase of security in Cloud Computing applications. 2. Literature Review Anomaly detection in the cloud infrastructure and big data environment has been the topic of many research studies in the literature. Since the first introduction of cloud infrastructure in 2006 [4], cloud computing has greatly impacted the industries. The rapid development of Internet and Big Data technologies has resulted in increased service development on cloud computing, such as online banking services, electronic news services, government information systems, mobile services, etc. These systems handle sensitive and confidential data, making the anomaly detection mechanisms one of its core security requirements. In the review paper by Arif Sari [4], [5], different techniques and mechanisms used in the detection of anomalous activities within the cloud environment are described: threshold detection, statistical analysis, rule-based measures, data mining, and machine learning. We aim to apply statistical techniques and EDA 2 �Journal of Natural Sciences and Engineering, Vol. 3, (2020) DOI number: 12.34567/JONSAE2020123 (Exploratory Data Analysis) in order to discover anomalies. In the “Big Data processing for Anomaly Detection” survey [6], Ariyaluran et al. present the details of the comparative analysis and the relationship of three different domains, which are anomaly detection, machine-learning algorithms, and real-time big data processing. This paper aims to contribute to complemented techniques for anomaly detection. Once anomalies are detected, we can utilize Machine Learning and real-time anomaly detection for future improvements. In their research, Dalal and Rele [6], [7] emphasize the steps in creating effective and reliable mechanisms for threat detection. They highlight the importance of the first CRISP-DM (Cross Industry Standardized Process for Data Mining) phase named “Develop Business Understanding”, where reasons for defects and answers for maintenance are taken into consideration. They discuss the phase “Analyze Data and Data Dependencies” where the aim is to analyze, combine, and compare the data with the present situation, without proposing EDA as a baseline for data understanding. Our work aims to employ EDA in order to complement the methodology. Also, they highlight the step named “Engage with Subject Matter Experts (SME’s)” for better dataset examination and analysis of the anomaly situation, along with a grouping of the threat factors. By employing these methods, we aim to set transparent expectations and bring out clarity to our results. In further research, we work closely with application development technical lead which serves as SME, and facilitates in clarification of log data, as well as threats, anomalies and our results 3. Methodology The research is implemented using a portion of the CRISP-DM (Cross Industry Standardized Process for Data Mining) methodology [8], which represents the common standards used by data scientists and data mining experts in order to build analytical and machine learning models. Prior to analytical and machine learning model creation, we need to construct a clean dataset of user behavior with anomalies labeled for future modeling. To do so, in this research we focus on the first three phases: Business Understanding, Data Understanding, and Data Preparation, as highlighted with red color in the figure below. Modeling and subsequent phases are researched in our extended study of anomaly detection in cloud computing. 3 �Journal of Natural Sciences and Engineering, Vol. 3, (2020) DOI number: 12.34567/JONSAE2020123 Figure 1. CRISP-DM workflow In the Business Understanding phase, the goal is to determine business objectives, assess the situation from a business perspective, discuss with subject matter experts, determine data mining goals, and produce a project plan. In the Data Understanding, we collect and select raw data, describe and explore the data, consult with subject matter experts, and verify data quality. In the Data Preparation phase, which is often the most time-consuming phase, we select and clean the data, format data, and construct a clean dataset. We approach the mentioned phases using Big Data Analytics and Exploratory Data Analysis (EDA). Big Data Analytics examines large amounts of data in a non-traditional manner, that is using distributed and shared resources to support the data quantity and complexity [8], [9]. Exploratory Data Analysis [10] is an approach to analyzing data in order to summarize their main characteristics and uncover the underlying structure using statistical and visual methods. 3.1. Data Collection and Selection Cloud-based enterprise web application logs are produced by multiple servers and services, which are streamed to Elasticsearch [11] service, an open-source search, and analytics engine for all types of data. Elasticsearch is distributed, fast, and scalable, which makes it an ideal environment for big data ingestion, enrichment, storage, analysis, and visualization. 4 �Journal of Natural Sciences and Engineering, Vol. 3, (2020) DOI number: 12.34567/JONSAE2020123 Figure 2. Raw data access from Kibana Raw data is accessed by locally restoring the Elasticsearch cluster snapshot taken for a period of three months. The cluster contains around 20 GB of semi-structured data collected from different application services and levels, indexed by a timestamp. Application logs are mapped to 175 attributes and accessed using Kibana [12], the Elastic Stack service for data analysis and visualization. Attribute selection is a part of the “Business understanding” and “Data understanding” phase, implemented together in consultations with application development technical lead, i.e., subject matter expert (which we’ll refer to as SME). The attributes describing the user’s application usage that were the most relevant for anomaly detection are selected for further analysis. The following table displays statistical information for selected attributes. 5 �Journal of Natural Sciences and Engineering, Vol. 3, (2020) DOI number: 12.34567/JONSAE2020123 Table 1. Selected data statistical information Attribute name Description Data type Range Missing timestamp Timestamp Date Time [2020-01-05 21:17, 0.0 % 2020-03-26 21:06] account_id Account ID, Nominal unique company f6afd09c-****-****-****- 8.87 % c30a935ccc37, ... account identifier client_country User country Nominal BA, US, ... 9.53 % company_name Company Name Nominal Company A, Company B, 10.17 % ... platform Application Nominal platform BrowserMNC, 0.0 % BackendMNC, ... principal_id User email Nominal developer@**.com, ... 9.64 % remote_address User IP address Nominal [ 0.0.0.0. - 255.255.255.255 9.12 % ] user_agent User-agent Nominal Mozilla/5.0 ( Windows NT 0.0 % 10.0; Win64; x64) … , ... error_message Error message Nominal validation error, auth error, 99.96 % ... message Log message Nominal Profiling, FrontTimings, ... 0.18 % level Log level Nominal Info, error 0.0 % path Parameterized Nominal PUT 99.78 % resource request /customer/***/ticket/***, ... resource Request Nominal (GET) /invoices, ... 0.0 % status_code Response code Nominal 200, 404, ... 10.17 % Once the relevant data is selected, we utilize Elastic Stack service named Logstash [13] for collecting the data, that is, obtaining the initial dataset in CSV format for further work. 6 �Journal of Natural Sciences and Engineering, Vol. 3, (2020) DOI number: 12.34567/JONSAE2020123 3.2. Data Cleansing and Engineering In order to get an insight into data quality, graphical and statistical methods were used to detect anomalies, faults, outliers, missing values, etc. Moreover, we engineer new attributes in order to increase the interpretability or decrease data complexity. Exploratory Data Analysis assists understanding of relations between attributes and allows us to spot tendencies, as well as to identify the necessary cleaning steps we have to take. First, we apply filters to remove log data from automated services, such as health-checks and other application services that don't reflect the user’s interactions. Next, we remove attributes that contain a high fraction of missing values because the informational significance of attributes is inconsiderable. Values of “status_code” attribute are mapped to the corresponding descriptions for better interpretability. We engineer new attributes: “resource_method”, “resource_base” and “user_os”. The “resource_method” and “resource_base” attributes are created from the values of the “resource” attribute by using regular expressions to extract the relevant information. The “user_os” attribute is created in a similar manner, extracting the relevant information using regular expressions from the “user agent” attribute. Creation of these attributes allows us to focus on the most relevant information and decrease the cardinality of original attributes. 3.3. Dataset Creation The clean dataset contains 16 attributes describing the application usage, and 522,763 rows with a timestamp attribute range from 6th January to 26th March (81 days). Data is imported to RapidMiner [14], a data science software platform that provides an integrated environment for data preparation, visualization, machine learning, text mining, and predictive analytics. It is open source and used for commercial applications, as well as for research, education, training, rapid prototyping. In this phase, we continue with Exploratory Data Analysis in order to discover patterns beyond formal modeling or hypothesis testing tasks. Our aim is to utilize the business understanding to increase the understanding of data and relationships between attributes in order to spot anomalous trends. As the application is B2B based, we analyze the company data first: company account histogram, statistics and distribution. Next, we analyze the behaviors of users in company and general context. By analyzing the “user” and “user domain” attribute, we spot trends in company context usage and behavior. Analysis of application resource requests allows us to understand the usage in general context. 7 �Journal of Natural Sciences and Engineering, Vol. 3, (2020) DOI number: 12.34567/JONSAE2020123 Figure 3. Counts of application resource requests From the figure above, we can spot trends and further analyze the resource usage. The resource request represents a user action, thus are highly valuable for the context of anomaly detection. Moreover, granular analysis facilitates the business understanding as we gain deeper insight into user generated data. Next, we analyze the application errors which are often one of the most informative attributes for the anomaly detection. Anomalies and cyber-attacks are often causing application errors, allowing us to quickly analyze error data and make distinctions between application anomalies, user anomalies and possible threats. Figure 4. Application error logs histogram 8 �Journal of Natural Sciences and Engineering, Vol. 3, (2020) DOI number: 12.34567/JONSAE2020123 Figure 5. Application logs status codes histogram Application status codes are highly correlated with application resource usage. By analyzing status codes, we gain insight into applications performance and usage trends. Anomalies are most visible when analyzing the status codes. Dataset creation is concluded with the creation of an “anomaly” attribute, which represents whether a specific application log instance is anomalous. The criteria for creation of such attribute are drawn from the discoveries of EDA and confirmed through the consultations with SME. By addressing the CRISP-DM phases for Business Understanding, Data Understanding, and Data Preparation with the application of Exploratory Data Analysis, we are able to discover anomalies in application usage and user behavior. 4. Results and Discussion As web application has busines-to-busines context, we approach the analysis of log data from a company perspective. We find that companies using the application can have their application usage segmented into three categories: heavy, medium, and light users, as shown below in the Figure 6. Heavy users are the companies responsible for application development and support. Medium users reflect the companies with frequent application usage, while light users represent the companies that are onboarding to application or in initial phases of application usage. Distinction of company users per their level of usage helps us create a better business understanding. Because of unbalanced level of application usage per 9 �Journal of Natural Sciences and Engineering, Vol. 3, (2020) DOI number: 12.34567/JONSAE2020123 company, we can expect an increased number of anomalies for heavy users, while companies with medium and light usage may have decreased the number of anomalies. Regarding the percentage of anomalies, it varies between companies with no specific pattern. Figure 6. Application usage per company When analyzing the histogram of application resource methods through the “resource_method” attribute, we find an anomalous request pattern, as shown below in the Figure 7. Consultations with SME yielded that resource request method anomaly corresponds to the service whose use has ceased, and the service behavior can be identified as anomaly. 10 �Journal of Natural Sciences and Engineering, Vol. 3, (2020) DOI number: 12.34567/JONSAE2020123 Figure 7. Application resource methods histogram anomaly When analyzing individual users, we perform segmentation per company using the domain name in user email address. The histogram of user domains contributes to business understanding as we can spot user trends per each company. In the figure below, we present the user domain histogram focused on anomalous application usage of unknown domains. We discover that usage from unknown domains tends to be increased in the monthly peaks of application usage. Figure 8. User domain histogram focused on unknown domains Consultations with SME clarified that unknown domains such as “gmail.com”, “hotmail.com”, and 11 �Journal of Natural Sciences and Engineering, Vol. 3, (2020) DOI number: 12.34567/JONSAE2020123 “outlook.com” are used by quality assurance developers and were marked as such. This has further decreased the number of visits from unknown domains. Moreover, consultations showed that users from unknown domains are companies in the trial phase, that is application demonstration phase, and are still eligible for anomaly detection. Application usage from other user domains is distributed as expected: two development companies take up the most traffic while others are medium and light users. Figure 9. Log message histogram anomalies In the figure above, we present an analysis result of log message histogram with revealed anomalies. We find that anomalies are caused by application development or, more specifically, integration attempts with other companies using the application. In the figure below, we present results from correlation analysis of the dataset. The correlation matrix shows increased correlation between attributes such as “platform” and “message”. These results help us to identify and discard highly correlated attributes and decrease the dataset complexity. 12 �Journal of Natural Sciences and Engineering, Vol. 3, (2020) DOI number: 12.34567/JONSAE2020123 Figure 10. Correlation matrix Correlation matrix also shows that attributes “status code” and “level” have a level of correlation. This indicates that application errors can be sourced from application status codes. In the figure below, status code histogram focused on error status code is depicted. We can spot the error trends together with identification of error sources. Figure 11. Status code histogram focused on error status codes With application of EDA, the resulting anomalies are used in the creation of labeled dataset for anomaly detection purposes. The dataset can serve as a baseline for creating various analytical and machine learning anomaly detection models such as frequency threshold detection, supervised anomaly prediction, unsupervised anomaly detection, etc. In the Table 2, we present the final dataset statistical information. 13 �Journal of Natural Sciences and Engineering, Vol. 3, (2020) DOI number: 12.34567/JONSAE2020123 Table 2. Dataset statistical information Attribute name Type Missing Least / Min Most / Max Range timestamp Date and 0 Jan 6, 2020 Mar 26, 2020 9:06 80d 14h 48min 6:18 AM PM 58710 (3) 12345 (131,132) time account_id Nominal 3 12345, c84c286[...]ffea5, [52 more] company_name Nominal 3 Company XYZ Company A Company A, (3) (131,132) Company B, [52 more] country Nominal 3 XX (29) US (399,465) US, BA, IN, [12 more] platform Nominal 0 Backend (45%) Browser (55%) Browser, Backend user Nominal 6 fk***@*.com fs***@*.com fs***@*.com, (4) (48,738) de***@*.com, [209 more] remote_address Nominal 3 184.*.*.22 (3) 77.*.*.171 (41,561) 77.*.*.171, 144.*.*.229, [302 more] user_agent Nominal 0 Mozilla/[...]4.1 Mozilla/[...]ri/537.3 Mozilla/[...]36, (3) 6 (77,449) Mozilla/[...].0, [114 more] 14 �Journal of Natural Sciences and Engineering, Vol. 3, (2020) DOI number: 12.34567/JONSAE2020123 error_msg Nominal 467,22 Getaddr[...].co ESOCKET[...]UT ESOCKET[...]UT, 5 m (1) (89) 502, [3 more] level Nominal 0 error (159) info (467,225) Info, Error message Nominal 0 Integ[...]led Profiling (264,851) Profiling, (159) frontTimigs, [1 more] status_code Nominal 93 405 Method 200 OK (453,461) [...]ed (1) resource_method Nominal 0 PUT (97) 200 OK, 204 No Content, [8 more] GET (373,123) GET, POST, [3 more] resource_base Nominal 0 produ[...]ile (8) endpoints (98,191) endpoints, customers, [17 more] user_domain Nominal 6 C*** (272) A*** (351,885) A***, M***, [9 more] user_agent_os Nominal 0 Unknown (3) Windows (411,762) Windows, OS X, [2 more] anomaly Binomina 0 True (882) False (466,502) False, True l 5. Conclusion This study has shown that the use of Exploratory Data Analysis contributes to and complements the implementation of CRISP-DM methodology phases: business understanding, data understanding, and data preparation. Moreover, we demonstrate that Exploratory Data Analysis is efficient method for detecting anomalies in big data. Summarizing data characteristics and discovering underlying patterns for data and its distribution brings value for both data understanding and data preparation phase. We confirm the benefits of proven method from previous studies: consultations with SME play a crucial role in the business understanding phase and give a valuable contribution in data understanding phase Next, consultations in the data understanding and data preparation phase facilitates the workflow and can help us increase the data value. Future efforts can be placed in implementation of subsequent CRISP-DM phases, that is, modeling, 15 �Journal of Natural Sciences and Engineering, Vol. 3, (2020) DOI number: 12.34567/JONSAE2020123 evaluation and deployment. Modeling data using Machine Learning techniques enables complex pattern discovery, as suitable for big data datasets, and further improves anomaly detection as underlying mathematical relationships can be leveraged. While this has been proven in majority of studies conducted in the field of anomaly detection and supervised machine learning, we propose a use of unsupervised machine learning for finding new anomalies that will enable a creation of extended labeled dataset which can then be used for creation of supervised machine learning model for anomaly detection and prediction. 6. [1] References “Big Data and cloud computing: innovation opportunities and challenges” [Online]. Available: https://www.tandfonline.com/doi/full/10.1080/17538947.2016.1239771. [Accessed: 04-Sep-2020] [2] “Cloud Security Alliance (CSA)” [Online]. Available: https://cloudsecurityalliance.org/. [Accessed: 04-Sep-2020] [3] “Top Threats to Cloud Computing: Egregious.” [Online]. Available: https://cloudsecurityalliance.org/artifacts/top-threats-to-cloud-computing-egregious-eleven/. [Accessed: 04-Sep-2020] [4] “About AWS.” [Online]. Available: https://aws.amazon.com/about-aws/. [Accessed: 04-Sep-2020] [5] A. Sari, “A Review of Anomaly Detection Systems in Cloud Networks and Survey of Cloud Security Measures in Cloud Storage Applications,” Journal of Information Security, vol. 6, no. 2, pp. 142–154, Mar. 2015. [6] “Real-time big data processing for anomaly detection: A Survey,” Int. J. Inf. Manage., vol. 45, pp. 289–307, Apr. 2019. [7] “Cyber Security: Threat Detection Model based on Machine learning Algorithm - IEEE Conference Publication.” [Online]. Available: https://ieeexplore.ieee.org/document/8724096. [Accessed: 04-Sep-2020] [8] “DMME: Data mining methodology for engineering applications – a holistic extension to the CRISP-DM model,” Procedia CIRP, vol. 79, pp. 403–408, Jan. 2019. [9] “A Reference Model for Big Data Analytics” [Online]. Available: https://www.researchgate.net/publication/327728739_A_Reference_Model_for_Big_Data_Analytic s. [Accessed: 04-Sep-2020] [10] “Exploratory data analysis” [Online]. Available: https://psycnet.apa.org/record/2011-23865-003. [Accessed: 04-Sep-2020] [11] “Open Source Search: The Creators of Elasticsearch, ELK Stack & Kibana.” [Online]. Available: https://www.elastic.co/. [Accessed: 04-Sep-2020] [12] “Kibana.” [Online]. Available: https://www.elastic.co/kibana. [Accessed: 04-Sep-2020] 16 �Journal of Natural Sciences and Engineering, Vol. 3, (2020) DOI number: 12.34567/JONSAE2020123 [13] “Logstash.” [Online]. Available: https://www.elastic.co/logstash. [Accessed: 04-Sep-2020] [14] “RapidMiner.” [Online]. Available: https://rapidminer.com/. [Accessed: 04-Sep-2020] 17 � Dublin Core The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/. Title A name given to the resource Journal of Natural Sciences and Engineering Identifier An unambiguous reference to the resource within a given context 2637-2835 DOI Digital object identifier 10.14706 Publisher An entity responsible for making the resource available International Burch University Description An account of the resource Journal of Natural Sciences and Engineering (JONSAE) is a peer-reviewed, biannually published international journal focusing on empirical and theoretical research in all branches of Engineering and Natural Sciences. It is published on the behalf of Faculty of Engineering and Natural Sciences of International Burch University and aims to provide the best content regarding by publishing original research papers, review articles, special issues, feature articles, and book reviews. All manuscript submissions are subject to initial appraisal by the Editor, and, if found suitable for further consideration, to peer review by independent, anonymous referees. All peer review is double-blind and submission is online. The journal welcomes theoretical, applied, interdisciplinary and methodological work, with preference on empirical research, critical approach and problem-solving methods in manuscripts. Language A language of the resource English Dublin Core The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/. Title A name given to the resource Using Exploratory Data Analysis and Big Data Analytics for Detecting Anomalies in Cloud Computing Author Author Ibrahim Muzaferija, Zerina Mašetić Abstract A summary of the resource. – While leveraging cloud computing for large-scale distributed applications allows seamless scaling, many companies struggle following up with the amount of data generated in terms of efficient processing and anomaly detection, which is a necessary part of the management of modern applications. As the record of user behavior, weblogs surely become the research item related to anomaly detection. Many anomaly detection methods based on automated log analysis have been proposed. However, not in the context of big data applications where anomalous behavior needs to be detected in understanding phases prior to modeling a system for such use. Big Data Analytics often ignores anomalous point due to high volume of data. To address this problem, we propose a complemented methodology for Big Data Analytics – the Exploratory Data Analysis, which assists in gaining insight into data relationships without the classical hypothesis modeling. In that way, we can gain better understanding of the patterns and spot anomalies. Results show that Exploratory Data Analysis facilitates anomaly detection and the CRISP-DM Business Understanding phase, making it one of the key steps in the Data Understanding phase. Keywords Keywords. Cloud Computing, Big Data, Data Mining, Anomaly Detection Identifier An unambiguous reference to the resource within a given context 2637-2835 DOI Digital object identifier 10.14706/JONSAE2021320 https://omeka.ibu.edu.ba/files/original/193f2509e547effd21bd12f5fca8923a.pdf d189bc6e4d6d8ae6e22e71f9fe0a08d7 PDF Text Text Journal of Foreign Language Teaching and Applied Linguistics Using Film Subtitles in FLT in Croatia Magdalena Nigoević & Koraljka Pejić & Trišnja Pejić University of Split, Croatia Submitted: 16.04.2014. Accepted: 19.11.2014. Abstract It is a general belief that students need to receive substantial input of authentic materials in FLT. The combination of verbal information with full visual experiences, such as films, has been found most appealing. Not only a large amount of natural language, but also a rich variety of cultural forms and expressions are mediated by this kind of “comprehensible input” (Krashen 1985). Various studies have demonstrated the ways in which intralingual subtitled audio-visual material can improve the effectiveness of general foreign language comprehension (Caimi 2002, Vanderplank 1988) and how it can be a useful tool in foreign language teaching and foreign language acquisition (Neuman & Koskinen 1992). Most foreign television and cinema programs distributed in Croatia have always been accompanied by interlingual subtitles; therefore the viewers are accustomed to them. Consequently, such a habit can be efficiently exploited in foreign language learning among Croatian students who will certainly more easily develop strategies to derive benefits from subtitled films. The main aim of this study was to examine whether and to what extent film subtitles (captions) increase learners’ ability to process languages. Our hypothesis was that subtitles facilitate general comprehension of a film, provided that the linguistic difficulty of the authentic film material has been carefully selected in order to match the students’ overall competency in L2. Our research was conducted among students of B1/B2 level of English L2. Students were divided into two groups: one group watched a sequence of a feature film without subtitles, while the other was shown the same material with subtitles. Both groups were given a specially designed test to assess their general comprehension of the viewed material. The findings revealed that the group of students viewing the subtitled film showed better results than the other group. Keywords: FLT, authentic audio-visual material, intralingual film subtitles, Croatian learners 181 �Using Film Subtitles in FLT in Croatia Introduction Learners of a foreign language do not always have an opportunity to communicate with ‘native speakers’. Therefore, it is exceptionally important that they are continually exposed to interactional and speech patterns of L2. This can easily be achieved by using audio-visual materials. The role of audio-visual materials as a stimulating and facilitating tool in the process of teaching and learning a foreign language has been widely acknowledged. “They can provide (a) the motivation achieved by basing lessons on attractively informative content material; (b) the exposure to a varied range of authentic speech, with different registers, and (c) language used in the context of real situations, which adds relevance and interest to the learning process” (Carrasquillo 1994:140). Through such materials students become acquainted with various sorts of verbal and non-verbal behaviour in L2, conversational strategies (opening and closing, turn taking) and various cultural patterns. Among other audio-visual materials, film is probably the most authentic, that is, “authentic, in the sense that the language is not artificially constrained, and is, at the same time, amenable to exploitation for language teaching purposes” (MacWilliam 1986: 134). It is an excellent medium for introducing various aspects of the foreign language in the classroom. Furthermore, films allow teachers and learners to explore the nonverbal and cultural aspects of language as well as verbal. It can also be highly motivating since it shows real-life situations and characters, thus giving an authentic and often amusing way to get acquainted with the (extra)linguistic and cultural aspects of the target reality. Subtitles in foreign language learning Various studies have been carried out on the ways in which intralingual1 subtitled audio-visual material can improve the effectiveness of general foreign language comprehension (Caimi 2002, Markham 1993 and 1999, Vanderplank 1988) and how it can be a useful tool in foreign language teaching and foreign language acquisition. Among others, Garza (1991) studied the way in which subtitles (captions) affect the study of vocabulary at higher level learners and concluded that the use of subtitles increases the comprehension and acquisition of vocabulary. Neuman & Koskinen (1992) obtained similar results in their study with advanced EFL students and came to a conclusion that students who watched subtitled (captioned) videos demonstrate better comprehension and vocabulary acquisition results. Baltova (1999) conducted 182 �Journal of Foreign Language Teaching and Applied Linguistics an experiment with French students in Canada whose native language was English. The purpose of her study was to find out how the learning and retention of content and vocabulary in French were affected by different authentic video formats. She also proved that the retention of the video content was superior under the subtitled conditions. The special edition of R.I.L.A. (Rassegna Italiana di Linguistica Applicata), edited by Annamaria Caimi in 2002, contains the proceedings of a scientific conference on subtitled films and several papers are focused on the role of subtitles in foreign language teaching and learning. Most the studies have focused on short-term effects of text aids, although some authors advocate the systematic collection of long-term data (Danan 2004: 75-76). The insight into both short- and long-term effects of subtitling can be seen in the experiment done by Bianchi e Ciabattoni (2008) in a broad-range investigation among the Italian adult learners of English. There were also past experiences and projects which encouraged the use of foreign language learning methods based on the creation of subtitles by students and pupils.2 All the findings agree that subtitling can contribute to language learning and that in formal learning contexts, subtitling can reduce the anxiety experienced by foreign language learners. The use of subtitled audio-visual material has the advantages of providing simultaneous exposure to spoken language, printed text and visual information, all conveying the same message (see: Baltova 1999: 33). Moreover, subtitles can function as an important element that bridges the gap between reading and listening skills (see: Borrás & Lafayette 1994). Most foreign programs distributed in Croatia, as in other so-called “subtitling countries”3, have always been accompanied by interlingual subtitles; therefore the viewers are exposed to subtitled foreign television and cinema programs from a very young age. As the viewers are accustomed to the logic of subtitling, they can easily switch to the use of intralingual or same-language subtitles. Consequently, such a habit can be efficiently exploited in foreign language learning among Croatian students who will certainly more easily develop strategies to derive benefits from subtitled films.4 However, the integration of film subtitles into language learning and teaching practice in Croatia has so far been unsatisfactory and few studies (Strmečki Marković 2003) investigated the use of film subtitles. Method of the Study The main objective of this study was to examine whether and to what extent film subtitles increase the language-processing ability of the learners. We wanted to determine whether watching a subtitled film facilitates general comprehension among Croatian learners. For the purpose of this study the opening sequence (7’50’’) of the feature film About a Boy (2002, directed by Paul Weitz) was chosen. The 183 �Using Film Subtitles in FLT in Croatia actors in the sequence are native speakers and use contemporary, standard variant of the English language. The topics of their conversations and monologues are common and deal with everyday situations, well known to the learners. The vocabulary and structures used in the sequence are already familiar to upper-intermediate level students. Our research was conducted among Croatian secondary school students of English L2 at B1/B2 level of the Common European Framework. The students were divided in two groups. The groups were homogenous in terms of the number of hours of studying English in secondary school (380), in terms of age (17-18) and accordingly, in terms of general culture and cineliteracy. The Treatment group viewed the selected sequence with subtitles, while the Control group watched the same sequence without subtitles. The general comprehension of the viewed material was tested by a particularly designed test. The test consisted of fifteen (15) open questions that the participants had to fill in, based on the information they heard in the sequence. Some questions required several elements in the answer, so the total score was 19. For each correct answer the participants scored one point. Each test was corrected by two independent, experienced English language teachers. Synonyms were also accepted as correct answers, provided that participant’s comprehension was confirmed. The experiment was conducted among secondary school students in Split (Croatia) in March 2014. The total number of students was one hundred (100), divided in two groups of fifty (50) participants each. They were given precise instructions for the activity: first they had to read the comprehension test questions, then carefully watch the sequence and afterwards answer the questions. They were not allowed to look at the questions while watching the sequence. Immediately after watching it, they were asked to complete the previously designed test and were given ten minutes (10’) for the task. The collected data were processed using t-test (SPSS programme) in order to determine the statistical difference between the Treatment group and the Control group. Our hypothesis was that the group that watched the film sequence with subtitles (Treatment group) would have a higher score in the comprehension test than the Control group that had watched the same sequence without subtitles. Discussion and findings 184 �Journal of Foreign Language Teaching and Applied Linguistics After the answer sheets were collected and corrected, the score for each group was calculated. We ran these data through t-test to assess whether the means of the two groups were statistically different from each other. This analysis is appropriate whenever it is important to compare two groups. As can be seen in Figure 1: the Treatment group had a mean score of 13.06, while the Control group had 6.58. The mean of the Treatment Group minus that of the Control Group equals -6.48. Given the 95% confidence interval, the difference is from -7.94 to -5.02. The standard error of difference was 0.736 (see Table 1). Table 1. Results of the comprehension test Control Group Mean 6.58 Standard deviation 3.85 N (number of participants) 50 Treatment Group 13.06 3.45 50 By conventional criteria, the t-test showed that the difference is considered to be extremely statistically significant. All the participants watched the same film sequence and the comprehension was tested by the same test. All the participants were equal in terms of all relevant criteria (age, numbers of hours of studying English, general culture and cineliteracy). The only difference between the groups 185 �Using Film Subtitles in FLT in Croatia was the intervention with subtitles, in that the Treatment group had the opportunity to listen to the speech and simultaneously read the uttered words in the form of subtitles, while the participants of the Control group based their understanding only on the spoken utterances. Since all participants were equal and tested in equal conditions, the difference in the scores can be attributed exclusively to the presence or absence of subtitles. Conclusion The findings are in accordance with previously conducted studies and these results lead us to the conclusion that subtitled film strategies have a positive impact to students’ overall comprehension skills. Because of its realistic use of language, its undemanding grasp and its attractiveness, watching a foreign language film as an activity has an encouraging effect. Not only is film an important source of different themes and topics, it also offers audio-visual stimulation for developing listening, speaking reading and general comprehension skills in foreign language learning. It is important, however, to take into account that a film may be an assisting medium in covering a topic and that it has to be adequate to the level of students’ language competences. If used appropriately, such exposure to film subtitles with Croatian students should definitely strengthen their foreign language comprehension and acquisition of language functions and structures. Nevertheless, the authors are aware of the fact that this study was conducted on a relatively small sample, homogenous in their age and education level. These data were collected exclusively from learners of English as L2 in a country where foreign TV and cinema programmes are usually subtitled and rarely dubbed, so viewers are accustomed to subtitles. Therefore, these data should be applied with caution when making inferences about other types of L2 learners. Notes 1 This refers to audio-visual material subtitled in the same language as the original. Same-language subtitles are also labelled captions or bimodal, unilingual, or intralingual subtitles in scholarly literature (Danan 2004: 68). Captioning was initially intended for individuals who are hearing impaired, but later was used in all spheres of life, both as didactic material and as an assisting tool in daily watching video programmes and films. On the other hand, interlingual (or interlinguistic) subtitling refers to audio-visual material in a foreign language subtitled in the learner's language and it is the most common way of translating a medium into 186 �Journal of Foreign Language Teaching and Applied Linguistics another language so that speakers of other languages can follow it. For the purpose of this study we will use the term ‘subtitles,’ which has become a common term in Europe referring only to intralingual subtitles. 2 Such as the LeViS (Learning via Subtitling) project, was coordinated by Hellenic Open University in Greece within the framework of Socrates Programme, LINGUA 2 (2006-2008) which developed the educational material for active foreign language learning based on film subtitling. (see: http://levis.cti.gr/) 3 Subtitling is the language transfer practice used most widely in Europe. It concerns 28 countries (26 countries plus two regions in two countries): Belgium (Flemishspeaking), Bulgaria, Croatia, Cyprus, Czech Republic, Denmark, Estonia, Finland, Greece, Hungary, Iceland, Ireland, Latvia, Liechtenstein, Lithuania, Luxembourg, Malta, Netherlands, Norway, Poland, Portugal, Romania, Slovakia, Slovenia, Sweden, Switzerland (German-speaking), Turkey and United Kingdom. (Retrieved 13 April 2014 from: http://eacea.ec.europa.eu/llp/studies/documents/study_on_the_use_of_subtitling/rapp ort_final-en.pdf) 4 Some American authors even emphasise “the incidental language learning occurring in Europe with spectators of American films” (Danan 2004: 68). References Baltova, I. (1999). Multisensory language teaching in a multidimensional curriculum: The use of authentic bimodal video in core French. The Canadian Modern Language Review, 56 (1), 32-48. Bianchi, F. & Ciabattoni, T. (2008). Captions and Subtitles in EFL Learning: an investigative study in a comprehensive computer environment. In: Baldry A., M.Pavesi, C.Taylor Torsello & C.Taylor (eds) From Didactas to Ecolingua: an ongoing research project n translation, 69-90. EUT, Edizioni Università di Trieste. Retrieved from www.openstarts.units.it/dspace/bitstream/10077/2848/1/bianchi_ciabattoni.pdf Borrás, I. & Lafayette, R. (1994). Effects of multimedia courseware subtitling on the speaking performance on college students of French. The Modern Language Journal, 78 (1), 61-75. Caimi, A. (ed.) (2002). Cinema: Paradiso delle lingue. I sottotitoli nell’apprendimento linguistico. Special issue of RILA – Rassegna Italiana di Linguistica Applicata, 34 (1-2). 187 �Using Film Subtitles in FLT in Croatia Carrasquillo, A. L. (1994). Teaching English as a second language: A resource guide. New York: Garland Publishing. Danan, M. (2004). Captioning and subtitling: Undervalued language learning strategies. Meta, 49(1), 67-77. Garza, T. (1991). Evaluating the use of captioned video materials in advanced foreign language learning. Foreign Language Annals, 24 (3), 239-258. Krashen, S. (1985). The Input Hypothesis: Issues and Implications. London: Longman. MacWilliam, I. (1986). Video and language comprehension. ELT Journal, 40 (2): 131-135. Markham, P. (1993). Captioned TV videotapes: Effects of visual support on second language comprehension. Journal of Educational Technology Systems, 21 (3), 183-191. Markham, P. (1999). Captioned videotapes and second-language listening word recognition. Foreign Language Annals, 32 (3), 321-328. Neuman, S.B. & Koskinen, P. (1992). Captioned TV as comprehensible input: Effects of incidental word learning from context for language minority students. Reading Research Quarterly, 27 (1), 94-106. Strmečki Marković, S. (2003). Igrani film u nastavi jezičnih vježbi u sklopu studija germanistike u Zagrebu. Strani jezici, 32 (3), 59-68. Vanderplank, R. (1988). The value of teletext subtitles in language learning. English Language Teaching (ELT) Journal, 42 (4), 272-281. 188 � Dublin Core The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/. Extent The size or duration of the resource. 2818 Title A name given to the resource Using Film Subtitles in FLT in Croatia Author Author Nigoević, Magdalena Pejić, Koraljka Pejić, Trišnja Abstract A summary of the resource. It is a general belief that students need to receive substantial input of authentic materials in FLT. The combination of verbal information with full visual experiences, such as films, has been found most appealing. Not only a large amount of natural language, but also a rich variety of cultural forms and expressions are mediated by this kind of “comprehensible input” (Krashen 1985). Various studies have demonstrated the ways in which intralingual subtitled audio-visual material can improve the effectiveness of general foreign language comprehension (Caimi 2002, Vanderplank 1988) and how it can be a useful tool in foreign language teaching and foreign language acquisition (Neuman & Koskinen 1992). Most foreign television and cinema programs distributed in Croatia have always been accompanied by interlingual subtitles; therefore the viewers are accustomed to them. Consequently, such a habit can be efficiently exploited in foreign language learning among Croatian students who will certainly more easily develop strategies to derive benefits from subtitled films. The main aim of this study was to examine whether and to what extent film subtitles (captions) increase learners’ ability to process languages. Our hypothesis was that subtitles facilitate general comprehension of a film, provided that the linguistic difficulty of the authentic film material has been carefully selected in order to match the students’ overall competency in L2. Our research was conducted among students of B1/B2 level of English L2. Students were divided into two groups: one group watched a sequence of a feature film without subtitles, while the other was shown the same material with subtitles. Both groups were given a specially designed test to assess their general comprehension of the viewed material. The findings revealed that the group of students viewing the subtitled film showed better results than the other group. Keywords: FLT, authentic audio-visual material, intralingual film subtitles, Croatian learners Date A point or period of time associated with an event in the lifecycle of the resource 2015-04-16 Keywords Keywords. Article PeerReviewed P Philology. Linguistics,PE English,PG Slavic, Baltic, Albanian languages and literature