<?xml version="1.0" encoding="UTF-8"?>
<itemContainer xmlns="http://omeka.org/schemas/omeka-xml/v5" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://omeka.org/schemas/omeka-xml/v5 http://omeka.org/schemas/omeka-xml/v5/omeka-xml-5-0.xsd" uri="https://omeka.ibu.edu.ba/items/browse?output=omeka-xml&amp;page=332&amp;sort_field=Dublin+Core%2CTitle" accessDate="2026-06-29T07:06:17+01:00">
  <miscellaneousContainer>
    <pagination>
      <pageNumber>332</pageNumber>
      <perPage>10</perPage>
      <totalResults>3494</totalResults>
    </pagination>
  </miscellaneousContainer>
  <item itemId="1916" public="1" featured="0">
    <fileContainer>
      <file fileId="2809">
        <src>https://omeka.ibu.edu.ba/files/original/677baf7f9d6988021e9bd4876247c178.docx</src>
        <authentication>f050eaf6ddbaa3af0f965389373856ab</authentication>
      </file>
      <file fileId="2810">
        <src>https://omeka.ibu.edu.ba/files/original/a5fed30f085884d6608fc112c9ba1cf1.pdf</src>
        <authentication>72164ffc68b8cc2154db7c5d26f82643</authentication>
        <elementSetContainer>
          <elementSet elementSetId="4">
            <name>PDF Text</name>
            <description/>
            <elementContainer>
              <element elementId="52">
                <name>Text</name>
                <description/>
                <elementTextContainer>
                  <elementText elementTextId="15730">
                    <text>Use of Literary Texts in Language Classrooms: A Fun Way of Teaching English
Hasan Serkan Kirca
Süleyman Demirel University/ Isparta, Turkey
Key words: motivation, literature, language teaching
ABSTRACT
Use of literary texts in language classrooms has long been a concern for researchers. Underlying rationale for the use
of different genres of literature lies in the fact that they familiarize language learners with different uses of the target
language through authentic materials. Furthermore, literary texts provide a student-friendly atmosphere which is
conducive to meaningful and entertaining learning.
Language learning is considered to be a demanding endavour for language learners. Included in the challenges
associated with language learning are affective variables. However, literary texts, while exposing the learners to the
imaginary and calming world of literature, help learners cope with anxiety or stress which might be exerted and
witnessed in the process of language learning. Along with the aforementioned advantages, literary texts promote
higher level of thinking skills such as synthesizing, analyzing and critical thinking.among language learners.
The first part of the presentation will be devoted to the rationale for using literary texts in the language classrooms
with an emphasis on their potential benefits. In the second part, the presenter will provide information on a number
of literary genres which can be employed in language classrooms.
The presenter will end up the session with an exemplary demonstration as to how short stories ,as a literary genre,
can be utilized in language classrooms. The last part of the presentation will be interactive through the participation
of the audience.

�</text>
                  </elementText>
                </elementTextContainer>
              </element>
            </elementContainer>
          </elementSet>
        </elementSetContainer>
      </file>
    </fileContainer>
    <elementSetContainer>
      <elementSet elementSetId="1">
        <name>Dublin Core</name>
        <description>The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/.</description>
        <elementContainer>
          <element elementId="79">
            <name>Extent</name>
            <description>The size or duration of the resource.</description>
            <elementTextContainer>
              <elementText elementTextId="15723">
                <text>1808</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="50">
            <name>Title</name>
            <description>A name given to the resource</description>
            <elementTextContainer>
              <elementText elementTextId="15724">
                <text>Use of Literary Texts in Language Classrooms: A Fun Way of Teaching English</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="96">
            <name>Author</name>
            <description>Author</description>
            <elementTextContainer>
              <elementText elementTextId="15725">
                <text>KIRCA, Hasan Serkan</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="94">
            <name>Abstract</name>
            <description>A summary of the resource.</description>
            <elementTextContainer>
              <elementText elementTextId="15726">
                <text>Key words: motivation, literature, language teaching  ABSTRACT  Use of literary texts in language classrooms has long been a concern for researchers. Underlying rationale for the use of different genres of literature lies in the fact that they familiarize language learners with different uses of the target language through authentic materials. Furthermore, literary texts provide a student-friendly atmosphere which is conducive to meaningful and entertaining learning.  Language learning is considered to be a demanding endavour for language learners. Included in the challenges associated with language learning are affective variables. However, literary texts, while exposing the learners to the imaginary and calming world of literature, help learners cope with anxiety or stress which might be exerted and witnessed in the process of language learning. Along with the aforementioned advantages, literary texts promote higher level of thinking skills such as synthesizing, analyzing and critical thinking.among language learners.  The first part of the presentation will be devoted to the rationale for using literary texts in the language classrooms with an emphasis on their potential benefits. In the second part, the presenter will provide information on a number of literary genres which can be employed in language classrooms.  The presenter will end up the session with an exemplary demonstration as to how short stories ,as a literary genre, can be utilized in language classrooms. The last part of the presentation will be interactive through the participation of the audience.</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="45">
            <name>Publisher</name>
            <description>An entity responsible for making the resource available</description>
            <elementTextContainer>
              <elementText elementTextId="15727">
                <text>IBU Publishing</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="40">
            <name>Date</name>
            <description>A point or period of time associated with an event in the lifecycle of the resource</description>
            <elementTextContainer>
              <elementText elementTextId="15728">
                <text>2013-05-03</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="97">
            <name>Keywords</name>
            <description>Keywords.</description>
            <elementTextContainer>
              <elementText elementTextId="15729">
                <text>Article
PeerReviewed</text>
              </elementText>
            </elementTextContainer>
          </element>
        </elementContainer>
      </elementSet>
    </elementSetContainer>
  </item>
  <item itemId="932" public="1" featured="0">
    <elementSetContainer>
      <elementSet elementSetId="1">
        <name>Dublin Core</name>
        <description>The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/.</description>
        <elementContainer>
          <element elementId="79">
            <name>Extent</name>
            <description>The size or duration of the resource.</description>
            <elementTextContainer>
              <elementText elementTextId="7479">
                <text>3537</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="50">
            <name>Title</name>
            <description>A name given to the resource</description>
            <elementTextContainer>
              <elementText elementTextId="7480">
                <text>USE OF LITERATURE IN ELT A SHORT STORY SAMPLE:   VERSION OF T H E A D V E N T U RE S O F H U C K L E B E RRY F I N N</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="96">
            <name>Author</name>
            <description>Author</description>
            <elementTextContainer>
              <elementText elementTextId="7481">
                <text>Erten, Selcen</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="94">
            <name>Abstract</name>
            <description>A summary of the resource.</description>
            <elementTextContainer>
              <elementText elementTextId="7482">
                <text>The aim of this paper was to emphasize the use of literature in ESL/EFL contexts and investigate what the students considered literature in general and in English classes. To be specific, the use of short stories was explained and investigated through the Adventures of Huckleberry Finn. At the beginning of the first lesson with the researcher, a questionnaire was given to 32 students, who were the preparation class beginner level students at Eskisehir Osmangazi University in the spring semester of 2012-2013 academic year. The second application of the questionnaire was made in the last lesson. The results basically revealed that the students believed the importance and effectiveness of short stories in EFL classes and the reason for their attitudes toward literature in English was actually because of limitations in their linguistic levels needed to understand and appreciate literature.      Keywords: Literature, EFL/ESL context, short story.</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="40">
            <name>Date</name>
            <description>A point or period of time associated with an event in the lifecycle of the resource</description>
            <elementTextContainer>
              <elementText elementTextId="7483">
                <text>2014</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="97">
            <name>Keywords</name>
            <description>Keywords.</description>
            <elementTextContainer>
              <elementText elementTextId="7484">
                <text>Conference or Workshop Item
PeerReviewed</text>
              </elementText>
            </elementTextContainer>
          </element>
        </elementContainer>
      </elementSet>
    </elementSetContainer>
    <tagContainer>
      <tag tagId="18">
        <name>PE English</name>
      </tag>
    </tagContainer>
  </item>
  <item itemId="2437" public="1" featured="0">
    <elementSetContainer>
      <elementSet elementSetId="1">
        <name>Dublin Core</name>
        <description>The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/.</description>
        <elementContainer>
          <element elementId="79">
            <name>Extent</name>
            <description>The size or duration of the resource.</description>
            <elementTextContainer>
              <elementText elementTextId="19446">
                <text>995</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="50">
            <name>Title</name>
            <description>A name given to the resource</description>
            <elementTextContainer>
              <elementText elementTextId="19447">
                <text>Using ‘Glocal News’ to Develop Students’ Reading and Speaking Skills.</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="96">
            <name>Author</name>
            <description>Author</description>
            <elementTextContainer>
              <elementText elementTextId="19448">
                <text>ÖZÇINAR-SIREL, Nazan</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="94">
            <name>Abstract</name>
            <description>A summary of the resource.</description>
            <elementTextContainer>
              <elementText elementTextId="19449">
                <text>With the improvement of technology, many young people regard themselves as non-readers because they would rather engage in getting information from other forms of media such as the Internet, television, advertising, music, movies, video games and other digital realities. Therefore, teachers are constantly thinking of challenging ways to assign tasks that students can perform with these digital gadgets. Teachers are also aware of the fact that students need to be exposed to reading materials as much as possible so that they can improve their level of English.  It is difficult to envisage a language- teaching programme without any reading tasks assigned to students. Whether teachers assign their students to read graded reader tasks or newspaper articles does not make any difference. It is a known fact that students will improve their reading skills with any reading tasks assigned to them. Therefore, reading newspaper articles is an effective way that teachers can use with their intermediate level of students to improve their reading skills. Unfortunately, we are not much of a reading society and we don’t even read a newspaper regularly in our mother tongue let alone in English. ‘Glocal News’ is one way of these challenging tasks designed for students at the intermediate level to encourage students to read online newspaper articles that they are interested in and present it online as a summary activity on MOODLE, an online open source known also as Course Management Systems (CMS). This workshop attempts to suggest an innovative approach to reading online newspaper articles to create online video journals. </text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="40">
            <name>Date</name>
            <description>A point or period of time associated with an event in the lifecycle of the resource</description>
            <elementTextContainer>
              <elementText elementTextId="19450">
                <text>2012-05-04</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="97">
            <name>Keywords</name>
            <description>Keywords.</description>
            <elementTextContainer>
              <elementText elementTextId="19451">
                <text>Conference or Workshop Item
PeerReviewed</text>
              </elementText>
            </elementTextContainer>
          </element>
        </elementContainer>
      </elementSet>
    </elementSetContainer>
    <tagContainer>
      <tag tagId="32">
        <name>P Philology. Linguistics</name>
      </tag>
    </tagContainer>
  </item>
  <item itemId="1879" public="1" featured="0">
    <fileContainer>
      <file fileId="2733">
        <src>https://omeka.ibu.edu.ba/files/original/100774075b54fcc2fdcd0941149f75dc.docx</src>
        <authentication>25ebec4ed420c209ed706f6b0255c8b1</authentication>
      </file>
      <file fileId="2734">
        <src>https://omeka.ibu.edu.ba/files/original/b39cf054c7052f7a0273e2415d401947.pdf</src>
        <authentication>692c7b2d0018f57e73b4696d03d0002a</authentication>
        <elementSetContainer>
          <elementSet elementSetId="4">
            <name>PDF Text</name>
            <description/>
            <elementContainer>
              <element elementId="52">
                <name>Text</name>
                <description/>
                <elementTextContainer>
                  <elementText elementTextId="15433">
                    <text>Using a Case Study to Teach the (Non)Subtleties of Language: Logical Fallacies and
Principles of Conversational Coherence
Artur Hadaj &amp; Christina Standerfer
University of Tirana/ Tirana, Albania
Key words:logical fallacies, case study, conversational coherence
ABSTRACT
This paper centers on a practical and relevant way to teach logical fallacies and how to avoid them to English as a
second language learners in the Balkan region. The paper begins with a brief overview of the importance of teaching
subtleties of language, such as logical fallacies and principles of conversational coherence and then proceeds to
describe a rather heated written exchange between the editors of the Albanian daily newspaper Shekulli and
representatives of the U.S. Embassy. In 2011, Shekulli published a long editorial without adding any statement
saying that the views expressed in the article did not represent the stand of the newspaper. Immediately after this
editorial, the US Embassy issued a brief statement accusing this newspaper of using an ad hominem argument when
they explicitly referred to the ambassador’s Asian looks and his short stature. In their statement, the Embassy
conveyed information regarding money the U.S. government had donated to the Albanian Media Institute for the
qualification of Albania journalists. The implication being that the journalists of this newspaper either did not want
to attend the qualification courses organized by the Institute or they could not understand the modern principles of
newspaper writing. A few days later the Dutch embassy in Tirana severed relations with Shekulli, accusing its
editors of engaging in slander. Description of the case is followed by an analysis, with a focus on the logical
fallacies evident in the discourse (e.g., ad hominem arguments, non sequiturs, and glittering generalities). The paper
concludes with lesson plans for how the case can be used to teach not only logical fallacies but also principles of
conversational coherence (Grice, 1989) by leading students through a series of exercises in which they reimagine
and reconstruct the exchange in ways that produce different and perhaps more favorable outcomes.

�</text>
                  </elementText>
                </elementTextContainer>
              </element>
            </elementContainer>
          </elementSet>
        </elementSetContainer>
      </file>
    </fileContainer>
    <elementSetContainer>
      <elementSet elementSetId="1">
        <name>Dublin Core</name>
        <description>The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/.</description>
        <elementContainer>
          <element elementId="79">
            <name>Extent</name>
            <description>The size or duration of the resource.</description>
            <elementTextContainer>
              <elementText elementTextId="15426">
                <text>1741</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="50">
            <name>Title</name>
            <description>A name given to the resource</description>
            <elementTextContainer>
              <elementText elementTextId="15427">
                <text>Using a Case Study to Teach the (Non)Subtleties of Language: Logical Fallacies and Principles of Conversational Coherence</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="96">
            <name>Author</name>
            <description>Author</description>
            <elementTextContainer>
              <elementText elementTextId="15428">
                <text>HADAJ, Artur
STANDERFER, Christina</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="94">
            <name>Abstract</name>
            <description>A summary of the resource.</description>
            <elementTextContainer>
              <elementText elementTextId="15429">
                <text>Key words:logical fallacies, case study, conversational coherence  ABSTRACT  This paper centers on a practical and relevant way to teach logical fallacies and how to avoid them to English as a second language learners in the Balkan region. The paper begins with a brief overview of the importance of teaching subtleties of language, such as logical fallacies and principles of conversational coherence and then proceeds to describe a rather heated written exchange between the editors of the Albanian daily newspaper Shekulli and representatives of the U.S. Embassy. In 2011, Shekulli published a long editorial without adding any statement saying that the views expressed in the article did not represent the stand of the newspaper. Immediately after this editorial, the US Embassy issued a brief statement accusing this newspaper of using an ad hominem argument when they explicitly referred to the ambassador’s Asian looks and his short stature. In their statement, the Embassy conveyed information regarding money the U.S. government had donated to the Albanian Media Institute for the qualification of Albania journalists. The implication being that the journalists of this newspaper either did not want to attend the qualification courses organized by the Institute or they could not understand the modern principles of newspaper writing. A few days later the Dutch embassy in Tirana severed relations with Shekulli, accusing its editors of engaging in slander. Description of the case is followed by an analysis, with a focus on the logical fallacies evident in the discourse (e.g., ad hominem arguments, non sequiturs, and glittering generalities). The paper concludes with lesson plans for how the case can be used to teach not only logical fallacies but also principles of conversational coherence (Grice, 1989) by leading students through a series of exercises in which they reimagine and reconstruct the exchange in ways that produce different and perhaps more favorable outcomes.</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="45">
            <name>Publisher</name>
            <description>An entity responsible for making the resource available</description>
            <elementTextContainer>
              <elementText elementTextId="15430">
                <text>IBU Publishing</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="40">
            <name>Date</name>
            <description>A point or period of time associated with an event in the lifecycle of the resource</description>
            <elementTextContainer>
              <elementText elementTextId="15431">
                <text>2013-05-03</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="97">
            <name>Keywords</name>
            <description>Keywords.</description>
            <elementTextContainer>
              <elementText elementTextId="15432">
                <text>Article
PeerReviewed</text>
              </elementText>
            </elementTextContainer>
          </element>
        </elementContainer>
      </elementSet>
    </elementSetContainer>
  </item>
  <item itemId="2586" public="1" featured="0">
    <elementSetContainer>
      <elementSet elementSetId="1">
        <name>Dublin Core</name>
        <description>The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/.</description>
        <elementContainer>
          <element elementId="79">
            <name>Extent</name>
            <description>The size or duration of the resource.</description>
            <elementTextContainer>
              <elementText elementTextId="20341">
                <text>770</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="50">
            <name>Title</name>
            <description>A name given to the resource</description>
            <elementTextContainer>
              <elementText elementTextId="20342">
                <text>Using A Moodle Platform In An Online Exchange To Enhance Intercultural Sensitivity: A Practical Experience In Higher Education</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="96">
            <name>Author</name>
            <description>Author</description>
            <elementTextContainer>
              <elementText elementTextId="20343">
                <text>Raluy Alonso, Angel</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="94">
            <name>Abstract</name>
            <description>A summary of the resource.</description>
            <elementTextContainer>
              <elementText elementTextId="20344">
                <text>As the Council of Europe suggests foreign language teaching needs to comprise not only linguistic performance but also intercultural consciousness and intercultural skills. Despite being grammatically and lexically competent, many university students have limited experience in handling cultural difference due to a lack of exposure to intercultural interaction (Belz, 2006). As O’Dowd (2007) states, online communication tools not only offer more opportunities than before to interact with peers from distant societies but they also provide an authentic and effective way of preparing learners for intercultural enrichment through partnership.    The aim of this talk is to present a summary of the experience and the findings of a semester long online exchange between specialist learners of English at the University of Vic (Barcelona, Spain) and at the University of Opole (Poland) during the 2011-2012 academic year. The immediate objective pursed by both institutions was to establish a closer relationship between third year students both physically and virtually so as to foster a better understanding of their counterparts’ culture. The project rested on the principles of reciprocity and learner autonomy, so the communication was asynchronous and fundamentally developed outside the classroom. In order to test the impact of the online communication on the students’ intercultural sensitivity a small scale study was conducted. During the session, the structure, outcomes, challenges and future of the experience will be discussed and some preliminary results of the research project will be presented.  </text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="40">
            <name>Date</name>
            <description>A point or period of time associated with an event in the lifecycle of the resource</description>
            <elementTextContainer>
              <elementText elementTextId="20345">
                <text>2012-05</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="97">
            <name>Keywords</name>
            <description>Keywords.</description>
            <elementTextContainer>
              <elementText elementTextId="20346">
                <text>Conference or Workshop Item
PeerReviewed</text>
              </elementText>
            </elementTextContainer>
          </element>
        </elementContainer>
      </elementSet>
    </elementSetContainer>
    <tagContainer>
      <tag tagId="32">
        <name>P Philology. Linguistics</name>
      </tag>
    </tagContainer>
  </item>
  <item itemId="2249" public="1" featured="0">
    <fileContainer>
      <file fileId="3303">
        <src>https://omeka.ibu.edu.ba/files/original/68a47120ae18ec74cd7d533e54a2be14.pdf</src>
        <authentication>f7ba5d25d9c3b40841699af4cf75665e</authentication>
        <elementSetContainer>
          <elementSet elementSetId="4">
            <name>PDF Text</name>
            <description/>
            <elementContainer>
              <element elementId="52">
                <name>Text</name>
                <description/>
                <elementTextContainer>
                  <elementText elementTextId="18187">
                    <text>3rd International Symposium on Sustainable Development, May 31 - June 01 2012, Sarajevo

Talinli, I., Topuz, E. and Akbay, M.U. (2010) Comparative Analysis for Energy Production
Processes (EPPs): Sustainable Energy Futures for Turkey, Energy Policy, 38, 44794488.
Toksarı, M. and Toksarı, M.D. (2011) Bulanık Analitik Hiyerarşi Prosesi (AHP) Yaklaşımı
Kullanılarak Hedef Pazarın Belirlenmesi, ODTÜ Gelişme Dergisi, 38, 51-70.
Tseng, M-L., Lin, Y-H. and Chiu, A.S.F. (2009) Fuzzy AHP-Based Study of Cleaner
Production Implementation in Taiwan PWB Manufacturer, Journal of Cleaner
Production, 17, 1249-1256.
Wang, L., Xu, L. and Song, H. (2011) Environmental Performance Evaluation of Beijing's
Energy Use Planning, Energy Policy, 39, 3483-3495.
Zheng, G., Jing, Y., Huang, H., Shi, G. and Zhang, X. (2010) Developing a Fuzzy Analytic
Hierarchical Process Model for Building Energy Conservation Assessment,
Renewable Energy, 35, 78-87.
Zheng, J. (2011) Enterprise Knowledge Management Application Evaluation Based on Cloud
Gravity Center Model and Fuzzy Extended AHP, Journal of Computers, 6(6), 11101116.

Using Artificial Neural Networks To Forecast Gdp For Turkey

Karaatli Meltem, Göçmen Yağcilar Gamze, Karacadal Hüseyin, Sezer Fırat Suleyman
Suleyman Demirel University, Isparta, Turkey
E-mails: meltemkaraatli@sdu.edu.tr,gamzeyagcilar@sdu.edu.tr,
huseyin_karacadal@hotmail.com,cihangir_07_@hotmail.com

Abstract
Artificial Neural Networks (ANN) is a system resembling biological neural systems and uses
working principles of human brain as a base. ANN can be applied in various fields for the
purposes of forecasting, classification, optimization, data binding and so on. ANN has been
frequently used in financial applications in recent years. In this study, ANN is used in
forecasting Gross Domestic Product of Turkey. Gross Domestic Product (GDP) refers to the
market value of all final goods and services produced within a country in a given period. GDP
can be thought as the size of an economy and it is the foremost important measure of
macroeconomic performance of a country, a country’s health and standard of living.
Therefore, expectations about future GDP can be the primary determinant of investments,
employment, wages, profits and even stock market activities. With respect to its economic
326

�3rd International Symposium on Sustainable Development, May 31 - June 01 2012, Sarajevo

significance mentioned above, the purpose of this study is to forecast Gross Domestic Product
(GDP) for Turkey and to test the ability of ANN Method in forecasting GDP.

Keywords: Importance of Gross Domestic Product, Forecasting, Artificial Neural Networks.

1. INTRODUCTION
Gross Domestic Product (GDP) is the total market value of all the final goods and services
produced within a country’s boarders in a given year. This production is generated by both
citizens of the country and foreigners living in its borders. GDP is one of the most important
indicators of an economic growth, health and welfare. Therefore, it tells us a lot about the real
economic activity.
Calculation of GDP can be basically done in one of two ways: either by adding up what
everyone earned (income approach), or by adding up what everyone spent (expenditure
method) in a year. Logically, both measures should arrive at roughly the same total
(www.investopedia.com). In Turkey, GDP is measured quarterly by TUİK. To compute
economic growth, each quarter is compared to the previous one.
Considering its large impact on almost everybody in an economy, forecasting GDP has a great
importance both theoretically and practically. First of all, GDP represents economic
production and growth. So it gives a signal about the future employment and wages. GDP also
determines stock market return rates. If GDP growth rate is positive, then investors may
expect to gain revenue (www.investopedia.com).By using GDP reports, it can be seen which
sectors of the economy are growing and which ones are declining. This would help investors
to determine whether they should invest in or which sectors they should invest in
(http://useconomy.about.com/od/grossdomesticproduct/p/GDP.htm).
The GDP statistics can help the economists a lot in solving the problems of inflation in the
country. The national income figures throw light as to how much general price level has
increased or decreased, how much of their income people spend on consumption goods and
how much they save? Government can devise measures of controlling inflation or deflation on
the basis of these figures of consumption, saving and investment in the country
(http://www.economicsconcepts.com/gdp_as_a_measure_of_welfare.htm).
In the existing literature, forecasting GDP is widely studied with different methods. In this
paper, we wish to determine whether the forecasting performance of this variable can be
improved using neural network models. In this context, the purpose of this study is to forecast
GDP of Turkey using Artificial Neural Networks (ANN) Method. The rest of this paper is
organized as follows: Section 2 reviews some of the literature on GDP forecasts. Section 3
describes the methodology, while Section 4 presents the results. Finally, section 5 concludes
the paper.

2. Literature review
327

�3rd International Symposium on Sustainable Development, May 31 - June 01 2012, Sarajevo

Tkacz and Hu (1999) have determined whether more accurate indicator models of output
growth, based on monetary and financial variables, can be developed using neural networks.
The authors have used ANN model to forecast GDP growth for Canada. The main findings of
this study are that, at the 1-quarter forecasting horizon, neural networks yield no significant
forecast improvements. At the 4-quarter horizon, however, the improved forecast accuracy is
statistically significant. The root mean squared forecast errors of the best neural network
models are about 15 to 19 per cent lower than their linear model counterparts.
Marcellino (2007) has evaluated whether complicated time series models can outperform
standard linear models for forecasting GDP growth and inflation for the United States. In the
study, it is considered as a large variety of models and evaluation criteria, using a bootstrap
algorithm to evaluate the statistical significance of the results. The main conclusion is that in
general linear time series, models can be hardly beaten if they are carefully specified.
Schumacher and Breitung (2008) have employed factor models to forecast German GDP
using mixed-frequency real-time data, where the time series are subject to different statistical
publication lags. In the empirical application, the authors have used a novel real-time dataset
for the German economy. Employing a recursive forecast experiment, they have evaluated the
forecast accuracy of the factor model with respect to German GDP.
Guegan and Rakotomarolahy (2010) have conducted an empirical forecast accuracy
comparison of the non-parametric method, known as multivariate Nearest Neighbor method,
with parametric VAR modeling on the euro area of GDP. By using both methods for now
casting and forecasting the GDP, through the estimation of economic indicators plugged in
the bridge equations, the authors have got more accurate forecasts when using nearest
neighbor method. It is also proven the asymptotic normality of the multivariate k-nearest
neighbor regression estimator for dependent time series, providing confidence intervals for
point forecast in time series.
Mirbagheri (2010) has investigated the supply side economic growth of Iran by estimating
GDP growth. In this study, the predictive results of Fuzzy-logic and Neural-Fuzzy methods
are also compared. According to the findings of the study, forecasting by the Neural-Fuzzy
method is recommended.
Ge and Cui (2011) have used process neural network (PNN) into the GDP forecast and
established the forecast model based on PNN by choosing the main factors influencing GDP
and using the dual extraction capacity on time and space cumulative effect of PNN. By means
of comparing and analyzing with traditional neural network forecast model, the result shows
that GDP forecast model which bases on PNN has a better performance.
Liliana and Napitupulu (2010) have also used ANN method in forecasting GDP. In this study,
authors have forecasted GDP for Indonesia and they put forward many advantages and
disadvantages of the method. According to the results, the authors have concluded that the
ANN model has better ability in forecasting the macroeconomic indicators.

3. Methodology
328

�3rd International Symposium on Sustainable Development, May 31 - June 01 2012, Sarajevo

Artificial neural networks (ANN) may be identified as computing technologies containing
performances and general features of biological neural networks (Deng v.d., 2008:1118).
ANN, developed by imitating the human brain's operating mechanism with the aim of
realizing the basic operations performed by the brain, is a logical computer programming
technique. In a computer media, an algorithm, which attempts to operate as the brain does,
makes a decision, makes a conclusion, arrives at a conclusion on the basis of the existing data
when data are missing, accepts new data input constantly, learns and remembers, is called as
"Artificial Neural Networks".(Kaltakçı, 1997:411-420)
Artificial neural networks consist of many simple processing elements called as nodes or
nerves. Each nerve is attached to the other nerves with weights. These weights indicate the
information used by the network to solve a problem. Nerves are located in each layer and
these layers are interconnected to the other nerves in adjacent layers. A weight gives the
mathematical value of the relative power of information's connections that have been
transferred from one layer to another. Addition function calculates the sum of all the weighted
inputs of a nerve. Activation function is used for the conversion of output in an acceptable
range. (usually 0-1 range). Input layer is identified with the independent variables while
output layer is identified with the dependent variables (Deng v.d., 2008:1118).
Networks having one layer are called single-layered neural networks while networks having
more than one layer are called multilayered neural networks. In a multilayered neural
network, number of neurons in each layer may vary (Hines, 1997; 206). While a singlelayered network consists of an input and output layer, a multilayered network may consist one
or more middle (hidden) layers. As the number of middle layers increases, the ability of
artificial neural network to get statistics from input data also increases (Nygren, 2004).
If an artificial neural network is required to solve a nonlinear problem, a more sophisticated
type of network is needed for these types of problems. Multilayered sensors (MLS) are
network architectures developed for this purpose. This network has a forward network
architecture and a supervised learning method is used (Deng, 2008:1118). MLS consists of an
input layer, one or more middle layers and an output layers. Each layer has one or more
processing elements. All processing elements in a layer is interconnected to all processing
elements in a top layer. The flow of information is forwards and there is no feedback.
Therefore, these types of networks are called as feed-forward neural network model. There is
no information processing in input layer. The number of processing elements in input and
output layers is totally dependent on the practiced problem. The number of middle layers and
the number of processing elements in middle layers are found by trial and error method
(Lippmann, 1987; 24-25).
Each produced output in these types of networks is compared with the target output in each
learning iteration and errors are calculated. By propogating backwards in neural network, this
error is used to correct the weights. This process goes ahead so long as the mean squared error
between target output and output produced by network is minimized (Deng, 2008:1118). For
this reason, this type of network is also called as error propogation model or backpropogation

329

�3rd International Symposium on Sustainable Development, May 31 - June 01 2012, Sarajevo

network model (Öztemel, 2003:76). These types of networks are illustrated (exemplified) in
Figure 1).
Figure 1: A Multilayered Network Model
Output
Backward
Error Flow

Forward
Activation Flow
Output Layer

…..

Middle Layer

Input

…..

Layer

Input 1

Input 2

Input N

Kaynak: (Hamid ve Iqbal, 2004:1118)

4. Forecasting Gross Domestic Product with ANN
In this study, by the method of artificial neural networks, the gross domestic product has been
estimated on the basis of the calculated data by the method of three-monthly expenditures for
the years of 1998-2010. Data have been drawn from the website of Turkish Statistical
Institute. In the study, 52 pieces of data have been used for each variable covering the threemonthly periods of 13 years. 20% of the data consists of tests and 80% of it consists of
trainings which thus randomly creates 4 different groups.
Gross domestic product consists of a composite of macroeconomic variables such as resident
household consumption, government final consumption expenditure, gross fixed capital
formation, stock exchanges, export and import of goods and services. Gross domestic product
is considered to be dependent variable while household consumption, government final
consumption expenditure, gross fixed capital formation, stock exchanges and export and
import of goods and services and time are considered to be independent variable. Together
with their symbols, the dependent and independent variables used in the study are shown
below.
Gross National Product: GDP
Time: T
330

�3rd International Symposium on Sustainable Development, May 31 - June 01 2012, Sarajevo

Resident Household Consumption: RHC
Government final consumption expenditure: GFCE
Gross fixed capital formation: GFCF
Stock Exchanges: SE
Goods and Services Expenditures: GSE
Import of Goods and Services: IGS
In the study, as the values of independents are unknown during the desired terms accept the
time variable, GDP ,which is a dependent variable, has been predicted after each independent
variable has been estimated separately depending on the time. Namely, each independent
variable has been considered as dependent variable and they have been predicted depending
on the time variable. Different neuron numbers and hidden layer numbers have been tested to
find the most appropriate network which will be used in the prediction of all variables. The
estimated performance metrics have been evaluated in determining the most appropriate
network. The network structure, of which forecasting measurements are the smallest, is
identified as the most suitable one. The most appropriate network structures used to predict
the all variables are illustrated in Table 2. Yet, as the stock exchanges, taken as independent
variable, have so many sharp rises and falls, each quarter is estimated and combined within
itself. The estimation performance metrics; MSE (Mean Square Error), RMSE (Root mean
square) and MAPE (Mean absolute percentage error), which are commonly used in the
literature, are shown in Formula 1,2 and 3 (Zhang ve Hu, 1998:500, Cho, 2003:328, De
Lurgio, 1998:53).

 (y

RMSE 



t

 yt )2

T

(1)


MAPE 

1
T



yt  yt
yt

 100

(2)


MSE 

  y

t



 yt 


2

T

Here;

yt

= The actual observation values,



yt
= Estimated values,
T = Estimated numb
331

(3)

�3rd International Symposium on Sustainable Development, May 31 - June 01 2012, Sarajevo

Table1: The network structures used for estimation of variables
Number of
neurons in
the input
layer

The number
of
intermidiate
layer neurons

Number of
neurons in
the output
layer

R

MAPE

µ ( The
number of
iteration)

RHC

1

3

1

0,97

3,73

3

GFCE

1

4

1

0,96

3,94

5

GFCF

1

3

1

0,81

9,32

10

1

0,83

70,5

20

SE

Independent
variables

2

3

2

SE1

1

SE2

1

3

1

0,86

88,6

20

SE3

1

5

1

0,78

6,6

15

SE4

1

1

0,97

19,5

20

2

3

GSE

1

2

1

0,94

5,20

15

IGS

1

3

1

0,95

6,6

12

GDP

7

3

1

0,99

2,77

2

Estimation performance metrics of Gross domestic product (GDP) are obtained as
MSE=0,000042, RMSE=0,006451 ve MAPE=2,775746%. On the basis of these
measurements, Witt and Witt (2000) classified the estimation models and called those whose
MAPE values are under 10% as the models having " high accuracy" and those whose values
are between 10% nd 20% as the "correct predictions". Similarly, Lewis classified the models
and called those hose MAPE values are less than 10% as "very good", those between 10% and
20% as "good", those between 20% and 50% as "acceptable" and those under 50% as "false
and erroneous" (Aktaran, Çuhadar ve Kayacan, 2005:6).

332

�3rd International Symposium on Sustainable Development, May 31 - June 01 2012, Sarajevo

Figure2: The optimum network structure to estimate the GDP
Input layer
RHC
Hidden layer
GFCE
Outputlayer
GFCF

SE

GDP

GSE

IGS

T

In this study, Matlab 7.9 computer package program has been used. For
training function 'trainlm' , for learning function 'learngdm', for performance function 'MSE'
and for the transfer function 'tansig' have been selected. In the study, predicted and
actual values have been given in Table 2.

Table 2: Actual and Estimated Values of GDP

333

Actual(1.000TL)

Estimated (1.000TL)

2011 GDP

85.139.293

109.708.230

2011-Q1

26.205.423

26.070.548

2011-Q2

27.904.922

27.911.332

2011-Q3

31.028.948

28.430.643

2011-Q4

---------

27.295.707

2012 GDP

---------

111.233.502

2011-Q1

---------

26.813.588

2011-Q2

---------

28.153.427

2011-Q3

---------

28.500.755

2011-Q4

---------

27.765.733

�3rd International Symposium on Sustainable Development, May 31 - June 01 2012, Sarajevo

5. CONCLUSIONS
Gross Domestic Product is an important indicator for all economic units including companies,
investors and households. Because it determines their future incomes, returns of their
investments, cost of capital and so on. So economic units make their decisions and set
economic policies depending on future economic conditions determined by what the future
GDP will be. Here the question is which methods can be more suitable and successful in
forecasting GDP. In this paper we applied Artificial Neural Networks method as a prediction
model. Results suggest that forecasting performance of this variable can be improved using
neural network models.

REFERENCES
Cho, V. (2003). “A Comparison of Three Different Approaches to Tourist Arrival
Forecasting”, Tourism Management, 24: 323-330.
Çuhadar, M. ve Kayacan C. (2005), “Yapay Sinir Ağları Kullanılarak Konaklama
İşletmelerinde Doluluk Oranı Tahmini: Türkiye’deki Konaklama İşletmeleri Üzerine Bir
Deneme”, Anatolia:Turizm Araştırmaları Dergisi, 16(1): 1990-2005.
De LURGIO, A. S. (1998), Forecasting Principles and Applications, Irwin McGrawHill:Singapore.
Deng, Wei-Jaw, Wen-Chin Chen, Wen Pei “Back-propagation neural network based
importance–performance analysis for determining critical service attributes”, Expert Systems
with Applications 34 (2008) 1115–1125.
Ge, L., Cui, B., (2011), “Research on Forecasting GDP Based on Process Neural Network”,
IEEE 2011, 7. International Conference on Natural Computation, 821-824.
Guegan, D., Rakotomarolahy, P., (2010), “Alternative Methods for Forecasting GDP”,
University of Paris, CES Working Papers, 2010.65.
Hamid, Shaikh, A. ve Zahid Iqbal (2004), “Using Neural Networks for Forecasting Volatility
of S&amp;P 500 Index Futures Prices”, Journal of Business Research, 57: 1116-1125.
Hines, J, W., MATLAB, Supplement to Fuzzy and Neural Approaches in Engineering, John
Wiley&amp;Sons, Inc., 1997.
Kaltakci, M, Y., Dere, Y., Yapay Sinir Ağları Uygulamalarının İnşaat Mühendisliğinde
Kullanımı, Prof. Dr. Rifat Yarar Sempozyumu, Editör: Semih S.
Liliana, Napitupulu, T.A., (2010), “Artificial Neural Network Application in Gross Domestic
Product Forecasting- an Indonesian Case”, 2. International Conference on Advances in
Computing, Control and Telecommunication Technologies, IEEE 2010, 89-93.
Lippmann, R., “An Introduction to Computing with Neural Nets”, Vol.4, 1987.
Marcellino, M., (2007), “A Comparison of Time Series Models for Forecasting GDP Growth
and Inflation”, http://www.eui.eu/Personal/Marcellino/1.pdf.
334

�3rd International Symposium on Sustainable Development, May 31 - June 01 2012, Sarajevo

Mirbagheri, M., (2010), “Fuzzy Logic and Neural Network Fuzzy Forecasting of Iran GDP
Growth”, African Journal of Business Management, Vol.4, No.6, 925-929.
Nygren, K., Stock Prediction: A Neural Network Approach, Master Thesis, Royal Institute Of
Technology, KTH, 2004.
Öztemel, E., Yapay Sinir Ağları, Papatya Yayıncılık, İstanbul, 2003.
Schumacher, C., Breitung, J., (2008), “Real-time Forecasting of German GDP based on Large
Factor Model with Monthly and Quarterly Data”, International Journal of Forecasting, Vol.
24, 386-398.
Tkacz, Greg, Hu, Sarah, (1999), “Forecasting GDP Growth Using Artificial Neural
Networks”, Bank of Canada Working Papers, 99-3.
Zhang, G., Hu, M.Y. (1998) “Neural Network Forecasting of the British Pound/US Dollar
Exchange Rate”, Omega Int. J. Mgmt. Sci, 26(4): 495-506.
(http://useconomy.about.com/od/grossdomesticproduct/p/GDP.htm).
(http://www.economicsconcepts.com/gdp_as_a_measure_of_welfare.htm).
(www.investopedia.com).

The Importance And The Place Of Ombudsman In Law State

Feyzullah Ünal
Dumlupinar University, Faculty of Economics and Administrative Sciences
E-mail: feyz_unal@mynet.com

Abstract
In analyzing the ombudsman from the respesct of its historical roots, it is understood that this
institution has been inspired by Islam state system and Otoman state system. The institution
ombudsman has been implemented in countries more than 100 today and overtaken the
mission of protecting the citizens against the maladministration, securing the fundamental
rights and liberty and constituted security for both governing and governed. In this study, it is
offered that the fundamental rights and freedoms should be under the security, all activities of
the government should be under the control of jurisdiction and the significance of this
institution sould be awared in realizing the legal governance.

Keywords: Ombudsman, law state, fundamental rights and freedoms, justice, control and
judicial control.

335

�</text>
                  </elementText>
                </elementTextContainer>
              </element>
            </elementContainer>
          </elementSet>
        </elementSetContainer>
      </file>
    </fileContainer>
    <elementSetContainer>
      <elementSet elementSetId="1">
        <name>Dublin Core</name>
        <description>The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/.</description>
        <elementContainer>
          <element elementId="79">
            <name>Extent</name>
            <description>The size or duration of the resource.</description>
            <elementTextContainer>
              <elementText elementTextId="18181">
                <text>1129</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="50">
            <name>Title</name>
            <description>A name given to the resource</description>
            <elementTextContainer>
              <elementText elementTextId="18182">
                <text>Using Artificial Neural Networks To Forecast Gdp For Turkey</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="96">
            <name>Author</name>
            <description>Author</description>
            <elementTextContainer>
              <elementText elementTextId="18183">
                <text>Karaatli, Meltem</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="94">
            <name>Abstract</name>
            <description>A summary of the resource.</description>
            <elementTextContainer>
              <elementText elementTextId="18184">
                <text>Artificial Neural Networks (ANN) is a system resembling biological neural systems and uses  working principles of human brain as a base. ANN can be applied in various fields for the  purposes of forecasting, classification, optimization, data binding and so on. ANN has been  frequently used in financial applications in recent years. In this study, ANN is used in  forecasting Gross Domestic Product of Turkey. Gross Domestic Product (GDP) refers to the  market value of all final goods and services produced within a country in a given period. GDP  can be thought as the size of an economy and it is the foremost important measure of  macroeconomic performance of a country, a country’s health and standard of living.  Therefore, expectations about future GDP can be the primary determinant of investments,  employment, wages, profits and even stock market activities. With respect to its economic significance mentioned above, the purpose of this study is to forecast Gross Domestic Product  (GDP) for Turkey and to test the ability of ANN Method in forecasting GDP.  Keywords: Importance of Gross Domestic Product, Forecasting, Artificial Neural Networks.</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="40">
            <name>Date</name>
            <description>A point or period of time associated with an event in the lifecycle of the resource</description>
            <elementTextContainer>
              <elementText elementTextId="18185">
                <text>2012-05-31</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="97">
            <name>Keywords</name>
            <description>Keywords.</description>
            <elementTextContainer>
              <elementText elementTextId="18186">
                <text>Conference or Workshop Item
PeerReviewed</text>
              </elementText>
            </elementTextContainer>
          </element>
        </elementContainer>
      </elementSet>
    </elementSetContainer>
    <tagContainer>
      <tag tagId="6">
        <name>H Social Sciences (General)</name>
      </tag>
    </tagContainer>
  </item>
  <item itemId="2450" public="1" featured="0">
    <elementSetContainer>
      <elementSet elementSetId="1">
        <name>Dublin Core</name>
        <description>The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/.</description>
        <elementContainer>
          <element elementId="79">
            <name>Extent</name>
            <description>The size or duration of the resource.</description>
            <elementTextContainer>
              <elementText elementTextId="19525">
                <text>904</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="50">
            <name>Title</name>
            <description>A name given to the resource</description>
            <elementTextContainer>
              <elementText elementTextId="19526">
                <text>Using Current out-of-class Materials in Teaching Reading Comprehension</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="96">
            <name>Author</name>
            <description>Author</description>
            <elementTextContainer>
              <elementText elementTextId="19527">
                <text>Asgari, Majid </text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="94">
            <name>Abstract</name>
            <description>A summary of the resource.</description>
            <elementTextContainer>
              <elementText elementTextId="19528">
                <text>The studies on the integrating out of class materials with class materials mostly show the crucial role of this task for teachers and its benefits for students. This study investigates the effect of the integrating currents issues of interest into class materials on the students’ reading comprehension. The following question is proposed. Is relating current issues of interest to class materials useful on students reading comprehension? A true and a null hypothesis are given. The true hypothesis is integrating current issues of interest with class materials in teaching reading has a positive effect on reading comprehension. The study is performed at Islamic Azad University in Hidaj with 60 participants--male and female-- who are majoring in ‘mechanical’ and ‘electrical’ engineering. The subjects are randomly divided into two groups, each with 30 students. One of the groups is used as the experimental group (G1) and the other one as the control group (G2). The subjects are taught for two weeks and finally will take an achievement test. After analyzing the results of the test, and by comparing the means of the scores using t-test, the null hypothesis will be testified to show whether integrating current issues of interest with class materials improves reading comprehension of students in English class at university.</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="40">
            <name>Date</name>
            <description>A point or period of time associated with an event in the lifecycle of the resource</description>
            <elementTextContainer>
              <elementText elementTextId="19529">
                <text>2012-05</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="97">
            <name>Keywords</name>
            <description>Keywords.</description>
            <elementTextContainer>
              <elementText elementTextId="19530">
                <text>Conference or Workshop Item
PeerReviewed</text>
              </elementText>
            </elementTextContainer>
          </element>
        </elementContainer>
      </elementSet>
    </elementSetContainer>
    <tagContainer>
      <tag tagId="32">
        <name>P Philology. Linguistics</name>
      </tag>
    </tagContainer>
  </item>
  <item itemId="246" public="1" featured="0">
    <fileContainer>
      <file fileId="244">
        <src>https://omeka.ibu.edu.ba/files/original/37e978cf4ff09117c1045deec0a58515.pdf</src>
        <authentication>f756b0d603e556316fcddca5d288d4cd</authentication>
        <elementSetContainer>
          <elementSet elementSetId="4">
            <name>PDF Text</name>
            <description/>
            <elementContainer>
              <element elementId="52">
                <name>Text</name>
                <description/>
                <elementTextContainer>
                  <elementText elementTextId="1861">
                    <text>USING DATABASE AUDIT FOR ANALYZING ON HISTORICAL DATA
Adnan Hodžić
International Burch University
Bosnia and Herzegovina
adnan.hodzic@ibu.edu.ba
Adem Karadag
Turkey
nuhadem@gmail.com
Abstract: Database auditing is one of the biggest issues in data security. Absence
of information auditing drives the business applications to the lost trail of business
procedures. To cope with auditing and in order to track operations and the actors of
those operations in time, we need historical data or temporary database. Legitimate
and exchange times are two important time-stamps in temporary database. In this
paper, we show the methods to handle database auditing in business exchange
operations, accurate times, and performers of the operations. These strategies
are separated in two sets; utilizing relational databases, and utilizing semi-structured
information.
Keywords: Database Audit, Historical Data
Introduction
It is very crucial that a company no matter how big is it maintains the security of its
information. Since there are many stealing of valuable data such as customers’ credit
card data, designs and maybe source codes, the data should be protected all the
time. Keeping safe your data is protecting its confidentiality, integrity, and availability.
To ensure the data security, there should be a security plan. Authentication and
administration can facilitate the security at a point (Mullins &amp; Craig, 2002). However,
there is a need to keep log files and check them separately from the database.
Thus database audit was introduced to inspect the trail maintenance. Data servers
help to create a database audit policy to protect the database safe. In this way,
user entries can be controlled. There will be some techniques showing how to make
database auditing depend on historical data. This paper divided into 4 parts. In part 2
and 3, there is literature review of historical data and auditing. The outline of auditing
of database was described in some ways. We used a relational database (Grad, 2013)
to represent the row, column based and log-file auditing strategies.
Database Auditing
Database auditing includes inspecting a database to control and view the actions of
database users. In this way, auditor can see the manipulations, corruptions or glitches
on the data. Database audit also refers to a professional database auditing resolutiongiving chance to track and inspect of any database activity involving accessing,
login, protection breaches, user activities, insert-delete-change the data. Recently, to
supply accurate data auditing a framework has been introduced in respect to data
retention strategies. (Lu &amp; Miklau, 2009) Under retention restriction a formula applied
to audit data in the protected history. In this way database audit would be more
accurate.
ICESoS 2016 - Proceedings Book 283

�International Conference on Economic and Social Studies (ICESoS’16)
It is important to detect changes that are deviates from standard. To differentiate
the normal behaviors on the data and have better results in audit, data mining
techniques are generally applied. This method can only detect the static actions of
the user. This disadvantage can be affected by tracking all activities of user in an
data audit system. As a result, anomaly detection method was introduced to model
the normal behavior of the user. (Park &amp; Lee, 2008) In this way normal behaviors can
be easily differentiated from suspicious ones.
To teach database security and auditing and make the students have better
understanding about it, hands –on lab studies are set (Luebbers, Grimmer, &amp;
Jarke, 2003) In these studies various database scenario are set to integrate theories
of database protection into practices.
Historical Data
Historical data is the information outlining activity, conditions and trends in a company’s
past database. Historical data is often archived, and may be held in non-volatile,
secondary storage. Historical data can be useful in helping to predict the future of a
company and a market, as when conducting predictive analyses.
Table 1: Operational Student Table Referenced By Student-History Table For RowBased Auditing
Student Number

Name

Birth

Adress

Registration Date

Fee

445

Zeynep

10.10.1988

Ankara

15.01.2008

2400

822

Mahmut

12.09.1990

Istanbul

01.09.2010

2600

544

Ayşe

15.05.1991

Istanbul

01.09.2011

2600

It is very significant to detect who made the changes like insertion of a new data,
data manipulation or deletion on the database. In this way, a good data audit can
be retrieved. The time and the user is important issue to analyze the modification of
data. When was the action happened can be answered by valid and transaction
times. In a study it is mentioned that valid and transaction times should assure no
data loss. (Bhargava &amp; Gadia, 1993)
Arranging Historical Data For Auditing On Relational Database
There are some ways to design historical data in a relational database (Margaret
Rouse, 2015) like separated tables for recording past data and transaction log files.
The idea of arranging separated tables for each relational database table is easy
way to track to changes for each item. With both strategies there is no change on
the original data tables. There are 3 ways that we represent here to supply historical
data for auditing database. They are auditing on a row level, column level and logtable.
Database Audit on a Row Level
Our original relational tables stay same but we create a separate table for each table
to apply data audit. Operational “Student” table as shown in Table 1 supplies the
current data of each student for operations. There are 2 kinds of data type in this table;
284 ICESoS 2016 - Proceedings Book

�Regional Economic Development: Entrepreneurship and Innovation
static and operational data. Static data stays same or rarely change like Student
Number, Registration date or Name. Historical or operational data continuously can
be updated like address of the student. Static query, which is always used, already
stays same to call the data from “Student”. Table 2 is an auditing table that includes
all students’ data in the operational table. Two time intervenes needed for valid times.
We need to know the beginning and ending time to sustain the life cycle of the data.
Besides the valid time, we acquire to have operation type to diminish the complexity
of comparison among histories of the same data and the user to make him responsible
from the action.
History of “Student” table is shown in Table 2. It can be seen from history table that Ali
Oz has been a student since 01.09.2005. The user Mustafa updated his fee 2 times by
increasing by $100 each and updated address by changing it from Istanbul to Adana.
Ali has finished the school and his record deleted from the Student table by Semih.
Ahmet moved from Hatay to Ankara on 23.09.2008 and his record terminated on
January 2009 by Mustafa. Zeynep’s fee was increased by $100 by Mustafa. Finally,
Semih added two new students Mahmut and Ayşe to the Student table.
Table 2: Operational Student Table Referenced By Student-History Table
For Row-Based Auditing
Student
Number

Name

Birth

Address

966

Ali

21.04.1986

Istanbul

966

Ali

21.04.1987

966

Ali

855

Regist.
Date

Fee

Begin

End

O
p

User

01.09.2005

2300

01.09.2005

01.09.2007

I

Mustafa

Adana

01.09.2005

2300

01.09.2007

U

Mustafa

21.04.1988

Adana

01.09.2005

2450

01.09.2007

23.06.2008

D

Semih

Ahmet

11.05.1986

Hatay

01.09.2006

2350

21.09.2007

01.09.2008

I

Semih

855

Ahmet

11.05.1986

Ankara

01.09.2006

2350

23.09.2008

15.01.2009

D

Mustafa

445

Zeynep

10.10.1988

Ankara

15.01.2008

2300

15.01.2008

15.06.2010

I

Mustafa

445

Zeynep

10.10.1988

Ankara

15.01.2008

2400

15.01.2008

U

Mustafa

822

Mahmut

12.09.1990

Istanbul

01.09.2010

2600

01.09.2010

I

Semih

544

Ayşe

15.05.1991

Istanbul

01.09.2011

2600

01.09.2011

I

Semih

Operational table and audit table records are identical. Data is repeated in different
rows but this is kept for the sake of historical query.
Database audit on a row level has some advantages and drawbacks. It is easier
to apply auditing. When the user wants to insert, update or delete something from
the operation table, the program can simply copy the all value in the record into
the historical table. Besides, the end column should be updated with the operation.
This operation can be achieved by the database as used in (Yang, 2009) article.
Drawbacks can be mentioned that redundancy makes the system complicated. Also,
calling historical data is needed to the comparison between operational table and
auditing table by using recursive query.
ICESoS 2016 - Proceedings Book 285

�International Conference on Economic and Social Studies (ICESoS’16)
SELECT S1.fee, MINS, MAXS, S1.USER, OPERATION FROM Student_HISTORY_R S1,
( SELECT S2.fee, MIN(S2.begin) MINS, MAX(S2.end) MAXS
FROM Student_HISTORY_R S2
WHERE Student Number = 966 GROUP BY fee) S3 WHERE S1.fee = S3.fee
Database Audit on Column Level
Column level audit is not including redundant data as seen in the row level audit. This
historical table does not contain static data like birth date and registration date. The
auditing table just sustains the changed data except primary key like student number.
This is required to save the data in the operational table. Student history in Table 3
keeps just the changed data and it is less redundant than the Table 2. The student
number 966 Ali moved from Istanbul to Adana on 01.09.2007 got raised fee from
2300 to 2450 on
01.09.2007. Selecting not-null value on a particular auditing column in SELECT
statement would display only the actual change. For example,
SELECT fee, begin, end, USER, OPERATION FROM Student_HISTORY_C
WHERE Student Number = 966 AND fee IS NOT NULL
The query displays the auditing of Ali’s fee. Comparing with row-based auditing on
the same query, the SELECT statement is much less complex.
Each record in column-based auditing table cannot contain more than one
value of historical data because of the uncertainty of end time of each auditing data.
Table 3: Student_History_C Table Using Column-Based Auditing
Student Number

Address

966
966

Begin

End

Operation

User

Istanbul

01.09.2005

01.09.2007

I

Mustafa

Adana

01.09.2007

U

Mustafa

966

Fee

2450

01.09.2007

23.06.2008

D

Semih

855

Hatay

21.09.2007

01.09.2008

I

Semih

855

Ankara

23.09.2008

15.01.2009

D

Mustafa

445

Ankara

2300

15.01.2008

15.06.2010

I

Mustafa

2400

15.01.2008

U

Mustafa

445
822

Istanbul

2600

01.09.2010

I

Semih

544

Istanbul

2600

01.09.2011

I

Semih

286 ICESoS 2016 - Proceedings Book

�Regional Economic Development: Entrepreneurship and Innovation
Since it is less complicated column level audit is faster. Less disk space is used also.
However, many NULL values would cause other issues when writing queries
Auditing on Log Table
A log table that tracks changes to a system are also referred audit as it gives a bunch
of information like user, data, time of execution that can be used to audit a
system. Relational Database Management Systems (RDBMS)’s like audit option
like in DB2 (IBM Knowledge Center, 2015), SQL (Stankovic, 2016) and ORACLE Servers
(Stackowiak, Bales, &amp; Greenwald, 2004) and facilitate database administrators to
sustain an audit trail (Logging, Auditing, and Monitoring the Directory) and saved it in
a log file. However, log tables are not keeping the finished time to program. To prevent
this, there may be two ways.
Column Based Log Audit Tables for Operation Logs
We need to isolate auditing log data from the operational data. To do this, we
make additional table for each auditing column. For instance, if ADDRESS and FEE
columns in the STUDENT table are auditing columns, we make ADDRESS and FEE tables
for auditing purposes as appeared in Table 4 and Table 5. There are some advantages
about this way. First, it decreases the amount of auditing data and it makes it easier to
analyze the tables. However, the number of independent tables may increase.
Table 4: Audit Log Table For Address
PK

Student Number

Adress

Begin

End

1

966

Istanbul

01.09.2005

01.09.2007

2

966

Adana

01.09.2007

3

855

Hatay

21.09.2007

4

855

23.09.2008

5

822

Istanbul

6

544

Istanbul

OP

User

I

Mustafa

U

Mustafa

01.09.2008

I

Semih

15.01.2009

D

Mustafa

01.09.2010

I

Semih

01.09.2011

I

Semih

ICESoS 2016 - Proceedings Book 287

�International Conference on Economic and Social Studies (ICESoS’16)
Table 5: Audit Log Table For Fee
PK

Student Number

Fee

Begin

1

966

2300

01.09.2007

2

966

2450

01.09.2007

3

445

2300

15.01.2008

4

445

2400

5

822

6

544

End

Op

User

U

Mustafa

23.06.2008

D

Semih

15.06.2010

I

Mustafa

15.01.2008

U

Mustafa

2600

01.09.2010

I

Semih

2600

01.09.2011

I

Semih

One Log Audit Table for Operation Logs
To join audit data into one spot, we coordinate each auditing column from all
operational tables into one single auditing log table. The audit log table makes out
of name of table and column, Student ID of the record in the operational table,
changed value, begin time, operation that causes the change and name of user who
controls this data.
Case of single audit log table of the database containing Student and Faculty
tables is appeared in the Table 6. All changes made on the tables is built into the
single audit log table. A solitary insertion of Student number 966 into Student table
makes the insertion into audit log table two times; one log record for ADDRESS
and another for Fee if Student table has two auditing columns. Upgrading on an
auditing trait will embed an auditing record into the log table. You can see same
action like in insertion; deletion of a record will be logged twice into audit log table
if there should be an occurrence of two auditing columns, for example, deletion of
Student 966 in Table 6.
Table 6: One Audit Log Table For Every Table; Student And Faculty In Database
PK

Student
Number

Table

Column

Value

Begin

1

966

Student

Address

Adana

01.09.2007

2

966

Student

Fee

2450

4

855

Student

Address

Hatay

5

855

Student

Fee

6

445

Student

7

445

8
9

Op

User

I

Mustafa

23.06.2008

D

Mustafa

21.09.2007

01.09.2008

I

Semih

2350

23.09.2008

15.01.2009

D

Semih

Address

Ankara

15.01.2008

15.06.2010

I

Mustafa

Student

Fee

2300

15.01.2008

I

Mustafa

445

Student

Fee

2400

01.09.2010

U

Semih

822

Student

Address

Istanbul

01.09.2010

I

Semih

288 ICESoS 2016 - Proceedings Book

End

13.04.2011

�Regional Economic Development: Entrepreneurship and Innovation
10

822

Student

Fee

2600

01.09.2010

I

Mustafa

11

544

Student

Address

Istanbul

01.09.2011

I

Mustafa

12

544

Student

Fee

2600

01.09.2011

I

Semih

13

221

Faculty

Manager

108

01.01.2012

I

Semih

14

103

Faculty

Manager

120

21.06.2013

U

Mustafa

Audit log table is expansive if there are numerous auditing columns from various
tables. Separating the data in columns and having a solitary audit log table for every
subsystem are suggested. Both methodologies require additional handling for each
operation at the databases, particularly, the auditing data. Of course, database
motors have as of now controlled log tables. With this additional handling, the general
framework will be slowed down.
Conclusion
Operation tables and auditing tables should be apart from each other. In this way
database engine could be much faster in running the auditing query when we compare
a table includes both operational and auditing data. Overhead of checking which
partition will be used against the query is added to execution time. Also, database
administrator would manage the database management system easier.
There are many options for auditing database. Some solutions are appropriate for
relational databases. On the other hand, marketing databases are mostly using semistructured databases.
Database auditing is one of the crucial issue for a company to maintain its’ not
only security-related concerns but also performance and reliability. Monitoring and
recording of selected user database actions determine the future of the company’s
business. Overall, security and reliability of the data can be sustained by a good
database auditing method
References
• Bhargava, G., &amp; Gadia, S. K. (1993). Relational Database Systems with Zero
Information Loss. 5 (1), 76- 87.
• Grad, B. (2013). Relational Database Management Systems: The Business
Explosion. IEEE Annals of the History of Computing archive , 35 (2), 8-9.
• IBM Knowledge Center. (2015, January 15). Retrieved April 30, 2016, from ibm.
com: http://www.ibm.com/support/knowledgecenter/#!/SSEPGG_8.2.0/
welcome.html
• Lu, W., &amp; Miklau, G. (2009). Auditing a Database Under Retention Restrictions.
IEEE Inter. Conf. on Data Eng. (pp. 42-53). ICDE.
• Luebbers, D., Grimmer, U., &amp; Jarke, M. (2003). ystematic Development of Data
Mining- Based Data Quality Tools. proc. of the 29th VLDB Conference, (pp. 548
- 559). Berlin.
• Margaret Rouse. (2015). Relational database management systems (RDBMS).
Retrieved April 5, 2016, from TechTaregt: http://searchsqlserver.techtarget.com/
definition/relational-database-management-system
ICESoS 2016 - Proceedings Book 289

�International Conference on Economic and Social Studies (ICESoS’16)
• Mullins, &amp; Craig. (2002). Database administration: the complete guide to
practices and procedures. Addison-Wesley.
• Park, N. H., &amp; Lee, W. S. (2008). Anomaly Detection over Clustering Multidimensional Transactional Audit Streams. IEEE International Workshop on
Semantic Computing and Applications (pp. 78-80). IWSCE.
• Stackowiak, R., Bales, D., &amp; Greenwald, R. (2004, August 26). Oracle Docs.
Retrieved April 30, 2016, from docs.oracle.com: http://download.oracle.com/
docs/cd/B14099_19/idmanage.1012/b14082/logging.htm#i126963
• Stankovic, I. (2016, April 5). SQL Server Audit (Database Engine). Retrieved April
30, 2016, from msdn.microsoft.com: https://msdn.microsoft.com/en- us/en%20
us/library/cc280386.aspx
• Yang, L. (2009). Teaching Database Security and Auditing. SIGCSE, (pp. 241-245).

290 ICESoS 2016 - Proceedings Book

�</text>
                  </elementText>
                </elementTextContainer>
              </element>
            </elementContainer>
          </elementSet>
        </elementSetContainer>
      </file>
    </fileContainer>
    <elementSetContainer>
      <elementSet elementSetId="1">
        <name>Dublin Core</name>
        <description>The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/.</description>
        <elementContainer>
          <element elementId="79">
            <name>Extent</name>
            <description>The size or duration of the resource.</description>
            <elementTextContainer>
              <elementText elementTextId="1855">
                <text>3307</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="50">
            <name>Title</name>
            <description>A name given to the resource</description>
            <elementTextContainer>
              <elementText elementTextId="1856">
                <text>USING DATABASE AUDIT FOR ANALYZING ON HISTORICAL DATA</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="96">
            <name>Author</name>
            <description>Author</description>
            <elementTextContainer>
              <elementText elementTextId="1857">
                <text>Hodzic, Adnan
Karadag, Adem</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="94">
            <name>Abstract</name>
            <description>A summary of the resource.</description>
            <elementTextContainer>
              <elementText elementTextId="1858">
                <text>Abstract: Database auditing is one of the biggest issues in data security. Absence  of information auditing drives the business applications to the lost trail of business  procedures. To cope with auditing and in order to track operations and the actors of  those operations in time, we need historical data or temporary database. Legitimate  and exchange times are two important time-stamps in temporary database. In this  paper, we show the methods to handle database auditing in business exchange  operations, accurate times, and performers of the operations. These strategies  are separated in two sets; utilizing relational databases, and utilizing semi-structured  information.</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="40">
            <name>Date</name>
            <description>A point or period of time associated with an event in the lifecycle of the resource</description>
            <elementTextContainer>
              <elementText elementTextId="1859">
                <text>2016</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="97">
            <name>Keywords</name>
            <description>Keywords.</description>
            <elementTextContainer>
              <elementText elementTextId="1860">
                <text>Conference or Workshop Item
PeerReviewed</text>
              </elementText>
            </elementTextContainer>
          </element>
        </elementContainer>
      </elementSet>
    </elementSetContainer>
    <tagContainer>
      <tag tagId="6">
        <name>H Social Sciences (General)</name>
      </tag>
    </tagContainer>
  </item>
  <item itemId="3504" public="1" featured="0">
    <fileContainer>
      <file fileId="4320">
        <src>https://omeka.ibu.edu.ba/files/original/41cddaabb1e237ad0c086dbca13071d0.pdf</src>
        <authentication>369ba7ca22ee871fc9b74adde3cf1d69</authentication>
        <elementSetContainer>
          <elementSet elementSetId="4">
            <name>PDF Text</name>
            <description/>
            <elementContainer>
              <element elementId="52">
                <name>Text</name>
                <description/>
                <elementTextContainer>
                  <elementText elementTextId="26578">
                    <text>Journal of Natural Sciences and Engineering, Vol. 3, (2020)
DOI number: 12.34567/JONSAE2020123

Using Exploratory Data Analysis and Big Data Analytics for Detecting Anomalies
in Cloud Computing
Ibrahim Muzaferija1, Zerina Mašetić1
1

International Burch University, Sarajevo, Bosnia and Herzegovina
ibrahim.muzaferija@stu.ibu.edu.ba
zerina.masetic@ibu.edu.ba

Abstract – While leveraging cloud computing for large-scale distributed applications allows
seamless scaling, many companies struggle following up with the amount of data generated in terms
of efficient processing and anomaly detection, which is a necessary part of the management of
modern applications. As the record of user behavior, weblogs surely become the research item
related to anomaly detection. Many anomaly detection methods based on automated log analysis
have been proposed. However, not in the context of big data applications where anomalous behavior
needs to be detected in understanding phases prior to modeling a system for such use. Big Data
Analytics often ignores anomalous point due to high volume of data. To address this problem, we
propose a complemented methodology for Big Data Analytics – the Exploratory Data Analysis,
which assists in gaining insight into data relationships without the classical hypothesis modeling. In
that way, we can gain better understanding of the patterns and spot anomalies. Results show that
Exploratory Data Analysis facilitates anomaly detection and the CRISP-DM Business
Understanding phase, making it one of the key steps in the Data Understanding phase.
Keywords - Cloud Computing, Big Data, Data Mining, Anomaly Detection

1.

Introduction

With constant growth and advancements of the Internet, there are more systems connected to other
connected systems, constantly generating and exchanging data. That data is referred to as Big Data and is
constantly targeted by cyber-attacks as it contains sensitive and valuable information. The term “big data”
refers to data that is so large, complex, or rapid that it’s not possible to process using traditional
computing and data management tools. Big Data provides opportunities to improve research, operational
efficiency, and decision-support applications with increased value for digital applications [1]. At the same
time, Big Data represents the challenges to store, transport, process, mine, and serve the data. Data that is
high in volume, velocity, variety, and veracity must be processed with advanced analytical tools and
algorithms to reveal meaningful information and provide value.
Cloud computing represents the use of distributed and shared resources such as computing, storage,
networking, and analytical software, and provides fundamental support to address the challenges of Big

1

�Journal of Natural Sciences and Engineering, Vol. 3, (2020)
DOI number: 12.34567/JONSAE2020123
Data. Cloud computing serves both as a technological enabler and producer of big data [1].
Anomalies represent unusual or behaviors that deviate from the normal. In efforts to increase cloud
computing reliability, anomaly detection poses a frequent problem in threat detection and identification,
as reported by Cloud Security Alliance (CSA) [2] which represents the world’s leading organization
dedicated to securing cloud computing environments, conducts annual research with an aim to raise
awareness of threats, risks, and vulnerabilities in the cloud environment. In their latest (2019) report [3],
CSA re-examined the risks with cloud security and took a new approach, examining the problems in
configuration and authentication, rather than the traditional focus on vulnerabilities and malware,
highlighting the following threats:
1.

Data Breaches

2.

Misconfiguration and inadequate change control

3.

Lack of cloud security architecture and strategy

4.

Insufficient identity, credential, access, and key management

5.

Account hijacking

6.

Insider threat

7.

Insecure interfaces and APIs

8.

Weak control plane

9.

Metastructure and applistructure failures

10.

Limited cloud usage visibility

11.

Abuse and nefarious use of cloud services

In this research, we aim to address the threats which can be traced in user logs (numbered 1, 4, 5, 6, 8, 9
and 11) by utilizing Big Data Analytics and Exploratory Data Analysis in order to discover anomalies and
contribute to increase of security in Cloud Computing applications.
2.

Literature Review

Anomaly detection in the cloud infrastructure and big data environment has been the topic of many
research studies in the literature. Since the first introduction of cloud infrastructure in 2006 [4], cloud
computing has greatly impacted the industries. The rapid development of Internet and Big Data
technologies has resulted in increased service development on cloud computing, such as online banking
services, electronic news services, government information systems, mobile services, etc. These systems
handle sensitive and confidential data, making the anomaly detection mechanisms one of its core security
requirements.
In the review paper by Arif Sari [4], [5], different techniques and mechanisms used in the detection of
anomalous activities within the cloud environment are described: threshold detection, statistical analysis,
rule-based measures, data mining, and machine learning. We aim to apply statistical techniques and EDA

2

�Journal of Natural Sciences and Engineering, Vol. 3, (2020)
DOI number: 12.34567/JONSAE2020123
(Exploratory Data Analysis) in order to discover anomalies.
In the “Big Data processing for Anomaly Detection” survey [6], Ariyaluran et al. present the details of the
comparative analysis and the relationship of three different domains, which are anomaly detection,
machine-learning algorithms, and real-time big data processing. This paper aims to contribute to
complemented techniques for anomaly detection. Once anomalies are detected, we can utilize Machine
Learning and real-time anomaly detection for future improvements.
In their research, Dalal and Rele [6], [7] emphasize the steps in creating effective and reliable
mechanisms for threat detection. They highlight the importance of the first CRISP-DM (Cross Industry
Standardized Process for Data Mining) phase named “Develop Business Understanding”, where reasons
for defects and answers for maintenance are taken into consideration. They discuss the phase “Analyze
Data and Data Dependencies” where the aim is to analyze, combine, and compare the data with the
present situation, without proposing EDA as a baseline for data understanding. Our work aims to employ
EDA in order to complement the methodology.
Also, they highlight the step named “Engage with Subject Matter Experts (SME’s)” for better dataset
examination and analysis of the anomaly situation, along with a grouping of the threat factors. By
employing these methods, we aim to set transparent expectations and bring out clarity to our results. In
further research, we work closely with application development technical lead which serves as SME, and
facilitates in clarification of log data, as well as threats, anomalies and our results
3.

Methodology

The research is implemented using a portion of the CRISP-DM (Cross Industry Standardized Process for
Data Mining) methodology [8], which represents the common standards used by data scientists and data
mining experts in order to build analytical and machine learning models. Prior to analytical and machine
learning model creation, we need to construct a clean dataset of user behavior with anomalies labeled for
future modeling. To do so, in this research we focus on the first three phases: Business Understanding,
Data Understanding, and Data Preparation, as highlighted with red color in the figure below. Modeling
and subsequent phases are researched in our extended study of anomaly detection in cloud computing.

3

�Journal of Natural Sciences and Engineering, Vol. 3, (2020)
DOI number: 12.34567/JONSAE2020123

Figure 1. CRISP-DM workflow
In the Business Understanding phase, the goal is to determine business objectives, assess the situation
from a business perspective, discuss with subject matter experts, determine data mining goals, and
produce a project plan. In the Data Understanding, we collect and select raw data, describe and explore
the data, consult with subject matter experts, and verify data quality. In the Data Preparation phase, which
is often the most time-consuming phase, we select and clean the data, format data, and construct a clean
dataset.
We approach the mentioned phases using Big Data Analytics and Exploratory Data Analysis (EDA). Big
Data Analytics examines large amounts of data in a non-traditional manner, that is using distributed and
shared resources to support the data quantity and complexity [8], [9]. Exploratory Data Analysis [10] is
an approach to analyzing data in order to summarize their main characteristics and uncover the underlying
structure using statistical and visual methods.
3.1. Data Collection and Selection
Cloud-based enterprise web application logs are produced by multiple servers and services, which are
streamed to Elasticsearch [11] service, an open-source search, and analytics engine for all types of data.
Elasticsearch is distributed, fast, and scalable, which makes it an ideal environment for big data ingestion,
enrichment, storage, analysis, and visualization.

4

�Journal of Natural Sciences and Engineering, Vol. 3, (2020)
DOI number: 12.34567/JONSAE2020123

Figure 2. Raw data access from Kibana
Raw data is accessed by locally restoring the Elasticsearch cluster snapshot taken for a period of three
months. The cluster contains around 20 GB of semi-structured data collected from different application
services and levels, indexed by a timestamp. Application logs are mapped to 175 attributes and accessed
using Kibana [12], the Elastic Stack service for data analysis and visualization.
Attribute selection is a part of the “Business understanding” and “Data understanding” phase,
implemented together in consultations with application development technical lead, i.e., subject matter
expert (which we’ll refer to as SME). The attributes describing the user’s application usage that were the
most relevant for anomaly detection are selected for further analysis. The following table displays
statistical information for selected attributes.

5

�Journal of Natural Sciences and Engineering, Vol. 3, (2020)
DOI number: 12.34567/JONSAE2020123

Table 1. Selected data statistical information

Attribute name

Description

Data type

Range

Missing

timestamp

Timestamp

Date Time

[2020-01-05 21:17,

0.0 %

2020-03-26 21:06]
account_id

Account ID,

Nominal

unique company

f6afd09c-****-****-****-

8.87 %

c30a935ccc37, ...

account identifier
client_country

User country

Nominal

BA, US, ...

9.53 %

company_name

Company Name

Nominal

Company A, Company B,

10.17 %

...
platform

Application

Nominal

platform

BrowserMNC,

0.0 %

BackendMNC, ...

principal_id

User email

Nominal

developer@**.com, ...

9.64 %

remote_address

User IP address

Nominal

[ 0.0.0.0. - 255.255.255.255

9.12 %

]
user_agent

User-agent

Nominal

Mozilla/5.0 ( Windows NT

0.0 %

10.0; Win64; x64) … , ...
error_message

Error message

Nominal

validation error, auth error,

99.96 %

...
message

Log message

Nominal

Profiling, FrontTimings, ...

0.18 %

level

Log level

Nominal

Info, error

0.0 %

path

Parameterized

Nominal

PUT

99.78 %

resource request

/customer/***/ticket/***, ...

resource

Request

Nominal

(GET) /invoices, ...

0.0 %

status_code

Response code

Nominal

200, 404, ...

10.17 %

Once the relevant data is selected, we utilize Elastic Stack service named Logstash [13] for collecting the
data, that is, obtaining the initial dataset in CSV format for further work.

6

�Journal of Natural Sciences and Engineering, Vol. 3, (2020)
DOI number: 12.34567/JONSAE2020123
3.2. Data Cleansing and Engineering
In order to get an insight into data quality, graphical and statistical methods were used to detect
anomalies, faults, outliers, missing values, etc. Moreover, we engineer new attributes in order to increase
the interpretability or decrease data complexity. Exploratory Data Analysis assists understanding of
relations between attributes and allows us to spot tendencies, as well as to identify the necessary cleaning
steps we have to take.
First, we apply filters to remove log data from automated services, such as health-checks and other
application services that don't reflect the user’s interactions. Next, we remove attributes that contain a
high fraction of missing values because the informational significance of attributes is inconsiderable.
Values of “status_code” attribute are mapped to the corresponding descriptions for better interpretability.
We engineer new attributes: “resource_method”, “resource_base” and “user_os”. The “resource_method”
and “resource_base” attributes are created from the values of the “resource” attribute by using regular
expressions to extract the relevant information. The “user_os” attribute is created in a similar manner,
extracting the relevant information using regular expressions from the “user agent” attribute. Creation of
these attributes allows us to focus on the most relevant information and decrease the cardinality of
original attributes.
3.3. Dataset Creation
The clean dataset contains 16 attributes describing the application usage, and 522,763 rows with a
timestamp attribute range from 6th January to 26th March (81 days).
Data is imported to RapidMiner [14], a data science software platform that provides an integrated
environment for data preparation, visualization, machine learning, text mining, and predictive analytics. It
is open source and used for commercial applications, as well as for research, education, training, rapid
prototyping.
In this phase, we continue with Exploratory Data Analysis in order to discover patterns beyond formal
modeling or hypothesis testing tasks. Our aim is to utilize the business understanding to increase the
understanding of data and relationships between attributes in order to spot anomalous trends.
As the application is B2B based, we analyze the company data first: company account histogram,
statistics and distribution. Next, we analyze the behaviors of users in company and general context. By
analyzing the “user” and “user domain” attribute, we spot trends in company context usage and behavior.
Analysis of application resource requests allows us to understand the usage in general context.

7

�Journal of Natural Sciences and Engineering, Vol. 3, (2020)
DOI number: 12.34567/JONSAE2020123

Figure 3. Counts of application resource requests
From the figure above, we can spot trends and further analyze the resource usage. The resource request
represents a user action, thus are highly valuable for the context of anomaly detection. Moreover, granular
analysis facilitates the business understanding as we gain deeper insight into user generated data.
Next, we analyze the application errors which are often one of the most informative attributes for the
anomaly detection. Anomalies and cyber-attacks are often causing application errors, allowing us to
quickly analyze error data and make distinctions between application anomalies, user anomalies and
possible threats.

Figure 4. Application error logs histogram

8

�Journal of Natural Sciences and Engineering, Vol. 3, (2020)
DOI number: 12.34567/JONSAE2020123

Figure 5. Application logs status codes histogram
Application status codes are highly correlated with application resource usage. By analyzing status codes,
we gain insight into applications performance and usage trends. Anomalies are most visible when
analyzing the status codes.
Dataset creation is concluded with the creation of an “anomaly” attribute, which represents whether a
specific application log instance is anomalous. The criteria for creation of such attribute are drawn from
the discoveries of EDA and confirmed through the consultations with SME. By addressing the
CRISP-DM phases for Business Understanding, Data Understanding, and Data Preparation with the
application of Exploratory Data Analysis, we are able to discover anomalies in application usage and user
behavior.
4.

Results and Discussion

As web application has busines-to-busines context, we approach the analysis of log data from a company
perspective. We find that companies using the application can have their application usage segmented into
three categories: heavy, medium, and light users, as shown below in the Figure 6. Heavy users are the
companies responsible for application development and support. Medium users reflect the companies
with frequent application usage, while light users represent the companies that are onboarding to
application or in initial phases of application usage. Distinction of company users per their level of usage
helps us create a better business understanding. Because of unbalanced level of application usage per
9

�Journal of Natural Sciences and Engineering, Vol. 3, (2020)
DOI number: 12.34567/JONSAE2020123
company, we can expect an increased number of anomalies for heavy users, while companies with
medium and light usage may have decreased the number of anomalies. Regarding the percentage of
anomalies, it varies between companies with no specific pattern.

Figure 6. Application usage per company
When analyzing the histogram of application resource methods through the “resource_method” attribute,
we find an anomalous request pattern, as shown below in the Figure 7. Consultations with SME yielded
that resource request method anomaly corresponds to the service whose use has ceased, and the service
behavior can be identified as anomaly.

10

�Journal of Natural Sciences and Engineering, Vol. 3, (2020)
DOI number: 12.34567/JONSAE2020123

Figure 7. Application resource methods histogram anomaly
When analyzing individual users, we perform segmentation per company using the domain name in user
email address. The histogram of user domains contributes to business understanding as we can spot user
trends per each company. In the figure below, we present the user domain histogram focused on
anomalous application usage of unknown domains. We discover that usage from unknown domains tends
to be increased in the monthly peaks of application usage.

Figure 8. User domain histogram focused on unknown domains
Consultations with SME clarified that unknown domains such as “gmail.com”, “hotmail.com”, and

11

�Journal of Natural Sciences and Engineering, Vol. 3, (2020)
DOI number: 12.34567/JONSAE2020123
“outlook.com” are used by quality assurance developers and were marked as such. This has further
decreased the number of visits from unknown domains. Moreover, consultations showed that users from
unknown domains are companies in the trial phase, that is application demonstration phase, and are still
eligible for anomaly detection. Application usage from other user domains is distributed as expected: two
development companies take up the most traffic while others are medium and light users.

Figure 9. Log message histogram anomalies
In the figure above, we present an analysis result of log message histogram with revealed anomalies. We
find that anomalies are caused by application development or, more specifically, integration attempts with
other companies using the application.
In the figure below, we present results from correlation analysis of the dataset. The correlation matrix
shows increased correlation between attributes such as “platform” and “message”. These results help us to
identify and discard highly correlated attributes and decrease the dataset complexity.

12

�Journal of Natural Sciences and Engineering, Vol. 3, (2020)
DOI number: 12.34567/JONSAE2020123

Figure 10. Correlation matrix
Correlation matrix also shows that attributes “status code” and “level” have a level of correlation. This
indicates that application errors can be sourced from application status codes. In the figure below, status
code histogram focused on error status code is depicted. We can spot the error trends together with
identification of error sources.

Figure 11. Status code histogram focused on error status codes
With application of EDA, the resulting anomalies are used in the creation of labeled dataset for anomaly
detection purposes. The dataset can serve as a baseline for creating various analytical and machine
learning anomaly detection models such as frequency threshold detection, supervised anomaly prediction,
unsupervised anomaly detection, etc. In the Table 2, we present the final dataset statistical information.

13

�Journal of Natural Sciences and Engineering, Vol. 3, (2020)
DOI number: 12.34567/JONSAE2020123

Table 2. Dataset statistical information

Attribute name

Type

Missing Least / Min

Most / Max

Range

timestamp

Date and

0

Jan 6, 2020

Mar 26, 2020 9:06

80d 14h 48min

6:18 AM

PM

58710 (3)

12345 (131,132)

time
account_id

Nominal

3

12345,
c84c286[...]ffea5,
[52 more]

company_name

Nominal

3

Company XYZ

Company A

Company A,

(3)

(131,132)

Company B, [52
more]

country

Nominal

3

XX (29)

US (399,465)

US, BA, IN, [12
more]

platform

Nominal

0

Backend (45%)

Browser (55%)

Browser, Backend

user

Nominal

6

fk***@*.com

fs***@*.com

fs***@*.com,

(4)

(48,738)

de***@*.com,
[209 more]

remote_address

Nominal

3

184.*.*.22 (3)

77.*.*.171 (41,561)

77.*.*.171,
144.*.*.229, [302
more]

user_agent

Nominal

0

Mozilla/[...]4.1

Mozilla/[...]ri/537.3

Mozilla/[...]36,

(3)

6 (77,449)

Mozilla/[...].0,
[114 more]

14

�Journal of Natural Sciences and Engineering, Vol. 3, (2020)
DOI number: 12.34567/JONSAE2020123
error_msg

Nominal

467,22

Getaddr[...].co

ESOCKET[...]UT

ESOCKET[...]UT,

5

m (1)

(89)

502, [3 more]

level

Nominal

0

error (159)

info (467,225)

Info, Error

message

Nominal

0

Integ[...]led

Profiling (264,851)

Profiling,

(159)

frontTimigs, [1
more]

status_code

Nominal

93

405 Method

200 OK (453,461)

[...]ed (1)
resource_method Nominal

0

PUT (97)

200 OK, 204 No
Content, [8 more]

GET (373,123)

GET, POST, [3
more]

resource_base

Nominal

0

produ[...]ile (8)

endpoints (98,191)

endpoints,
customers, [17
more]

user_domain

Nominal

6

C*** (272)

A*** (351,885)

A***, M***, [9
more]

user_agent_os

Nominal

0

Unknown (3)

Windows (411,762)

Windows, OS X,
[2 more]

anomaly

Binomina

0

True (882)

False (466,502)

False, True

l

5.

Conclusion

This study has shown that the use of Exploratory Data Analysis contributes to and complements the
implementation of CRISP-DM methodology phases: business understanding, data understanding, and
data preparation. Moreover, we demonstrate that Exploratory Data Analysis is efficient method for
detecting anomalies in big data. Summarizing data characteristics and discovering underlying patterns for
data and its distribution brings value for both data understanding and data preparation phase. We confirm
the benefits of proven method from previous studies: consultations with SME play a crucial role in the
business understanding phase and give a valuable contribution in data understanding phase Next,
consultations in the data understanding and data preparation phase facilitates the workflow and can help
us increase the data value.
Future efforts can be placed in implementation of subsequent CRISP-DM phases, that is, modeling,

15

�Journal of Natural Sciences and Engineering, Vol. 3, (2020)
DOI number: 12.34567/JONSAE2020123
evaluation and deployment. Modeling data using Machine Learning techniques enables complex pattern
discovery, as suitable for big data datasets, and further improves anomaly detection as underlying
mathematical relationships can be leveraged. While this has been proven in majority of studies conducted
in the field of anomaly detection and supervised machine learning, we propose a use of unsupervised
machine learning for finding new anomalies that will enable a creation of extended labeled dataset which can then be used for creation of supervised machine learning model for anomaly detection and
prediction.

6.

[1]

References

“Big Data and cloud computing: innovation opportunities and challenges” [Online]. Available:
https://www.tandfonline.com/doi/full/10.1080/17538947.2016.1239771. [Accessed: 04-Sep-2020]

[2]

“Cloud Security Alliance (CSA)” [Online]. Available: https://cloudsecurityalliance.org/. [Accessed:
04-Sep-2020]

[3]

“Top Threats to Cloud Computing: Egregious.” [Online]. Available:
https://cloudsecurityalliance.org/artifacts/top-threats-to-cloud-computing-egregious-eleven/.
[Accessed: 04-Sep-2020]

[4]

“About AWS.” [Online]. Available: https://aws.amazon.com/about-aws/. [Accessed: 04-Sep-2020]

[5]

A. Sari, “A Review of Anomaly Detection Systems in Cloud Networks and Survey of Cloud
Security Measures in Cloud Storage Applications,” Journal of Information Security, vol. 6, no. 2,
pp. 142–154, Mar. 2015.

[6]

“Real-time big data processing for anomaly detection: A Survey,” Int. J. Inf. Manage., vol. 45, pp.
289–307, Apr. 2019.

[7]

“Cyber Security: Threat Detection Model based on Machine learning Algorithm - IEEE Conference
Publication.” [Online]. Available: https://ieeexplore.ieee.org/document/8724096. [Accessed:
04-Sep-2020]

[8]

“DMME: Data mining methodology for engineering applications – a holistic extension to the
CRISP-DM model,” Procedia CIRP, vol. 79, pp. 403–408, Jan. 2019.

[9]

“A Reference Model for Big Data Analytics” [Online]. Available:
https://www.researchgate.net/publication/327728739_A_Reference_Model_for_Big_Data_Analytic
s. [Accessed: 04-Sep-2020]

[10] “Exploratory data analysis” [Online]. Available: https://psycnet.apa.org/record/2011-23865-003.
[Accessed: 04-Sep-2020]
[11] “Open Source Search: The Creators of Elasticsearch, ELK Stack &amp; Kibana.” [Online]. Available:
https://www.elastic.co/. [Accessed: 04-Sep-2020]
[12] “Kibana.” [Online]. Available: https://www.elastic.co/kibana. [Accessed: 04-Sep-2020]
16

�Journal of Natural Sciences and Engineering, Vol. 3, (2020)
DOI number: 12.34567/JONSAE2020123
[13] “Logstash.” [Online]. Available: https://www.elastic.co/logstash. [Accessed: 04-Sep-2020]
[14] “RapidMiner.” [Online]. Available: https://rapidminer.com/. [Accessed: 04-Sep-2020]

17

�</text>
                  </elementText>
                </elementTextContainer>
              </element>
            </elementContainer>
          </elementSet>
        </elementSetContainer>
      </file>
    </fileContainer>
    <collection collectionId="3">
      <elementSetContainer>
        <elementSet elementSetId="1">
          <name>Dublin Core</name>
          <description>The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/.</description>
          <elementContainer>
            <element elementId="50">
              <name>Title</name>
              <description>A name given to the resource</description>
              <elementTextContainer>
                <elementText elementTextId="26245">
                  <text>Journal of Natural Sciences and Engineering</text>
                </elementText>
              </elementTextContainer>
            </element>
            <element elementId="43">
              <name>Identifier</name>
              <description>An unambiguous reference to the resource within a given context</description>
              <elementTextContainer>
                <elementText elementTextId="26605">
                  <text>2637-2835</text>
                </elementText>
              </elementTextContainer>
            </element>
            <element elementId="98">
              <name>DOI</name>
              <description>Digital object identifier</description>
              <elementTextContainer>
                <elementText elementTextId="26606">
                  <text>10.14706</text>
                </elementText>
              </elementTextContainer>
            </element>
            <element elementId="45">
              <name>Publisher</name>
              <description>An entity responsible for making the resource available</description>
              <elementTextContainer>
                <elementText elementTextId="26607">
                  <text>International Burch University</text>
                </elementText>
              </elementTextContainer>
            </element>
            <element elementId="41">
              <name>Description</name>
              <description>An account of the resource</description>
              <elementTextContainer>
                <elementText elementTextId="26608">
                  <text>Journal of Natural Sciences and Engineering (JONSAE) is a peer-reviewed, biannually published international journal focusing on empirical and theoretical research in all branches of Engineering and Natural Sciences. It is published on the behalf of Faculty of Engineering and Natural Sciences of International Burch University and aims to provide the best content regarding by publishing original research papers, review articles, special issues, feature articles, and book reviews. All manuscript submissions are subject to initial appraisal by the Editor, and, if found suitable for further consideration, to peer review by independent, anonymous referees. All peer review is double-blind and submission is online. The journal welcomes theoretical, applied, interdisciplinary and methodological work, with preference on empirical research, critical approach and problem-solving methods in manuscripts.</text>
                </elementText>
              </elementTextContainer>
            </element>
            <element elementId="44">
              <name>Language</name>
              <description>A language of the resource</description>
              <elementTextContainer>
                <elementText elementTextId="26609">
                  <text>English</text>
                </elementText>
              </elementTextContainer>
            </element>
          </elementContainer>
        </elementSet>
      </elementSetContainer>
    </collection>
    <elementSetContainer>
      <elementSet elementSetId="1">
        <name>Dublin Core</name>
        <description>The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/.</description>
        <elementContainer>
          <element elementId="50">
            <name>Title</name>
            <description>A name given to the resource</description>
            <elementTextContainer>
              <elementText elementTextId="26579">
                <text>Using Exploratory Data Analysis and Big Data Analytics for Detecting Anomalies&#13;
in Cloud Computing</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="96">
            <name>Author</name>
            <description>Author</description>
            <elementTextContainer>
              <elementText elementTextId="26580">
                <text>Ibrahim Muzaferija, Zerina Mašetić &#13;
</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="94">
            <name>Abstract</name>
            <description>A summary of the resource.</description>
            <elementTextContainer>
              <elementText elementTextId="26581">
                <text>– While leveraging cloud computing for large-scale distributed applications allows&#13;
seamless scaling, many companies struggle following up with the amount of data generated in terms&#13;
of efficient processing and anomaly detection, which is a necessary part of the management of&#13;
modern applications. As the record of user behavior, weblogs surely become the research item&#13;
related to anomaly detection. Many anomaly detection methods based on automated log analysis&#13;
have been proposed. However, not in the context of big data applications where anomalous behavior&#13;
needs to be detected in understanding phases prior to modeling a system for such use. Big Data&#13;
Analytics often ignores anomalous point due to high volume of data. To address this problem, we&#13;
propose a complemented methodology for Big Data Analytics – the Exploratory Data Analysis,&#13;
which assists in gaining insight into data relationships without the classical hypothesis modeling. In&#13;
that way, we can gain better understanding of the patterns and spot anomalies. Results show that&#13;
Exploratory Data Analysis facilitates anomaly detection and the CRISP-DM Business&#13;
Understanding phase, making it one of the key steps in the Data Understanding phase.&#13;
</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="97">
            <name>Keywords</name>
            <description>Keywords.</description>
            <elementTextContainer>
              <elementText elementTextId="26582">
                <text> Cloud Computing, Big Data, Data Mining, Anomaly Detection</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="43">
            <name>Identifier</name>
            <description>An unambiguous reference to the resource within a given context</description>
            <elementTextContainer>
              <elementText elementTextId="26583">
                <text>2637-2835</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="98">
            <name>DOI</name>
            <description>Digital object identifier</description>
            <elementTextContainer>
              <elementText elementTextId="26584">
                <text>10.14706/JONSAE2021320&#13;
</text>
              </elementText>
            </elementTextContainer>
          </element>
        </elementContainer>
      </elementSet>
    </elementSetContainer>
  </item>
  <item itemId="403" public="1" featured="0">
    <fileContainer>
      <file fileId="412">
        <src>https://omeka.ibu.edu.ba/files/original/193f2509e547effd21bd12f5fca8923a.pdf</src>
        <authentication>d189bc6e4d6d8ae6e22e71f9fe0a08d7</authentication>
        <elementSetContainer>
          <elementSet elementSetId="4">
            <name>PDF Text</name>
            <description/>
            <elementContainer>
              <element elementId="52">
                <name>Text</name>
                <description/>
                <elementTextContainer>
                  <elementText elementTextId="3088">
                    <text>Journal of Foreign Language Teaching and Applied Linguistics

Using Film Subtitles in FLT in Croatia
Magdalena Nigoević &amp; Koraljka Pejić &amp; Trišnja Pejić
University of Split, Croatia
Submitted: 16.04.2014.
Accepted: 19.11.2014.

Abstract
It is a general belief that students need to receive substantial input of authentic
materials in FLT. The combination of verbal information with full visual
experiences, such as films, has been found most appealing. Not only a large amount
of natural language, but also a rich variety of cultural forms and expressions are
mediated by this kind of “comprehensible input” (Krashen 1985). Various studies
have demonstrated the ways in which intralingual subtitled audio-visual material can
improve the effectiveness of general foreign language comprehension (Caimi 2002,
Vanderplank 1988) and how it can be a useful tool in foreign language teaching and
foreign language acquisition (Neuman &amp; Koskinen 1992).
Most foreign television and cinema programs distributed in Croatia have always been
accompanied by interlingual subtitles; therefore the viewers are accustomed to them.
Consequently, such a habit can be efficiently exploited in foreign language learning
among Croatian students who will certainly more easily develop strategies to derive
benefits from subtitled films.
The main aim of this study was to examine whether and to what extent film subtitles
(captions) increase learners’ ability to process languages. Our hypothesis was that
subtitles facilitate general comprehension of a film, provided that the linguistic
difficulty of the authentic film material has been carefully selected in order to match
the students’ overall competency in L2. Our research was conducted among students
of B1/B2 level of English L2. Students were divided into two groups: one group
watched a sequence of a feature film without subtitles, while the other was shown the
same material with subtitles. Both groups were given a specially designed test to
assess their general comprehension of the viewed material. The findings revealed that
the group of students viewing the subtitled film showed better results than the other
group.
Keywords: FLT, authentic audio-visual material, intralingual film subtitles, Croatian
learners
181

�Using Film Subtitles in FLT in Croatia

Introduction
Learners of a foreign language do not always have an opportunity to communicate
with ‘native speakers’. Therefore, it is exceptionally important that they are
continually exposed to interactional and speech patterns of L2. This can easily be
achieved by using audio-visual materials. The role of audio-visual materials as a
stimulating and facilitating tool in the process of teaching and learning a foreign
language has been widely acknowledged. “They can provide (a) the motivation
achieved by basing lessons on attractively informative content material; (b) the
exposure to a varied range of authentic speech, with different registers, and (c)
language used in the context of real situations, which adds relevance and interest to
the learning process” (Carrasquillo 1994:140). Through such materials students
become acquainted with various sorts of verbal and non-verbal behaviour in L2,
conversational strategies (opening and closing, turn taking) and various cultural
patterns.
Among other audio-visual materials, film is probably the most authentic, that is,
“authentic, in the sense that the language is not artificially constrained, and is, at the
same time, amenable to exploitation for language teaching purposes” (MacWilliam
1986: 134). It is an excellent medium for introducing various aspects of the foreign
language in the classroom. Furthermore, films allow teachers and learners to explore
the nonverbal and cultural aspects of language as well as verbal. It can also be highly
motivating since it shows real-life situations and characters, thus giving an authentic
and often amusing way to get acquainted with the (extra)linguistic and cultural
aspects of the target reality.

Subtitles in foreign language learning
Various studies have been carried out on the ways in which intralingual1 subtitled
audio-visual material can improve the effectiveness of general foreign language
comprehension (Caimi 2002, Markham 1993 and 1999, Vanderplank 1988) and how
it can be a useful tool in foreign language teaching and foreign language acquisition.
Among others, Garza (1991) studied the way in which subtitles (captions) affect the
study of vocabulary at higher level learners and concluded that the use of subtitles
increases the comprehension and acquisition of vocabulary. Neuman &amp; Koskinen
(1992) obtained similar results in their study with advanced EFL students and came
to a conclusion that students who watched subtitled (captioned) videos demonstrate
better comprehension and vocabulary acquisition results. Baltova (1999) conducted
182

�Journal of Foreign Language Teaching and Applied Linguistics

an experiment with French students in Canada whose native language was English.
The purpose of her study was to find out how the learning and retention of content
and vocabulary in French were affected by different authentic video formats. She
also proved that the retention of the video content was superior under the subtitled
conditions. The special edition of R.I.L.A. (Rassegna Italiana di Linguistica
Applicata), edited by Annamaria Caimi in 2002, contains the proceedings of a
scientific conference on subtitled films and several papers are focused on the role of
subtitles in foreign language teaching and learning.
Most the studies have focused on short-term effects of text aids, although some
authors advocate the systematic collection of long-term data (Danan 2004: 75-76).
The insight into both short- and long-term effects of subtitling can be seen in the
experiment done by Bianchi e Ciabattoni (2008) in a broad-range investigation
among the Italian adult learners of English. There were also past experiences and
projects which encouraged the use of foreign language learning methods based on the
creation of subtitles by students and pupils.2
All the findings agree that subtitling can contribute to language learning and that in
formal learning contexts, subtitling can reduce the anxiety experienced by foreign
language learners. The use of subtitled audio-visual material has the advantages of
providing simultaneous exposure to spoken language, printed text and visual
information, all conveying the same message (see: Baltova 1999: 33). Moreover,
subtitles can function as an important element that bridges the gap between reading
and listening skills (see: Borrás &amp; Lafayette 1994).
Most foreign programs distributed in Croatia, as in other so-called “subtitling
countries”3, have always been accompanied by interlingual subtitles; therefore the
viewers are exposed to subtitled foreign television and cinema programs from a very
young age. As the viewers are accustomed to the logic of subtitling, they can easily
switch to the use of intralingual or same-language subtitles. Consequently, such a
habit can be efficiently exploited in foreign language learning among Croatian
students who will certainly more easily develop strategies to derive benefits from
subtitled films.4 However, the integration of film subtitles into language learning and
teaching practice in Croatia has so far been unsatisfactory and few studies (Strmečki
Marković 2003) investigated the use of film subtitles.

Method of the Study
The main objective of this study was to examine whether and to what extent film
subtitles increase the language-processing ability of the learners. We wanted to
determine whether watching a subtitled film facilitates general comprehension
among Croatian learners. For the purpose of this study the opening sequence (7’50’’)
of the feature film About a Boy (2002, directed by Paul Weitz) was chosen. The
183

�Using Film Subtitles in FLT in Croatia

actors in the sequence are native speakers and use contemporary, standard variant of
the English language. The topics of their conversations and monologues are common
and deal with everyday situations, well known to the learners. The vocabulary and
structures used in the sequence are already familiar to upper-intermediate level
students.
Our research was conducted among Croatian secondary school students of English
L2 at B1/B2 level of the Common European Framework. The students were divided
in two groups. The groups were homogenous in terms of the number of hours of
studying English in secondary school (380), in terms of age (17-18) and accordingly,
in terms of general culture and cineliteracy. The Treatment group viewed the selected
sequence with subtitles, while the Control group watched the same sequence without
subtitles.
The general comprehension of the viewed material was tested by a particularly
designed test. The test consisted of fifteen (15) open questions that the participants
had to fill in, based on the information they heard in the sequence. Some questions
required several elements in the answer, so the total score was 19. For each correct
answer the participants scored one point. Each test was corrected by two
independent, experienced English language teachers. Synonyms were also accepted
as correct answers, provided that participant’s comprehension was confirmed.
The experiment was conducted among secondary school students in Split (Croatia) in
March 2014. The total number of students was one hundred (100), divided in two
groups of fifty (50) participants each. They were given precise instructions for the
activity: first they had to read the comprehension test questions, then carefully watch
the sequence and afterwards answer the questions. They were not allowed to look at
the questions while watching the sequence. Immediately after watching it, they were
asked to complete the previously designed test and were given ten minutes (10’) for
the task.
The collected data were processed using t-test (SPSS programme) in order to
determine the statistical difference between the Treatment group and the Control
group.
Our hypothesis was that the group that watched the film sequence with subtitles
(Treatment group) would have a higher score in the comprehension test than the
Control group that had watched the same sequence without subtitles.
Discussion and findings

184

�Journal of Foreign Language Teaching and Applied Linguistics

After the answer sheets were collected and corrected, the score for each group was
calculated. We ran these data through t-test to assess whether the means of the two
groups were statistically different from each other. This analysis is appropriate
whenever it is important to compare two groups. As can be seen in Figure 1: the
Treatment group had a mean score of 13.06, while the Control group had 6.58.

The mean of the Treatment Group minus that of the Control Group equals -6.48.
Given the 95% confidence interval, the difference is from -7.94 to -5.02. The
standard error of difference was 0.736 (see Table 1).
Table 1. Results of the comprehension test
Control Group
Mean
6.58
Standard deviation
3.85
N (number of participants) 50

Treatment Group
13.06
3.45
50

By conventional criteria, the t-test showed that the difference is considered to be
extremely statistically significant. All the participants watched the same film
sequence and the comprehension was tested by the same test. All the participants
were equal in terms of all relevant criteria (age, numbers of hours of studying
English, general culture and cineliteracy). The only difference between the groups
185

�Using Film Subtitles in FLT in Croatia

was the intervention with subtitles, in that the Treatment group had the opportunity to
listen to the speech and simultaneously read the uttered words in the form of
subtitles, while the participants of the Control group based their understanding only
on the spoken utterances. Since all participants were equal and tested in equal
conditions, the difference in the scores can be attributed exclusively to the presence
or absence of subtitles.

Conclusion
The findings are in accordance with previously conducted studies and these results
lead us to the conclusion that subtitled film strategies have a positive impact to
students’ overall comprehension skills. Because of its realistic use of language, its
undemanding grasp and its attractiveness, watching a foreign language film as an
activity has an encouraging effect. Not only is film an important source of different
themes and topics, it also offers audio-visual stimulation for developing listening,
speaking reading and general comprehension skills in foreign language learning. It is
important, however, to take into account that a film may be an assisting medium in
covering a topic and that it has to be adequate to the level of students’ language
competences.
If used appropriately, such exposure to film subtitles with Croatian students should
definitely strengthen their foreign language comprehension and acquisition of
language functions and structures.
Nevertheless, the authors are aware of the fact that this study was conducted on a
relatively small sample, homogenous in their age and education level. These data
were collected exclusively from learners of English as L2 in a country where foreign
TV and cinema programmes are usually subtitled and rarely dubbed, so viewers are
accustomed to subtitles. Therefore, these data should be applied with caution when
making inferences about other types of L2 learners.

Notes
1

This refers to audio-visual material subtitled in the same language as the original.
Same-language subtitles are also labelled captions or bimodal, unilingual, or
intralingual subtitles in scholarly literature (Danan 2004: 68). Captioning was
initially intended for individuals who are hearing impaired, but later was used in all
spheres of life, both as didactic material and as an assisting tool in daily watching
video programmes and films. On the other hand, interlingual (or interlinguistic)
subtitling refers to audio-visual material in a foreign language subtitled in the
learner's language and it is the most common way of translating a medium into
186

�Journal of Foreign Language Teaching and Applied Linguistics

another language so that speakers of other languages can follow it. For the purpose of
this study we will use the term ‘subtitles,’ which has become a common term in
Europe referring only to intralingual subtitles.
2

Such as the LeViS (Learning via Subtitling) project, was coordinated by Hellenic
Open University in Greece within the framework of Socrates Programme, LINGUA
2 (2006-2008) which developed the educational material for active foreign language
learning based on film subtitling. (see: http://levis.cti.gr/)
3
Subtitling is the language transfer practice used most widely in Europe. It concerns
28 countries (26 countries plus two regions in two countries): Belgium (Flemishspeaking), Bulgaria, Croatia, Cyprus, Czech Republic, Denmark, Estonia, Finland,
Greece, Hungary, Iceland, Ireland, Latvia, Liechtenstein, Lithuania, Luxembourg,
Malta, Netherlands, Norway, Poland, Portugal, Romania, Slovakia, Slovenia,
Sweden, Switzerland (German-speaking), Turkey and United Kingdom. (Retrieved
13
April
2014
from:
http://eacea.ec.europa.eu/llp/studies/documents/study_on_the_use_of_subtitling/rapp
ort_final-en.pdf)
4
Some American authors even emphasise “the incidental language learning
occurring in Europe with spectators of American films” (Danan 2004: 68).

References
Baltova, I. (1999). Multisensory language teaching in a multidimensional curriculum:
The use of authentic bimodal video in core French. The Canadian Modern
Language Review, 56 (1), 32-48.
Bianchi, F. &amp; Ciabattoni, T. (2008). Captions and Subtitles in EFL Learning: an
investigative study in a comprehensive computer environment. In: Baldry A.,
M.Pavesi, C.Taylor Torsello &amp; C.Taylor (eds) From Didactas to Ecolingua: an
ongoing research project n translation, 69-90. EUT, Edizioni Università di
Trieste.
Retrieved
from
www.openstarts.units.it/dspace/bitstream/10077/2848/1/bianchi_ciabattoni.pdf
Borrás, I. &amp; Lafayette, R. (1994). Effects of multimedia courseware subtitling on the
speaking performance on college students of French. The Modern Language
Journal, 78 (1), 61-75.
Caimi, A. (ed.) (2002). Cinema: Paradiso delle lingue. I sottotitoli
nell’apprendimento linguistico. Special issue of RILA – Rassegna Italiana di
Linguistica Applicata, 34 (1-2).

187

�Using Film Subtitles in FLT in Croatia

Carrasquillo, A. L. (1994). Teaching English as a second language: A resource
guide. New York: Garland Publishing.
Danan, M. (2004). Captioning and subtitling: Undervalued language learning
strategies. Meta, 49(1), 67-77.
Garza, T. (1991). Evaluating the use of captioned video materials in advanced
foreign language learning. Foreign Language Annals, 24 (3), 239-258.
Krashen, S. (1985). The Input Hypothesis: Issues and Implications. London:
Longman.
MacWilliam, I. (1986). Video and language comprehension. ELT Journal, 40 (2):
131-135.
Markham, P. (1993). Captioned TV videotapes: Effects of visual support on second
language comprehension. Journal of Educational Technology Systems, 21 (3),
183-191.
Markham, P. (1999). Captioned videotapes and second-language listening word
recognition. Foreign Language Annals, 32 (3), 321-328.
Neuman, S.B. &amp; Koskinen, P. (1992). Captioned TV as comprehensible input:
Effects of incidental word learning from context for language minority students.
Reading Research Quarterly, 27 (1), 94-106.
Strmečki Marković, S. (2003). Igrani film u nastavi jezičnih vježbi u sklopu studija
germanistike u Zagrebu. Strani jezici, 32 (3), 59-68.
Vanderplank, R. (1988). The value of teletext subtitles in language learning. English
Language Teaching (ELT) Journal, 42 (4), 272-281.

188

�</text>
                  </elementText>
                </elementTextContainer>
              </element>
            </elementContainer>
          </elementSet>
        </elementSetContainer>
      </file>
    </fileContainer>
    <elementSetContainer>
      <elementSet elementSetId="1">
        <name>Dublin Core</name>
        <description>The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/.</description>
        <elementContainer>
          <element elementId="79">
            <name>Extent</name>
            <description>The size or duration of the resource.</description>
            <elementTextContainer>
              <elementText elementTextId="3082">
                <text>2818</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="50">
            <name>Title</name>
            <description>A name given to the resource</description>
            <elementTextContainer>
              <elementText elementTextId="3083">
                <text>Using Film Subtitles in FLT in Croatia</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="96">
            <name>Author</name>
            <description>Author</description>
            <elementTextContainer>
              <elementText elementTextId="3084">
                <text>Nigoević, Magdalena
Pejić, Koraljka
Pejić, Trišnja</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="94">
            <name>Abstract</name>
            <description>A summary of the resource.</description>
            <elementTextContainer>
              <elementText elementTextId="3085">
                <text>It is a general belief that students need to receive substantial input of authentic materials in FLT. The combination of verbal information with full visual experiences, such as films, has been found most appealing. Not only a large amount of natural language, but also a rich variety of cultural forms and expressions are mediated by this kind of “comprehensible input” (Krashen 1985). Various studies have demonstrated the ways in which intralingual subtitled audio-visual material can improve the effectiveness of general foreign language comprehension (Caimi 2002, Vanderplank 1988) and how it can be a useful tool in foreign language teaching and foreign language acquisition (Neuman &amp; Koskinen 1992).     Most foreign television and cinema programs distributed in Croatia have always been accompanied by interlingual subtitles; therefore the viewers are accustomed to them. Consequently, such a habit can be efficiently exploited in foreign language learning among Croatian students who will certainly more easily develop strategies to derive benefits from subtitled films.     The main aim of this study was to examine whether and to what extent film subtitles (captions) increase learners’ ability to process languages. Our hypothesis was that subtitles facilitate general comprehension of a film, provided that the linguistic difficulty of the authentic film material has been carefully selected in order to match the students’ overall competency in L2. Our research was conducted among students of B1/B2 level of English L2. Students were divided into two groups: one group watched a sequence of a feature film without subtitles, while the other was shown the same material with subtitles. Both groups were given a specially designed test to assess their general comprehension of the viewed material. The findings revealed that the group of students viewing the subtitled film showed better results than the other group.    Keywords: FLT, authentic audio-visual material, intralingual film subtitles, Croatian learners</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="40">
            <name>Date</name>
            <description>A point or period of time associated with an event in the lifecycle of the resource</description>
            <elementTextContainer>
              <elementText elementTextId="3086">
                <text>2015-04-16</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="97">
            <name>Keywords</name>
            <description>Keywords.</description>
            <elementTextContainer>
              <elementText elementTextId="3087">
                <text>Article
PeerReviewed</text>
              </elementText>
            </elementTextContainer>
          </element>
        </elementContainer>
      </elementSet>
    </elementSetContainer>
    <tagContainer>
      <tag tagId="47">
        <name>P Philology. Linguistics,PE English,PG Slavic, Baltic, Albanian languages and literature</name>
      </tag>
    </tagContainer>
  </item>
</itemContainer>
