Subject and Scope

The common text processing/corpus softwares in the literature are developed for other languages than Turkish. Some of them have Turkish characters support. Eventhough it is observed that Turkish is an agglutinative language and its specifical syntax cannot be analysed via these softwares. These softwares are known to be able to analyse Turkish characters problem at text files reading stage. But for syntactic structure, it is inadequate to tag Turkish and report the tagged datas.

In this sense, another drawback of the current sotwares is that they don’t have any standard reporting tools. In fact every research has some specific features based on the research questions and these features require different reportings and taggings/markings according to the research logic.

Therefore the subject of the proposed project is to develop a flexible, easily accesible, database-supported, corpus platform which is specific to the researcher and will be able to meet the demands of the user at the highest level in paralel with the methodological approaches of the corpus linguistics and where directly the distinctive structure of Turkish can be formatted based on the research questions and where the research results can be reported properly for the researchers studying on language/linguistics. In this context, the scope of the project is to develop a corpus platform which can be used efficiently in small and middle scaled corpus projects privatized for Turkish.

This project is supported within the scope of TUBITAK 1005 – National New Ideas and Products Research Support Program
Project No:114E791