Reading Gokturkish text with the Yolo object detection algorithm

J OURNAL OF M ECHATRONICS AND A RTIFICIAL I NTELLIGENCE IN E NGINEERING


Introduction
Natural language processing (NLP) is a branch of artificial intelligence (AI) and enables computers to comprehend [1], produce, and manipulate human language [2].Natural language processing has the feature of querying data with real language text or voice [3].It provides services in many areas, especially health [4], law [5], finance [6], security [7], arts [8] and education [9].It is important to establish a language processing process [13] based on pre-processing techniques [10], programming languages [11], and library development [12].It is very important for us to transfer the languages that are disappearing to new generations and to carry historical knowledge to the present day [14].Through various awareness projects within the UN [16], especially UNESCO [15], efforts are being made to raise awareness about teaching native languages in schools in order to prevent the loss of languages.
Ancient Anatolian languages [17][18][19][20][21] and written languages constitute important historical data.Ancient languages in Anatolia had an alphabet like today's.These scripts have a variety of primitive writing systems such as Cuneiform, Aramaic, Phoenician, Phrygian, Carian, Lycian, Side, Assyrian and Greek Alphabets.With the data obtained from reading these articles, it will be possible to read and understand history better [22], [23].In addition, the decrease in the number of people who know these writings will cause the connection between the future and the past to be severed.In this study, a study developed to read the Gokturkish language, which is at risk of being lost, on the basis of artificial intelligence and to use it in different fields, will be presented.There is no way yet to read and record it with Natural Language Processing techniques and Artificial Intelligence.
In this study, the study of experiencing and reading Anatolian Ancient Languages and Philology, History and Engineering approaches were applied in an interdisciplinary manner and the reading of Gokturkish texts, which is a part of the study, was presented.

Method
This research is deeply rooted in specific theoretical frameworks and models, with a primary emphasis on investigating the YOLO algorithm, which stands as a widely employed deep learning methodology within the expansive realm of object detection.YOLO, or You Only Look Once, is distinctive in its capability to simultaneously detect all objects present in a single image, showcasing its efficiency in real-time applications.
The investigative focus of this study extends to the utilization of pre-processed data that has been meticulously tailored to align with the unique linguistic characteristics of Gokturkish texts.Rigorous scrutiny by linguists has ensured the linguistic appropriateness of the texts for effective machine learning.Additionally, this research has benefited from consultations with subject matter experts specializing in the historical periods covered by the Gokturkish texts, contributing valuable insights to the contextual understanding of the data.
In the contemporary landscape, the applications of object detection methods have become pervasive across diverse fields, owing to continuous technological advancements and the introduction of novel architectural paradigms.The evolution of these technologies has resulted in the development of faster and more accurate object detection models, with the YOLO algorithm emerging as a standout choice for real-time object detection scenarios.Leveraging convolutional neural networks (CNN), YOLO demonstrates a high degree of efficiency and precision in the detection of objects within images.In essence, this study not only delves into the technical intricacies of the YOLO algorithm but also contextualizes its application within the broader landscape of technological progress and interdisciplinary collaboration.[26] The YOLO (You Only Look Once) algorithm operates by dividing the entire image into grid regions of size A×A.Each grid is then processed through a neural network to determine the presence of an object within it.If an object is detected, the algorithm identifies whether the midpoint of the object lies within the grid.Subsequently, it predicts parameters such as the object's width, length, height, class, and a confidence score.

Fig. 2. YOLO working algorithm
For instance, in Fig. 2, if the midpoint of a car corresponds to the 7th grid, that particular grid is responsible for detecting the car and drawing a bounding box around it.YOLO generates a distinct prediction vector for each grid, and within each vector, the following information is included: Confidence Score: This score indicates the model's confidence in whether an object exists in the current grid.A score of 0 signifies that the object is definitely not present, while a score of 1 indicates a high certainty of presence.This score reflects the model's confidence not only in the existence of an object but also in accurately identifying the object and determining the coordinates of the bounding box around it.
Bx:  coordinate of the midpoint of the object.The YOLO (You Only Look Once) architecture, a groundbreaking approach in the field of object detection, draws inspiration from GoogleLeNet while introducing innovative features to address specific challenges.The YOLO architecture is characterized by a series of 24 convolutional layers designed for efficient feature extraction.Subsequently, it incorporates two fully connected layers that play a crucial role in estimating bounding box coordinates and the corresponding probabilities of detected objects.
One notable challenge in object detection, especially with Convolutional Neural Networks (CNNs), lies in the tendency to downsample input images, making it challenging to accurately recognize small objects.YOLO tackles this issue by implementing a unique strategy within its architecture.For instance, it employs a process where a layer with dimensions 28×28×512 is reduced to 14×14×2048, and this reduced layer is then added behind the output layer with dimensions 14×14×1024.
This approach not only mitigates the limitations associated with recognizing small objects but also enhances the network's capacity to capture intricate details and spatial relationships within the input data.The strategic reduction in layer dimensions enables YOLO to strike a balance between preserving fine-grained information and optimizing computational efficiency.In summary, YOLO's architecture, inspired by GoogleLeNet, stands out in its incorporation of specialized layers and techniques to address challenges inherent in object detection, showcasing a commitment to both accuracy and efficiency in processing complex visual data.Fig. 4 provides a comprehensive comparison between YOLOv3 and other algorithms, specifically assessing their performance at a 0.5 Intersection over Union (IoU) or mean Average Precision at 50 % overlap (mAP-50) on the COCO dataset.The graph unequivocally demonstrates YOLO's superiority over its competitors, excelling in both time efficiency and accuracy.To appreciate why other object detection algorithms lag in speed, it's instructive to delve into their underlying mechanisms.
Region-based object detection algorithms, exemplified by R-CNN, adopt a sequential methodology.They initially identify potential object regions and subsequently apply Convolutional Neural Networks (CNN) to each of these regions independently.While this approach yields commendable results, the drawback lies in the significant increase in computational operations, given that an image undergoes two distinct processes.
Despite attempts to enhance speed with subsequent iterations like Fast R-CNN and Faster R-CNN, the number of frames per second (FPS) remains notably low during both training and visual inspection.Notably, even with these advancements, the Faster R-CNN algorithm achieves only an average of 7 FPS in real-time scenarios.The unparalleled speed of the YOLO algorithm stems from its ability to predict all objects and their coordinates in a single pass through the neural network, treating object detection as a unified regression problem.
This distinctive prediction process enables YOLO to process images swiftly, establishing it as both a fast and accurate object detection algorithm.By fundamentally altering the paradigm of object detection, YOLO emerges as a pioneering solution that seamlessly combines speed and precision, making it particularly well-suited for real-time applications in the evolving landscape of computer vision.
The research hypotheses are that the results of this study will demonstrate the effectiveness of the use of the YOLO algorithm in reading Gokturkish texts and that this algorithm will serve as an example for other similar studies.
The YOLO algorithm used in this study is faster than traditional object detection methods and provides high accuracy rates.YOLO can simultaneously detect and classify objects in a single image.These features are used for example, traffic flow monitoring, facial recognition, object tracking, security systems, etc.However, the YOLO algorithm has not been used before in Turkic languages such as Turkish or Gokturkish.Therefore, in this thesis, the use of the YOLO algorithm in Gokturkish texts will be examined.

Conclusions
This comprehensive study is poised to deliver multifaceted benefits across scientific, cultural, and economic domains.At its scientific core, the research endeavors to deepen our understanding of the intricate processes involved in reading and decoding Gokturkish texts, promising to catalyze further investigations in this burgeoning field of study.Culturally, the study holds the promise of enriching our comprehension of Turkish culture and history, playing a pivotal role in the preservation of invaluable information that might otherwise be lost.On an economic front, the demonstration of the YOLO algorithm's prowess in facilitating faster and more efficient reading of Gokturkish texts stands to streamline the work of historians, archaeologists, and other researchers, thereby enhancing overall productivity and effectiveness.
In essence, this study represents a concerted effort to evaluate the efficacy of employing the YOLO algorithm for the interpretation of Gokturkish texts.By fostering interdisciplinary collaboration, it aims to underscore the importance of leveraging computer-aided technology in decoding ancient languages.The potential applications of this research are far-reaching, encouraging diverse disciplines to join forces in unlocking the mysteries embedded in Gokturkish texts.Through this synergy, the study not only stands to advance our scientific understanding but ISSN ONLINE 2669-1116 also promises to contribute significantly to the cultural and economic landscape.
The investigation into the utilization of the YOLO object detection algorithm for computerassisted reading of Gokturkish texts holds particular significance for history, language, and cultural research.By expediting the deciphering of historical documents, this thesis has the potential to grant researchers quicker and more effective access to the wealth of information contained within these texts.Moreover, the original contribution lies in exploring the YOLO algorithm's capacity to navigate the nuances of Gokturkish texts, aiming to make a meaningful impact on the broader field of Natural Language Processing, especially in the realm of ancient or lost languages.
Rooted in expert analyses and textual readings, this study establishes a fundamental letter-word relationship, imparting this knowledge to the system.By doing so, it not only bridges the gap between traditional disciplines but also propels the study of Gokturkish texts into the forefront of cutting-edge technology applications.As a melting pot of diverse disciplinary interests, this study serves as a catalyst for collaboration, building bridges between history, language, culture, and computer science, ultimately contributing to a more holistic and nuanced understanding of Gokturkish texts.

Fig. 1 .
Fig.1.Gokturkish letters[24] By:  coordinate of the midpoint of the object.Bw: Width of the object.Bh: Height of the object.Connected Class Probability: The number of predictive values as many different classes as there are in our model.Confidence Score = Box Confidence Score × Connected Class Probability.Box Confidence Score = P (object) × IoU.P(object): Probability of whether the box covers the object or not.IoU: Intersection of the box where the object is actually located and the box that is predicted.