Large Language Models: Training, Challenges, Applications, and Development

Kazem, Taghandiki; Mohammad, Mohammadi

Large Language Models: Training, Challenges, Applications, and Development

عنوان مقاله: Large Language Models: Training, Challenges, Applications, and Development
شناسه ملی مقاله: BECE02_063
منتشر شده در دومین کنفرانس ملی کسب و کار نوین در مهندسی برق و کامپیوتر در سال 1403

مشخصات نویسندگان مقاله:

Kazem Taghandiki - Department of Computer Engineering, Technical and Vocational University (TVU), Tehran, Iran
Mohammad Mohammadi

خلاصه مقاله:

This paper delves into the realm of large language models (LLMs), which are potent tools within the domain of artificial intelligence, facilitating comprehensive comprehension and utilization of human language by computers. LLMs, exemplified by GPT-۳, exhibit remarkable proficiency across various linguistic tasks, ranging from writing to language translation and customer service. However, despite their efficacy, these models are not without drawbacks; they often manifest biases, possess opaque workings, and consume substantial energy resources. Tracing the evolution of LLMs from rudimentary models to sophisticated constructs capable of intricate language processing, this paper explores their learning mechanisms, diverse applications, and associated concerns, including the imperative of fairness and privacy preservation. Looking towards the future, it advocates for research endeavors aimed at enhancing the environmental sustainability, comprehensibility, and impartiality of LLMs. The overarching objective is to harness these models effectively and ethically, cognizant of their transformative potential across technological and societal landscapes. In addition, it should be noted that the author extensively reviewed and studied over ۴۰ articles within a span of ۲ months to comprehensively acquaint themselves with the subject matter, ensuring the most effective presentation of large language models (LLMs) to the interested audience.

کلمات کلیدی:

large language models, natural language processing, artificial intelligence, GPT

صفحه اختصاصی مقاله و دریافت فایل کامل: https://civilica.com/doc/2032450/