Large Language Models: Training, Challenges, Applications, and Development

Kazem Taghandiki; Mohammad Mohammadi

Large Language Models: Training, Challenges, Applications, and Development

Publish place: The 2nd National Conference of New Business on Electrical and Computer Engineering

Publish Year: 1403

نوع سند: مقاله کنفرانسی

زبان: English

This Paper With 13 Page And PDF Format Ready To Download

دریافت فایل کامل Paper

Certificate
من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

https://civilica.com/doc/2032450

شناسه ملی سند علمی:

BECE02_063

تاریخ نمایه سازی: 2 مرداد 1403

Abstract:

This paper delves into the realm of large language models (LLMs), which are potent tools within the domain of artificial intelligence, facilitating comprehensive comprehension and utilization of human language by computers. LLMs, exemplified by GPT-۳, exhibit remarkable proficiency across various linguistic tasks, ranging from writing to language translation and customer service. However, despite their efficacy, these models are not without drawbacks; they often manifest biases, possess opaque workings, and consume substantial energy resources. Tracing the evolution of LLMs from rudimentary models to sophisticated constructs capable of intricate language processing, this paper explores their learning mechanisms, diverse applications, and associated concerns, including the imperative of fairness and privacy preservation. Looking towards the future, it advocates for research endeavors aimed at enhancing the environmental sustainability, comprehensibility, and impartiality of LLMs. The overarching objective is to harness these models effectively and ethically, cognizant of their transformative potential across technological and societal landscapes. In addition, it should be noted that the author extensively reviewed and studied over ۴۰ articles within a span of ۲ months to comprehensively acquaint themselves with the subject matter, ensuring the most effective presentation of large language models (LLMs) to the interested audience.

Keywords:

large language models , natural language processing , artificial intelligence , GPT

Authors

Kazem Taghandiki

Department of Computer Engineering, Technical and Vocational University (TVU), Tehran, Iran

Mohammad Mohammadi