Inferring organizational duties from Persian administrative and employment laws using Large Language Models (LLMs) and few-shot learning

Hojjat Hajizadeh Nowkhandan; Mohsen Kahani

Inferring organizational duties from Persian administrative and employment laws using Large Language Models (LLMs) and few-shot learning

Publish place: Journal of Innovations in Computer Science and Engineering، Vol: 2، Issue: 4

Publish Year: 1404

نوع سند: مقاله ژورنالی

زبان: English

This Paper With 10 Page And PDF Format Ready To Download

دریافت فایل کامل Paper

Certificate
من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

https://civilica.com/doc/2255059

شناسه ملی سند علمی:

JR_JICSE-2-4_007

تاریخ نمایه سازی: 24 اردیبهشت 1404

Abstract:

Abstract—Extracting organizational duties from legal documents is a critical yet challenging task, particularly in low-resource languages like Persian. This paper presents an innovative approach that integrates state-of-the-art Named Entity Recognition (NER) with advanced segmentation techniques and Large Language Models (LLMs) to accurately identify and extract duties assigned to organizations from Persian legal texts. Leveraging the power of the BERT-based model for NER, we enhance the recognition of relevant entities and ensure precise linkage to target organizations. Our method involves segmenting documents into sentences with an enhanced POS-based tokenizer, followed by the retrieval of contextually relevant segments based on the detected entities. We then explore the effectiveness of different LLM configurations, including a hierarchical approach that leverages both small and large models. Our experiments demonstrate that the hierarchical approach, combining ’Llama-۳.۱-۸B’ and ’gpt-۴o’, achieves an F۱-score of ۰.۷۹۰۱, significantly outperforming single-model approaches. This research underscores the potential of LLMs in legal text analysis, paving the way for more advanced tools in Natural Language Processing. Future work will include testing on a broader range of organizations, refining prompt engineering techniques, and enhancing model interpretability.

Keywords:

Index Terms—NLP , Large Language Models , Few-shot Learning , Duty Extraction , Document Segmentation , Legal Informatics

Authors

Hojjat Hajizadeh Nowkhandan

Department of Computer Engineering Ferdowsi University of Mashhad

Mohsen Kahani

Department of Computer Engineering Ferdowsi University of Mashhad