Inferring organizational duties from Persian administrative and employment laws using Large Language Models (LLMs) and few-shot learning
Publish Year: 1404
نوع سند: مقاله ژورنالی
زبان: English
View: 122
This Paper With 10 Page And PDF Format Ready To Download
- Certificate
- من نویسنده این مقاله هستم
استخراج به نرم افزارهای پژوهشی:
شناسه ملی سند علمی:
JR_JICSE-2-4_007
تاریخ نمایه سازی: 24 اردیبهشت 1404
Abstract:
Abstract—Extracting organizational duties from legal documents is a critical yet challenging task, particularly in low-resource languages like Persian. This paper presents an innovative approach that integrates state-of-the-art Named Entity Recognition (NER) with advanced segmentation techniques and Large Language Models (LLMs) to accurately identify and extract duties assigned to organizations from Persian legal texts. Leveraging the power of the BERT-based model for NER, we enhance the recognition of relevant entities and ensure precise linkage to target organizations. Our method involves segmenting documents into sentences with an enhanced POS-based tokenizer, followed by the retrieval of contextually relevant segments based on the detected entities. We then explore the effectiveness of different LLM configurations, including a hierarchical approach that leverages both small and large models. Our experiments demonstrate that the hierarchical approach, combining ’Llama-۳.۱-۸B’ and ’gpt-۴o’, achieves an F۱-score of ۰.۷۹۰۱, significantly outperforming single-model approaches. This research underscores the potential of LLMs in legal text analysis, paving the way for more advanced tools in Natural Language Processing. Future work will include testing on a broader range of organizations, refining prompt engineering techniques, and enhancing model interpretability.
Keywords:
Authors
Hojjat Hajizadeh Nowkhandan
Department of Computer Engineering Ferdowsi University of Mashhad
Mohsen Kahani
Department of Computer Engineering Ferdowsi University of Mashhad