法律文本格式化助手
将法律文本转换为规范的 Markdown 格式
工作流图谱
YAML 源码
app:
description: 将法律文本转换为规范的 Markdown 格式
icon: 🤖
icon_background: '#FFEAD5'
mode: workflow
name: 法律文本格式化助手
use_icon_as_answer_icon: false
kind: app
version: 0.1.5
workflow:
conversation_variables: []
environment_variables: []
features:
file_upload:
allowed_file_extensions: []
allowed_file_types: []
allowed_file_upload_methods: []
enabled: false
fileUploadConfig:
audio_file_size_limit: 50
batch_count_limit: 5
file_size_limit: 15
image_file_size_limit: 10
video_file_size_limit: 100
workflow_file_upload_limit: 10
image:
enabled: false
number_limits: 3
transfer_methods: []
number_limits: 3
opening_statement: ''
retriever_resource:
enabled: false
sensitive_word_avoidance:
enabled: false
speech_to_text:
enabled: false
suggested_questions: []
suggested_questions_after_answer:
enabled: false
text_to_speech:
enabled: false
language: ''
voice: ''
graph:
edges:
- data:
isInIteration: false
sourceType: start
targetType: llm
id: 1735626727052-source-1735632132331-target
source: '1735626727052'
sourceHandle: source
target: '1735632132331'
targetHandle: target
type: custom
zIndex: 0
- data:
isInIteration: false
sourceType: llm
targetType: code
id: 1735632132331-source-1735629665760-target
source: '1735632132331'
sourceHandle: source
target: '1735629665760'
targetHandle: target
type: custom
zIndex: 0
- data:
isInIteration: true
iteration_id: '1735632653899'
sourceType: iteration-start
targetType: llm
id: 1735632653899start-source-1735632981287-target
source: 1735632653899start
sourceHandle: source
target: '1735632981287'
targetHandle: target
type: custom
zIndex: 1002
- data:
isInIteration: false
sourceType: code
targetType: end
id: 1735633258714-source-1735628174959-target
source: '1735633258714'
sourceHandle: source
target: '1735628174959'
targetHandle: target
type: custom
zIndex: 0
- data:
isInIteration: false
sourceType: code
targetType: iteration
id: 1735629665760-source-1735632653899-target
source: '1735629665760'
sourceHandle: source
target: '1735632653899'
targetHandle: target
type: custom
zIndex: 0
- data:
isInIteration: false
sourceType: iteration
targetType: code
id: 1735632653899-source-1735633258714-target
source: '1735632653899'
sourceHandle: source
target: '1735633258714'
targetHandle: target
type: custom
zIndex: 0
- data:
isInIteration: false
sourceType: code
targetType: tool
id: 1735633258714-source-1735802619533-target
source: '1735633258714'
sourceHandle: source
target: '1735802619533'
targetHandle: target
type: custom
zIndex: 0
nodes:
- data:
desc: ''
selected: false
title: 开始
type: start
variables:
- label: legal_text
max_length: 80000
options: []
required: true
type: paragraph
variable: legal_text
height: 88
id: '1735626727052'
position:
x: 27.937620977678364
y: 292
positionAbsolute:
x: 27.937620977678364
y: 292
selected: false
sourcePosition: right
targetPosition: left
type: custom
width: 243
- data:
desc: ''
outputs:
- value_selector:
- '1735633258714'
- final_text
variable: text
selected: false
title: 结束
type: end
height: 88
id: '1735628174959'
position:
x: 1198.1485031082693
y: 292
positionAbsolute:
x: 1198.1485031082693
y: 292
selected: false
sourcePosition: right
targetPosition: left
type: custom
width: 243
- data:
code: "import json\nimport re\n\ndef main(text, analysis_result):\n # 处理\
\ analysis_result,移除可能的 markdown 标记\n analysis_text = analysis_result.strip()\n\
\ if analysis_text.startswith('```json'):\n analysis_text = analysis_text[7:]\n\
\ if analysis_text.endswith('```'):\n analysis_text = analysis_text[:-3]\n\
\ \n # 使用 json.loads 而不是 eval\n analysis = json.loads(analysis_text.strip())\n\
\ text_type = analysis.get('text_type')\n structure = analysis.get('structure',\
\ {})\n target_length = 2000 # 保持2000字符\n max_length = target_length\
\ * 1.1 # 允许超出10%\n \n def get_split_point_checker(text_type, structure):\n\
\ if text_type == 'law_text':\n def is_split_point(line,\
\ current_length):\n # 严格按照章节分割\n return line.strip().startswith('第')\
\ and ('章' in line or '编' in line)\n else:\n def is_split_point(line,\
\ current_length):\n # 严格按照案例分割\n return (line.strip().startswith('案例')\
\ or\n line.strip().endswith('案') or\n \
\ line.strip().endswith('判决书') or\n line.strip().endswith('裁定书'))\n\
\ \n return is_split_point\n \n is_split_point\
\ = get_split_point_checker(text_type, structure)\n \n # 预处理:移除多余的空行,保留单个空行\n\
\ raw_lines = text.split('\\n')\n lines = []\n prev_line_empty\
\ = True\n \n for line in raw_lines:\n line = line.rstrip()\n\
\ is_empty = not line.strip()\n \n # 保留有意义的空行,但避免连续空行\n\
\ if not is_empty or not prev_line_empty:\n lines.append(line)\n\
\ prev_line_empty = is_empty\n \n raw_segments = [] # 改名为\
\ raw_segments\n current_segment = []\n current_length = 0\n \n\
\ for i, line in enumerate(lines):\n line_length = len(line.strip())\n\
\ \n # 如果当前段落为空,直接添加\n if not current_segment:\n \
\ current_segment.append(line)\n current_length = line_length\n\
\ continue\n \n # 检查是否是分割点\n if is_split_point(line,\
\ current_length):\n # 在分割点处分割\n if current_segment:\n\
\ raw_segments.append('\\n'.join(current_segment))\n \
\ current_segment = [line]\n current_length = line_length\n\
\ continue\n \n # 如果超过最大长度,强制分割\n if current_length\
\ + line_length > max_length:\n raw_segments.append('\\n'.join(current_segment))\n\
\ current_segment = [line]\n current_length = line_length\n\
\ else:\n current_segment.append(line)\n current_length\
\ += line_length\n \n # 添加最后一个段落\n if current_segment:\n \
\ raw_segments.append('\\n'.join(current_segment))\n \n # 确保至少返回一个段落\n\
\ if not raw_segments:\n raw_segments = [text]\n \n # 格式化为字符串数组\n\
\ segments = []\n for segment in raw_segments:\n if segment.strip():\
\ # 确保段落不为空\n segments.append(segment.strip())\n \n #\
\ 如果没有有效段落,返回包含整个文本的单个段落\n if not segments:\n segments = [text.strip()]\n\
\ \n # 返回包含 segments 数组的字典\n return {\n \"segments\": segments\n\
\ }"
code_language: python3
desc: '代码
用于文本的分割,适应大模型一次性输出长度
同时需要输出数组提供给迭代模块用于并行处理'
outputs:
segments:
children: null
type: array[string]
selected: false
title: 文本分割
type: code
variables:
- value_selector:
- '1735626727052'
- legal_text
variable: text
- value_selector:
- '1735632132331'
- text
variable: analysis_result
height: 144
id: '1735629665760'
position:
x: 730.6999387678718
y: 58.97775019592794
positionAbsolute:
x: 730.6999387678718
y: 58.97775019592794
selected: false
sourcePosition: right
targetPosition: left
type: custom
width: 243
- data:
context:
enabled: true
variable_selector:
- '1735626727052'
- legal_text
desc: 用LLM分析文本是法条还是案例,返回json
model:
completion_params:
temperature: 0.3
mode: chat
name: deepseek-chat
provider: deepseek
prompt_template:
- edition_type: basic
id: 8a14522d-d436-4e39-96cc-770cd9daee9e
role: system
text: "你是一个专业的法律文本分析助手。请分析输入的文本,判断其是法律条文还是法律案例,并分析其结构特征。\n\n请直接返回JSON格式(不要包含任何markdown标记),必须包含以下字段:\n\
{\n \"text_type\": \"law_text/case_text\", // 文本类型:law_text为法律条文,case_text为法律案例\n\
\ \"confidence\": number, // 判断的置信度(0-1)\n \"reason\"\
: \"string\", // 判断依据\n \"structure\": { \
\ // 结构特征\n \"has_chapters\": boolean, // 是否有章节划分\n\
\ \"has_articles\": boolean, // 是否有条款编号\n \"has_case_number\"\
: boolean, // 是否有案号\n \"has_court_parts\": boolean, //\
\ 是否有法院审理部分\n \"main_sections\": [\"string\"] // 主要段落类型\n },\n\
\ \"split_strategy\": ...(过长已截断)