當前位置：首頁 > news >正文

網(wǎng)站400百度關鍵詞優(yōu)化手段

news 2025/7/5 6:08:29

網(wǎng)站400,百度關鍵詞優(yōu)化手段,青島高級網(wǎng)站建設服務,網(wǎng)站黨建制度建設目錄 0. 本欄目競賽匯總表1. 本文主旨2. AI工程架構3. 數(shù)據(jù)預處理模塊3.1 配置數(shù)據(jù)路徑和處理參數(shù)3.2 配置API參數(shù)3.3 配置輸出路徑 4. AI并行處理模塊4.1 定義LLM客戶端類4.2 定義數(shù)據(jù)處理函數(shù)4.3 定義JSON保存函數(shù)4.4 定義數(shù)據(jù)分片函數(shù)4.5 定義分片處理函數(shù)4.5 定義文件名排序…

- 0. 本欄目競賽匯總表
- 1. 本文主旨
- 2. AI工程架構
- 3. 數(shù)據(jù)預處理模塊
- - 3.1 配置數(shù)據(jù)路徑和處理參數(shù)
  - 3.2 配置API參數(shù)
  - 3.3 配置輸出路徑
- 4. AI并行處理模塊
- - 4.1 定義LLM客戶端類
  - 4.2 定義數(shù)據(jù)處理函數(shù)
  - 4.3 定義JSON保存函數(shù)
  - 4.4 定義數(shù)據(jù)分片函數(shù)
  - 4.5 定義分片處理函數(shù)
  - 4.5 定義文件名排序函數(shù)
- 5. 數(shù)據(jù)整合模塊
- - 5.1 加載數(shù)據(jù)并生成分片
  - 5.2 初始化LLM客戶端并測試
  - 5.3 并行處理數(shù)據(jù)生成
  - 5.4 合并處理結果
  - 5.5 保存最終結果

0. 本欄目競賽匯總表

Kaggle競賽匯總

1. 本文主旨

大白話：由于在上一篇文章的數(shù)據(jù)探索中，我們發(fā)現(xiàn)了部分訓練數(shù)據(jù)的錯誤解釋存在缺失，因此直接使用GPT_4o+人設提示詞工程，對訓練集數(shù)據(jù)存在的錯誤解釋缺失問題的處理。
通過本文可收獲技能：API調用AI接口、人設提示詞工程案例、復雜的數(shù)據(jù)處理與緩存處理。
上文回顧：Eedi大模型蒸餾方案01-競賽信息解讀與數(shù)據(jù)理解

2. AI工程架構

3. 數(shù)據(jù)預處理模塊

3.1 配置數(shù)據(jù)路徑和處理參數(shù)

data_path = "~/work/eedi_synthetic_data/MalAlgoQA_format.csv"
index_start = 0
index_end = len(df)
step = 100
max_workers = 2

3.2 配置API參數(shù)

model_config = dict(openai_api_base = "https://testshellapi.kimi.asia/v1", api_key = "****",model = "gpt-4o",default_system_prompt = """##TaskYou are a Mathematics teacher. Your task is to reason and identify the ConstructName and SubjectName and then the misconception behind the user input Incorrect Answers with the Question.ConstructName is Most granular level of knowledge related to question, appears to describe the specific mathematical method or procedure used to solve the question. It explains the technique or approach needed to reach the answer.SubjectName is More general context than the construct, represents the broader mathematical topic or category that the question belongs to.Misconceptions are a mistake in conceptual understanding and they have relations with all the applications of those concepts. For example, a single misconception on the connections among proportional relationships (part/whole, part/part, whole/part) can cause problems in identifying those patterns in drawings and can be the cause of failing to realize all parts must be of equal size, therefore associating the denominator of the fraction with the total number of parts regardless their size.Answer concisely what misconception it is to lead to getting the incorrect answer.Do not use "The misconception is" to start your answers.Do not mention the concrete details of the question or answers. ##User inputQuestion: The question textA: multiple choice answer A textB: multiple choice answer B textC: multiple choice answer C textD: multiple choice answer D textCorrect Answer: The correct answer text##You should answer in the following JSON format{"ConstructName": "here writes the constructName","SubjectName": "here writes the SubjectName""MisconceptionAName": "here writes the answer A's misconception.","MisconceptionBName": "here writes the answer B's misconception.","MisconceptionCName": "here writes the answer C's misconception.","MisconceptionDName": "here writes the answer D's misconception.",}""", # system prompt,default_temperature = 0.5,max_tokens = 256,
)

3.3 配置輸出路徑

cache_folder = f"./cache_{model_config['model']}_model_misconceptions_result"
if not os.path.exists(cache_folder):os.makedirs(cache_folder)
output_data_path = f"misconception_data_{os.path.splitext(os.path.basename(data_path))[0]}_{model_config['model']}.csv"

4. AI并行處理模塊

4.1 定義LLM客戶端類

class LLMChat:def __init__(self, openai_api_base, api_key, model, default_temperature, default_system_prompt, max_tokens=512):self.client = OpenAI(api_key = api_key,base_url=openai_api_base,)self.model = modelself.default_temperature = default_temperatureself.default_system_prompt = default_system_promptself.max_tokens = max_tokensdef chat(self, user_prompt, system_prompt=None, temperature=None):if not system_prompt:system_prompt = self.default_system_promptif not temperature:temperature = self.default_temperaturechat_response = self.client.chat.completions.create(model=self.model,temperature=temperature,messages=[{"role": "system", "content": system_prompt},{"role": "user", "content": user_prompt},],max_tokens=self.max_tokens,response_format={"type": "json_object"})return chat_response.choices[0].message.content

4.2 定義數(shù)據(jù)處理函數(shù)

def process_row(args, debug=False):user_prompt = """Question: {question}A: {answer_a}B: {answer_b}C: {answer_c}D: {answer_d}Correct Answer: {correct_answer}"""index, row = argsca = row["CorrectAnswer"]correctanswer = row[f"Answer{ca}Text"]input_user_prompt = user_prompt.format(question=row['QuestionText'],answer_a=row['AnswerAText'],answer_b=row['AnswerBText'],answer_c=row['AnswerCText'],answer_d=row['AnswerDText'],correct_answer=correctanswer,)ret_data = {}try:ret_data = vc.chat(input_user_prompt)if debug:print(ret_data+'\n')except Exception as e:print(f'An exception occur {str(e)}')ret_data['error'] = str(e)passif debug:print('system: ', model_config['default_system_prompt'])print('>'* 50)print('user_input: ', input_user_prompt)print('>'* 50)print('assistant: ', ret_data)return ret_data

4.3 定義JSON保存函數(shù)

def save_json(fn, obj):with open(fn, 'w') as f:json.dump(obj, f, ensure_ascii=False, indent=4)print(f"save file to {fn}")

4.4 定義數(shù)據(jù)分片函數(shù)

def slice_range(start, end, step):if step <= 0:raise ValueError("步長必須大于0")result = []while start <= end:result.append(start)start += stepif result[-1] < end:result.append(end)return result

4.5 定義分片處理函數(shù)

def process_pairs(sliced_range):slices = []for first, second in zip(sliced_range, sliced_range[1:]):slices.append([first, second])return slices

4.5 定義文件名排序函數(shù)

def natural_sort_key(filename):parts = re.findall(r'\d+', filename)return tuple(map(int, parts))

5. 數(shù)據(jù)整合模塊

5.1 加載數(shù)據(jù)并生成分片

df = pd.read_csv(data_path)
df.head()
sliced_range = process_pairs(slice_range(index_start, index_end, step))

df數(shù)據(jù)檢查：
在這里插入圖片描述

5.2 初始化LLM客戶端并測試

vc = LLMChat(**model_config)
r = process_row((7, df.iloc[7]), debug=True)

5.3 并行處理數(shù)據(jù)生成

for slices in tqdm(sliced_range, total=len(sliced_range)):output_filepath = f'{cache_folder}/cache_res_{slices[0]}.json'if os.path.exists(output_filepath):print(f'cache file exists, skip {output_filepath}')continuedf_tasks = df.iloc[slices[0]:slices[1]]results = []with ProcessPoolExecutor(max_workers=max_workers) as executor:results = list(tqdm(executor.map(process_row, df_tasks.iterrows()), total=len(df_tasks)))save_json(output_filepath, results)

5.4 合并處理結果

f_names = glob.glob(f'{cache_folder}/*.json')
sorted_filenames = sorted(f_names, key=natural_sort_key)
f_names = sorted_filenamesresults = []
for fn in f_names:with open(fn, 'r') as f:batch_results = json.load(f)results.extend(batch_results)l = len(results)
results = [json.loads(r) for r in results]

5.5 保存最終結果

df = df.iloc[:l]
gen_df = pd.DataFrame(results)
df = pd.concat([df, gen_df], axis=1)
df.to_csv(output_data_path, index=False)

(To be continued)

查看全文

http://m.aloenet.com.cn/news/39545.html

vps做網(wǎng)站用什么系統(tǒng)網(wǎng)站推廣途徑和推廣要點有哪些?

站長工具收錄查詢女裝關鍵詞排名

陜西住房與城鄉(xiāng)建設廳網(wǎng)站微信引流獲客軟件

東莞南城網(wǎng)站建設公司網(wǎng)絡營銷優(yōu)化推廣

網(wǎng)站建設實訓報告建議北京網(wǎng)站制作建設公司

動易網(wǎng)站安裝最新疫情最新消息

做網(wǎng)站怎么買服務器嗎免費seo課程

網(wǎng)站后臺管理系統(tǒng)論文廣告推廣平臺代理

國內做賭博網(wǎng)站風險大嗎鄭州千鋒教育培訓機構怎么樣

網(wǎng)站備案怎樣提交到管局軟文廣告

網(wǎng)站設計的銷售人工智能培訓機構

南陽市住房和城市建設局網(wǎng)站seo搜索引擎排名優(yōu)化

網(wǎng)站建設地址北京昌平關鍵詞搜索指數(shù)查詢工具

網(wǎng)站集約化建設情況360推廣客服電話是多少

詩敏家具網(wǎng)站是誰做的官網(wǎng)seo

h5網(wǎng)站建設文章淘寶指數(shù)查詢工具

wordpress mvc百度seo排名優(yōu)化提高流量

深圳購物商城網(wǎng)站建設域名解析

網(wǎng)站搜索排名高怎么做免費百度下載

自己怎么做交易網(wǎng)站網(wǎng)站里的友情鏈接

發(fā)布網(wǎng)站需要備案交換鏈接營銷

網(wǎng)站建設大作業(yè)選題怎樣制作一個網(wǎng)頁

做電影解析網(wǎng)站獨立站谷歌seo

網(wǎng)站建設成功案例杭州專業(yè)seo服務公司

長春財經學院怎么樣好不好開魯seo服務

淮南市建設工程質量監(jiān)督中心網(wǎng)站想做網(wǎng)站找什么公司

網(wǎng)站開發(fā)簡歷網(wǎng)站站內關鍵詞優(yōu)化

廈門誰需要網(wǎng)站建設網(wǎng)絡推廣公司排行榜

java網(wǎng)站開發(fā)技術百度seo優(yōu)化

北京海淀區(qū)信息科技有限公司seo關鍵詞優(yōu)化技術

国产亚洲精品福利在线无卡一,国产精久久一区二区三区,亚洲精品无码国模,精品久久久久久无码专区不卡