调用Deepseek来自动仿写文章python脚本-python脚本-何三笔记

调用Deepseek来自动仿写文章python脚本

发表于 2025年02月14日阅读 1073 评论 0

背景

最近有位网友私信我，能不能搞一套能自动仿写文章的软件。给我详细的讲解了他的目前运作文章的思路。目前是手动通过deepseek丢1-N篇文章用来分析写作风格，然后再丢一篇目标文章，根据写作风格重新写一篇新的文章，拿生成好的文章再去简单配图编辑格式发布平台。

这个用python脚本通过deepseek接口来调用自动生成是没有问题的。

用python脚本的好处是可以自定义扩展，实现批量化生成。下面来看看具体python脚本的实现

Deepseek官网经常出现系统繁忙,导致体验非常不好。好在deepseek开源了，目前网上很多免费部署的方案，这里就不在赘述了。今天给大家推荐的是阿里云百炼提供的deepseek api接口调用。

有兴趣的可以去阿里云百炼平台查看。

实现逻辑

支持提供多条文章url
脚本自动获取文章内容
调用deepseek对文章内容进行分析
提供目标文章url
获取目标文章内容
调用deepseek进行复写
以主编的身份对写作结果进行审核，并调整内容

代码实现

import os
from openai import OpenAI  # 引入 OpenAI 库
from dotenv import load_dotenv
import requests
import html2text

# 加载环境变量
load_dotenv()
# 读取环境变量获取API密钥 在阿里云获取
api_key = os.getenv("api_key","你自己的key") 
openai_client = OpenAI(
    api_key=api_key,
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1"  # DeepSeek API 地址
)
model = "deepseek-r1"


def get_article_content(url):
    response = requests.get(url)
    html_content = response.text

    # 使用html2text库将HTML转换为纯文本
    h = html2text.HTML2Text()
    h.ignore_links = True
    h.ignore_images = True
    h.ignore_emphasis = True
    h.ignore_strong = True
    h.ignore_br = True
    return h.handle(html_content)

class ContentProcessor:
    @staticmethod
    def extract_articles(urls):
        """从多个URL提取文章内容"""
        articles = []
        for url in urls:
            try:
                text = get_article_content(url)
                articles.append(text)
            except Exception as e:
                print(f"Error processing {url}: {str(e)}")
        return articles

class WritingAnalyst:
    @staticmethod
    def analyze_style(texts):
        """分析文章写作手法"""
        combined_text = "\n\n".join(texts)[:5000]  # 限制长度

        prompt = f"""请分析以下文章的写作手法，包括但不限于：
        1. 整体结构布局
        2. 语言风格特点
        3. 修辞手法运用
        4. 段落衔接方式
        5. 其他显著特征

        文章内容：
        {combined_text}

        请用条理清晰的方式列出主要写作手法："""

        response = openai_client.chat.completions.create(
            model=model,
            messages=[{"role": "user", "content": prompt}]
        )
        return response.choices[0].message.content

class ContentCreator:
    @staticmethod
    def rewrite_content(source_content, writing_style):
        """基于写作手法进行二次创作"""
        prompt = f"""根据以下写作手法：
        {writing_style}

        请对以下内容进行重新创作：
        {source_content}

        要求：
        1. 保持核心信息不变
        2. 严格遵循分析的写作手法
        3. 输出内容流畅自然
        4. 长度与原文相当"""

        response = openai_client.chat.completions.create(
            model=model,
            messages=[{"role": "user", "content": prompt}]
        )
        return response.choices[0].message.content

class ChiefEditor:
    @staticmethod
    def review_content(original_style, new_content):
        """主编审核与修正"""
        prompt = f"""作为主编，请审核以下内容：
        {new_content}

        原始写作手法要求：
        {original_style}

        请检查：
        1. 是否符合指定写作手法
        2. 是否存在事实性错误
        3. 语言是否通顺
        4. 逻辑是否严谨

        发现问题请直接给出修改后的版本，无需额外说明："""

        response = openai_client.chat.completions.create(
            model=model,
            messages=[{"role": "user", "content": prompt}]
        )
        return response.choices[0].message.content

class AIAgent:
    def __init__(self):
        self.processor = ContentProcessor()
        self.analyst = WritingAnalyst()
        self.creator = ContentCreator()
        self.editor = ChiefEditor()

    def process(self, original_urls, new_source):
        # 处理原始文章
        original_texts = self.processor.extract_articles(original_urls)
        if not original_texts:
            return "无法获取原始文章内容"

        # 分析写作手法
        writing_style = self.analyst.analyze_style(original_texts)
        print(f"\n分析出的写作手法：\n{writing_style}\n{'-'*50}")

        # 处理新内容源
        if new_source.startswith('http'):
            new_content = self.processor.extract_articles([new_source])
            new_content = new_content[0] if new_content else ""
        else:
            new_content = new_source

        # 二次创作
        draft = self.creator.rewrite_content(new_content, writing_style)
        print(f"\n初稿内容：\n{draft}\n{'-'*50}")

        # 主编审核
        final_content = self.editor.review_content(writing_style, draft)
        return final_content

if __name__ == "__main__":
    agent = AIAgent()

    # 示例使用
    original_urls = [
        "https://mp.weixin.qq.com/s/mPv2xd40ZIGYhFLEWw7aYA",
        "https://mp.weixin.qq.com/s/9KhKZNjpqfDWG5j_CvmsbA"
    ]

    new_source = "https://mp.weixin.qq.com/s/Hb_Eg5ERSmljzfofKsyz5Q"

    result = agent.process(original_urls, new_source)
    print("\n最终审核通过内容：\n" + result)

需要优化

1、增加Rag检索增强功能，使效果更好 2、加入批量化操作，比如读取excel内容（写作参考，目标文章），实现批量化生成

免责声明

直接通过Python爬虫获取网络文章内容可能会违反相关的使用条款，甚至可能触犯法律。如果触发法律与作者无关，请自定承担！

最后

所有软件都有bug

交流群微信：466867714 备注：python

本文链接：https://www.h3blog.com/article/562/