基于SpringBoot+SpringAI+Ollama开发智能问答系统

2025-06-30 11:26 开发作者：码农阿豪@新空间

引言

在人工智能技术飞速发展的今天，大语言模型(LLM)已成为开发者工具箱中不可或缺的一部分。然而，依赖云端API服务不仅存在数据隐私问题，还可能产生高昂成本。本文将介绍如何利用SpringBoot、SpringAI框架结合Ollama本地大模型服务，搭建一个完全运行在本地Windows环境下的智能问答系统。

技术栈概述

SpringBoot与SpringAI

SpringBoot作为Java生态中最流行的应用框架，提供了快速构建生产级应用的能力。SpringAI是Spring生态系统中的新兴成员，专门为AI集成设计，它简化了与各种大语言模型的交互过程，提供了统一的API接口。

Ollama本地模型服务

Ollama是一个开源项目，允许开发者在本地运行和管理大型语言模型。它支持多种开源模型，包括Llama、Mistral等，并提供了简单的API接口。通过Ollama，我们可以在不依赖互联网连接的情况下使用强大的语言模型能力。

环境准备

硬件要求

Windows 10/11操作系统

至少16GB RAM（推荐32GB或以上）

NVIDIA显卡（可选，可加速推理）

软件安装

1.安装Ollama：

访问Ollama官网(https://ollama.ai)下载Windows版本并安装

2.验证Ollama安装：

ollama list

项目搭建

创建SpringBoot项目

使用Spring Initializr(https://start.spring.io)创建项目，选择以下依赖：

Spring Web
Lombok
Spring AI (如未列出可手动添加)

配置pom.xml

确保包含SpringAI Ollama依赖：

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
    <version>0.8.1</version>
</dependency>

应用配置

application.yml配置：

spring:
  ai:
    ollama:
      base-url: http://localhost:1编程1434
      chat:
        model: deepseek
        options:
          temperature: 0.7
          top-p: 0.9

核心功能实现

问答服务层

创建QAService类：

@Service
public class QAService {
    
    private final OllaMAChatClient chatClient;
    
    public QAService(OllamaChatClient chatClient) {
        this.chatClient = chatClient;
    }
    
    public String generateAnswer(String prompt) {
        return chatClient.call(prompt);
    }
    
    public Flux<String> generateStreamAnswer(String prompt) {
        return chatClient.stream(prompt);
    }
}

控制器实现

QAController.java:

@RestController
@RequestMapping("/api/qa")
public class QAController {
    
    private final QAService qaService;
    
    public QAController(QAService qaService) {
        this.qaService = qaService;
    }
    
    @PostMapping("/ask")
    public ResponseEntity<String> askQuestion(@RequestBody String question) {
        String answer = qaService.generateAnswer(question);
        return ResponseEntity.ok(answer);
    }
    
    @GetMapping(value = "/ask-stream", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
    public Flux<String> askQuestionStream(@RequestParam String question) {
        return qaService.generateStreamAnswer(question);
    }
}

提示工程优化

为提高回答质量，我们可以实现提示模板：

PromptTemplateService.java:

@Service
public class PromptTemplateService {
    
    pjsrivate static final String QA_TEMPLATE = """
            你是一个专业的AI助手，请根据以下要求回答问题：
            1. 回答要专业、准确
            2. 如果问题涉及不确定信息，请明确说明
            3. 保持回答简洁明了
            
            问题：{question}
            """;
    
    public String buildPrompt(String question) {
        return QA_TEMPjsLATE.replace("{question}", question);
    }
}

更新QAService使用提示模板：

public String generateAnswer(String prompt) {
    String formattedPrompt = promptTemplateService.buildPrompt(prompt);
    return chatClient.call(formattedPrompt);
}

高级功能实现

对话历史管理

实现简单的对话记忆功能：

ConversationManager.java:

@Service
@Scope(value = WebApplicationContext.SCOPE_SESSION, proxyMode = ScopedProxyMode.TARGET_CLASS)
public class ConversationManager {
    
    private final List<String> conversationHistory = new ArrayList<>();
    
    public void addExchange(String userInput, String aiResponse) {
        conversationHistory.add("用户: " + userInput);
        conversationHistory.add("AI: " + aiResponse);
    }
    
    public String getConversationContext() {
        return String.join("\n", conversationHistory);
    }
    
    public void clear() {
        conversationHistory.clear();
    }
}

更新提示模板以包含历史：

public String buildPrompt(String question, String history) {
    return QA_TEMPLATE.replace("{history}", history)
                     .replace("{question}", question);
}

文件内容问答

实现基于上传文档的问答功能：

DocumentService.java:

@Service
public class DocumentService {
    
    private final ResourceLoader resourceLoader;
    private final TextSplitter textSplitter;
    
    public DocumentService(ResourceLoader resourceLoader) {
        this.resourceLoader = resourceLoader;
        this.textSplitter = new TokenTextSplitter();
    }
    
    public List<String> processDocument(MultipartFile file) throws IOException {
        String content = new String(file.getBytes(), StandardCharsets.UTF_8);
        return textSplitter.split(content);
    }
    
    public String extractRelevantParts(List<String> chunks, String question) {
        // 简化的相关性匹配 - 实际项目应使用嵌入向量
        return chunks.stream()
                .filter(chunk -> chunk.toLowerCase().contains(question.toLowerCase()))
                .findFirst()
                .orElse("");
    }
}

添加文档问答端点：

@PostMapping(value = "/ask-with-doc", consumes = MediaType.MULTIPART_FORM_DATA_VALUE)
public ResponseEntity<String> askWithDocument(
        @RequestParam String question,
        @RequestParam MultipartFile document) throws IOException {
    
    List<String> chunks = documentService.processDocument(document);
    String context = documentService.extractRelevantParts(chunks, question);
    
    String prompt = """
            基于以下文档内容回答问题：
            
            文档相关部分：
            {context}
            
            问题：{question}
            """.replace("{context}", context)
              .replace("{question}", question);
    
    String answer = qaService.generateAnswer(prompt);
    return ResponseEntity.ok(answer);
}

前端交互实现

简单HTML界面

resources/static/index.html:

<!DOCTYPE html>
<html>
<head>
    <title>本地AI问答系统</title>
    <script src="https://cdn.jsdelivr.net/npm/axIOS/dist/axios.min.js"></script>
</head>
<body>
    <h1>本地问答系统</h1>
    <div>
        <textarea id="question" rows="4" cols="50"></textarea>
    </div>
    <button onclick="askQuestion()">提问</button>
    <div id="answer"></div>
    
    <script>
        function askQuestion() {
            const question = document.getElementById('question').value;
            document.getElementById('answer').innerText = "思考中...";
            
            axios.post('/api/qa/ask', question, {
                headers: { 'Content-Type': 'text/plain' }
            })
            .then(response => {
                document.getElementById('answer').innerText = response.data;
            })
            .catch(error => {
                document.getElementById('answer').innerText = "出错: " + error.message;
            });
        }
    </script>
</body>
</html>

流式响应界面

添加流式问答HTML：

<div>
    <h2>流式问答</h2>
    <textarea id="streamQuestion" rows="4" cols="50"></textarea>
    <button onclick="askStreamQuestion()">流式提问</button>
    <div id="streamAnswer"></div>
</div>

<script>
    function askStreamQuestion() {
        const question = document.getElementById('streamQuestion').value;
        const answerDiv = document.getElementById('streamAnswer');
        answerDiv.innerText = "";
        
        const eventSource = new EventSource(`/api/qa/ask-stream?question=${encodeURIComponent(question)}`);
        
        eventSource.onmessage = function(event) {
            answerDiv.innerText += event.data;
        };
        
        eventSource.onerror = function() {
            eventSource.close();
        };
    }
</script>

性能优化与调试

模型参数调优

在application.yml中调整模型参数：

spring:
  ai:
    ollama:
      chat:
        options:
          temperature: 0.5  # 控制创造性(0-1)
          top-p: 0.9        # 核采样阈值
          num-predict: 512  # 最大token数

日志记录

配置日志以监控AI交互：

@Configuration
public class LoggingConfig {
    
    @Bean
    public Logger.Level feignLoggerLevel() {
        return Logger.Level.FULL;
    }
    
    @Bean
    public OllamaApi ollamaApi(Client client, ObjectProvider<HttpMessageConverterCustomizer> customizers) {
        return new OllamaApiIntercepto编程r(new OllamaApi(client, customizers));
    }
}

class OllamaApiInterceptor implements OllamaApi {
    
    private static final Logger log = LoggerFactory.getLogger(OllamaApiInterceptor.class);
    private final OllamaApi delegate;
    
    public OllamaApiInterceptor(OllamaApi delegate) {
        this.delegate = delegate;
    }
    
    @Override
    public GenerateResponse generate(GenerateRequest request) {
        log.info("Ollama请求: {}", request);
        GenerateResponse response = delegate.generate(request);
        log.debug("Ollama响应: {}", response);
        return response;
    }
}

超时设置

配置连接超时：

spring:
  ai:
    ollama:
      client:
        connect-timeout: 30s
        read-timeout: 5m

安全加固

API认证

添加简单的API密钥认证：

SecurityConfig.java:

@Configuration
@EnableWebSecurity
public class SecurityConfig {
    
    @Value("${app.api-key}")
    private String apiKey;
    
    @Bean
    public SecurityFilterChain securityFilterChain(HttpSecurity http) throws Exception {
        http
            .authorizeHttpRequests(auth -> auth
                .requestMatchers("/api/**").authenticated()
                .anyRequest().permitAll()
            )
            .addFilterBefore(new ApiKeyFilter(apiKey), UsernamePasswordAuthenticationFilter.class)
            .csrf().disable();
        return http.build();
    }
}
class ApiKeyFilter extends OncePerRequestFilter {
    
    private final String expectedApiKey;
    
    public ApiKeyFilter(String expectedApiKey) {
        this.expectedApiKey = expectedApiKey;
    }
    
    @Override
    protected void doFilterInternal(HttpServletRequest request, 
                                   HttpServletResponse response, 
                                   FilterChain filterChain) throws ServletException, IOException {
        String apiKey = requestjavascript.getHeader("X-API-KEY");
        
        if (!expectedApiKey.equals(apiKey)) {
            response.sendError(HttpServletResponse.SC_UNAUTHORIZED, "无效的API密钥");
            return;
        }
        
        filterChain.doFilter(request, response);
    }
}

部署与运行

启动Ollama服务

在Windows命令行中：

ollama serve

运行SpringBoot应用

在IDE中直接运行主类，或使用Maven命令：

mvn spring-boot:run

系统测试

访问 http://localhost:8080 测试问答功能，或使用Postman测试API端点。

扩展思路

向量数据库集成

考虑集成Chroma或Milvus等向量数据库实现更精准的文档检索：

@Configuration
public class VectorStoreConfig {
    
    @Bean
    public VectorStore vectorStore(EmbeddingClient embeddingClient) {
        return new SimpleVectorStore(embeddingClient);
    }
    
    @Bean
    public EmbeddingClient embeddingClient(OllamaApi ollamaApi) {
        return new OllamaEmbeddingClient(ollamaApi);
    }
}

多模型切换

实现动态模型选择：

@Service
public class ModelSelectorService {
    
    private final Map<String, ChatClient> clients;
    
    public ModelSelectorService(
            OllamaChatClient deep seekClient,
            OllamaChatClient llamaClient) {
        this.clients = Map.of(
            "deep seek", deep seekClient,
            "llama", llamaClient
        );
    }
    
    public ChatClient getClient(String modelName) {
        return clients.getOrDefault(modelName, clients.get("deep seek"));
    }
}

总结

本文详细介绍了如何使用SpringBoot、SpringAI和Ollama在本地Windows环境搭建一个功能完整的大模型问答系统。通过这个方案，开发者可以：

完全在本地运行AI服务，保障数据隐私
利用Spring生态快速构建生产级应用
灵活选择不同的开源模型
实现基础的问答到复杂的文档分析功能

随着本地AI技术的不断进步，这种架构将为更多企业应用提供安全、可控的AI解决方案。读者可以根据实际需求扩展本文示例，如增加更多模型支持、优化提示工程或集成更复杂的业务逻辑。

到此这篇关于基于SpringBoot+SpringAI+Ollama开发智能问答系统的文章就介绍到这了,更多相关SpringBoot SpringAI Ollama实现智能问答内容请搜索编程客栈(www.devze.com)以前的文章或继续浏览下面的相关文章希望大家以后多多支持编程客栈(www.devze.com)！

继续阅读：SpringBoot Ollama SpringBoot SpringAI Ollama开发智能问答系统 SpringBoot结合Ollama本地大模型

目录

引言

技术栈概述

SpringBoot与SpringAI

Ollama本地模型服务

环境准备

硬件要求

软件安装

项目搭建

创建SpringBoot项目

配置pom.xml

应用配置

核心功能实现

问答服务层

控制器实现

提示工程优化

高级功能实现

对话历史管理

文件内容问答

前端交互实现

简单HTML界面

流式响应界面

性能优化与调试

模型参数调优

日志记录

超时设置

安全加固

部署与运行

启动Ollama服务

运行SpringBoot应用

系统测试

扩展思路

向量数据库集成

多模型切换

总结

更多精彩内容

精彩评论

最新开发

Golang配置管理Viper的实现

go动态限制并发数量的实现示例

golang调用dll的接口三种方式小结

Go语言中的多种测试方法

Go语言sync.Once和sync.Cond的实现

开发排行榜

springboot后端存储富文本内容的思路与步骤(含图片内容)

PyCharm运行python测试,报错“没有发现测试”/“空套件”的解决

return base64.b64encode(b).decode(

基于C语言实现钻石棋游戏的示例代码

Sublime Text 3解决中文乱码问题（实测可用）