Java实现Excel图片URL筛选与大小检测的全过程

2025-08-09 10:51 开发作者： Nicky.Ma

引言

在数据处理场景中，我们常需筛选Excel中的图片URL。本文分享一个完整的Java方案，涵盖从读取图片URL到检测有效性、筛选大小，再到生成新Excel文件的全过程，同时讲解开发与优化过程，帮助你解决实际业务中的数据筛选和清洗需求。

一、问题背景

客户现场图片数据，要求如下：

读取Excel的图片URL。
检测URL有效性并获取图片大小。
筛选大于1MB或无法访问（404）的图片记录。
保留原始数据格式，尤其是日期类型数据。
生成筛选后的新Excel文件。

二、核心实现方案

（一）技术选型

为实现上述目标，我们主要采用以下技术：

Apache POI ：用于读取和写入Excel文件，支持对单元格数据的操作及格式处理，能方便地处理XLSX文件。
HttpURLConnection ：用于检测图片URL的有效性并获取图片大小，通过发送HEAD请求获取资源信息，避免下载整个图片浪费带宽。

（二）关键代码实现

1. Excel文件读取与写入

package cn.api.server;

import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URL;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.logging.Level;
import java.util.logging.Logger;

public class ImageSizeFilter {
    private static final Logger LOGGER = Logger.getLogger(ImageSizeFilter.class.getName());
    private static final int CONNECT_TIMEOUT = 5000;
    private static final int READ_TIMEOUT = 5000;
    private static final double BYTES_TO_MEGABYTES = 1024.0 * 1024.0;
    private static final double SIZE_THRESHOLD = 1.0;
    private static final SimpleDateFormat DATE_FORMAT = new SimpleDateFormat("yyyy-MM-dd");

    public static void main(String[] args) {
        String inputFilePath = "C:/Users/admin/Desktop/图片数据.xlsx";
        String outputFilePath = "C:/Users/admin/Desktop/图片数据_筛选后.xlsx";

        System.out.println("开始处理Excel文件...");
        System.out.println("输入文件: " + inputFilePath);

        long startTime = System.currentTimeMillis();
        int processedCount = 0;
        int filteredCount = 0;

        try (FileInputStream inputStream = new FileInputStream(new File(inputFilePath));
             Workbook workbook = new XSSFWorkbook(inputStream)) {

            Sheet sheet = workbook.getSheetAt(0);
            int totalRows = sheet.getLastRowNum();
            System.out.println("发现 " + totalRows + " 条数据记录");

            try (Workbook newWorkbook = new XSSFWorkbook()) {
                Sheet newSheet = newWorkbook.createSheet();
                Row headerRow = sheet.getRow(0);
                Row newHeaderRow = newSheet.createRow(0);
                
                // 复制表头并添加新列
                copyRow(headerRow, newHeaderRow);
                createHeaderCell(newHeaderRow, "图片大小（M）");
                createHeaderCell(newHeaderRow, "状态");

                int newRowIndex = 1;
                for (int i = 1; i <= totalRows; i++) {
                    if (i % 100 == 0) {
                        System.out.println("已处理 " + i + "/" + totalRows + " 行");
                    }

                    Row row = sheet.getRow(i);
                    if (row != null) {
                        Cell urlCell = row.getCell(7);
                        if (urlCell != null) {
                            String imageUrl = getCellValue(urlCell);
                            if (isValidUrl(imageUrl)) {
                                processedCount++;
                                long sizeInBytes = getImageSize(imageUrl);
                                double sizeInMegabytes = sizeInBytes / BYTES_TO_MEGABYTES;
                                boolean is404 = sizeInBytes == 0 && isUrl404(imageUrl);
                                
                                if (sizeInMegabytes > SIZE_THRESHOLD || is404) {
                                    filteredCount++;
                                    Row newRow = newSheet.createRow(newRowIndex++);
                                    copyRowWithDateHandling(row, newRow, workbook, newWorkbook);
                                    newRow.createCell(headerRow.getLastCellNum()).setCellValue(sizeInMegabytes);
                                    newRow.createCell(headerRow.getLastCellNum() + 1).setCellValue(is404 ? "404" : "图片过大");
                                }
                      python      }
                        }
                    }
                }

                try (FileOutputStream outputStream = new FileOutputStream(new File(outputFilePath))) {
                    newWorkbook.write(outputStream);
                }

                long endTime = System.currentTimeMillis();
                System.out.println("筛选完成！耗时：" + (endTime - startTime) / 1000 + " 秒");
                System.out.println("处理记录数：" + processedCount);
                System.out.println("筛选出的记录数：" + filteredCount);
                System.out.println("结果保存至：" + outputFilePath);
            }
        } catch (IOException e) {
            LOGGER.log(Level.SEVERE, "处理文件时出错", e);
        }
    }
}

在上述代码中，我们通过FileInputStream读取原始Excel文件，利用XSSFWorkbook将其加载为Workbook对象。然后获取第一个工作表（Sheet），并遍历其行数据。对于筛选出的符合条件的行，我们创建新的Workbook对象（newWorkbook），并在其中创建新的工作表（newSheet），将原始表头复制过来并添加新列 “图片大小（M）” 和 “状态”，用于存储图片大小信息和筛选状态。

2. URL检测与图片大小获取

// 获取图片大小（字节）
private static long getImageSize(String imageUrl) {
    HttpURLConnection connection = null;
    try {
        URL url = new URL(imageUrl);
        connection = (HttpURLConnection) url.openConnection();
        connection.setRequestMethod("HEAD");
        connection.setConnectTimeout(CONNECT_TIMEOUT);
        connection.setReadTimeout(READ_TIMEOUT);
        connection.connect();
        return connection.getResponseCode() == HttpURLConnection.HTTP_OK 
               ? connection.getContentLength() : 0;
    } catch (IOException e) {
        LOGGER.log(Level.SEVERE, "获取图片大小异常", e);
        return 0;
    } finally {
        if (connection != null) {
            connection.disconnect();
        }
    }
}

// 判断URL是否404
private static boolean isUrl404(String imageUrl) {
    HttpURLConnection connection = null;
    try {
        URL url = new URL(imageUrl);
        connection = (HttpURLConnection) url.openConnection();
        connection.setRequestMethod("HEAD");
        connection.setConnectTimeout(CONNECT_TIMEOUT);
        connection.setReadTimeout(READ_TIMEOUT);
        connection.connect();
        return connection.getResponseCode() == HttpURLConnection.HTTP_NOT_FOUND;
    } catch (IOException e) {
        LOGGER.log(Level.SEVERE, "检测404异常", e);
        return false;
    } finally {
        if (connection != null) {
            connection.disconnect();
        }
    }
}

这里，我们使用HttpURLConnection发送HEAD请求到指定的图片URL。HEAD请求不会下载资源实体内容，只请求资源的头部信息，这样可以快速获取图片的相关信息，如大小等。通过调用connection.getContentLength（）方法可获取图片大小（以字节为单位）。同时，我们还定义了isUrl404（）方法来判断URL是否返回404状态码，以便识别无法访问的图片。

3. 单元格数据读取与处理

// 获取单元格值（处理日期格式）
private static String getCellValue(Cell cell) {
    if (cell == null) {
        return "";
    }
    int cellType = cell.getCellType();
    switch (cellType) {
        case Cell.CELL_TYPE_STRING:
            return cell.getStringCellValue();
        case Cell.CELL_TYPE_NUMERIC:
            return DateUtil.isCellDateFormatted(cell) 
                   ? DATE_FORMAT.format(cell.getDateCellValue()) 
                   : String.valueOf(cell.getNumericCellValue());
        case Cell.CELL_TYPE_BOOLEAN:
            return String.valueOf(cell.getBooleanCellValue());
        case Cell.CELL_TYPE_FORMULA:
            return cell.getCellFormula();
        default:
            return "";
    }
}

在读取Excel单元格数据时，需考虑不同类型的数据处理方式。对于字符串类型单元格，直接获取其字符串值；对于数值型单元格，若其为日期格式（通过DateUtil.isCellDateFormatted（cell）判断），则将其转换为Date对象并按照指定格式（yyyy - MM - dd）格式化为字符串，否则以常规数值形式返回；对于布尔型单元格，返回对应的布尔值字符串；对于公式型单元格，返回其公式内容。

4. 行数据复制

// 复制行（表头专用，不处理日期）
private static void copyRow(Row sourceRow, Row targetRow) {
    for (int i = 0; i < sourceRow.getLastCellNum(); i++) {
        Cell sourceCell = sourceRow.getCell(i);
        Cell targetCell = targetRow.createCell(i);
        if (sourceCell != null) {
            int cellTypejs = sourceCell.getCellType();
            switch (cellType) {
                case Ce编程客栈ll.CELL_TYPE_STRING:
                    targetCell.setCellValue(sourceCell.getStringCellValue());
                    break;
                case Cell.CELL_TYPE_NUMERIC:
                    targetCell.setCellValue(sourceCell.getNumericCellValue());
                    break;
                case Cell.CELL_TYPE_BOOLEAN:
                    targetCell.setCellValue(sourceCell.getBooleanCellValue());
                    break;
                case Cell.CELL_TYPE_FORMULA:
                    targetCell.setCellFormula(sourceCell.getCellFormula());
                    break;
            }
        }
    }
}

// 复制行（数据行专用，处理日期格式）
private static void copyRowWithDateHandling(Row sourceRow, Row targetRow, 
                                           Workbook sourceWorkbook, Workbook targetWorkbook) {
    for (int i = 0; i < sourceRow.getLastCellNum(); i++) {
        Cell sourceCell = sourceRow.getCell(i);
        Cell targetCell = targetRow.createCell(i);
        if (sourceCell != null) {
            int cellType = sourceCell.getCellType();
            switch (cellType) {
                case Cell.CELL_TYPE_STRING:
                    targetCell.setCellValue(sourceCell.getStringCellValue());
                    break;
                case Cell.CELL_TYPE_NUMERIC:
                    if (DateUtil.isCellDateFormatted(sourceCell)) {
                        targetCell.setCellValue(sourceCell.getDateCellValue());
                        CellStyle newCellStyle = targetWorkbook.createCellStyle();
                        newCellStyle.cloneStyleFrom(sourceCell.getCellStyle());
                        targetCell.setCellStyle(newCellStyle);
                    } else {
                        targetCell.setCellValue(sourceCell.getNumericCellValue());
                    }
                    break;
                case Cell.CELL_TYPE_BOOLEAN:
                    targetCell.setCellValue(sourceCell.getBooleanCellValue());
                    break;
                case Cell.CELL_TYPE_FORMULA:
                    targetCell.setCellFormula(sourceCell.getCellFormula());
                    break;
            }
        }
    }
}

为实现行数据的复制，我们定义了两个方法。copyRow（）方法用于复制表头行数据，直接根据单元格类型设置目标单元格的值；copyRowWithDateHandling（）方法用于复制数据行，在处理数值型单元格时，会判断其是否为日期格式，若是，则将其作为日期处理并复制对应的单元格样式，以确保日期格式在新Excel文件中正确显示。

三、优化过程详解

（一）初始实现问题

在最初的实现中，我们遇到了以下问题：

日期格式处理不当 ：数值型日期被转为普通数字，导致数据展示不符合预期，例如原本在Excel中显示为 “2024 - 05 - 20” 的日期，在读取后变为一串数字。
Java 8兼容性问题 ：最初代码使用了Java 12+的switch表达式，但在实际部署环境中需兼容Java 8，导致代码无法正常运行。
缺少完整的行复制方法 ：在复制行数据时，未能全面处理各种单元格类型及样式，导致新生成的Excel文件数据格式混乱。

（二）日期格式处理优化

通过引入DateUtil.isCellDateFormatted（）方法检测单元格是否为日期格式，并使用SimpleDateFormat将Date对象格式化为指定字符串形式，成功解决了日期格式处理问题。优化后的代码如下：

private static String getCellValue(Cell cell) {
    if (cell == null) {
        return "";
    }
    int cellType = cell.getCellType();
    switch (cellType) {
        case Cell.CELL_TYPE_STRING:
            return cell.getStringCellValue();
        case Cell.CELL_TYPE_NUMERIC:
            return DateUtil.isCellDateFormatted(cell) 
                   ? DATE_FORMAT.format(cell.getDateCellValue()) 
                   : String.valueOf(cell.getNumericCellValue());
        case Cell.CELL_TYPE_BOOLEAN:
            return String.valueOf(cell.getBooleanCellValue());
        case Cell.CELL_TYPE_FORMULA:
            return cell.getCellFormula();
        default:
            return "";
    }
}

（三）Java 8兼容性改造

将枚举和switch表达式改为传统写法，使用Cell.CELL_TYPE_* 常量，并重构行复制方法，使其在Java 8环境下稳定运行。改造后的行复制方法如下：

private static void copyRowWithDateHandling(Row sourceRow, Row targetRow, 
                                           Workbook sourceWorkbook, Workbook targetWorkbook) {
    for (int i = 0; i < sourceRow.getLastCellNum(); i++) {
        Cell sourceCell = sourceRow.getCell(i);
        Cell targetCell = targetRow.createCell(i);
        if (sourceCell != null) {
            int cellType = sourceCell.getCellType();
            switch (cellType) {
                case Cell.CELL_TYPE_STRING:
                    targetCell.setCellValue(sourceCell.getStringCellValue());
                    break;www.devze.com
                case Cell.CELL_TYPE_NUMERIC:
                    if (DateUtil.isCellDateFormatted(sourceCell)) {
                        targetCell.setCellValue(sourceCell.getDateCellValue());
                        CellStyle newCellStyle = targetWorkbook.createCellStyle();
                        newCellStyle.cloneStyleFrom(sourceCell.getCellStyle());
                        targetCell.setCellStyle(newCellStyle);
                    } else {
                        targetCell.setCellValue(sourceCell.getNumericCellValue());
                    }
                    break;
                case Cell.CELL_TYPE_BOOLEAN:
                    targetCell.setCellValue(sourceCell.getBooleanCellValue());
                    break;
                case Cell.CELL_TYPE_FORMULA:
                    targetCell.setCellFormula(sourceCell.getCellFormula());
                    break;
            }
        }
    }
}

（四）完整功能实现

经过上述优化后，我们的代码实现了以下功能：

准确读取Excel文件中的数据，包括各种类型单元格数据，特别是正确处理了日期格式数据，避免了数据变形问题。
通过HttpURLConnection有效检测图片URL的合法性及获取图片大小，能够精准筛选出大于1MB或无法访问（404）的图片记录。
在生成新Excel文件时，完整保留了原始数据的格式，并新增列存储图片大小信息和筛选状态，方便用户查看筛选结果。
兼容Java 8环境，确保代码在不同版本的JDK下稳定运行，适应更多实际应用场景。

四、使用说明

（一）环境准备

JDK版本 ：需安装JDK 8或以上版本。
Maven依赖 ：在项目中引入以下Apache POI相关依赖，用于操作Excel文件。

<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi</artifactId>
    <version>4.1.2</version>
</dependency>
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooXML</artifactId>
    <version>4.1.2</version>
</dependency>

（二）运行方式

根据实际需求修改main方法中的输入文件路径（inputFilePath）和输出文件路径（outputFilePath），指定待处理的Excel文件和生成结果的保存位置。
编译项目，运行ImageSizeFilter类的main方法，程序将自动开始处理Excel文件，并在控制台输出处理进度和结果信息。

（三）自定义配置

图片大小阈值 ：可通过修改SIZE_THRESHOLD常量的值来调整图片大小筛选阈值，例如将其改为2.0，则筛选出大于2MB的图片。
网络连接超时时间 ：根据网络状况调整CONNECT_TIMEOUT和READ_TIMEOUT常量的值，以优化图片URL检测过程中的网络连接性能。
日期显示格式 ：若需更改日期在新Excel文件中的显示格式，可修改DATE_FORMAT常量对应的SimpleDateFormat模式，如改为 “yyyy/MM/dd HH:mm:ss” 以包含时间信息。

五、技术总结与扩展方向

（一）关键技术点总结

POI操作Excel ：熟练掌握POI对Excel文件的读写操作，包括工作簿、工作表、行、单元格的创建、获取及数据读写，是实现本功能的核心基础。通过合理使用CellStyle等类，可有效控制单元格数据的显示格式。
HttpURLConnection网络请求 ：利用HttpURLConnection发送HEAD请求检测图片URL的有效性及获取图片大小，是一种高效且节省带宽的方法。正确设置请求方法、超时时间等参数，可确保网络请求的稳定性和准确性。
Java 8兼容性编程 ：在实际项目开发中，考虑到不同环境的兼容性需求，避免使用过高版本的Java特性，采用传统语法结构和常量，能提高代码的通用性和可移植性。
数据格式处理 ：对于Excel中的日期等特殊数据类型，需深入了解其存储和展示原理，通过合理的判断和转换逻辑，确保数据在程序处理过程中及最终结果中的准确性。

（二）可能的扩展方向

多线程处理 ：针对大规模数据处理场景，可考虑引入多线程技术，将Excel文件的行数据分配到多个线程同时进行图片URL检测和筛选操作，从而显著提高处理效率，减少程序运行时间。
图片预览功能 ：在筛选出不符合要求的图片后，为进一步方便用户查看和分析，可增加图片预览功能。例如，利用图像处理库将图片缩略图嵌入到生成的新Excel文件中，或开发一个简单的界面程序展示图片预览。
支持多种文件格式 ：除了Excel文件（.xlsx），还可扩展程序支持CSV等其他常见数据文件http://www.devze.com格式，以适应更广泛的数据处理需求。这需要根据不同文件格式的特点，调整数据读取、写入及格式处理等相关代码逻辑。
数据库存储与查询 ：对于经过筛选的图片数据及相关信息，可考虑将其存储到数据库中，实现数据的持久化。同时，借助数据库的查询功能，能够方便地对筛选结果进行后续的统计分析、数据检索等操作，为用户提供了一种更灵活的数据管理方式。

六、结语

通过本案例，我们深刻体会到Java在数据处理领域的强大能力以及在实际业务开发中的广泛应用。借助Apache POI和HttpURLConnection等工具和技术，能够高效地实现Excel图片URL的筛选与大小检测功能，并解决实际业务中的数据清洗问题。在开发过程中，注重细节处理，如数据格式保留、跨版本兼容性等，是提升程序质量和用户体验的关键。未来，随着业务需求的不断拓展和技术的持续发展，我们可以对现有程序进行进一步优化和扩展，以满足更多样化的数据处理场景。

希望本文对你有所帮助，如果你在实现过程中遇到任何问题或有任何改进建议，欢迎在评论区留言交流。

附录，所有代码：

package cn.api.server;

import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URL;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.logging.Level;
import java.util.logging.Logger;

public class ImageSizeFilter {
    private static final Logger LOGGER = Logger.getLogger(ImageSizeFilter.class.getName());
    // 设置连接超时时间（毫秒）
    private static final int CONNECT_TIMEOUT = 5000;
    // 设置读取超时时间（毫秒）
    private static final int READ_TIMEOUT = 5000;
    // 定义字节到兆字节的转换系数
    private static final double BYTES_TO_MEGABYTES = 1024.0 * 1024.0;
    // 图片大小阈值（MB）
    private static final double SIZE_THRESHOLD = 1.0;
    // 日期格式
    private static final SimpleDateFormat DATE_FORMAT = new SimpleDateFormat("yyyy-MM-dd");

    public static void main(String[] args) {
        String inputFilePath = "C:/Users/admin/Desktop/图片数据.xlsx";
        String outputFilePath = "C:/Users/admin/Desktop/图片数据_筛选后.xlsx";

        System.out.println("开始处理Excel文件...");
        System.out.println("输入文件: " + inputFilePath);

        long startTime = System.currentTimeMillis();
        int processedCount = 0;
        int filteredCount = 0;

        try (FileInputStream inputStream = new FileInputStream(new File(inputFilePath));
             Workbook workbook = new XSSFWorkbook(inputStream)) {

            Sheet sheet = workbook.getSheetAt(0);
            int totalRows = sheet.getLastRowNum();
            System.out.println("发现 " + totalRows + " 条数据记录");

            try (Workbook newWorkbook = new XSSFWorkbook()) {
                Sheet newSheet = newWorkbook.createSheet();

                // 复制表头，并新增图片大小列和状态列
                Row headerRow = sheet.getRow(0);
                Row newHeaderRow = newSheet.createRow(0);
                copyRow(headerRow, newHeaderRow);
                createHeaderCell(newHeaderRow, "图片大小（M）");
                createHeaderCell(newHeaderRow, "状态");

                int newRowIndex = 1;
                for (int i = 1; i <= totalRows; i++) {
                    if (i % 100 == 0) {
                        System.out.println("已处理 " + i + "/" + totalRows + " 行");
                    }

                    Row row = sheet.getRow(i);
                    if (row != null) {
                        // 根据实践的Excel表设置数字，这里是第8列
                        Cell urlCell = row.getCell(7);
                        if (urlCell != null) {
                            String imageUrl = getCellValue(urlCell);
                            if (isValidUrl(imageUrl)) {
                                processedCount++;
                                long sizeInBytes = getImageSize(imageUrl);
                                double sizeInMegabytes = sizeInBytes / BYTES_TO_MEGABYTES;
                                boolean is404 = false;
                                if (sizeInBytes == 0) {
                                    is404 = isUrl404(imageUrl);
                                }
                                if (sizeInMegabytes > SIZE_THRESHOLD || is404) {
                                    filteredCount++;
                                    Row newRow = newSheet.createRow(newRowIndex++);
                                    copyRowWithDateHandling(row, newRow, workbook, newWorkbook);
                                    // 在新行的倒数第二列写入图片大小（M）
                                    newRow.createCell(headerRow.getLastCellNum()).setCellValue(sizeInMegabytes);
                                    // 在新行的最后一列写入状态
                                    newRow.createCell(headerRow.getLastCellNum() + 1).setCellValue(is404 ? "404" : "图片过大");
                                }
                            } else {
                                LOGGER.log(Level.WARNING, "发现不合法的URL (行 {0}): {1}", new Object[]{i, imageUrl});
                            }
                        }
                    }
                }

                try (FileOutputStream outputStream = new FileOutputStream(new File(outputFilePath))) {
                    newWorkbook.write(outputStream);
                }

                long endTime = System.currentTimeMillis();
                System.out.println("筛选完成！");
                System.out.println("处理时间: " + (endTime - startTime) / 1000 + " 秒");
                System.out.println("处理记录数: " + processedCount);
                System.out.println("筛选出的记录数: " + filteredCount);
                System.out.println("结果保存到: " + outputFilePath);

            } catch (IOException e) {
                LOGGER.log(Level.SEVERE, "写入输出文件时出错", e);
                System.err.println("错误: 无法写入输出文件: " + outputFilePath);
            }

        } catch (IOException e) {
            LOGGER.log(Level.SEVERE, "读取输入文件时出错", e);
            System.err.println("错误: 无法读取输入文件: " + inputFilePath);
        }
    }

    private static long getImageSize(String imageUrl) {
        HttpURLConnection connection = null;
        try {
            URL url = new URL(imageUrl);
            connection = (HttpURLConnection) url.openConnection();
            connection.setRequestMethod("HEAD");
            connection.setConnectTimeout(CONNECT_TIMEOUT);
            connection.setReadTimeout(READ_TIMEOUT);
            connection.connect();
            int responseCode = connection.getResponseCode();
            if (responseCode == HttpURLConnection.HTTP_OK) {
                return connection.getContentLength();
            } else {
                LOGGER.log(Level.WARNING, "获取图片大小失败，URL: {0}，响应码: {1}", new Object[]{imageUrl, responseCode});
                return 0;
            }
        } catch (IOException e) {
            LOGGER.log(Level.SEVERE, "获取图片大小IO异常，URL: " + imageUrl, e);
            return 0;
        } finally {
            if (connection != null) {
                connection.disconnect();
            }
        }
    }

    private static boolean isUrl404(String imageUrl) {
        HttpURLConnection connection = null;
        try {
            URL url = new URL(imageUrl);
            connection = (HttpURLConnection) url.openConnection();
            connection.setRequestMethod("HEAD");
            connection.setConnectTimeout(CONNECT_TIMEOUT);
            connection.setReadTimeout(READ_TIMEOUT);
            connection.connect();
            return connection.getResponseCode() == HttpURLConnection.HTTP_NOT_FOUND;
        } catch (IOException e) {
            LOGGER.log(Level.SEVERE, "判断 URL 是否 404 时发生 IO 异常，URL: " + imageUrl, e);
            return false;
        } finally {
            if (connection != null) {
                connection.disconnect();
            }
        }
    }

    private static String getCellValue(Cell cell) {
        if (cell == null) {
            return "";
        }

        // 兼容Java 8的写法
        int cellType = cell.getCellType();
        switch (cellType) {
            case Cell.CELL_TYPE_STRING:
                return cell.getStringCellValue();
            case Cell.CELL_TYPE_NUMERIC:
                if (DateUtil.isCellDateFormatted(cell)) {
                    Date date = cell.getDateCellValue();
                    return DATE_FORMAT.format(date);
                } else {
                    return String.valueOf(cell.getNumericCellValue());
                }
            case Cell.CELL_TYPE_BOOLEAN:
                return String.valueOf(cell.getBooleanCellValue());
            case Cell.CELL_TYPE_FORMULA:
                return cell.getCellFormula();
            default:
                return "";
        }
    }

    private static boolean isValidUrl(String url) {
        if (url == null || url.trim().isEmpty()) {
            return false;
        }
        try {
            new URL(url);
            return true;
        } catch (MalformedURLException e) {
            return false;
        }
    }

    /**
     * 复制行（用于表头复制，不处理日期格式）
     */
    private static void copyRow(Row sourceRow, Row targetRow) {
        for (int i = 0; i < sourceRow.getLastCellNum(); i++) {
            Cell sourceCell = sourceRow.getCell(i);
            Cell targetCell = targetRow.createCell(i);
            if (sourceCell != null) {
                int cellType = sourceCell.getCellType();
                switch (cellType) {
                    case Cell.CELL_TYPE_STRING:
                        targetCell.setCellValue(sourceCell.getStringCellValue());
                        break;
                    case Cell.CELL_TYPE_NUMERIC:
                        targetCell.setCellValue(sourceCell.getNumericCellValue());
                        break;
                    case Cell.CELL_TYPE_BOOLEAN:
                        targetCell.setCellValue(sourceCell.getBooleanCellValue());
                        break;
                    case Cell.CELL_TYPE_FORMULA:
                        targetCell.setCellFormula(sourceCell.getCellFormula());
                        break;
                    default:
                        break;
                }
            }
        }
    }

    /**
     * 复制行并处理日期格式（用于数据行复制）
     */
    private static void copyRowWithDateHandling(Row sourceRow, Row targetRow, Workbook sourceWorkbook, Workbook targetWorkbook) {
        for (int i = 0; i < sourceRow.getLastCellNum(); i++) {
            Cell sourceCell = sourceRow.getCell(i);
            Cell targetCell = targetRow.createCell(i);
            if (sourceCell != null) {
                int cellType = sourceCell.getCellType();
                switch (cellType) {
                    case Cell.CELL_TYPE_STRING:
                        targetCell.setCellValue(sourceCell.getStringCellValue());
                        break;
                    case Cell.CELL_TYPE_NUMERIC:
                        if (DateUtil.isCellDateFormatted(sourceCell)) {
                            targetCell.setCellValue(sourceCell.getDateCellValue());
                            CellStyle newCellStyle = targetWorkbook.createCellStyle();
                            newCellStyle.cloneStyleFrom(sourceCell.getCellStyle());
                            targetCell.setCellStyle(newCellStyle);
                        } else {
                            targetCell.setCellValue(sourceCell.getNumericCellValue());
                        }
                        break;
                    case Cell.CELL_TYPE_BOOLEAN:
                        targetCell.setCellValue(sourceCell.getBooleanCellValue());
                        break;
                    case Cell.CELL_TYPE_FORMULA:
                        targetCell.setCellFormula(sourceCell.getCellFormula());
                        break;
                    default:
                        break;
                }
            }
        }
    }

    private static void createHeaderCell(Row headerRow, String headerValue) {
        Cell headerCell = headerRow.createCell(headerRow.getLastCellNum());
        headerCell.setCellValue(headerValue);
        CellStyle style = headerCell.getSheet().getWorkbook().createCellStyle();
        Font font = headerCell.getSheet().getWorkbook().createFont();
        font.setBold(true);
        style.setFont(font);
        headerCell.setCellStyle(style);
    }
}

以上就是Java实现Excel图片URL筛选与大小检测的全过程的详细内容，更多关于Java Excel图片URL筛选与检测的资料请关注编程客栈(www.devze.com)其它相关文章！

继续阅读：Excel图片URL Java Excel图片 Java Excel图片URL筛选与检测 Java URL检测 Java URL筛选

Java实现Excel图片URL筛选与大小检测的全过程

目录

引言

一、问题背景

二、核心实现方案

（一）技术选型

（二）关键代码实现

1. Excel文件读取与写入

2. URL检测与图片大小获取

3. 单元格数据读取与处理

4. 行数据复制

三、优化过程详解

（一）初始实现问题

（二）日期格式处理优化

（三）Java 8兼容性改造

（四）完整功能实现

四、使用说明

（一）环境准备

（二）运行方式

（三）自定义配置

五、技术总结与扩展方向

（一）关键技术点总结

（二）可能的扩展方向

六、结语

更多精彩内容

精彩评论

最新开发

C++实现冒泡排序的多种方式详解

C++解析命令行参数的实现原理与代码详解

Qt项目无法找到.pro文件的解决方案汇总

C#使用Spire.Doc将HTML转换为文本的代码实现

C#使用Spire.PDF for .NET合并多个PDF文档和指定页面的实现方案

开发排行榜

springboot后端存储富文本内容的思路与步骤(含图片内容)

PyCharm运行python测试,报错“没有发现测试”/“空套件”的解决

return base64.b64encode(b).decode(

基于C语言实现钻石棋游戏的示例代码

Sublime Text 3解决中文乱码问题（实测可用）