Python如何从txt文件中提取特定数据

2023-11-19 10:16 开发作者：六和七

代码：

def get_data(txt_path: str = '', epoch: int = 100, target: str = '', target_data_len: int = 5):
    num_list = []  # 将提取出来的数据保存到列表,并在最后返回
    data = open(txt_path, encoding="utf-8")  # 打开文件
    str1 = data.read()  # 将文件中读取到的内容转化为字符串
    data.close()  # 关闭文件
    for i in range(0, epoch):
        index = str1.find(target)  # 查找字符串str1中str2字符串的位置
        num_list.append(float(str1[index+len(target):index+len(target)+target_data_len]))  # 将需要的数据提取到列表中
        str1 = str1.replace(target, 'xxxx', 1)  # 替换掉已经查阅过的地方,' xxxx '表示替换后的内容，1表示在字符串中的替换次数为1
    return num_list

函数参数解释：

txt_path 文件路径
epoch 这份文本文件中要提取出的数据个数,默认100
target 目标数据的前缀
target_data_len 目标数据的长度，默认为5
返回值，列表数据

使用举例：

txt文档内容：

x1:273 test3:477 y4:38489 y1:149 x2:423
x1:274 test3:475 y4:37956 y1:152 x2:422
x1:269 test3:473 y4:38156 y1:152 x2:421
x1:271 test3:471 y4:38156 y1:155 x2:418
x1:272 编程客栈; test3:467 y4:38056 y1:158 x2:416
x1:275 test3:466 y4:37956 y1:161 x2:415

使用：

data_path = "D:/program/test/double_vLuKfcamera_data/x_data.txt"
# 提取x1的数据
list_x1  = get_data(data_path, 6, target="x1:", target_data_len=3)
# 提取test3的数据
list_test3  = get_data(data_path, 6, target="test3:", target_data_len=3)
# 提取y4的数据
list_y4  = get_data(data_path, 6, target="y4:", target_data_len=6)
print(list_x1)
print(list_test3)
printwww.devze.com(list_y4)

输出：

[273.0, 274.0, 26python9.0, 271.0, 272.0, 275.0]
[477.0, 475.0, 473.0, 471.0, 467.0, 466.0]
[38489.0, 37956.0, 38156.0, 38156.0, 38056.0, 37956.0]

附：Python 从不规则文本中提取有效信息

背景：从一个混有文字和多个表格的word文档里，提取表格中有效信息

代码：

from docx import Document
import numpy as np
import pandas as pd
#读取文件
doc = Document("文件名.docx")
#读取表格
tables = doc.tables
#print(len(tables))
rlt = []
flag = 0
for t in tables: #每一个表格
    rows = t.rows
    for r in rows: #每一行
        cols = r.cells
        for c in cols: #每一个单元格
            if flag != 0:
                rlt.append(c.text)
                flag = 0
                continue
            if c.text == "不动产所有权人" or c.text == "不动产权属证明" or c.text == "项目名称" or  c.text == "项目地址":
                flag = 1
nums = len(rlt)
rlt = np.array(rlt).reshape((nums//4,4))
#print(rlt)            
df = pd.DataFrame(rlt,columns= ["不动产所有权人" ,"不动产权属证明" ,"项目名称","项目地址"])
#print(df)
df.to_excel('rlt.xlsx')

总结

到此这篇关于Python如何从txt文编程客栈件中提取特定数据的文章就介绍到这了,更多相关Python从txt文件提取数据内容请搜索编程客栈(www.devze.com)以前的文章或继续浏览下面的相关文章希望大家以后多多支持编程客栈(www.devze.com)！

继续阅读：python txt提取数据 python提取txt数据

Python如何从txt文件中提取特定数据

目录

代码：

使用举例：

附：Python 从不规则文本中提取有效信息

总结

更多精彩内容

精彩评论

最新开发

基于WinForm实现通用自动更新系统的完整流程

C#程序实现将MySQL的存储过程转换成Oracle的存储过程

SpringBoot打包为外部配置包的技巧分享

SpringBoot外部化配置的最佳实践指南

使用Python将CSV文件转换为PDF的实践指南

开发排行榜

springboot后端存储富文本内容的思路与步骤(含图片内容)

PyCharm运行python测试,报错“没有发现测试”/“空套件”的解决

return base64.b64encode(b).decode(

基于C语言实现钻石棋游戏的示例代码

Sublime Text 3解决中文乱码问题（实测可用）