Automatic string length in recarray

2022-12-10 13:30 问答作者：

If开发者_JAVA技巧 I create a recarray in this way:

In [29]: np.rec.fromrecords([(1,'hello'),(2,'world')],names=['a','b'])

The result looks fine:

Out[29]: 
rec.array([(1, 'hello'), (2, 'world')], 
      dtype=[('a', '<i8'), ('b', '|S5')])

But if I want to specify the data types:

In [32]: np.rec.fromrecords([(1,'hello'),(2,'world')],dtype=[('a',np.int8),('b',np.str)])

The string is set to a length of zero:

Out[32]: 
rec.array([(1, ''), (2, '')], 
      dtype=[('a', '|i1'), ('b', '|S0')])

I need to specify datatypes for all numerical types since I care about int8/16/32, etc, but I would like to benefit from the auto string length detection that works if I don't specify datatypes. I tried replacing np.str by None but no luck. I know I can specify '|S5' for example, but I don't know in advance what the string length should be set to.

If you don't need to manipulate the strings as bytes, you may use the object data-type to represent them. This essentially stores a pointer instead of the actual bytes:

In [38]: np.array(data, dtype=[('a', np.uint8), ('b', np.object)])
Out[38]: 
array([(1, 'hello'), (2, 'world')], 
      dtype=[('a', '|u1'), ('b', '|O8')])

Alternatively, Alex's idea would work well:

new_dt = []

# For each field of a given type and alignment, determine
# whether the field is an integer.  If so, represent it as a byte.

for f, (T, align) in dt.fields.iteritems():
    if np.issubdtype(T, int):
        new_dt.append((f, np.uint8))
    else:
        new_dt.append((f, T))

new_dt = np.dtype(new_dt)
np.array(data, dtype=new_dt)

which should yield

array([(1, 'hello'), (2, 'world')], 
      dtype=[('f0', '|u1'), ('f1', '|S5')])

I don't know how to ask numpy to determine for you some aspects of a dtype but not others, but couldn't you have, e.g.:

data = [(1,'hello'),(2,'world')]
dlen = max(len(s) for i, s in data)
st = '|S%d' % dlen
np.rec.fromrecords(data, dtype=[('a',np.int8), ('b',st)])

继续阅读：numpy python

Automatic string length in recarray

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？