Question

Convert '3' to numpy.dtypes.Int64DType

I have strings containing str representations of numpy values. I also have a list of the original dtypes, such as numpy.dtypes.Int64DType. I do not know all possible dtypes in advance. I can for the life of me not figure out how to convert the string back to a scalar of the correct dtype.

dt = numpy.dtypes.Int64DType
dt('3')  # I was somewhat hoping this would work but the dtype is not callable
np.fromstring('3', dtype=dt)  # ValueError: string size must be a multiple of element size
np.fromstring('3', dtype=dt, count=1)  # ValueError: string is smaller than requested size
np.fromstring('[3]', dtype=dt, count=1)  # ValueError: Cannot create an object array from a string

For the latter three calls, I also get a

:1: DeprecationWarning: The binary mode of fromstring is deprecated, as it behaves surprisingly on unicode inputs. Use frombuffer instead`

Which I don't understand, because '3' is not a binary string?

 4  53  4
1 Jan 1970

Solution

 0

I still don't quite understand why the above attempts did not work (or how I can tell fromstring that the input string is non-binary) and I don't find the error messages particularly helpful, but the following does what I want:

np.array('3', dtype=dt).item()

Edit: I overlooked the following note on the doc page, because I was not using the optional sep argument (it seemed irrelevant to the scalar case):

Deprecated since version 1.14: Passing sep='', the default, is deprecated since it will trigger the deprecated binary mode of this function. This mode interprets string as binary bytes, rather than ASCII text with decimal numbers, an operation which is better spelt frombuffer(string, dtype, count). If string contains unicode text, the binary mode of fromstring will first encode it into bytes using utf-8, which will not produce sane results.

Indeed, np.fromstring('3', dtype=int, sep=' ') works. However, np.fromstring('3', dtype=np.dtypes.Int64DType, sep=' ') still does not work and gives a ValueError: Cannot create an object array from a string.

Edit #2: Apparently, np.dtypes.Int64DType must be instantiated, different from standard dtypes. So: np.fromstring('3', dtype=np.dtypes.Int64DType(), sep=' ') works (note the added parentheses).

2024-07-25
Eike P.