Python Examples using `h5py`¶

One way to gain a quick familiarity with NeXus is to start working with some data. For at least the first few examples in this section, we have a simple two-column set of 1-D data, collected as part of a series of alignment scans by the APS USAXS instrument during the time it was stationed at beam line 32ID. We will show how to write this data using the Python language and the h5py package 1 (using h5py calls directly rather than using the NeXus NAPI). The actual data to be written was extracted (elsewhere) from a spec 2 data file and read as a text block from a file by the Python source code. Our examples will start with the simplest case and add only mild complexity with each new case since these examples are meant for those who are unfamiliar with NeXus.

1: h5py: https://www.h5py.org/
2: SPEC: http://certif.com/spec.html

The data shown plotted in the next figure will be written to the NeXus HDF5 file using only two NeXus base classes, NXentry and NXdata, in the first example and then minor variations on this structure in the next two examples. The data model is identical to the one in the Introduction chapter except that the names will be different, as shown below:

simple data structure — data structure, (from Introduction)¶

our h5py example

/entry:NXentry
    /mr_scan:NXdata
       /mr : float64[31]
       /I00 : int32[31]

Example-H5py-Plot — plot of our *mr_scan*¶

two-column data for our mr_scan

92608    1037
92591    1318
92575    1704
92558    2857
92541    4516
92525    9998
92508    23819
92491    31662
92475    40458
92458    49087
92441    56514
92425    63499
92408    66802
92391    66863
92375    66599
92358    66206
92341    65747
92325    65250
92308    64129
92291    63044
92275    60796
92258    56795
92241    51550
92225    43710
92208    29315
92191    19782
92175    12992
92158    6622
92141    4198
92125    2248
92108    1321

Writing the simplest data using `h5py`¶

These two examples show how to write the simplest data (above). One example writes the data directly to the NXdata group while the other example writes the data to NXinstrument/NXdetector/data and then creates a soft link to that data in NXdata.

Complete `h5py` example writing and reading a NeXus data file¶

Writing the HDF5 file using h5py¶

In the main code section of BasicWriter.py, a current time stamp is written in the format of ISO 8601 (yyyy-mm-ddTHH:MM:SS). For simplicity of this code example, we use a text string for the time, rather than computing it directly from Python support library calls. It is easier this way to see the exact type of string formatting for the time. When using the Python datetime package, one way to write the time stamp is:

timestamp = "T".join( str( datetime.datetime.now() ).split() )

The data (mr is similar to “two_theta” and I00 is similar to “counts”) is collated into two Python lists. We use the numpy package to read the file and parse the two-column format.

The new HDF5 file is opened (and created if not already existing) for writing, setting common NeXus attributes in the same command from our support library. Proper HDF5+NeXus groups are created for /entry:NXentry/mr_scan:NXdata. Since we are not using the NAPI, our support library must create and set the NX_class attribute on each group.

Note

We want to create the desired structure of /entry:NXentry/mr_scan:NXdata/.

First, our support library calls f = h5py.File() to create the file and root level NeXus structure.
Then, it calls nxentry = f.create_group("entry") to create the NXentry group called entry at the root level.
Then, it calls nxdata = nxentry.create_group("mr_scan") to create the NXentry group called entry as a child of the NXentry group.

Next, we create a dataset called title to hold a title string that can appear on the default plot.

Next, we create datasets for mr and I00 using our support library. The data type of each, as represented in numpy, will be recognized by h5py and automatically converted to the proper HDF5 type in the file. A Python dictionary of attributes is given, specifying the engineering units and other values needed by NeXus to provide a default plot of this data. By setting signal="I00" as an attribute on the group, NeXus recognizes I00 as the default y axis for the plot. The axes="mr" attribute on the NXdata group connects the dataset to be used as the x axis.

Finally, we must remember to call f.close() or we might corrupt the file when the program quits.

BasicWriter.py: Write a NeXus HDF5 file using Python with h5py

#!/usr/bin/env python
'''Writes a NeXus HDF5 file using h5py and numpy'''

import h5py    # HDF5 support
import numpy
import six

print("Write a NeXus HDF5 file")
fileName = u"prj_test.nexus.hdf5"
timestamp = u"2010-10-18T17:17:04-0500"

# load data from two column format
data = numpy.loadtxt(u"input.dat").T
mr_arr = data[0]
i00_arr = numpy.asarray(data[1],'int32')

# create the HDF5 NeXus file
f = h5py.File(fileName, "w")
# point to the default data to be plotted
f.attrs[u'default']          = u'entry'
# give the HDF5 root some more attributes
f.attrs[u'file_name']        = fileName
f.attrs[u'file_time']        = timestamp
f.attrs[u'instrument']       = u'APS USAXS at 32ID-B'
f.attrs[u'creator']          = u'BasicWriter.py'
f.attrs[u'NeXus_version']    = u'4.3.0'
f.attrs[u'HDF5_Version']     = six.u(h5py.version.hdf5_version)
f.attrs[u'h5py_version']     = six.u(h5py.version.version)

# create the NXentry group
nxentry = f.create_group(u'entry')
nxentry.attrs[u'NX_class'] = u'NXentry'
nxentry.attrs[u'default'] = u'mr_scan'
nxentry.create_dataset(u'title', data=u'1-D scan of I00 v. mr')

# create the NXentry group
nxdata = nxentry.create_group(u'mr_scan')
nxdata.attrs[u'NX_class'] = u'NXdata'
nxdata.attrs[u'signal'] = u'I00'      # Y axis of default plot
nxdata.attrs[u'axes'] = u'mr'         # X axis of default plot
nxdata.attrs[u'mr_indices'] = [0,]   # use "mr" as the first dimension of I00

# X axis data
ds = nxdata.create_dataset(u'mr', data=mr_arr)
ds.attrs[u'units'] = u'degrees'
ds.attrs[u'long_name'] = u'USAXS mr (degrees)'    # suggested X axis plot label

# Y axis data
ds = nxdata.create_dataset(u'I00', data=i00_arr)
ds.attrs[u'units'] = u'counts'
ds.attrs[u'long_name'] = u'USAXS I00 (counts)'    # suggested Y axis plot label

f.close()   # be CERTAIN to close the file

print("wrote file:", fileName)

Reading the HDF5 file using h5py¶

The file reader, BasicReader.py, is very simple since the bulk of the work is done by h5py. Our code opens the HDF5 we wrote above, prints the HDF5 attributes from the file, reads the two datasets, and then prints them out as columns. As simple as that. Of course, real code might add some error-handling and extracting other useful stuff from the file.

Note

See that we identified each of the two datasets using HDF5 absolute path references (just using the group and dataset names). Also, while coding this example, we were reminded that HDF5 is sensitive to upper or lowercase. That is, I00 is not the same is i00.

BasicReader.py: Read a NeXus HDF5 file using Python with h5py

#!/usr/bin/env python
'''Reads NeXus HDF5 files using h5py and prints the contents'''

import h5py    # HDF5 support

fileName = "prj_test.nexus.hdf5"
f = h5py.File(fileName,  "r")
for item in f.attrs.keys():
    print(item + ":", f.attrs[item])
mr = f['/entry/mr_scan/mr']
i00 = f['/entry/mr_scan/I00']
print("%s\t%s\t%s" % ("#", "mr", "I00"))
for i in range(len(mr)):
    print("%d\t%g\t%d" % (i, mr[i], i00[i]))
f.close()

Output from BasicReader.py is shown next.

Output from BasicReader.py

file_name: prj_test.nexus.hdf5
file_time: 2010-10-18T17:17:04-0500
creator: BasicWriter.py
HDF5_Version: 1.8.5
NeXus_version: 4.3.0
h5py_version: 1.2.1
instrument: APS USAXS at 32ID-B
#   mr  I00
 17.9261 1037
 17.9259 1318
 17.9258 1704
 17.9256 2857
 17.9254 4516
 17.9252 9998
 17.9251 23819
 17.9249 31662
 17.9247 40458
 17.9246 49087
17.9244 56514
17.9243 63499
17.9241 66802
17.9239 66863
17.9237 66599
17.9236 66206
17.9234 65747
17.9232 65250
17.9231 64129
17.9229 63044
17.9228 60796
17.9226 56795
17.9224 51550
17.9222 43710
17.9221 29315
17.9219 19782
17.9217 12992
17.9216 6622
17.9214 4198
17.9213 2248
17.9211 1321

Finding the default plottable data¶

Let’s make a new reader that follows the chain of attributes (@default, @signal, and @axes) to find the default plottable data. We’ll use the same data file as the previous example. Our demo here assumes one-dimensional data. (For higher dimensionality data, we’ll need more complexity when handling the @axes attribute and we’ll to check the field sizes. See section Find the plottable data, subsection Version 3, for the details.)

reader_attributes_trail.py: Read a NeXus HDF5 file using Python with h5py

import h5py

with h5py.File("prj_test.nexus.hdf5", "r") as nx:
    # find the default NXentry group
    nx_entry = nx[nx.attrs["default"]]
    # find the default NXdata group
    nx_data = nx_entry[nx_entry.attrs["default"]]
    # find the signal field
    signal = nx_data[nx_data.attrs["signal"]]
    # find the axes field(s)
    attr_axes = nx_data.attrs["axes"]
    if isinstance(attr_axes, (set, tuple, list)):
        #  but check that attr_axes only describes 1-D data
        if len(attr_axes) == 1:
            attr_axes = attr_axes[0]
        else:
            raise ValueError(f"expected 1-D data but @axes={attr_axes}")
    axes = nx_data[attr_axes]

    print(f"file: {nx.filename}")
    print(f"signal: {signal.name}")
    print(f"axes: {axes.name}")
    print(f"{axes.name} {signal.name}")
    for x, y in zip(axes, signal):
        print(x, y)

Output from reader_attributes_trail.py is shown next.

Output from reader_attributes_trail.py

file: prj_test.nexus.hdf5
signal: /entry/mr_scan/I00
axes: /entry/mr_scan/mr
/entry/mr_scan/mr /entry/mr_scan/I00
92608 1037
92591 1318
92575 1704
92558 2857
92541 4516
92525 9998
92508 23819
92491 31662
92475 40458
92458 49087
92441 56514
92425 63499
92408 66802
92391 66863
92375 66599
92358 66206
92341 65747
92325 65250
92308 64129
92291 63044
92275 60796
92258 56795
92241 51550
92225 43710
92208 29315
92191 19782
92175 12992
92158 6622
92141 4198
92125 2248
92108 1321

Plotting the HDF5 file¶

Now that we are certain our file conforms to the NeXus standard, let’s plot it using the NeXpy 3 client tool. To help label the plot, we added the long_name attributes to each of our datasets. We also added metadata to the root level of our HDF5 file similar to that written by the NAPI. It seemed to be a useful addition. Compare this with plot of our mr_scan and note that the horizontal axis of this plot is mirrored from that above. This is because the data is stored in the file in descending mr order and NeXpy has plotted it that way (in order of appearance) by default.

3: NeXpy: http://nexpy.github.io/nexpy/

fig-Example-H5py-nexpy-plot — plot of our *mr_scan* using NeXpy¶

Links to Data in External HDF5 Files¶

HDF5 files may contain links to data (or groups) in other files. This can be used to advantage to refer to data in existing HDF5 files and create NeXus-compliant data files. Here, we show such an example, using the same counts v. two_theta data from the examples above.

We use the HDF5 external file links with NeXus data files.

f[local_addr] = h5py.ExternalLink(external_file_name, external_addr)

where f is an open h5py.File() object in which we will create the new link, local_addr is an HDF5 path address, external_file_name is the name (relative or absolute) of an existing HDF5 file, and external_addr is the HDF5 path address of the existing data in the external_file_name to be linked.

file: external_angles.hdf5¶

Take for example, the structure of external_angles.hdf5, a simple HDF5 data file that contains just the two_theta angles in an HDF5 dataset at the root level of the file. Although this is a valid HDF5 data file, it is not a valid NeXus data file:

angles:float64[31] = [17.926079999999999, '...', 17.92108]
  @units = degrees

file: external_counts.hdf5¶

The data in the file external_angles.hdf5 might be referenced from another HDF5 file (such as external_counts.hdf5) by an HDF5 external link. 4 Here is an example of the structure:

entry:NXentry
  instrument:NXinstrument
  detector:NXdetector
    counts:NX_INT32[31] = [1037, '...', 1321]
      @units = counts
    two_theta --> file="external_angles.hdf5", path="/angles"

4: see these URLs for further guidance on HDF5 external links: https://portal.hdfgroup.org/display/HDF5/H5L_CREATE_EXTERNAL, http://docs.h5py.org/en/stable/high/group.html#external-links

file: external_master.hdf5¶

A valid NeXus data file could be created that refers to the data in these files without making a copy of the data files themselves.

Note

It is necessary for all these files to be located together in the same directory for the HDF5 external file links to work properly.`

To be a valid NeXus file, it must contain a NXentry group. For the files above, it is simple to make a master file that links to the data we desire, from structure that we create. We then add the group attributes that describe the default plottable data:

data:NXdata
  @signal = counts
  @axes = "two_theta"
  @two_theta_indices = 0

Here is (the basic structure of) external_master.hdf5, an example:

entry:NXentry
@default = data
  instrument --> file="external_counts.hdf5", path="/entry/instrument"
  data:NXdata
    @signal = counts
    @axes = "two_theta"
     @two_theta = 0
    counts --> file="external_counts.hdf5", path="/entry/instrument/detector/counts"
    two_theta --> file="external_angles.hdf5", path="/angles"

source code: externalExample.py¶

Here is the complete code of a Python program, using h5py to write a NeXus-compliant HDF5 file with links to data in other HDF5 files.

externalExample.py: Write using HDF5 external links

#!/usr/bin/env python
'''
Writes a NeXus HDF5 file using h5py with links to data in other HDF5 files.

This example is based on ``writer_2_1``.
'''

import h5py
import numpy

FILE_HDF5_MASTER = u"external_master.hdf5"
FILE_HDF5_ANGLES = u"external_angles.hdf5"
FILE_HDF5_COUNTS = u"external_counts.hdf5"

#---------------------------

# get some data
buffer = numpy.loadtxt("input.dat").T
tthData = buffer[0]                             # float[]
countsData = numpy.asarray(buffer[1],'int32')   # int[]

# put the angle data in an external (non-NeXus) HDF5 data file
f = h5py.File(FILE_HDF5_ANGLES, "w")
ds = f.create_dataset(u"angles", data=tthData)
ds.attrs[u"units"] = u"degrees"
f.close()    # be CERTAIN to close the file


# put the detector counts in an external HDF5 data file 
# with *incomplete* NeXus structure (no NXdata group)
f = h5py.File(FILE_HDF5_COUNTS, "w")
nxentry = f.create_group(u"entry")
nxentry.attrs[u"NX_class"] = u"NXentry"
nxinstrument = nxentry.create_group(u"instrument")
nxinstrument.attrs[u"NX_class"] = u"NXinstrument"
nxdetector = nxinstrument.create_group(u"detector")
nxdetector.attrs[u"NX_class"] = u"NXdetector"
ds = nxdetector.create_dataset(u"counts", data=countsData)
ds.attrs[u"units"] = u"counts"
# link the "two_theta" data stored in separate file
local_addr = nxdetector.name + u"/two_theta"
f[local_addr] = h5py.ExternalLink(FILE_HDF5_ANGLES, u"/angles")
f.close()

# create a master NeXus HDF5 file
f = h5py.File(FILE_HDF5_MASTER, "w")
f.attrs[u"default"] = u"entry"
nxentry = f.create_group(u"entry")
nxentry.attrs[u"NX_class"] =u"NXentry"
nxentry.attrs[u"default"] = u"data"
nxdata = nxentry.create_group(u"data")
nxdata.attrs[u"NX_class"] = u"NXdata"

# link in the signal data
local_addr = '/entry/data/counts'
external_addr = u"/entry/instrument/detector/counts"
f[local_addr] = h5py.ExternalLink(FILE_HDF5_COUNTS, external_addr)
nxdata.attrs[u"signal"] = u"counts"

# link in the axes data
local_addr = u"/entry/data/two_theta"
f[local_addr] = h5py.ExternalLink(FILE_HDF5_ANGLES, u"/angles")
nxdata.attrs[u"axes"] = u"two_theta"
nxdata.attrs[u"two_theta_indices"] = [0,]

local_addr = u"/entry/instrument"
f[local_addr] = h5py.ExternalLink(FILE_HDF5_COUNTS, u"/entry/instrument")

f.close()

downloads¶

The Python code and files related to this section may be downloaded from the following table.

file	description
`input.dat`	2-column ASCII data used in this section
`BasicReader.py`	python code to read example prj_test.nexus.hdf5
`BasicWriter.py`	python code to write example prj_test.nexus.hdf5
`external_angles_h5dump.txt`	h5dump analysis of external_angles.hdf5
`external_angles.hdf5`	HDF5 file written by externalExample
`external_angles_structure.txt`	punx tree analysis of external_angles.hdf5
`external_counts_h5dump.txt`	h5dump analysis of external_counts.hdf5
`external_counts.hdf5`	HDF5 file written by externalExample
`external_counts_structure.txt`	punx tree analysis of external_counts.hdf5
`externalExample.py`	python code to write external linking examples
`external_master_h5dump.txt`	h5dump analysis of external_master.hdf5
`external_master.hdf5`	NeXus file written by externalExample
`external_master_structure.txt`	punx tree analysis of external_master.hdf5
`prj_test.nexus_h5dump.txt`	h5dump analysis of the NeXus file
`prj_test.nexus.hdf5`	NeXus file written by BasicWriter
`prj_test.nexus_structure.txt`	punx tree analysis of the NeXus file

Python Examples using `h5py`¶

Writing the simplest data using `h5py`¶

Complete `h5py` example writing and reading a NeXus data file¶

Writing the HDF5 file using h5py¶

Reading the HDF5 file using h5py¶

Finding the default plottable data¶

Plotting the HDF5 file¶

Links to Data in External HDF5 Files¶

file: external_angles.hdf5¶

file: external_counts.hdf5¶

file: external_master.hdf5¶

source code: externalExample.py¶

downloads¶

NeXus-FAIRmat

Navigation

Related Topics

Google search

Python Examples using h5py¶

Writing the simplest data using h5py¶

Complete h5py example writing and reading a NeXus data file¶

Writing the HDF5 file using h5py¶

Reading the HDF5 file using h5py¶

Finding the default plottable data¶

Plotting the HDF5 file¶

Links to Data in External HDF5 Files¶

file: external_angles.hdf5¶

file: external_counts.hdf5¶

file: external_master.hdf5¶

source code: externalExample.py¶

downloads¶

Python Examples using `h5py`¶

Writing the simplest data using `h5py`¶

Complete `h5py` example writing and reading a NeXus data file¶