zea.data.file¶
zea H5 file functionality.
Functions
|
Asserts key is in a h5py.File. |
|
Convert a dictionary with sortable keys to a sorted list of values. |
|
Loads a zea data files (h5py file). |
|
Loads a zea data files (h5py file). |
|
Validate the structure and data of a zea HDF5 file. |
Classes
|
h5py.File in zea format. |
|
Lazy proxy for an h5py.Group that exposes children as attributes. |
- class zea.data.file.File(name, mode='r', *args, **kwargs)[source]¶
Bases:
Fileh5py.File in zea format.
Initialize the file.
- Parameters:
name (str, Path, HFPath) – The path to the file. Can be a string or a Path object. Additionally can be a string with the prefix ‘hf://’, in which case it will be resolved to a huggingface path.
mode (str, optional) – The mode to open the file in. Defaults to “r”.
*args – Additional arguments to pass to h5py.File.
**kwargs – Additional keyword arguments to pass to h5py.File.
- copy_key(key, dst)[source]¶
Copy a specific key to another file.
Will always copy the attributes and the scan data if it exists. Will warn if the key is not in this file or if the key already exists in the destination file.
- Parameters:
key (
str) – The key to copy.dst (
File) – The destination file to copy the key to.
- classmethod create(path, data, scan=None, metadata=None, metrics=None, probe_name=None, us_machine=None, description=None, compression='gzip', overwrite=False)[source]¶
Create a new zea HDF5 file from data, scan, and optional metadata.
All inputs are validated against the
FileSpecschema (dtypes, shapes, dimension consistency) before anything is written to disk.- Parameters:
path – Destination file path.
data (
dict) – Data dict accepted byDataSpec.scan (
dict|None) – Scan-parameter dict accepted byScanSpec.metadata (
dict|None) – Optional metadata dict accepted byMetadataSpec.metrics (
dict|None) – Optional metrics dict accepted byMetricsSpec.probe_name (
str|None) – Name of the probe.us_machine (
str|None) – Name of the ultrasound machine.description (
str|None) – Free-text description of the acquisition.compression (
str) – HDF5 compression filter (default"gzip").overwrite (
bool) – If False (default), raise if the file exists.
- Returns:
The closed
Filehandle (re-open withFile(path)to read).- Return type:
>>> import os, tempfile >>> import numpy as np >>> from zea import File >>> n_frames, n_tx, n_el, n_ax = 2, 4, 8, 64 >>> raw = np.zeros((n_frames, n_tx, n_ax, n_el, 1), dtype=np.float32) >>> geom = np.zeros((n_el, 3), dtype=np.float32) >>> scan = { ... "probe_geometry": geom, ... "sampling_frequency": np.float32(40e6), ... "center_frequency": np.float32(5e6), ... "demodulation_frequency": np.float32(5e6), ... "initial_times": np.zeros(n_tx, dtype=np.float32), ... "t0_delays": np.zeros((n_tx, n_el), dtype=np.float32), ... "tx_apodizations": np.ones((n_tx, n_el), dtype=np.float32), ... "focus_distances": np.full(n_tx, np.inf, dtype=np.float32), ... "transmit_origins": np.zeros((n_tx, 3), dtype=np.float32), ... "polar_angles": np.zeros(n_tx, dtype=np.float32), ... "time_to_next_transmit": np.ones((n_frames, n_tx), dtype=np.float32) * 1e-4, ... } >>> _, path = tempfile.mkstemp(suffix=".hdf5") >>> f = File.create( ... path, data={"raw_data": raw}, scan=scan, probe_name="L11-4v", overwrite=True ... ) >>> f.probe_name 'L11-4v' >>> f.close() >>> os.unlink(path)
- property data: GroupProxy¶
Lazy proxy for the
datagroup.Returns a
GroupProxyso individual datasets can be accessed as attributes without loading everything into RAM:with File(path) as f: f.data.raw_data[:, :n_tx] # read a slice f.data.image.values[0] # nested group access
- property description¶
Reads the description from the data file and returns it.
- get_parameters()[source]¶
Returns a dictionary of parameters to initialize a scan object that comes with the file (stored inside datafile).
If there are no scan parameters in the hdf5 file, returns an empty dictionary.
- Returns:
The scan parameters.
- Return type:
dict
- get_probe_parameters()[source]¶
Returns a dictionary of probe parameters to initialize a probe object that comes with the file (stored inside datafile).
- Returns:
The probe parameters.
- Return type:
dict
- get_scan_parameters()[source]¶
Returns a dictionary of scan parameters stored in the file.
- Return type:
dict
- classmethod get_shape(path, key)[source]¶
Get the shape of a key in a file.
- Parameters:
path (
str) – The path to the file.key (
str) – The key to get the shape of.
- Returns:
The shape of the key.
- Return type:
tuple
- has_key(key)[source]¶
Check if the file has a specific key.
- Parameters:
key (
str) – The key to check.- Returns:
True if the key exists, False otherwise.
- Return type:
bool
- load_data(data_type, indices=None)[source]¶
Load data from the file.
Deprecated since version Use:
file.data.<key>with standard h5py slice indexing instead::- with File(path) as f:
raw = f.data.raw_data[:] # all frames raw = f.data.raw_data[0] # first frame raw = f.data.raw_data[0, [0, 2]] # frame 0, transmits 0 and 2
The indices parameter can be used to load a subset of the data. This can be
'all'orNoneto load all dataan
intto load a single framea
List[int]to load specific frames- a
Tuple[Union[list, slice, int], ...]to index multiple axes (i.e. frames and transmits). Note that indexing with lists of indices for multiple axes is not supported. In that case, try to define one of the axes with a slice for optimal performance. Alternatively, slice the data after loading.
- a
For more information on the indexing options, see indexing on ndarrays and fancy indexing in h5py.
- Parameters:
data_type (str) – The type of data to load. Options are ‘raw_data’, ‘aligned_data’, ‘beamformed_data’, ‘envelope_data’, ‘image’ and ‘image_sc’.
indices (
Union[Tuple[Union[list,slice,int],...],List[int],int,None]) – The indices to load. Defaults toNonein which case all data is loaded.
- Return type:
ndarray
- load_transmits(key, selected_transmits)[source]¶
Load raw_data or aligned_data for a given list of transmits. :type key: str :param key: The type of data to load. Options are ‘raw_data’ and ‘aligned_data’. :type key: str :type selected_transmits: list, np.ndarray :param selected_transmits: The transmits to load. :type selected_transmits: list, np.ndarray
- metadata()[source]¶
Return a validated
MetadataSpecobject from the file.- Returns:
The validated metadata spec.
- Return type:
- Raises:
KeyError – If the file has no
metadatagroup.
Example:
>>> with File("my_file.hdf5") as f: ... meta = f.metadata() ... print(meta.subject.id)
- metrics()[source]¶
Return a validated
MetricsSpecobject from the file.- Returns:
The validated metrics spec.
- Return type:
- Raises:
KeyError – If the file has no
metricsgroup.
Example:
>>> with File("my_file.hdf5") as f: ... met = f.metrics() ... print(met.coherence_factor.shape)
- property n_ax: int¶
Number of axial samples.
- property n_frames¶
Return number of frames in a file.
- property name¶
Return the name of the file.
- property path¶
Return the path of the file.
- probe()[source]¶
Returns a Probe object initialized with the parameters from the file.
- Returns:
The probe object.
- Return type:
>>> from zea import File >>> path = ( ... "hf://zeahub/picmus/database/experiments/contrast_speckle/" ... "contrast_speckle_expe_dataset_iq/contrast_speckle_expe_dataset_iq.hdf5" ... ) >>> with File(path) as f: ... probe = f.probe() >>> type(probe).__name__ 'Verasonics_l11_4v'
- property probe_name¶
Reads the probe name from the data file and returns it.
- recursively_load_dict_contents_from_group(path)[source]¶
Load dict from contents of group
Values inside the group are converted to numpy arrays or primitive types (int, float, str).
- Parameters:
path (
str) – path to group- Returns:
dictionary with contents of group
- Return type:
dict
- scan(safe=True, **kwargs)[source]¶
Returns a Scan object initialized with the parameters from the file.
- Parameters:
safe (bool, optional) – If True, will only use parameters that are defined in the Scan class. If False, will use all parameters from the file. Defaults to True.
**kwargs – Additional keyword arguments to pass to the Scan object. These will override the parameters from the file if they are present in the file.
- Returns:
The scan object.
- Return type:
>>> from zea import File >>> path = ( ... "hf://zeahub/picmus/database/experiments/contrast_speckle/" ... "contrast_speckle_expe_dataset_iq/contrast_speckle_expe_dataset_iq.hdf5" ... ) >>> with File(path) as f: ... scan = f.scan() >>> type(scan).__name__ 'Scan'
- property stem¶
Return the stem of the file.
- property us_machine¶
Reads the ultrasound machine name from the data file and returns it.
- validate()[source]¶
Lightweight structural validation — no array data is loaded into RAM.
Checks that the file has a
datagroup and that all keys within it are recognised zea data types. For legacy files (before zea v0.1.0) a minimal key-name check is performed. For files created with zea v0.1.0 and later (viaFile.create()) the keys are checked against theDataSpecschema.Use
validate_spec()for a full validation that loads all data and checks dtypes, shapes, and cross-field dimension consistency.- Returns:
{"status": "success"}on success.- Return type:
dict
- Raises:
AssertionError – If the file is missing required groups or contains unrecognised data keys.
- validate_spec()[source]¶
Full schema validation — loads all data into RAM.
Reads every dataset in the file and runs dtype, shape, and cross-dimension consistency checks as defined by
FileSpec. Use this to confirm a file is fully spec-compliant before sharing or processing it.For a fast, zero-IO structural check use
validate()instead.Note
This method only works on files created with zea v0.1.0 and later. Files written before zea v0.1.0 should be re-saved through
File.create().- Returns:
The fully validated spec object, with all data accessible as typed attributes (e.g.
spec.data.raw_data,spec.scan.n_tx).- Return type:
- Raises:
TypeError, ValueError – If the file does not conform to the spec.
>>> with File("my_file.hdf5") as f: ... spec = f.validate_spec() ... print(spec.scan.n_tx)
- property zea_version: str | None¶
Return the zea version that wrote this file, or
Nonefor legacy files.Files created with zea v0.1.0 and later store a
zea_versionroot attribute. Files written before zea v0.1.0 returnNone.
- class zea.data.file.GroupProxy(group)[source]¶
Bases:
objectLazy proxy for an h5py.Group that exposes children as attributes.
Datasets are returned as-is (h5py.Dataset supports slicing without loading everything into RAM). Sub-groups are wrapped in another
GroupProxyso the dot-access pattern works recursively:with File(path) as f: # returns h5py.Dataset – no data loaded yet f.data.raw_data # slicing triggers the actual read, just like plain h5py f.data.raw_data[:, :n_tx] # nested groups work too f.data.image.values[0]
- zea.data.file.dict_to_sorted_list(dictionary)[source]¶
Convert a dictionary with sortable keys to a sorted list of values.
Note
This function operates on the top level of the dictionary only. If the dictionary contains nested dictionaries, those will not be sorted.
Example
>>> from zea.data.file import dict_to_sorted_list >>> input_dict = {"number_000": 5, "number_001": 1, "number_002": 23} >>> dict_to_sorted_list(input_dict) [5, 1, 23]
- Parameters:
dictionary (
dict) – The dictionary to convert. The keys must be sortable.- Returns:
The sorted list of values.
- Return type:
list
- zea.data.file.load_file(path, data_type='raw_data', indices=None, scan_kwargs=None)[source]¶
Loads a zea data files (h5py file).
Returns the data together with a scan object containing the parameters of the acquisition and a probe object containing the parameters of the probe.
Additionally, it can load a specific subset of frames / transmits.
The indices parameter can be used to load a subset of the data. This can be
'all'orNoneto load all dataan
intto load a single framea
List[int]to load specific frames- a
Tuple[Union[list, slice, int], ...]to index multiple axes (i.e. frames and transmits). Note that indexing with lists of indices for multiple axes is not supported. In that case, try to define one of the axes with a slice for optimal performance. Alternatively, slice the data after loading.
- a
For more information on the indexing options, see indexing on ndarrays and fancy indexing in h5py.
- Parameters:
path (str, pathlike) – The path to the hdf5 file.
data_type (str, optional) – The type of data to load. Defaults to ‘raw_data’. Other options are ‘aligned_data’, ‘beamformed_data’, ‘envelope_data’, ‘image’ and ‘image_sc’.
indices (
Union[Tuple[Union[list,slice,int],...],List[int],int,None]) – The indices to load. Defaults to None in which case all frames are loaded.scan_kwargs (
dict) – Additional keyword arguments to pass to the Scan object. These will override the parameters from the file if they are present in the file. Defaults to None.
- Returns:
The raw data of shape (n_frames, n_tx, n_ax, n_el, n_ch). (Scan): A scan object containing the parameters of the acquisition. (Probe): A probe object containing the parameters of the probe.
- Return type:
- zea.data.file.load_file_all_data_types(path, indices=None, scan_kwargs=None)[source]¶
Loads a zea data files (h5py file).
Returns all data types together with a scan object containing the parameters of the acquisition and a probe object containing the parameters of the probe.
Additionally, it can load a specific subset of frames / transmits.
The indices parameter can be used to load a subset of the data. This can be
'all'orNoneto load all dataan
intto load a single framea
List[int]to load specific frames- a
Tuple[Union[list, slice, int], ...]to index multiple axes (i.e. frames and transmits). Note that indexing with lists of indices for multiple axes is not supported. In that case, try to define one of the axes with a slice for optimal performance. Alternatively, slice the data after loading.
- a
For more information on the indexing options, see indexing on ndarrays and fancy indexing in h5py.
- Parameters:
path (str, pathlike) – The path to the hdf5 file.
indices (
Union[Tuple[Union[list,slice,int],...],List[int],int,None]) – The indices to load. Defaults to None in which case all frames are loaded.scan_kwargs (
dict) – Additional keyword arguments to pass to the Scan object. These will override the parameters from the file if they are present in the file. Defaults to None.
- Returns:
A dictionary with all data types as keys and the corresponding data as values. (Scan): A scan object containing the parameters of the acquisition. (Probe): A probe object containing the parameters of the probe.
- Return type:
(dict)
- zea.data.file.validate_file(path=None, file=None)[source]¶
Validate the structure and data of a zea HDF5 file.
For files created with zea v0.1.0 and later this runs the full
FileSpecschema validation (dtypes, shapes, and dimension consistency). Legacy files (before zea v0.1.0) are detected by the presence of scalar datasetscan/n_frames; for those only a lightweight structuraldatagroup check is performed.Provide either path or file, but not both.
- Parameters:
- Returns:
{"status": "success"}on success.- Return type:
dict
- Raises:
AssertionError – If the file is missing the
datagroup.TypeError, ValueError – If spec validation fails on files created with zea v0.1.0 and later.