A dataset is a collection of images, depth maps, camera parameters, etc. from an acquisition. The dataset parameters file defines how the image, depth map, etc. files are organized and numbered on the file system.
All of these tools use a common JSON-based dataset parameters file format. It can describe both 1D and 2D datasets. There can be different versions of equivalent files (e.g. for each view, Kinect raw depth map, upsampled depth map, and VSRS disparity map). There can be a different numbering for these files.
The classes implemented in src/lib/dataset.h
(C++) and src/lib/pylib/dataset.py
(Python) are used by all other tools to
read the dataset parameters.
{
"y_index_range" : [101, 201],
"x_index_range" : [1, 851],
"camera_name_format" : "cam_{y:04d}{x:04d}",
"cameras_filename" : "cameras.json",
"width" : 1920,
"height" : 1080,
"image_filename_format" : "files/image_{y:04d}{x:04d}.png",
"depth_filename_format" : "files/depth_{y:04d}{x:04d}.png",
"vsrs" : {
"z_far" : 1600.0,
"z_near" : 600.0,
"image_filename_format" : "../3DLicornea_A1/textures/Band2_0101-0201_step1mm/files/cam_{y:04d}{x:04d}.yuv",
"depth_filename_format" : "../3DLicornea_A1/disparity/Band2_0101-0201_step1mm/files/cam_{y:04d}{x:04d}.yuv"
},
"kinect_raw" : {
"filename_y_index_factor" : -1,
"filename_y_index_offset" : 684,
"filename_x_index_factor" : 1,
"filename_x_index_offset" : 0,
"image_filename_format" : "../raw/483-583_step1mm_acq2/texture/{y}.0/Kinect_out_texture_000_{y}.0z_0001_{x:04d}.png",
"depth_filename_format" : "../raw/483-583_step1mm_acq2/depth/{y}.0/Kinect_out_depth_000_{y}.0z_0001_{x:04d}.png"
}
}
{
"x_index_range": [1, 851],
"camera_name_format": "cam_0151{x:04d}",
"cameras_filename": "cameras.json",
"width": 1920,
"height": 1080,
"image_filename_format": "files/image_0151{x:04d}.png",
"depth_filename_format": "files/depth_0151{x:04d}.png",
"vsrs": {
"z_far": 1600.0,
"z_near": 600.0,
"image_filename_format": "../3DLicornea_A1/textures/Band2_0101-0201_step1mm/files/cam_0151{x:04d}.yuv",
"depth_filename_format": "../3DLicornea_A1/disparity/Band2_0101-0201_step1mm/files/cam_0151{x:04d}.yuv"
},
"kinect_raw": {
"filename_x_index_factor": 1,
"filename_x_index_offset": 0,
"image_filename_format": "../raw/483-583_step1mm_acq2/texture/533.0/Kinect_out_texture_000_533.0z_0001_{x:04d}.png",
"depth_filename_format": "../raw/483-583_step1mm_acq2/depth/533.0/Kinect_out_depth_000_533.0z_0001_{x:04d}.png"
}
}
The parameters file is a JSON file. The index ranges (x_index_range
, y_index_range
), image+depth
filenames (image_filename_format
, depth_filename_format
), camera parameters, other attributes are global to the
dataset. The object vsrs
and kinect_raw
are file groups. They contain different versions of the files, possible with different
numbering.
Any extra attributes added to the files are ignored by the tools. Tools that modify parameter files (in dataset/
) leave the
extra attributes in place.
This is a 2D dataset, having X and Y indices. The possible index ranges are given by x_index_range
and y_index_range
.
The minimal and maximal index values are inclusive. (i.e. x=851 is a valid index.) It is different from Python range()
’s.
The index is just for numbering, and has not direct relation to the camera arrangement. Each index (x,y)
corresponds to a view.
The view has a camera, image file and depth file. camera_name_format
is a format template that gives the name of the camera
corresponding to view (x,y)
. (more below)
2D parameters must contain both x_index_range
and y_index_range
. 1D parameters must contains only x_index_range
.
It cannot contain only y_index_range
. (Because otherwise all the tools would need to distinguish between X-only and Y-only datasets.)
The indices can also have a step: specifying for example [1, 10, 4]
(array with 3 integers) for an index range corresponds to the index values
1, 5, 9
. 2
would be an invalid index. The maximum 10
is no longer included in this example.
Each view corresponds to a camera name, and an image and depth filename. The format strings are in Python syntax. For
2D datasets, there is an argument “x” and an argument “y”. For 1D datasets there is only an argument “x”.
Format is described at https://pyformat.info/.
The C++ tools use fmt to parse the format string (not printf
).
“Filenames” are relative file paths. The file paths are always relative to the location of the parameters file. Except if the filename begins with “/”: then it is considered an absolute path.
kinect_raw
and vsrs
are groups. They contain additional files belonging to the view. kinect_raw
are the raw,
unprocessed files from the Kinect camera. vsrs
are .yuv
files for VSRS.
They can optionally have a different numbering. filename_x_index_factor
(“factor”) and filename_x_index_offset
(“offset”) are both optional. If they
are not present, factor defaults to 1, and offset defaults to 0. (Same for Y).
The factor can be a real number. The offset is always an integer. Both can be negative.
The filename format templates in a group (e.g. vsrs/image_filename_format
) are formatted with the local indices.
The mapping is:
local_x = floor(x * factor) + offset
In this example x = 101, 102, 103
becomes in raw_kinect
: local_x = 684, 683, 682
.
The default image_filename_format
, etc. values (not in a group) are said to be in the root group or default group.
There are tools for manipulating dataset parameter files in dataset/
.
dataset/slice make a 1D slice of a 2D parameters file, by fixing the Y index to a certain value. It produces a 1D dataset parameters file. The example 1D parameters file was generated using this, by fixing Y to 151.
To instead make a slice fixing the X index: It is not possible to have only a Y index in the file. Instead dataset/flip can be used to swap the meanings of the Y and X index. Then dataset/slice is used on that new, flipped parameters file.