In this post I record
the multi-scale implementation of the PatchMatch Stereo algorithm.
The aim is to create a
pyramid to process the algorithm in different scale so that speed and accuracy
may be improved.
·
Start from the smallest
scale and for each scale:
o Downsampling the original size image to
current scale
o Initialize a new object of the CostFunction
class (including image border enlarge and grad image calculation)
o For
each pixel in the current scale image:
§ Spatial
propagation: calculate the cost for the neighbors and use the plane that gives
minimal cost
§ View propagation: replace current plane if the
plane of the corresponding pixel in the other view gives smaller cost
§ Plane refinement
§ All above procedures are processed in order (Image A: top-left to bottom right; Image B: top-left to bottom right; Image A: bottom right to top-left; Image B: bottom right to top-left)
§ All above procedures are processed in order (Image A: top-left to bottom right; Image B: top-left to bottom right; Image A: bottom right to top-left; Image B: bottom right to top-left)
·
Post-processing
Indexing the Patch Image after downsample:
For each patch in a patch image, it is indexed based on its (x, y) coordinate and width of the image.
Patch_Img_A[y * _imgSize.width + x]
However, if we want to retrieve the corresponding patch in coarser image, we need to do:
int yy = y / 2, xx = x / 2; Patch_Img_Pre_A[yy * currLImage.cols / 2 + xx];
Negative disparity value:
Although the plane parameters can result in negative disparity, I assume all the disparity are non-negative in my program. If a too large (> maxDisparity) or too small (< 0) disparity is calculated, I replace it with maxDisparity or absolute value respectively. It does not make much sense in terms of the disparity is the pixel difference but it keeps the program simpler.
Time measurement with OpenCV:
double t = (double)getTickCount(); // do something ... t = ((double)getTickCount() - t)/getTickFrequency(); cout << "Times passed in seconds: " << t << endl;
Patch Image from small scale to large scale:
A finer scale patch image is produced by coping corresponding plane parameters (i.e. a, b and c) in coarser scale. Better interpolation strategy should be explored (e.g. bilinear/ bi-cubic)
Fast converging:
It may not be necessary to iterate at a patch that is nearly converged. One idea is detect converged patches at coarser scale which is fast and not update these patches in finer scale.
Runtime (in second):
Input image dimension: 360x288
Layers: 3
Windows: 35x35 (finest layer)
1st (coarsest): 0.364 - 0.350 - 0.363 - 0.350
2nd: 5.744 - 5.500 - 5.743 - 5.503
3rd (finest): 108.073 - 103.807 - 109.202 - 103.938
Reference:
No comments:
Post a Comment