Reconstruction of Three-Dimensional Object from Two-Dimensional Images by Utilizing Distance Regularized Level Algorithm and Mesh Object Generation

Three-dimensional (3D) reconstruction from images is a most beneficial method of object regeneration by using a photo-realistic way that can be used in many fields. For industrial fields, it can be used to visualize the cracks within alloys or walls. In medical fields, it has been used as 3D scanner to reconstruct some human organs such as internal nose for plastic surgery or to reconstruct ear canal for fabricating a hearing aid device, and others. These applications need high accuracy details and measurement that represent the main issue which should be taken in consideration, also the other issues are cost, movability, and ease of use which should be taken into consideration. This work has presented an approach for design and constructed a low-cost three-dimensional object scanner. We have proposed a 3D canal reconstruction system (ear or nose) based on using 2D images for reconstruction 3D object. A low-cost EndoScope with a proposed program based upon utilized the segmentation algorithm type “Distance Regularized Level” to segment active edges from images then generate mesh object in order to generate 3D structure for small canals or cracks. The results show good accuracy of the reconstructed object in both details and their measurements which are related to the success in the reconstruction of algorithm that yields good three-dimensional meshes object.


Introduction:
The 3D reconstruction is a part of 3D visualization, which plays a vital role in many applications such as in Industrial and medical applications that depend on tools (hardware and software) that provides human-machine interaction. The ability to build 3D models from the 2D image has a wide range of applications. The first practical approach to reconstructed three-dimensional model from the image is done by A. Rockwood and J. Winget in 1997, that based on Shape from Shading technique. In their work. they proposed an approach for automated construction of complex 3D objects and sophisticated irradiance functions with no needs for feature recognition. This approach presented tolerant to the nose and performs as an 'enhancing filter' for the large nose. Input can be found as shaded sketches of created objects or as digitized images; hence, this method could be applied to both construction issue for example, in styling, and to reconstruction issue found in reverse engineering (1).
Regarding the reconstruction of the threedimensional scene, there are generally two methods: passive and active. The first one is the image-based approach and the second is laser scanners. Active imaging (laser scanning techniques) generally is used for three-dimensional scanning and viewing through scattering media, it is performed by controlling the light source in the scene. Active control of the light source is very suitable in calculating correspondences in the context of triangulation as well as instant evaluation of depth 900 via Time-of-Flight (ToF) measurements. An effective method of active imaging is to focus all the power of the active light source into a point and achieve point-by-point scanning. This is the functioning principle base LIDAR that is effective even with strong sunlight. Active imaging is accurate and robust; however, it is also expensive and has specific restrictions on the surface properties and the size of the objects. Besides that, it may be unable to capture the color details of the objects. The Passive imaging (image-based technique) is built on the research that has been achieved in photogrammetry and computer vision in the past 30th years. It utilizes an ambient light source to image a scene. The most common design of a passive image or a camera is a combination of a lens and a 2D sensor, such as that found in an SLR or a cellphone camera. In spite of less accuracy, it reconstructs three-dimensional model given at minimum two images or a series of images that will be easily acquired by optical imaging units (such as CCD camera). So, it is generally low cost which can be highly significant for many of applications in simulation, virtual reality, and entertainment. As the development of electronic technology in the last years, digital cameras become more and more common and create a high-quality image. This improvement in the imaging device and its highquality images in addition to low cost have attracted researchers' attention especially the field of 3D scanning and reconstruction strategy (2,3). This study aims to investigate the ability to construction three-dimensional structure from twodimensional image by utilizing "Regularized Level Segmentation" Algorithm with 3D mesh tringle reconstruction generation. The proposed system should allow users to make the reconstruction without other expensive and complex devices but using a low-cost EndoScope and computer. The user acquires images by freely moving the EndoScope with real-time viewing then capture the desired images, then the proposed software will do the 3D reconstruction following several steps with few human operations.

Previous Works
Das et al (2015) (4), designed and constructed a fringe projection ear scope to make three-dimensional morphology scan of the tympanic membrane. A device is able to render a highresolution depth map of the tympanic membrane that will be useful in finding out the existence of fluid in the middle ear. Three-dimensional scanning of the tympanic membrane was achieved by using a normal optoscope along with an HD webcam, a telecentric optical system and light, and portable projector. The higher resolution of this system causes it to be practical to determine abnormalities in the shape of the eardrum through the generated 3D profile.
Liang and Hongliang (2017) (5), proposed an approach to reconstruct the three-dimensional oral surgery scene model from 2D image of endoscope.
They utilized ORB-SLAM (simultaneous localization and mapping) with a low-cost endoscope to estimate the location of endoscope and create a 3D map for the for the oral surgery scene. A laser patterns has been used to help to make more feature points and fits to reduce data scarcity. In addition, the parameters are tuned to get more feature points and strengthen the standard to filter the mismatches Le et al (2017) (6), demonstrated and analyzed the endoscopic visualization system using a plenoptic approach to reconstruct threedimensional details. The proposed setup combines a plenoptic camera with a clinical surgical endoscope to acquire a depth accuracy error around 1mm and a precision error around 2mm, within a 25mm×25mm field of view, operating at 11 FPS (Frame per second).

Theoretical Background
Have been used to measure and scan small size cracks or canals such as wall cracks, ear or nose canal, however, the reconstruction of these types of small structures is the difficult task, this because of the relatively small size of these structures in addition to complex curvature, which needs very small and adjustable instrument to be capable to observe and determine Time of Fly (ToF) from the wall in order to reconstruct it. They are a few approaches including laser baroscopic, ultrasonic wave, feature matching-based threedimensional reconstruction who hybrid CT and endoscopic images, however, these methods are very costly and facilitated from limited companies. This work proposed a three-dimensional reconstruction dependent on employing and utilizing EndoScope. Image segmentation algorithm known as Distance Regularized Level Set Evolution done by (7), to segment ear or nose canals internal surface then reconstructed it.
Level set method is a kind of finite element methodology utilized for the modeling of surfaces or active curves. Recently, this method has been used with the area of machine vision for segmentation problems, specifically for Threedimensional segmentation. Compared with the traditional deformable types, the level set method would not be based upon the front's parameterization, but it just depends on their velocities. This causes it to be a very flexible and attractive tool in image segmentation and modeling of the shape (8).
The level set evolution equation is an advancement from curve evolution equation, where the early active contour models were designed in term of a dynamic parametric contour which can be expressed as (9): (1) Where: is the spatial parameter that parameterized the contour points which range is [0, 1], and is the temporal variable ∈ [0, ∞]. Then the curve evolution is defined in follows: ( , ) = (2) Where: represent the speed function which controls the contour motion, is inward normal vector of the curve.
As described in (9), the curve evolution described in (2) with regards to parameterized contour will be transformed to a level set formulation via embedded the dynamic contour ( , )as the zero-level set that has time dependent LSF ( , , ) . By assuming the embedding LSF will take negative values in the zero-level contour and it will take positive values out the inward normal vector ( ). The inward normal vector is defined by the following formula (9): = −∇ |∇ | (3) Where: ∇ is the gradient operator By subtracting 3 in 2, then the curve evolution expressed in equation (2) is transformed to the partial differential equation term (PDE) as in follows: = |∇ | Where: is the referred to the equation of level set evolution. Distance Regularized Level Set Evolution algorithm some time terms (DRLSE) is an advancement of level set approach. It is made mainly because the level set function generates irregularities which usually cause numerical errors and eliminate the constancy of level set evolution. To get over these challenges, distance regularization concept is applied and develop into a DRLSE method (10): = ( (|∇ |∇ ) + |∇ | + . ∇ (5) Where: is the LSF to be reinitialized With distance regularization concept, the numerical structure is stable with no need for reinitialization. Distance regularization Level can be utilized for image and it involves edge-based or region-based image structure to determine the external energy. Li et al (10) have introduced the Distance Regularized Level Set Evolution algorithm and its application to an active contour model by using edge-based information. This algorithm first filters the image using the Gaussian Kernel Filter in order to smooth the image to minimize the noise.
Energy minimization refers to physic which comes from the fact that all matters in our universe get from high energy level and go toward minimum energy level. Thus, if the matter at the minimum energy level, it means it is "stable" for this matter (i.e. if the energy of extracted contour is at the minimum energy then it presents the object border as good as possible. In the proposed method, the LSF is presented by ϕ, it is matrix with the exact dimension of specified image and each value of single cell reference a real number in range between and + range limits. This ϕ, is set as initial region of given object as -value and the rest of them with + value. At the last iteration ϕ matrix should be checked, and then allocate the cell that value is 0, as an object border. Thus, the ϕ matrix is very significant variable. The energy function which has to be minimized is the function of ϕ matrix as shown in following equation (10,11): Where: ( ) is energy function that depends on the data of interest. , are coefficient of the energy functional ( ), ( ) and ( ), which their values are greater than zero. ( ) is level set regularization, The method presents to generate an energy function as a sum of 3 part that are regularized distance term ( )with its weight , Length term ( ) with its weight , and area term ( ) with its weight . In this formula, all conditions are bigger than 0. All three coefficient of weight control the corresponding terms effect on the energy function. (for example, if α<0, then the object area must be as big as possible in order to reduce the function. This led to obtain larger initial object border in step by step. Genuinely. if we make α is equal or greater than 0, then the initial area will become smaller step by step. Since optimization step tries to find ϕ that achieves the minimum energy function, this would mean if we need to achieve bigger object then it will need to set α to be less than 0 and vice versa. However, μ and λ would need to be bigger than 0, since the aim is locating the object length (object circumference) and regularization term to be as short as possible and as small as possible. The level set regularization ℛ (∅) is defined by following equation (10,11): is potential energy. The initial Level set function LSF ( ) for sample edge segmented from the age is shown in Fig. 1: The delta function will be minimum when level set function at the object boundary ( ) represent the weighted area of region , which is presented to boost the movements of the zero-level contour along with the process of level set progress, that is significant if the initial contour is placed at a distance out of the desired object boundaries. It is defined by the following equation (10,11): ( ) ≜ ∫ (− ) Ω (9) Where: is the Heaviside Function.
Heaviside function in the functional ( ) and the direct delta function in the functional ( ) are approximated by the smooth functions H( ) and ( ) in many level set methods, which are expressed in following formulas (11):

Materials and Methods: Experimental Hardware
The proposed system is consisting of two main parts: An EndoScope and computer. The EndoScope is low cost 1600 × 1200 1080P 6 LED USB Endoscope (Fig. 2). The computational specification that has been used in the experiment are in the following:  CPU: 2.2 core 2 Duo

Reconstruction Algorithm and Software
The proposed system is based first on using "Distance Regularized Level Set Evolution" segmentation algorithm to detect the edge of the canal or crack then generated mesh. In this work, we have used an image that is captured by EndoScope and then resized to 256 x 256 pixels. The Distance Regularized Level algorithm has first detected the edge and we adjust both when the value of has been modified in order to be used for ear canal reconstruction.
The process is based on the following procedure, first, we have captured ear canal image by EndoScope and then saved it as a JPEG format(a). Next, the contour of LSF (b) was initialized. Then the LSF will progress, heading the zero-level set toward the required object boundary. The proposed algorithm for reconstruction region is illustrated in follows:

Input:
Image or Video from Endoscope Output: 3D Reconstruction Object (stl format) 1 Get image from Endoscope. 2 Read the input image of size M×N. 3 Define the initial level set function. The distance regularization coefficient term as 0.2, the weighted length coefficient term as 5 and weighted area coefficient term as 3 and the parameter which specified the Dirac delta function width terms as 1.5. 4 Apply Fast Fourier Transform to the convolution of Gaussian with the image by setting the scale parameter in the Gaussian kernel as 0.8. 5 Initialize Level Set Function (LSF) as a binary step function. 6 Generate the initial region R0 as a rectangle. 7 Display the initial level set function and initial zero level contour. 8 Start level set evolution and refine the zero-level contour via further level set evolution with =0. 9 Display the final zero level contour and final level set function. 10 Compute Distance Regularization Term ( ). 11 Compute Dirac delta function (from equation 10) & Heaviside function (from equation 11) 12 Compute Energy Functional 13 If the zero crossing points are either stop varying for sequential iterations or override the specific maximum range of iterations, next it will stop the iteration, if not, it will return to Step 3. 14 Calculate and compare the DSM and the time of execution of both DRLSE with FFT and DRLSE without FFT 15 Generate 3D mesh for the reconstructed region 16 Convert mesh to a 3D object in STL file format 17 End.
The 3D reconstruction program has been designed by used MATLAB (version 2018a), the graphic user interface of the program is shown in Fig. 3.

Figure 3. Program GUI
As can see from figure 3, at start the program ask user to input the desired resolution from EndoScope, then the user should press "take photo" button to capture the desire hole.

Results and Discussion:
As described in section 3.2, the Endoscope has been used in order to capture an image of crack or canal then a proposed program is then starting to detect region then segment it from images and start generating a 3D mesh based on potential value. The generated 3D mesh is then converted to 3D STL object based.
The approach has eight parameters. The most significant parameters are: mu (μ), lambda (λ), alfa (α), epsilon (ε), c0, maxiter, sigma and timestep. The most significant parameters are: mu (μ), lambda (λ) and alfa (α) that are presented coefficient of three different factors of energy function as control term, length term and area term respectively. This params is established by inventor as mu=0.2, lambda=5 and alfa=-3. Typically, this params reveals which aspect is how significant in our energy function. To reduce the energy function requires, to reduce the sum of this factors. In fact, we will need to reduce both of these three factors together. it means multi-objective minimization. lambda factor tries to fit the objects contour on to edges area in the provided image as far as possible. Once we rise lambda then the contour would get further edges region when compared to before and it made the object border to become rougher. The object border become far from smoothed object and it can cause several errors. However, on the other hand in case we reduce the lambda then the object border will turn to be smoother but then the border start to fit on to the non-edge area and that may be not possibly real border. on other hand, the alfa coefficient has been selected to be negative, as our aim is to get the initial area big due to the fact that initially we select tiny part of concerned region well then, our method will need to get bigger. So, we desired that the area coefficient needs to be bigger than prior iteration. That is why we need to select it negative. However, the negative value is depending upon how the area term is significant for our goals. In case of selection bigger negative value then the segmentation result being bigger, on the other hand when we select it a smaller negative value it is clear that the results object border turned to be smaller. Thus, the value to be small or big depends upon the image. The Mu parameters are related to the importance of regularized term. In our approach the results image in range for instance -2 to +2. In case we reduce this value than this range the result becomes not-stabilize, alternatively if we rise this number, the algorithm will tend to keep the result image in range and will not pay a sufficient amount 904 of attention to the border. epsilon (ε) and c0 factors are related to each other in which, c0 params represents the different level that can be used in level set function. On other hand, epsilon is the parameter of Heaviside and Dirac function. The other parameters, Maxiter represent the number of iterations in which the algorithm should be utilized till the end. As the proposed method needs to find the results in step by step thus, we will need to specify how many iterations the process should be using. The Timestep is the length step that gradient descent algorithm. In our test, we used an endoscope to scan some samples such as cracks, real human ear canal to check the accuracy of the proposed method in reconstructed these structures, where it then compared with original measurement and structure of them. The reconstruction result of hole and ear canal is shown in Figs. 4 and 5, respectively. As shown in Figs. 4, 5, 6 and 7, the operation begins by stream video from Endoscope or camera (resolution of the video depends on camera or Endoscope specification) then it will preview the video in viewport field in GUI. The user can adjust the best image by moving Endoscope manually until getting the desired picture for the cracks or canal then capture its image via pressing the "capture" button. Next, the captured image is input to the processed program to start processing. In the program, the proposed method first requested edge segmentation algorithm and the user should input the contour dimension in such way that the required hole edge should be within its dimension. Then the algorithm start calculates the edge off and it adjusts the dimensions and continues to adjust edge until surrounded the desired structure (this depends on a number of iterations identified in the algorithm). Fig. 8 illustrates the edege detection and mesh generation process.

reconstruction process for for (a) the noise canael and (b) ear canal
After that, the program start to generate the mesh from the detected edge and to be reconstructed based on potential value and Gaussian mythology selected and it continues reconstructed it until reaching the desired depth. The desired depth for cracks or canal is determined either practically by using range finder (laser or ultrasonic) or theoretically based on light intensity of pixels and input to program parameters. The generated 3D mesh shown in Fig. 9, is then converted to the 3D object by converted mesh to triangles then the reconstructed object is saved as STL format, which represents the final result of the reconstruction process. The reconstructed object is compared with silicon impression for ear and gypsum powder for hole and the result shows that the reconstructed object is relatively close to real shape and dimension.

Conclusion:
This study investigates the ability to the reconstructed 3D object from the 2D image in realtime by using a low-cost Endoscope with proposed reconstruction program based on an algorithm called "Distance Regularized Level Set Evolution". For this purpose, we have tested the proposed method for reconstruction the 3D structure for many cases such as small hole in wall, pipe canal, human noise canal and human ear canal from 2D images taken by EndoScope or camera. The proposed system has eight parameters which are: mu (μ), lambda (λ), alfa (α), epsilon (ε), c0, maxiter, sigma and timestep. In order to detect edge, the initial contour dimension should be input as selected region then the method start to verifying the edge. The result shows that the proposed system start detect edge and it succeeds to detect exact edge region for several cases used. The results show good accuracy of reconstructed structures details and a nearest dimensional measurement which is related to an accurate reconstruction strategy and algorithm that produced good 3D meshes for scanned structures. The experimental results show that there are some points that should be taken in consecration, first, to get good segmentation result for the region, the image should be grayscale and resized to a suitable size, where the good image size for small holes and cracks is in the range between 100 to 250 pixels. The second important thing is the initial region that should be near the hole edge to efficiently detect the edge and separate it from image background. However, these values should be determined experimentally depending on image and structure type. This method when compared with traditional methods has many advantage, when compared with low cost laser scanner it is different in the fact that it can scan narrow region such as small holes, ear canal, noise canal which cannot be constructed using traditional 3D scanner which is bigger in size and can just scan outer surface or large inner area. However, when compared with specialist 3d scanner used for reconstructed small inner surface, the proposed system can reconstruct inner surface in good accuracy with very low cost when compared to the traditional 3D scanners that utilizes laser system which is very expensive. Thus, this system can be used practically in many applications to design a low-cost 3D scanner especially in some small area that needs to be scanned such as internal ear canal to fabricate hearing aid directly by using 3D printers and other applications, and it can improve the accuracy of system by detecting multi edges in overall scanned area then reconnect the mesh to make more accurate dimensions.