Instant Edit Propagation on Images Based on Bilateral Grid

时间：2024-07-28

Feng Li,Chaofeng Ou,Yan Gui, and Lingyun Xiang

Abstract:The ability to quickly and intuitively edit digital content has become increasingly important in our everyday life.However,existing edit propagation methods for editing digital images are typically based on optimization with high computational cost for large inputs.Moreover,existing edit propagation methods are generally inefficient and highly time-consuming.Accordingly,to improve edit efficiency,this paper proposes a novel edit propagation method using a bilateral grid,which can achieve instant propagation of sparse image edits.Firstly,given an input image with user interactions,we resample each of its pixels into a regularly sampled bilateral grid,which facilitates efficient mapping from an image to the bilateral space.As a result,all pixels with the same feature information (color,coordinates)are clustered to the same grid,which can achieve the goal of reducing both the amount of image data processing and the cost of calculation.We then reformulate the propagation as a function of the interpolation problem in bilateral space,which is solved very efficiently using radial basis functions.Experimental results show that our method improves the efficiency of color editing,making it faster than existing edit approaches,and results in excellent edited images with high quality.

Keywords:Instant edit propagation,bilateral grid,radial basis function,image editing.

1 Introduction

With the development of information technology,digital media technology [Shen,Shen,Liu et al.(2018);Xiang,Shen,Qin et al.(2018),Shen,Shen,Sun et al.(2018)] has become widely used,and people accordingly have higher requirements for visual media quality.Thanks to powerful image editing tools,people without any knowledge about image processing can easily edit a photo in a visually plausible way.The edit processing of images/videos has become a research hotspot,and color processing technology is an important part of this field.Color processing technology mainly includes color transfer and color editing propagation [He,Gui and Li (2017)].

Color transfer technology enables the mapping of color information from a reference image to a target image,thereby changing the visual effect of the target image [Faridul,Pouli,Chamaret et al.(2016);Irony,Cohen-Or and Lischinski (2005);Reinhard,Adhikhmin,Gooch et al.(2002);Welsh,Ashikhmin and Mueller (2002)].However,it is difficult to choose an appropriate reference image for transferring color.In order to solve the problem of reference image selection [Xie,Qin,Xiang et al.(2018)],machine learning [Xiang,Zhao,Li et al.(2018)] can be applied to color transfer [Cheng,Yang and Sheng (2015);Iizuka,Simo-Serra and Ishikawa (2016)].Using machine learning methods,an appropriate reference image can be learned by training a prepared reference image set,which should always contain a large number of different of reference images types.Machine-learningbased color transfer methods can reduce the level of required user interaction and the complexity of selecting reference images.However,when color processing is applied to complex natural images,the color consistency and spatial continuity of the repeated scene elements cannot be maintained during the color transfer process.

By contrast,editing propagation preserves the color consistency and spatial continuity of repeated scene elements.Color editing propagation first requires the user to draw colored lines on the image,then uses algorithms to achieve color transfer from the marked position to the unknown color area.Compared with color transfer,this method has the advantage of enabling people to mark different parts of the image with different colors according to requirements,meaning that the image will satisfy the color requirement to a greater content.Grayscale image coloring and color image re-coloring are two important methods of editing propagation [He,Gui and Li (2017)].

The grayscale image colorization method based on user interaction is an important part of color editing propagation [Horiuchi and Hirano [2003];Horiuchi and Kotera [2006];Levin,Lischinski and Weiss [2004]].This method requires numerous interactions,performs gray value similarity determination for each pixel in the image,and edits each pixel point in subsequent edit propagation;as a result,it occupies a large amount of memory and costs a lot of time.

Compared with the grayscale image colorization method,the color image re-coloring method requires only user-provided color scribbles of the local image area,meaning that the user operation is simpler [Xu,Li,Ju et al.(2009);Pellacini and Lawrence (2007);Huang,Zhang and Martin (2015)] without requiring to extract the local features of the image [Li,Qin,Xiang et al.(2018)].However,if the image size is large,the storage memory required will increase;this occupation of the large amount of memory reduces computational efficiency,resulting in longer re-coloring times.

Existing color processing methods usually take tens or even hundreds of seconds to process a picture.These methods are inefficient and highly time-consuming.In order to solve the problem of low time efficiency,we propose an instant edit propagation method based on a bilateral grid.Our method is divided into two phases：the image preprocessing phase and the editing propagation phase.In the image preprocessing phase,we resample the input images via interactions using a bilateral grid.The editing propagation phase involves propagating the color feature information of the grid vertices of the interactive marks to other grid vertices according to the interpolation model.Experimental results show that our method can greatly improve the speed of color editing and yields highquality editing results.

The Section 2 mainly describes some works related to this paper.The Section 3 will present the framework of our method.The Section 4 is mainly concerned with the presentation of experimental results and drawing comparisons with existing editing propagation methods.Finally,the Section 5 presents the conclusion.

2 Related work

Our work is inspired by earlier work on color transfer,but mainly uses edit propagation.

Color transfer.Color transfer involves mapping the color distribution of a reference image to the input image [Tai,Jia,Tang et al.(2005);Pitié,Kokaram and Dahyot (2007)].Xiao et al.[Xiao,Wan,Leung et al.(2015)] proposed a color transfer method based on gradient mesh optimization that can effectively maintain the gradient network structure information of images in color transfer results.On this basis,Liu et al.[Liu and Song(2018)] proposed a multi-source image color migration method based on edit propagation.This method proposes an interactive,automatic correction method that uses the modified interactive mark as the control sample in the edit propagation,finds all regions in the image similar to the interactive mark,and combines color migration and apparent migration technology,such that the corresponding color or texture information is migrated to the target image in order to obtain the migration result.Accordingly,our work draws on the advantages of both approaches：We optimize the interpolation model to accurately propagate the interactive color information to the grid vertices.

Edit propagation.User interaction-based methods can help users to propagate editing operations from one image region to another region with a similar appearance [An,and Pellacini (2008);Bie,Huang and Wang (2011)].Li et al.[Li,Ju and Hu (2010)] proposed a fast image editing method based on the radial basis function interpolation model,which can edit other image regions that are not interactive but are similar to the sample points.To solve color mixing in Li’s method,Chen et al.[Chen,Chen and Zhao (2012)] employed local linear embedding (LLE);however,when the value of the K-domain [Xiao,Wang and Liu (2018)] is large,the expected color conversion effect cannot be achieved.This method solves the data overflow presenting in the first two methods.Xu et al.[Xu,Yan and Jia(2013)] proposed an image editing method based on the sparse control model,which can achieve better color editing results with a small amount of user interaction.Combining the technical characteristics of Li et al.[Li,Ju and Hu (2010)],Chen et al.[Chen,Chen and Zhao (2012)] and Xu et al.[Xu,Yan and Jia (2013)],we propose an instant image editing method based on the bilateral grid.The method only requires a small amount of interaction to ensure the quality and instant editing of the image.

Recently,Chen et al.[Chen,Li,Chen et al.(2016)] proposed learning sparse dictionaries for edit propagation.This method follows the principle of sparse representation to obtain a representative and compact dictionary and perform edit propagation on the dictionary instead;accordingly,it reduces memory consumption while still maintaining a high degree of visual fidelity.

3 The proposed method

3.1 Framework

In order to improve the speed of image color editing,we propose an interactive color editing method,a flowchart of which is shown in Fig.1.The method includes two main stages：image preprocessing and color editing propagation.

(1)Image preprocessing.In this stage,the user needs to perform color line marking on the local area.However,when the grayscale channel of the color of interactive input is very different from the grayscale channel of the original image,the color will exhibit a large deviation after coloring is complete.Accordingly,we adjust the grayscale channels to accommodate the grayscale channels of interactive colors.Finally,the bilateral grid is used to resample the input image via user interaction.

(2)Color editing propagation.This stage consists of two parts：the construction of the interpolation model and the color editing propagation.When constructing the interpolation model,the grid vertex’s feature vector of the original image grid is used as input,while the feature vector of the user edits grid is used for interpolation to construct the radial basis interpolation model.During color editing propagation,the color values of the marked grid vertex are propagated to the target grid,after which the color values of the grid vertex are inversely mapped to individual pixels within the grid vertex.Finally,all pixels in the input image grid are inversely mapped to theRGBcolor space to obtain the output image after re-coloring.

Figure1:Method flowchart：(a)the input image;(b)the bilateral grid after image preprocessing;(c)the bilateral grid after color editing;(d)the output image

The advantages of our method are that it uses the grid vertex to replaces the image pixel,significantly reducing the amount of image data to be edited,and can greatly improve the speed of color editing.

3.2 Image preprocessing

To improve time efficiency,we resample input images with interactions by using a bilateral grid [Chen,Paris and Durand (2007)].The bilateral grid can be described as a five-dimensional array where the first three dimensions represent theLabvalue and the last two dimensions correspond to the image space coordinates.As a result,the input image data can be described in the high-dimensional feature space.

It should be noted that all experiments in this paper were performed in theLabspace.The related parameterssl,scandssare respectively the sampling rate of the gray value axis,abchannel,and the spatial coordinate axis (x,y).Intuitively,the size of the bilateral grid is inversely proportional to the sampling rate of each dimension.By setting the appropriate sampling rate,the number of bilateral grid vertices will be much smaller than the total number of pixels in the image,which can greatly reduce the amount of image data needing to be edited during color editing propagation.

In this paper,given an input imageIwith user interaction,in a bilateral grid of a specified size,each image pixel will be mapped to the corresponding grid cell according to the following formula：

where [.]is the closest-integer operator for calculating a grid vertex coordinate corresponding to each pixel.The homogeneous coordinates(I(x,y),1)are used to accumulate the color values and the total number of pixels in each grid cell.In order to define the subsequent bilateral grid-based editing propagation,we also need to obtain the color valuesof each grid vertexvi：

Thus,the input feature vector for each grid vertexviis represented as,while the interactive feature vector for each marked grid vertexvjis represented as.Since the feature vector of the grid vertex does not contain gray values,the above two feature vectors are four-dimensional.

It should be noted that the grid cells containing black marker pixels do not require calculation of the color values and coordinate values of their vertex;these grid cells will also be ignored in the subsequent re-coloring of the grid vertex.

3.3 Color editing propagation

In the bilateral space,we transform the editing propagation into a discrete point interpolation problem.We define the total set of all marked grid vertices asG.

In this paper,we use the least squares method to define the energy function [Amiri-Simkooei,Mortazavi and Asgari (2015)].Constructing the energy function in this way allows for minimization of the editing difference between the vertex of the re-coloring grid and the vertex of the marked grid after the energy function is minimized.The energy function is as follows：

where gvi is the marked feature vector corresponding to the grid vertexvi,while fvi is an input feature vector of the grid vertexvi.is the feature vector edited by the grid vertex.This paper defines h(f)as the radial basis function [Wang (2017)]：

where aviis the weight coefficient of the proportion of the control grid vertexin h(f),which is unknown.In the subsequent optimization edit,we will explain how to solveavi.fis the feature vector of the vertex of the grid to be edited,and the radial basis function isφ(r)=exp(-r2).

Minimizing the energy function is actually a problem of solving a series of linear equations.The classical method for solving a series of linear equations is to use the normal equation [Huang and Chen (2002)].

Substituting Eq.(3)into Eq.(4)obtains a complete energy functionJ：

In Eq.(5),the only unknown is the weighting factor,which is obtained by substituting the matrix：

Where X is the input matrix,a is the coefficient matrix to be solved,and Yis the output matrix,G=[i1,i2....ii].

The above energy formula is obtained by partial derivative of the coefficient matrix and calculated by matrix derivation：

Let the derivative ∇aJ (a)=0be expressed as the normal equations：XTX a=XTY.Therefore,the minimum value ofa is：amin =(XTX)-1XTY.

Once the relationship function h is determined,the grid vertex can be re-colored.After re-coloring the grid vertex,the next step is inverting the color of the grid vertex and the vertex of the marked grid to each pixel point it contains,so that the color value of the pixel in the grid is consistent with the color value of the grid vertex.However,the gray value remains unchanged.Finally,all pixels in the input image grid are inversely mapped to theRGBcolor space to obtain the output image after re-coloring.

4 Experimental results

4.1 Experimental settings

We test on the Windows 7 operating system,which runs on a PC equipped with a 2.80 GHz CPU and 8 GB of memory.Firstly,the optimal sampling rate parameters are obtained,after which the effectiveness of the proposed method is verified by the quality and efficiency of the experimental results.

Figure2:Color editing results at different sampling rates.The figures on the bottom row show the color editing results with different parameters,while the figures on the top show the corresponding probability map

The division of the bilateral grid is determined by the sampling rate.The number of grid cells greatly affects both color editing results and time efficiency.Generally speaking,we have a fixed coordinate sampling ratessthat is one-tenth of the image size,while the sampling rates of the gray value axis andabchannel (slandsc)are set to 15.As shown in Fig.2(c),once the sampling rate is lower,the excessive bilateral grid cells will leave most pixels out of a grid.The number of bilateral grid cells is reduced as the sampling rate increases,which facilitates subsequent color editing and propagation.However,a high sampling rate will result in fewer grids.Once a grid contains a tremendous number of pixels,this will cause a numerical overflow,as shown in Fig.2(b).After repeated adjustment of the parameters,we select thesl=sc=15 sampling rate as the subsequent experimental parameters.

We selected four images randomly downloaded from the Internet as the experimental images,which have been namedstarfish,fruit,children and flower.All of these are natural images that can generally be divided into two classes：one class containsstarfishandfruit,which are the images with repeated scene elements,while the other containschildrenandflower,which are the images with complex scene elements.

To conduct the comparison,we employed four different color editing methods to be compared with our method [Li,Ju and Hu (2010);Chen,Chen and Zhao (2012);Xu,Yan and Jia (2013);Chen,Li,Chen et al.(2016)],which are referred to as Li et al.[Li,Ju and Hu(2010);Chen,Chen and Zhao (2012);Xu,Yan and Jia (2013);Chen,Li,Chen et al.(2016)]respectively (because Chen has proposed two different methods,we distinguish between them using the year of publication).In order to get the color edit output image,each method needs to interact with the input image before color editing takes place.Tab.1 presents the user interaction statistics of the images for the different methods.The interacted images are then color edited,and the results of this process are presented in Fig.3.

Table1:User interaction statistics

Chen et al.’s [Chen,Chen and Zhao (2012)] requires fewer user interactions for a singlecolor image.However,when color editing is performed on an image with a large color difference,the number of user interactions increases.The advantage of Xu et al.’s [Xu,Yan and Jia (2013)] method is that the color editing propagation takes the spatial positional features into account,enabling variable color editing of the same area.However,its shortcoming is also obvious in that the number of interactions is slightly increased.For both Chen et al.’s [Chen,Li,Chen et al (2016)] method and our method require few user interactions,thus we make the same number of interactions on the same images using these two methods,while making more interactions by Chen et al.’s [ Chen,Chen and Zhao(2012);Xu,Yan and Jia (2013)] method.

Figure3:The color editing results of different color editing methods：(a)Li’s method;(b)Chen’s method (2012);(c)Xu’s method;(d)Chen’s method (2016);(e)our method.The figures in the odd-numbered line are the input interacted images,while the figures below are the corresponding color editing images

In the subsequent method quality comparison,the quality of the color editing results was quantitatively evaluated using the image’s structural sim ilarity index,or SSIM [Wang and Bovik (2002);Wang,Bovik,Sheikh et al.(2004)].For the original target imageI1 and the color edited imageI2,the SSIM can be obtained using the follow ing formula：

whereuI1anduI2are the average color values of the imageI1and the imageI2respectively.σI1andσI2are the color variances of the imageI1and the imageI2respectively.c1andc2are the constants added to prevent the denominator from being zero.

4.2 Image quality comparison

Fig.3 shows the color editing results of our method,Li’s method,Chen et al.’s [Chen,Chen and Zhao (2012)] method,Xu et al.’s [Xu,Yan and Jia (2013)] method and Chen et al.’s [Chen,Li,Chen et al.(2016)] method for images w ith repeated scene elements and complex scene elements.For the color editing images in Fig.3,we calculated their corresponding SSIM,which are presented in Tab.2.As shown in Tab.2,when compared w ith Li et al.’s [Li,Ju and Hu (2010);Chen,Chen and Zhao (2012);Xu,Yan and Jia (2013)] method,our method has the highest SSIM values for all images with both repeated scene elements and complex scene elements.Li’s method results in color m ixing and its SSIM values are the lowest.The Chen et al.’s[Chen,Chen and Zhao (2012)] method uses the local linear embedding method to effectively eliminate the boundary color m ixing ring and make up for the shortcom ings of the Li et al.’s [Li,Ju and Hu (2010)] method;as a result,the SSIM values are higher for Chen’s method (2012)than Li et al.’s [Li,Ju and Hu (2010)] method.However,Chen et al.’s [Chen,Chen and Zhao (2012)] method results in color loss when editing colors locate in multiple different areas.Xu et al.’s [Xu,Yan and Jia (2013)] method considers spatial position features to enable variable color editing of the same area more efficiently.Moreover,Xu et al.’s [Xu,Yan and Jia (2013)] method overcomes the overflow and color loss exhibited by Chen et al.’s [Chen,Chen and Zhao (2012)] method,meaning that it has higher SSIM values than Chen et al.’s [Chen,Chen and Zhao (2012)] method,although it also requires more interactions.For its part,Chen et al.’s [Chen,Li,Chen et al.(2016)] method obtains a representative and compact dictionary and performs edit propagation on the dictionary instead.While Chen et al.’s [Chen,Li,Chen et al.(2016)]method has excellent SSIM values,its time efficiency could still be improved.Finally,our method uses the optimized radial basis interpolation model to greatly improve the editing quality;as a result,our method can generate good-quality new color editing images with the highest SSIM values.

Table2:Statistics of SSIM values

4.3 Image color editing efficiency comparison

Color editing methods usually contain two stages：(1)importing image data and building a model;(2)color editing propagation.We calculated the running time of the two stages of each method in Fig.3.The results are listed in Tab.3.

Tab.3 compares the time taken to conduct image color editing for the images in Fig.3.

Table3:Running time statistics

As can be seen from the Tab.3,although the proposed method takes the most time in the first stage,it takes the least time by far in the second stage,with the result that the total running time of the proposed method is minimal.In the proposed method,the pixels are clustered into a grid in a bilateral space,which greatly reduces the selection of sample points,the computational complexity and the memory loss.Since the establishment of a bilateral grid is required in the first stage,our method requires more time in the first stage,but much less time in the second stage.Thus,our method requires only a small amount of user interaction to complete the color editing propagation task in a short time.The time efficiency of our method is thus highest than that of existing color editing methods.Li et al.’s [Li,Ju and Hu (2010);Chen,Chen and Zhao (2012)] method involve performing similarity judgment and color editing propagation on a single pixel;therefore,these methods take a long time.Xu et al.’s [Xu,Yan and Jia (2013)] method needs to calculate the spatial position feature,meaning that it takes an even longer time.Finally,Chen et al.’s[Chen,Li,Chen et al.(2016)] method uses a sparse dictionary for editing propagation;sparse dictionaries can reduce computations and significantly reduce time loss.

5 Conclusion

In this paper,we propose a fast image color processing method based on bilateral grid.By providing only a small amount of user interactions,different regions of the image can be edited simultaneously by combining the color features and coordinate features of the image.Using bilateral grids to cluster pixels greatly reduces the computational effort and memory loss,with the result that the time efficiency of the proposed method is improved.A grid with pixels similar to the color of the sample pixels and without marked pixels is defined as an edited grid,which can be used to accurately re-color the vertices of the edited grid.Accordingly,the proposed method realizes fast,accurate and high-quality color editing for images.

Acknowledgments:This project is supported by National Natural Science Foundation of China (No.U1836208,No.61402053 and No.61202439),Natural Science Foundation of Hunan Province of China (No.2019JJ50666 and No.2019JJ50655)and partly supported by Open Fund of Hunan Key Laboratory of Smart Roadway and Cooperative Vehicle-Infrastructure Systems (Changsha University of Science &Technology)(No.KFJ180701).