A Non-Extensive Entropy Algorithm for Multi-Region Segmentation: Generalization and Comparison

Since eighties, the concept of Shannon entropy has been applied in the ﬁeld of image processing and analysis for image binarization. The general idea underlining this type of image segmentation is the maximization of the information (entropy) associated to background and foreground pixels. Under the context of mechanical statistic, physical systems that may be well described by Shannon entropy are called extensive systems . It is well known, however, that several physical systems have non-extensive behavior, letting the Shannon entropy inadequate to describe such systems. A generalization of the Shannon entropy is the so called q-entropy , proposed for non-extensive systems. This paper presents an algorithm for image segmentation base on this new kind of entropy. We show that, applying recursively the proposed algorithm, it is possible to highlight the main image regions without oversegmentation. In order to validate our proposal, we compare it with four well known approaches for image clustering, namely: bootstrap, fuzzy-c-means, k-means and self-organized maps. Besides, we have used synthetic images to quantitatively compare these methods. Our results show that the proposed algorithm overcomes the traditional Shannon entropy for image binarization and gets multi-regions with similar performance in terms of precision and time of execution.


Index Terms
SEG-CLST Clustering-based methods; SEG-VDOB Video object segmentation and tracking; BMI-SEGM Biomedical image segmentation and quantitative analysis I. INTRODUCTION Image segmentation plays an important role on the basis of computational vision tasks, such as image analysis, recognition and tracking, to name a few.Image segmentation is a basic but important and complex problem which has been intriguing the researches for decades.The issue behind image segmentation is to decompose the image into regions of coherent properties in an attempt to identify objects and their parts.Gray level image segmentation techniques can be classified into the following categories: thresholding, methods based on feature space, edge detection based methods, region based methods, fuzzy logic techniques and neural networks.Besides, most of these methods can be extended to color images by representing color information in an appropriate color spaces.In addition, it is possible to combine more than one approach to achieve better performance.Among these methods, the thresholding ones are fundamental for this work.Thresholding is a large class of segmentation techniques that are based on the assumption that the objects can be distinguished and extracted from the background by their gray levels.The output of traditional thresholding operations is a binary image whose intensity pattern distinguish the foreground (gray level 0, for example) from the background (gray level 255).A survey can be found in references [12] and [27].
In general, threshold selection can be categorized into two classes, local methods and global methods.
In the global thresholding methods a single threshold is selected through the image histogram and applied for the segmentation task, while in the local methods the image is partitioned into a number of subimages and a specific threshold selected for each one.The global thresholding techniques are computationally cheaper than local methods as well as easy for implementation.However, its efficiency is highly dependent of the histogram distribution, specifically peaks separated by a valley, which limits their applications.In this sense, global methods tend to be more efficient, although May 23, 2006 DRAFT more computational expensive also, as a general threshold estimation is not necessarily an easy task.
Applying the concept of entropy in order to segment a digital image is a common practice since PUN's work [25] shows how to to find out a threshold that maximizes the information measure, the Shannon entropy, of the resulting binary image.Other works following the same philosophy were proposed, e.g, Kapur et al [14] maximized un upper bound of the total a posteriori entropy in order to obtain the threshold level.Abutaleb [1] extended the method using two-dimensional entropies.Li and Lee [18] and Pal [24] used the directed divergence of Kullback for the selection of the threshold, and Sahoo et al. [27] used the Reiny entropy model for image thresholding.
The concept of Shannon entropy was proposed for Theory of Information based on Boltzaman-Gibbs entropy for the context of classic thermodynamic.However, for several decades it is well known that this concept fails to explain some phenomena having complex behaviors such as long range interactions and long-time memories [35], [37].Such systems are called " non-extensive systems", and those following the BGS formalism are called " extensive systems".
In 1988 Tsallis proposed a new formalism for the generalization of BGS entropy, which is called q − entropy or Tsallis entropy.This new entropy has reached relative successful in explanation complex phenomena for several applications.A complete list of nonextensive entropy is vast and can be fully find in [37].
In 2004, Albuquerque et al [2] applied the concept of non-extensive entropy for mamographic gray scale images.They assume a probability distribution of gray scale luminance, one for background and other for foreground classes of pixels.Then, they had taken the threshold that maximizes the separation between these two classes.
However, in some classes of images, where the region around the boundaries has important information for the application, a simple thresholding or clustering method does not guarantee a May 23, 2006 DRAFT good separation between the classes.Examples of such images are mamographic ones from ultrasound devices having breast tumor, where, on the boundary of such lesions (e.g.malignant ones), dead cells are among healthful ones, generating a third region in the image.As these regions have generally a small narrowband, simple threshold methods may fail in capturing adequately their information content.On the other hand, most sophisticated methods, which combines several basic approaches may have a high computational time.
In this paper, we proposed a extension of the work of Albuquerque et al. [2].Applying in a recursive way the q-entropy concept at each partition (background and foreground).Our proposal gets more information (in terms of number of retrieved regions) at a low computational time.Also, we have total control over the number of recursions and, as a consequence, over the number of retrieved regions.
The paper is organized as follows.In Section II we indicate some related works; in Section III we introduce the q-entropy under the context of non-extensive systems; the proposed method is seen in Section IV; in Section V we compare our methodology against other well known methods from a point of view of natural and synthetic images.Finally, in Section VI we offer some discussions and conclusions about the results.

II. RELATED WORKS
The mean-shift algorithm is a general nonparametric technique proposed by Comaniciu and Meer [8] for clustering of complex multimodal feature space.It randomly tessellates the space with search windows, and moves until convergence is achieved at the nearest mode of the underlying probability distribution of density gradients.Several application of this algorithm for color clustering can also be found in [20], [26], [38].
Bootstrap clustering technique is similar to other resamplig schemes, such as cross-validation and jackknifing.A bootstrap is obtained by sampling with replacement from an empirical distribution function from training set.Chen et al [6] applied a bootstrap implementation to computer-aided diagnosis in breast ultrasound images.Dutendas et al [11] presented an Bayesian approach combined with a bootstrap algorithm in order to segment images from the retina.Another two applications applying bootstrap techniques on image segmentation can be found in [21], [40].
The Watershed transform is a reliable tool for initial image segmentation.A significant advantage of Watershed segmentation and a reason behind its extended utilization is that boundaries on the image plane are always guaranteed to be connected and closed, and each gradient minimum corresponds to one region [22].A nice explaination about Watershed algorithm can be found also in [13].
As an unsupervised clustering algorithm we can cite the Fuzzy c-mean (FCM) which has been applied successfully to a number of problems involving feature analysis, clustering and classifier design.It has been applied to wide variety of applications such as agricultural engineering, astronomy, chemistry, geology, image analysis, medical diagnosis, shape analysis and target recognition [4].
Unlabeled data are classified by definition of a norm, cluster prototype and by minimizing an objective function.Although the description of the original algorithm dates back from 1973 [3], [10], further variations have been described with modified definitions for the norm and prototypes for the cluster center [17] and [9].
One of the most used methods in the last years to image clustering is the so called self organizing maps (SOM), proposed by Kohonen [16].The SOM neural network consists of two layers, and for every neuron in the input layer, there is a link to every neuron in the output layer.During the training process of SOM network, for each input vector we get one best matching neuron in the output layer.
Here a competitive learning algorithm is used to adjust weight vectors in the neighborhood of best matching neuron.The adjustment decreases as the time and the range of neighborhood increasing.

III. TSALLIS ENTROPY
The entropy is an idea born in the heart of the classic thermodynamic, not as some fundamentally intuitive, but as some fundamentally quantitative, defined though an equation, which has been known as Boltzman-Gibbs (BG) entropy.Later, Shannon redefined the concept of BG entropy (now called BGS entropy) as an uncertainty measure associated to the content of system information.This traditional form of entropy is well known by the following equation: Generically speaking, systems having statistics such as BGS type are called extensive systems and have an additive property, defined as follows.Let P and Q be two random variables, with probability densities functions P = (p 1 , . . ., p n ) and Q = (q 1 , . . ., q m ), respectively, and S be the entropy associated with P or Q.If P and Q are independent, under the context of the Probability Theory, the entropy of the composed distribution 1 verifies the so called additivity rule: This traditional form of entropy is well known and for years has achieved relative success to explain several phenomenon if the effective microscopic interactions are short-ranged (i.e., close spatial connections) and the effective spatial microscopic memory is short-ranged (i.e., close time connections) and the boundary conditions are non(multi)fractal.Roughly speaking, the standard formalism are applicable whenever (and probably only whenever) the relevant space-time is non(multi)fractal.If this is not the case, some kind of extension appears to became necessary.We can make a complete analogy with Newtonian mechanics, when it becomes only an approximation (an increasingly bad one) when the involved velocities approach that of light or the masses are as small as say the electron mass; 1 we define the composed distribution, also called direct product of P = (p1, . . ., pn) and Q = (q1, . . ., qm), as P * Q = {piqj}i,j, with 1 ≤ i ≤ n and 1 ≤ j ≤ m May 23, 2006 DRAFT the standard statistical mechanics do not apply when the above requirements (short-range microscopic interactions, short-ranged microscopic memory and (multi)fractal boundary conditions) are not the case.
Then, recent developments, based on the concept of non-extensive entropy, also called Tsallis entropy, have generated a new interest in the study of Shannon entropy for Information Theory [28], [36].Tsallis entropy (or q-entropy) is a new proposal for the generalization of Boltzmann/Gibbs traditional entropy applied to non-extensive physical systems.
The non-extensive characteristics of Tsallis entropy has been applied through the inclusion of a parameter q, which generates several mathematical properties and the general equation is the following: where k is the total number of possibilities of the whole system and the real number q is the entropic index that characterizes the degree of non-extensiveness.In the limit q → 1, Equation (3) meets the traditional BGS entropy defined by Equation ( 1).These characteristics give to q-entropy flexibility in explanation of several physical systems.On the other hand, this new kind of entropy does not fail to explain the traditional physical systems since it is a generalization.
Furthermore, a generalization of some theory may suppose the violation of one of its postulates.In the case of the generalized entropy proposed by Tsallis, the additive property described by Equation ( 2) is violated in the form of Equation ( 4), which apply if the system has a non-extensive characteristic.
In this case, the Tsallis statistics is useful and the q-additivity describes better the composed system which is defined as: In this equation, the additive property does not hold unless in the limit q = 1.
Considering S q ≥ 0 in the pseudo-additive formalism of the Equation ( 4), the following classification for entropic systems is defined: sub-extensive system (when q > 1 and S q (P * Q) > S q (P ) + S q (Q)); extensive system (when q = 1 and S q (P * Q) = S q (P ) + S q (Q)) and superextensive (when q < 1 and S q (P * Q) < S q (P ) + S q (Q)).
Taking into account the similarities between the formalisms of Shannon and Boltzmann/Gibbs entropy, it is interesting to investigate the possibility of the generalization of Shannon entropy to the case of the information theory, as has been recently shown by Yamano [39].This generalization may be extended to image segmentation tasks, by applying Tsallis entropy, which has nonadditive information contents.
Albuquerque et al. [2] proposes an algorithm using the concept of q-entropy to segment US images.
Since this concept may be naturally applied over any statistical distribution, in this paper we propose a natural extension of the algorithm proposed by Albuquerque et al. [2] which yields to a recursive procedure by applying to each distribution P and Q the concept of q-entropy.As said, we have named our extended algorithm as NESRA.
The motivations to use the q-entropy are: 1) managing only a simple parameter q yields to a more controllable system; 2) as suggested in [2] for mammographic images, it is interesting to study the behavior of non-extensive segmentation under several others class of images; 3) it is simple and makes the implementation easy having a low computational time.
In the following section, we fully describe the NESRA proposal.

IV. THE NON-EXTENSIVE SEGMENTATION APPROACHES
Applying the concept of entropy in order to segment a digital image is a common practice since PUN [25] showed how to maximize the difference between the foreground and background using the Shannon entropy over a gray level distribution.Then, other works following the same line were proposed, e.g, Kapur et al [14] maximized un apper bound of the total a posteriori entropy in order to obtain the threshold level.Abutaleb [1] extended the method using two-dimensional entropies.Li May 23, 2006 DRAFT and Lee [18] and Pal [24] used the directed divergence of Kullback for the selection of the threshold, and Sahoo et al. [27] used the Reiny entropy model for image thresholding.
In 2004, Albuquerque et al [2] presented the concept of non-extensive entropy applied to mmamographic gray scale images.They assume a probability distribution, one for background and other for foreground and take the threshold that maximizes the non-additivity characteristic given by Equation ( 4).However, as several methods designed to produce binary images, this approach does not work for multiregion segmentation as well.Then, we proposed an extension of the method presented in [2] applying recursively the maximization of Equation ( 4) over the background and the foreground in order to achieve multi-regions of homogenous gray level distribution.
In this section, we formalize the NESRA algorithm.Firstly, we will review the non-extensive procedure for image segmentation proposed in [2].

A. Non Extensive Segmentation Algorithm for Image Binarization
Suppose an image with k gray-levels, let the probability distribution of these levels be P = {p i = p 1 ; p 2 ; . . .; p k }.Now, we want to consider two probability distribution from P , one for the foreground (P A ) and another for the background (P B ).We can make a partition at luminance level t between the pixels from P into A and B. In order to maintain the constraints 0 ≤ P A ≤ 1 and 0 ≤ P B ≤ 1 we must to re-normalize both distribution as: P A : p1 pA , p2 pA , . . ., pt pA and P B : pt+1 pB , pt+2 pB , . . ., pk pB , where Now, we calculate the a priori Tsallis entropy for each distribution such as We can observe that the Tsallis entropy represented by Equations ( 3), ( 5) and ( 6), depend on directly the parameter t for the foreground and background, and it is formulated as the sum of each entropy, May 23, 2006 DRAFT allowing the pseudo-additive property for statistically independent systems, defined in Equation (7).
To accomplish the segmentation task, in [2] the information measure between the two classes (foreground and background) is maximized.In this case, the luminance level t is considered to be the optimum threshold value (t opt ), which can be achieved with a cheap computational effort of Note that the value t which maximizes Equation ( 8) depends on mainly the parameter q.This is an advantage due to its simplicity.In the next section, we present the recursive formulation for this algorithm.
having the constraints p A1 = t i=1 p i , p A2 = t+1 p i , p B1 = υ +1 p i , p B2 = k υ+1 p i .For each one of these four distributions we can compute its respective non-extensive entropy as follows: Using Equations ( 4), ( 9) and ( 10) to compute S(A) = S(A1 + A2) and, similarly (11) and ( 12) to compute S(B) = S(B1 + B2), we have: In this case, we have a set of optimal values t opt = {t, , υ, k} which maximizes the Equation (13) which is equivalent of calculation of the following expression: The Equation ( 13) is simple, although with several terms.By developing a third recursion would yielding the number to sixteen, which is not necessary for our discussion.Now, two observations may be done.First, the experimental results show that it is not necessary more than two or three recursions in order to obtain results which are equals or better than traditional methods.Second, the growing of the number of recursion does not enlarge the algorithm complexity or computation, since this growing accompanies a dropping of states to be computed at each recursion.Since at each recursion the data are partitioned into two new groups, the computational complexity of the NESRA algorithm is O(cN ), where c is the number of recursions and N = lines × columns is the size of image.Also, we can preview the maximum number of achieved regions as 2 c+1 .When c = 0 the NESRA outputs two regions (background and foreground) only, and behaves as the proposal of May 23, 2006 DRAFT Albuquerque et al. [2], letting NESRA be a generalization of [2].
The recursive algorithm for the previous formulation is simple and we give it in the following.for all t = i until k do compute normalization for background compute normalization for foreground compute q-entropy for background according to Equation ( 5) compute q-entropy for foreground according to Equation ( 6) compute composed q-entropy according to Equations ( 7) and ( 8) end for topt = argmax of the composed q-entropy Call NESRA(H, i, topt) procedure Call NESRA(H, topt + 1, k) procedure FIM: there is nothing to do, return to calling procedure It can be argued that any parametric or non parametric method with the objective of finding an ideal threshold, producing a binary image, can be recursively used in the foreground and in the background in order to achieve multi-regions.However, our approach, based on non-extensive entropy, out performs several well known approaches under the same conditions as we show in section V.

V. EXPERIMENTAL RESULTS
To show the robustness of our proposed multi-region segmentation algorithm as well as the range of its application, we have experimented it over four classes of gray scale images: a natural image (sunset), mammographic ultrasound image, object with homogeneous background and lenna.The sunset image (Fig. 1.a) is suitable to test the method under low contrast intensity patterns.On the other hand, medical image applications have been a source of scientific investigation with several open challenges too, which demand for well segmentation algorithms.Therefore, we have chosen May 23, 2006 DRAFT a well known class of medical images to test the robustness, namely: mammographic images (Fig. 1.b).The ultrasound images have as main characteristics high speckle noises, low resolution, many spurious regions and bed region of interest definition, demanding for specific algorithms.Then, it also is suitable to test any algorithm for image segmentation.

A. Non-Extensive × Extensive Results
The range of q values is a key issue in our proposal.In this paper, we have tested the q − entropy segmentation (recursive and non-recursive) for a range 0 < q ≤ 10.0.All experiments were carriedout under three different values of q in the Equation ( 13).The main proposal of this section is to show the results comparing our proposed algorithm with the non recursive version.The first row of Fig. 2 (images a-c) presents the results of the segmentation of Fig. 1.a with Tsallis entropy (non-recursive) for values of q = 1.0, 6.0 and 10.0, respectively.Clearly, this image has five main regions: the solar circumference, the solar crown, the sky, mountains and buildings.However, only two regions were retrieved by the algorithm: with threshold t = 101, q = 1.0 (Fig. 2.a); t = 113, q = 6.0 (Fig. 2.b), and t = 55, q = 10.0 (Fig. 2.c).
The second row of Fig. 2 (images d-f) presents the results of the segmentation of Fig. 1.a with our NESRA algorithm (the recursive version) also for values of q = 1.0, 6.0 and 10.0, respectively.
In this case, three more regions were outlined and the results, under the same q values, are clearly better than those on the first row.The Fig. 2.d shows four regions which were retrieved with threshold t = {101, 135, 55}, all for q = 1.0.Similarly, Fig. 2.e shows four regions retrieved with threshold t = {113, 161, 63}, all with q = 6.0; and Fig. 2.f shows four regions retrieved with threshold t = {55, 79, 40}, for q = 10.0.
Bellow each image are the sequence of retrieved thresholding.For example, Fig.We can find more regions if we apply the NESRA one second recursion, finding up to 2 2+1 = 8 regions.The third row of Fig. 2 (images g-i) shows the results when we use the same q values as before.In this case, the sequence of generated thresholds is as follows.but for values of q = {6.0,10.0}, respectively.A third recursion, which could yield up to 2 3+1 = 16 regions is not necessary for this image, as no more salient region could be retrieved.A first important observation to this experiment is that the Fig. 2.h, generated with two recursions and q = 6.0 (subextensive system), seems to be the most interesting result as it delineates better the target regions.A second important observation is that the regions given by a recursion i always preserves the separation achieved by recursion i − 1, which may be a nice characteristic of NESRA algorithm as the user can get more control over local details.This control is achieved handling the number of recursions.This is specifically useful in applications where we can have coarse transitions in the boundary of main regions, such as ultrasound images which have large central regions (representing lesions) with mixing of dead and healthful cell on the lesion's boundaries.In this case, we may have a good separation between the lesion's nucleus, the background and the narrow region boundary (transition).This will be well observed in the next experiment.
The second class of images we have tested with the NESRA algorithm is from mammographic of ultrasound devices (Fig even more powerful, reducing the physician's work load (see [34] and references there in).
The first row of Fig. 3 shows three segmentation results (without recursion) of Fig. 1.b.The given (a) t = 91, q = 1.0(b) t = 52, q = 6.0 (c) t = 10, q = 10.0(d) t = {46,91,137}, q = 1.0 (e) t = {35,52,99} q = 6.0 (f) t = {4,10,19} q = 10.0 thresholds are 91, 52 and 10 for q = 1.0, 6.0 and 10.0, respectively.As the last experiment, we can preserve the partition given by the first NESRA application and can find more regions applying NESRA recursively.The results are shown in the second row of Fig. 3.The values of t below the images of Fig. 3.d-f are the given thresholds for this first recursion.The best result seems to be that of Fig. 3.e as we clearly have a good segmentation of lesion nucleus separated from the background by a narrow boundary with distinct threshold as a third region.This is particularly useful if the CAD systems want to analyze separately the boundary region in order to decide if it belongs to the lesion region or to the background.Also, it may help to extract the lesion's nucleus from the image.We can also note that the thickness of boundary region may be controlled by setting the q value.As q → 1.0 (extensive systems) the region of interest increases and tends to merge with the background.On the contrary, as q > 1.0 increases (subextensive systems), the region of interest is easier to define but diminishing missing (important) information.
The third data set we have used in our experiments is composed of synthetic images from the Columbia database [7].Although these images have homogeneous background, which helps the foreground extraction, several applications need the recognition of parts of objects.Then, the binarization of this kind of images may not work.This fact may be observed in the segmentation of Fig. 1.c outlined in Fig. 4.a-c.In this figure, the unique acceptable segmentation seems to be that of Fig. 4.a, when q = 0.When q = 0.5 or 1.0 the results are clearly inferior.When applying our proposed method, however, (Figs.4d-f.)we can reach better results for all values of q.The best result seems to be that of Fig. 4.d (q=0).
Finally, we have experimented our proposed algorithm over the lenna image, with the same previous q = {0, 0.5, 1.0} values.Then, in Fig. 5.a-c, it is shown the application of the extensive algorithm; and, by applying the proposed non-extensive algorithm over the same original image, we have better results in terms of segmentation as may be seen in the corresponding images of Fig. 5.d-f.In this case, an unique recursion was applied, which was sufficient to get four main regions.

B. Comparisons with Other Methods
In this section we compare our proposed approach with four well known algorithms which were briefly described in Section II, namely: bootstrap, fuzzy c-means, k-means and SOM.In Fig. 6 there are four columns and five rows.Each column corresponds to an image from Fig. 1, and each row corresponds to a different method (named from up to bottom): our proposed NESRA algorithm, boostrap, fuzzy c-means, k-means and SOM.
The second row shows the corresponding three segmentations when we applied our proposed method for the same q values as before.
(a) t = 129, q = 0 (b) t = 124, q = 0.5 (c) t = 122, q = 1.0(d) t = {129, 184, 75}, q = 0 (e) t = {124,176,78}, q = 0.5 (f) t = {122, 174, 79}, q = 1.0The NESRA was applied with two recursions, which means that there are at most four main regions.Bootstrap, fuzzy c-means and k-means was set in order to find up to five clusters, and the SOM network was set to spread their input data among a 3 × 2 rectangular grid of neurons.However, similar results were obtained with a 4 × 4 and 6 × 6 rectangular grids.Regarding these set up, we can make the following discussion about the results presented in Fig. 6.
In the first column of Fig. 6 (which correspond to Fig. 1.a) we can see that the NESRA and KM give similar results in comparison to Bootstrap, FCM and SOM, since NESRA and KM give four main salient regions, which correspond to the light around the sun, the sky, the horizon and buildings.
Additionally, NESRA seems to delineate better these four regions.
The second column of Fig. 6 shows the application of all algorithms over a mammographic images.
All algorithms traced well the tumor (central circular region).KM obtained more homogeneous regions for the background.All results are similar for background for the remaind methods.
In the third column (a cup with some drawings with homogeneous background corresponding to Fig. 1.c), we may discard only SOM results (row 5) and analyze only the other four, which seems to generate similar results.Among them, FCM generates small noises, NESRA, Bootstrap and KM seems to produce similar (good) segmentation.
In the last column shows the results for lenna image.All approaches except NESRA and bootstrap generated an oversegmentation, but NESRA has clearly low noise and homogeneous main regions.

C. Synthetic Images
Finally, in order to quantify a comparison of the methods, we have ran each algorithm with a synthetic image consisting of two concentric circles with radius 100 and 50.The intensities of the background and the outer and inner circler are 150, 100 and 50, respectively.These intensities do not have large distances in their histograms.This image is seen in Fig. 7 with three examples of increasing application of gaussian noise with µ = 0: σ 2 = 0.01, 0.05 and 0.1.
The experiment consists of applying in this synthetic image a gaussian noise with zero mean and variance σ 2 > 0. After, a 2D n × n adaptive noise removal filter was applied, which causes noise dropping but blurring the frontiers between the inner and outer circles and the outer circle and the background as well.Although the value of n = 9 has been chosen empirically, it is the same for all algorithm in order to put them under the same conditions.Also, in the case of NESRA method, the value of q = 0.001 was applied for the synthetic image.This value was chosen empirically and the good results indicate the system non-extensiveness.
After, a procedure based on the Door-In-Door-Out algorithm was implemented so that to extract the circles's boundaries and further their coordinates.Our goal is to measure the robustness of the methods in achieving the original boundaries under increasing values of noise variance.Then, we use the original boundary as a ground truth and compute how far from it is the estimated one.The coordinates of the inner and outer original curves (ground truth) were obtained in a straightforward manner by analytical calculation, given their radius and center.
The comparison between an original and an estimated curve was carried out through the PolyLine Distance Measure (PDM) method, proposed by Suri et al [30]- [33].The PDM is defined as the closest distance from the each estimated boundary point to the ideal/ground-truth region of interest.
The closest distance of each estimated boundary point can be the perpendicular distance (shortest Euclidian distance) to one skin-line, or can be one of the end boundary points joining the points of the closest interval.It is a measure of the average polyline distance of all boundary points of the estimated and ground-truth curve boundaries.
Letting B 1 be the first boundary and letting B 2 be the second boundary, we can derive the PDM measure, d Error poly , as follows: Also, the PDM Equation ( 17 where R is the larger image's diagonal and r is the average radius of the estimated curve.For a fixed N and M , and dividing ( 17) by (18), we can define the robustness of an algorithm in achieving the ideal boundary as: which is a percentage of the maximum error.Note that R → 0 when the error is maximum and R → 1 when the error is minimum.Therefore, in our experiments, we compute R as a function of increasing values of σ 2 in order to see the general performance of the methods while achieving the ideal boundary.We also comparing them under the same conditions and scenarios.
In order to visually compare the methodology's robustness, after applying the 2D adaptive filtering, we superpose the estimated curves on the original images.To give a visual idea of the result, we choose to illustrate this paper with the higher used variance (σ 2 = 0.1) in the segmented synthetic images.

NESRA Fuzzy C-Means K-Means SOM Bootstrap
Fig. 8.The result segmentation of the five algorithms used in this paper.We applied in the original image a gaussian noise with zero mean and σ 2 = 0.1 which is the highest noise used, and after, a 9 × 9 2D adaptive filter was used for smoothing the noise.In the specific case of NESRA algorithm we use the parameter q = 0.001 since it generated the best visual result with more homogeneous and noiseless regions.In the Fig. 8, we show the segmentation applied in the synthetic image for the five methodologies.
In this figure, all the segmented image's regions, only for viewing, were assigned to a different gray scale intensity: black for background, gray for outer circle and white for inner circle.
In the Fig. 9 we show the original (in white) and estimated (in black) curves superimposed over the original image with gaussian noise and the application of the 2D adaptive filter.A final graphic is showed in Fig. 11, where we plotted an average percentage of the original area for the inner, outer and background areas as a function of increasing gaussian noise.
The running time of the NESRA algorithm is proportional of the binary procedures, O(log N ), where N is the image dimension.This is due to as we split the image at each iteration, the number of pixels to be evaluated drops proportionally according to N .

VI. DISCUSSION AND CONCLUSION
We have presented a new method for image segmentation based on non-extensive entropy, which is a recursive version, called NESRA, of that proposed in [2].
The NESRA algorithm is a segmentation procedure for basic segmentation, which should be used in an initial segmentation.As such, it was compared with other algorithms under the same conditions, namely: Bootstrap, Fuzzy C-Means, K-Means and SOM network.All of them were ran for several classes of real and synthetic images.
The results show that the NESRA procedure is able to retrieve all the main clusters for all tested images.The quality of clusters retrieved is also appreciable.The proposed technique gives results at least similar to the best performance for three images (sunset, object and ultrasound) and clearly overcomes the others approaches for lenna image.This is not only an indication that the NESRA is a good strategy for image segmentation, also, by considering an image as a non-extensive system, qEntropy is a promisor line of investigation for image clustering.The proposed technique gives best results with the tested images and it has less time complexity, since it reduces the number of pixels examined at each recursion.
The NESRA algorithm retrieved more homogeneous and noiseless regions avoiding oversegmentation.It is due to the NESRA always splits the data set into two new regions only, yielding to less sensitiveness to noise and spurious regions.
It may be argued that any algorithm for binary segmentation (e.g., such as the well known iterative threshold) may be extended to a recursive version and applied like NESRA.However, it always splits the luminance space into two blocs, generating a background and a foreground regions maximizing the information (associated entropy).In comparison with the traditional method proposed by PUN [25], which uses the Shannon entropy, the NESRA overcomes for all images presented in Fig. 2-5.
This is an indication that the NESRA's advantage is not only its recursion, but also a better separation between the foreground and background which may be given due to the q parameter flexibility.
Regarding the choice of the q parameter, an automatic value can not be computed yet.Then, as future work, we proposed the investigation of the optimal q through an iterative method, since the entropic equations are functions of q.Up to this moment, depending on the class of image, the value of q may be taken from a small discrete range between 0 ≤ q ≤ 1.This is the case for all experiments of this paper.As we can see for all results, there is little changing in the results when q varies.Generically speaking, the q value may be used as a tuning setting when we consider the system as non-extensive.

Fig. 1 .
Fig. 1.c we have a photo of the Columbia database, which was taken under controlled light and

Fig. 1 .
Fig. 1.The four classes of images used in our experiments.

Fig. 2 .
Fig. 2. result segmentation of Fig. 1.a: with Tsallis entropy non-recursive algorithm (a-c) and with our extended proposed algorithm (NESRA) (d-f) for three different q values and two recursions.t stands for the retrieved threshold.
. 1.b is an example).This image shows a benign region at the center of the image with a low SNR background and a coarse transition on the lesion boundary.Adequate extraction of the breast lesion from ultrasound images is an important and open problem as breast cancer ranks second in the list of women's cancer and Computer Aided Diagnosis (CAD) has become

Fig. 3 .
Fig. 3. Result segmentation of Fig. 1.b: with Tsallis entropy non-recursive algorithm (a-c) and with our extended proposed algorithm (NESRA) (d-f) for three different q values and two recursions.t stands for the achieved thresholds.

Fig. 5 .
Fig. 5.The segmentation of lenna image for q = {0, 0.5, 1.0}.The first row presents the simple non-extensive algorithm and second row presents the corresponding recursive version.t stands for the retrieved threshold value.

Fig. 7 .
Fig. 7.The synthetic image used to compare the robustness of the methods.The two concentric circles have radius 100 and 50, and the intensities for the background, outer and inner circles are 150, 100 and 50 respectively.The letfmost image is the original image; the three others, from left to right, have µ = 0 and σ 2 = 0.01, 0.05 and 0.1 gaussian noise respectively.

Fig. 9 .
Fig. 9.The estimated (black) and original (white) curves superimposed over the original image corresponding to the segmentations of Figure 8.

Fig. 10 .Fig. 11 .
Fig.10.Comparative performance of the five used methods as a function of increasing gaussian noise.Left image is for the outer circle and right one is for inner circle.The x-line is the σ 2 and y-line is R according to Equation(19).std is the standard deviation The first application of NESRA achieves the value of partition t opt = 101.Then, the process is repeated (first recursion) to the range t < 101, achieving t opt = 55 and t ≥ 101, achieving t opt = 135.Then, a second recursion is applied to the ranges t ≤ 55, achieving t opt = 39, 55 ≤ t < 101, achieving t opt = 81, 101 ≤ t < 135, achieving t opt = 117, and t ≥ 135, achieving t opt = 164.Therefore, the image of Fig.2.g has the following eight ranges of luminance distributions: 0 ≤ t 1 < 39 ≤ t 2 < 55 ≤ t 3 < 81 ≤ t 4 < 101 ≤ t 5 < 117 ≤ t 6 < 135 ≤ t 7 < 164 ≤ t 8 < 256.We have applied arbitrary luminance values for each one of these ranges.The same reasoning can be carried out to the images of Fig.2.h and Fig.2.i, ) does not need the two boundaries B 1 and B 2 having the same length.Note the two main characteristics of the PDM equation: first, as B 1 → B 2 , d Error poly → 0; second, it is a distance in pixel units.If N × M is the image dimension, it is straightforward to show the maximum value for d Error poly as: