Abstract
For shielding applications that cannot sufficiently be shielded by only a passive shield, it is useful to combine a passive and an active shield. Indeed, the latter does the “finetuning” of the field reduction that is mainly caused by the passive shield. The design requires the optimization of the geometry of the passive shield, the position of all coils of the active shield, and the real and imaginary components of the currents (when working in the frequency domain). As there are many variables, the computational effort for the optimization becomes huge. An optimization using genetic algorithms is compared with a classical gradient optimization and with a design sensitivity approach that uses an adjoint system. Several types of active and/or passive shields with constraints are designed. For each type, the optimization was carried out by all three techniques in order to compare them concerning CPU time and accuracy.
1. Introduction
As the computational cost has decreased significantly during the last years, more and more commercial software packages offer standard algorithms for optimization in order to solve design problems. Many researchers in industry and in academia now use commercial finite-element software in combination with commercial optimization routines, such as genetic algorithms. This approach is usually successful but very time consuming. For example, the design of a combined active and passive shield in [1] required between 8 and 14 days of CPU time. In spite of the increased computational power of computers, the conventional combination of genetic algorithm and finite elements remains a computational challenge.
This paper is devoted to designing a passive and active shield for the same axisymmetric induction heater discussed in [1] and shown in Figure 1 in much less time. Therefore, not only another algorithm is explored (adjoint variable method, explained in Section 3), but also the optimization problem itself is changed in order to take advantage of the adjoint system. The optimization technique is similar to [2, 3] and allows the designer to use commercial software. The numerical model is quasistatic (and not static like in [2, 3]), which results in complex equations in the frequency domain (instead of real equations), and is based on Maxwell's equations. Both currents and geometrical shapes are optimized. Moreover, the optimization is multiobjective, and all of the five contributions to the cost are modelled in the same adjoint system. After validating the adjoint variable approach in Section 4, the method is compared in Section 5 with a classical gradient algorithm and with a genetic algorithm regarding the computation time and the efficiency of the shield.
2. Magnetic Shielding Problem
A magnetic shield is developed to reduce the magnetic stray field of an axisymmetric induction heating device for the heat treatment of aluminum discs. After adding the shield, the shielded device should comply with the reference levels of ICNIRP [4] or the European Community [5]. Next to the field reduction in a predefined “target area,” several other constraints are involved with a magnetic shielding problem: limits on the resistive heating in the shields and on the influence of the heating process, development costs, and geometrical constraints to guarantee the accessibility of the shielded device. In the literature, many papers are published concerning passive shields [6], studying the effect of dimensions, permeability (including nonlinearity, hysteresis), and conductivity of the metal sheets on the shielding effectiveness. A number of papers were written about active shielding [7–9]: field reduction caused by a counter field that is produced by controlling the currents in a number of compensation coils.
Passive shields can be very efficient especially at frequencies above 1 kHz where induced currents become effective, especially if the passive shields are closed, that is, if they completely enclose the source like Faraday's cage. Active shields can be more attractive at lower frequencies especially if the compensation coils are close to the source, because the generator of the compensation current can have a lower voltage and current rating. However, a lot of devices cannot easily be shielded by only a passive or only an active shield because of geometrical constraints. It is shown in [1] that a combination of both types of shielding may improve the performance significantly. Here, the passive shield should be rather close to the source. By flux shunting or by induced currents, the passive shield causes a significant field reduction in the target area that is usually not homogeneous: a high field reduction is achieved in parts of the target region, but still high fields occur in other parts. Here, active shields can additionally reduce the field in these regions.
The equations to solve are Maxwell's equations in the frequency domain:where the total current density consists of the external current density and the induced current density due to the conductivity and the electric field . The system of (2.1) is solved together with the constitutive law and with the following boundary conditions (see Figure 2):expressing the fact that the electric field should be perpendicular to and that the magnetic field should be perpendicular to . Introducing the vector potential so that leads toSubstitution of the electric field yields the equation to be solved by a finite-element model (FEM) [10]:which is written in weak formulation with test variable :taking into account the boundary conditions. The variable p contains all parameters to be optimized: the position and height of the passive shield with thickness in steel, and the (complex) currents and positions , of the active shield coils in the axisymmetric problem. Symbol denotes the standard scalar product defined by Notice that and depend on the position and size of the passive shield. Therefore, they both depend on . The weak formulation has the advantage that and can be discontinuous quantities.
The shielding application that we study is axisymmetric so that the vector potential can be written as . Here, is the azimuthal component of the vector potential of which the - and -components are zero. Maxwell's equations in the frequency domain (2.5) are in the axisymmetric case:On the boundary (see Figure 2) we prescribewhere
The objective value is a function of the scalar potential and of . First of all, it takes into account the magnetic field in the target area (): the main objective is to minimize the magnetic induction in the reference region This can be achieved by minimizing the cost functionalIn order to reduce the complexity of the cost function, we ignore the last terms and choose as cost value. Other objective values in the multiobjective optimization are the dissipation in the passive shield (), the dissipation in the active shield (), the change of the heating of the workpiece by the adding of shields (), and the volume of the passive shield as an investment cost ():Herein, is the cross-section of an active shield coil and is the resistivity of the coil material. The constant is the power dissipated in the workpiece without shields present. The weighting factors , should be chosen as explained in Section 4.
3. Adjoint System
In order to use any gradient-based optimization technique, one has to evaluate the derivative of with respect to in some direction . For simplicity of the exposition, we use the symbol defined asFormal derivation of the cost functional gives uswhere particular derivatives read asThe symbol “" denotes the complex conjugate. The extraboundary integral in is necessary because the domain , on which is evaluated, changes in size when the height of the passive shield changes. Here, we have to evaluate the topological derivative, which depends on concrete settings. In our case of a rectangular passive shield domain between -coordinates and -coordinates , where we change just one dimension, height , we have to evaluateWe used here the mean value theorem. Thus in our case,where is the upper boundary of .
The derivative is used to handle the variation of domain size, considering a transition region. For , one can use an explicit expression: instead of modelling one solid domain with high conductivity (dashed line in Figure 3), we model next to the domain with high conductivity a small transition domain (the domains 2-3-4-5 in solid line in Figure 3) that surrounds the domain with high . In this transition zone, the conductivity decreases linearly from to zero. The derivative differs from zero in this transition domain only.
For , however, one needs to set up corresponding PDE. For clarity, the dependence of is not explicitly written any more in the following equations. Differentiation of (2.6) with respect to results inWe obtain a PDE for . This is, however, not so practical. For the implementation, one needs to know a partial derivative in one component of . Imagine consists of three components. So for computations we need to know three values , , and , that is, we need to compute for three different values of , namely, , , and This is, however, time-consuming since every computation of means solving one PDE.
To solve this problem, we apply the so-called adjoint variable method. This method was successfully used in the optimal design problem in micromagnetism involving the shape optimization of the ferromagnetic core in MRAM memories [11]. We set up a dual problem to (3.6). Find such that for all , the following equality holds:where the functional is composed of those terms from that contain , namely,The functional serves as a pseudosource for the adjoint problem derived from the cost functional. Since the terms in the cost functional are nonzero only on the corresponding subdomains , and , so is the pseudosource. Notice that the cost terms of the active shield dissipation and the volume of the passive shield () in (2.9) do not result in a pseudosource in the dual problem. Although (3.7) and (3.8) represent only one adjoint system, its solution requires two finite-element evaluations. Indeed, the test functions are real-valued functions in the commercial software we use while they should be complex in order to solve (3.7) and (3.8). In practice, the adjoint system is solved a first time using only the real parts of the pseudosources. Afterwards, it is solved a second time using the imaginary parts of the pseudosources. This means that in total three finite-element calculations are needed to obtain the objective value and all gradients. Possibly, as the two adjoint calculations are independent from each other, they can be executed in parallel on appropriate computer architectures.
Knowing the solution of the dual problem (3.7), one can settle the following result: by setting in (3.7) and by setting in (3.6) we arrive atWhen comparing obtained by setting in (3.8) with the derivative of the cost (3.3), it can be observed that all terms occurring in (3.8) also occur in (3.3). By adding the missing terms, we obtain the explicit expression for the derivative of the cost:With this explicit expression for the derivative of with respect to , we are able to evaluate for any . We just need to solve the PDE for .
The actual minimization algorithm starts from some initial guess, for example, or other value obtained, for example, from a genetic algorithm.
The gradient-adjoint optimization algorithm can in principle be seen as a classical gradient algorithm that tries to minimize a cost function by iteratively evaluating a finite-element model (step 2). There is, however, one major difference: the gradients are obtained by solving another (adjoint) finite-element model (step 3) and by using the solution in an expression for the gradient (step 4). In the classical algorithm, the gradients are determined by applying perturbations in the parameters.
4. Validation of the Adjoint Variable Method
The studied application to design shields for is an axisymmetric induction heating device with an excitation coil of 0.2012 m radius carrying a current of 4000 A at 1 kHz. The workpiece of 10 mm height and 191 mm radius is made from aluminum.
We now add a passive shield of 0.65 mm thickness at radial position 0.3 m. Figure 4 shows the several cost terms from (2.9) using the weighting factors , = 8 , = 2 , = 2 , and as a function of the passive shield height. The choice of the weighting factors depends on the relative importance that is assigned to each cost term. No active shield was considered in this situation (). It can be seen that the term related to the magnetic field in the target area decreases strongly with increasing height of the passive shield, which is a purely conductive shield with and S/m. The losses caused by induced currents in this shield increase with the height for rather low shields with m, but remain almost constant for m. The cost increases with the height of the passive shield: as the shield is conductive, it reduces the flux produced by the excitation current. Consequently, the heating induced in the workpiece decreases, which causes the cost .
Figure 5 illustrates the derivatives to the parameter , calculated on the one hand by the adjoint method and on the other hand by the conventional approach: applying a perturbation of 2 mm in the height of the shield, re-evaluating the cost, and approximating the derivative by dividing the subtraction of both cost values by the perturbation step. The correspondence is good for this conductive shield. However, the finite-element mesh in the 1.25 mm wide and 2 mm high transition domain where the relevant terms of (3.10) are integrated had a maximal edge size of 0.1 mm: much finer than needed to obtain accurate gradients by the conventional method (0.25 mm).
In order to illustrate the limits of the adjoint method, we compared the conventional and adjoint gradients for several permeabilities in Figure 6. In the expression for the adjoint gradients (3.10), the term appears. For high permeability in the shield, the coefficient is very small in the region of the transition zone near the shield because here the permeability is high and is very small. In the region of the transition zone near the air, the relative permeability is close to 1 and the term is much larger. The mesh refinement for sufficient accuracy is here much more critical than for a conductive shield. As can be observed in Figure 6 for , the position at which the adjoint gradient crosses zero—important to find the correct solution in optimization problems—is about 30% too low.
(a)
(b)
(c)
Figure 7 depicts the gradients as a function of the current in an active shield coil and as a function of the position of such a coil. For the amplitude of the current, the accuracy is comparable to the one of classical derivatives, even for a coarse mesh. The first of the three main reasons for the better accuracy compared to of the passive shield is that the shape is square in contrast to the shape of the passive shield, which is a very small and high rectangle. Secondly, the term is constant and not quadratic like . Thirdly, the quantities and to integrate on the active shield coil transition domain are much more smooth than the quantities and to integrate on the passive shield transition domain. For the position, however, the difference is larger (up to 20% as can be seen in Figure 7(b)). The reason for this is that the calculation of the gradient requires the subtraction of integrals in two domains: the domains 2 and 4 in Figure 3 if the object is an active shield coil that moves in horizontal direction. The integrals have the same order of magnitude, causing the difference to be sensitive to numerical noise.
(a)
(b)
5. Comparison of Optimization Techniques for the Shielding Problem
The parameters to optimize are the height of the passive shield (between 10 and 200 mm), the horizontal position of each compensation coil (between 0.250 and 0.900 m and m), and the real and imaginary component of the compensation current in each coil (between −400 and +400 A, which is 10% of the excitation current). The radius of the passive shield (0.3 m) and the vertical position of the compensation coils (1.15 m) were not optimized. For all optimizations described below, genetic algorithms are compared with the conventional gradient method and with the gradient method that uses the adjoint system. The efficiency of several designed shields is also compared with the efficiency of the shields in [1] for the same induction heating application.
Genetic algorithms are stochastic optimization routines that iteratively find a global optimum by applying selection, mutation, and recombination (cross-over) techniques [12] on a “population.” The initial population consists of individuals whose parameters are randomly chosen within the boundaries.
The classical gradient algorithm finds the solution of the constrained nonlinear optimization problem by using sequential quadratic programming (SQP) [13]. The minimum is found using numerical approximations of the Jacobian and the Hessian. Each element of the Jacobian is determined by applying a small perturbation in one parameter and re-evaluating the cost value. In case of parameters to optimize, this procedure requires evaluations if the gradient is approximated by or if the gradient is approximated by . For gradient algorithms, the starting value is important in order to have convergence. The height of the passive shield was chosen randomly to be 60 mm. The starting values for the active shield parameters were chosen taken into account the following issues: the coils should be positioned rather close to the axisymmetric axis, because the field lines enter the target area from that zone. The coil with the smallest radius should have the highest currents because the field intensity decreases with increasing . The ratio between the real and imaginary part of the current was chosen to be equal toin case of a 90-mm high passive shield with .
The design sensitivity approach also solves a constrained nonlinear optimization problem without, however, the need to evaluate the model or times in order to determine the gradients. The gradients are retrieved from the solution of the adjoint system, which requires two extra solutions (one for the real part of the source term and one for the imaginary part), however, without the need to generate a new geometry or mesh. Similar to the classical gradient algorithm, the starting value is important and the solution may be a local minimum.
The following optimizations are carried out each time using the three algorithms (see Table 1).
(1) Optimization of the Height of the Passive Shield (1 Parameter)
In optimization 1a, the passive shield has
conductivity S/m and permeability . All terms in the cost function and their derivatives
can be seen in Figures 4 and 5, respectively,
as well as the position of the optimum at 83.9 mm and its cost of
4.3924. Table 1 shows that the optimum is found by the conventional gradient
method and by the genetic algorithm. The adjoint method finds a very good
approximation (80.7 mm) of the optimum and a slightly higher objective value.
This can be explained by the fact that the gradients of the two methods almost
coincide in Figure 5. Evidently, for only one parameter to optimize, the
adjoint method is slower than the conventional gradient method. The genetic
algorithm is very slow.
In optimization 1b, the shield has the same conductivity S/m and also a permeability of . In Figure 6, it is observed that in this case the accuracy of the adjoint method is much lower. This explains why the optimal height found by the adjoint method is 71.7 mm—this is where the gradient of the adjoint method crosses zero—instead of 100.1 mm. The difference in cost, however, is very small (see Table 1) so that the solution obtained by the adjoint method is still an acceptable solution. Notice that the shield with high permeability has a lower cost value than the purely conductive one. The average field in the target area is 7.76 T versus 12.05 T for the non-ferromagnetic shield: a reduction compared to the average induction in the unshielded case (28.9 T) of 11.4 and 7.6 decibel, respectively. The optimal height of the non-ferromagnetic shield is lower than the optimal height of the ferromagnetic one, because it has more losses ( versus 0.51) and much more disturbance of the heating process ( versus 0.61).
(2) Optimization of an Active Shield Consisting of Only One Coil
The complex current and the
horizontal position are optimized
(3 parameters). A passive shield with height = 90 mm is
present but not optimized. In order to test the gradient algorithms, a “bad”
starting value was chosen = . The genetic algorithm found a very good solution while the
gradient algorithms found (conventional)
and (adjoint), both
having almost the same cost and requiring almost the same CPU time. The correct
position is not found by the gradient algorithms, but their solution has a cost
that is only 0.5% higher than the one of the genetic algorithm.
(3) Optimization of Both the Passive and the Active Shield with One Coil
Next to the three parameters of case 2, also the
height of the passive shield is optimized (4 parameters in total). In case 3a,
the starting value was chosen according to the description above: , resulting in a slightly higher cost for the
conventional gradient algorithm compared to the genetic algorithm that found
the optimal value: . In case 3b, the starting values for the active
shield are chosen equal to the optimum of case 2. This time, the gradient
algorithm finds the global minimum. Both for 3a and 3b, the gradient method
using the adjoint system ends up with a higher cost because the height of the
passive shield is chosen too low for the same reason as in case 1b. The
calculation time of both gradient algorithms has the same magnitude.
(4) Optimization of an Active Shield Consisting of 3 Coils
This optimization results in 9 optimization parameters (three times
real part of the current, imaginary part of the current, and horizontal
position). The starting positions were 0.3, 0.4, and 0.6 m, the starting values
for the currents were and . None of both gradient algorithms found the low cost
value of the genetic algorithm (2.86) or the corresponding optimum [219.69, −72.442, 89.033, 17.53, −73.202, −6.1, 0.274, 0.404, 0.891>]. For 9 variables, the approach using the adjoint
variable is much faster than the conventional algorithm. Moreover, it finds a
lower cost value although the gradients do not deviate much from the
conventional ones.
When comparing these results to the ones from [1], it is seen that the calculation time of the genetic algorithm is much faster than for the genetic algorithm in the cited paper. This is because the objective function in the latter paper contained one finite-element calculation for every compensation coil considered, as well as an inner genetic algorithm nested in the main objective function, and up to nine compensation coils. The resulting shield had a better efficiency than the ones presented here (with less compensation coils), but the CPU time could be reduced from several days to a few hours. The gradient techniques solve the problem in less than one hour, but usually, their solution has a slightly higher cost value.
6. Conclusion
The gradient algorithm using an adjoint system was compared to a gradient method and a genetic algorithm for the design of a passive and active shield. It is observed that the accuracy of the adjoint variable method is good regarding the height of a conductive, non-ferromagnetic shield and regarding the current in compensation coils. The accuracy is somewhat less concerning the position of active shield coils. For a shield with high permeability, the accuracy may be unacceptable. For one evaluation of the objective function, the CPU time of the adjoint approach is about three times longer (as it requires two extra finite-element calculations). The optimization using the adjoint variables is slower than the conventional gradient method in the case of less than three parameters to optimize, comparable in case of three or four parameters, and faster in the case of more than four parameters.
|
Acknowledgments
This work was supported by the FWO project G.0082.06, by the GOA project BOF 07/GOA/006 and the IAP project P6/21. P. Sergeant and I. Cimrak are supported by the Fund for Scientific Research—Flanders FWO (Belgium).