Published: 2022-06-18 | Categories: [»] Tutorialsand[»] Optics.

Up to now we have developed a pretty extended methodology to analyze paraxial systems based on matrixial formulation. We will now use this toolbox to implement our second improvement over traditional optical design softwares!

In the traditional design workflow, the optical designer first develops an extended nominal design to which a lot of care is applied (tons of optimization to compensate aberrations over the whole field of view etc.) before he even questions himself about manufacturability of his design in details. More specifically, how the manufacturing process may influence the system performances and how sensitive the system can be to errors in manufacturing. It is not a secret that, when ordering parts, you never get exactly what you asked for but something close to it. Thicknesses of lenses, radii of curvature, refractive indices are all subject to small changes from what you initially expect, and these changes can sometimes severely impact your design.

Before placing an order for his lenses, the optical designer will run a tolerance analysis. It is an extremely costly operation where hundreds of copies of his design are evaluated (spot diagram, mtf, magnification change etc.) for random tiny changes over the various parameters of the system (thicknesses, radius of curvature, residual shape error etc.). Sometimes the conclusion of the analysis is that the system can be completely thrown away because it exists no manufacturing technology that is able to provide errors small enough to yield acceptable performances. Most of the time however the analysis helps to identify which of the elements are the most sensitive and orient redesign or tighten tolerancing (paying more for some of the element to get better performances). It still remains an expensive operation that occurs very late in the design phase and this goes really against the whole Agile methodology.

What I propose here is to integrate a part of tolerancing very early in the design phase. While it is not possible to do everything upfront, some effects can be analyzed even before actual lenses are inserted – only from a paraxial description of the system! These are also the elements that may have the most dramatic effect on the system performances such as large deviation of the focal planes, erratic magnification or excessive wavefront fluctuations from builds to builds.

Traditional optical design software evaluates tolerancing effects through a Monte-Carlo procedure. Copies of the nominal system are generated and randomly perturbated and histograms are built for given system parameters. As an example, the analysis will conclude that 90% of the systems have wavefront error less than, say, λ/2 while 15% will be better than λ/5. That means if you would build 100 systems, you could expect to have about 15 of them with performances better than λ/5 and 90 of them better than λ/2. From this analysis, it becomes merely a strategic decision on how you are going to market your system. Do you need only a few pieces of very high performances? Can you bin your product and sell them at different prices depending on their quality? It is not uncommon in single-shot projects to order up to five objectives when you need only one or two just to be sure that you will meet the hardware requirements!

For paraxial systems, we can use a very different strategy that is also much faster. It is based on an analytical sensitivity analysis of the system.

Imagine you have an analytical formula that tells you what is the focal length of a system composed of a single lens knowing its parameters R1, R2, t and n. From a Taylor expansion limited to the first order you also expect that the function will be linear for very small deviations to the nominal values

where the partial derivatives ∂f/∂… are computed around the nominal values R1, R2, t and n.

That is, the focal length of the system perturbated by the values R1, R2, t and n can be computed from the nominal focal length plus terms that depend on both the partial derivatives of the analytical formula and the perturbations value.

The error in focal length is therefore given by the deviation to the nominal value

In practice, the deviations will be random and are therefore best described by their statistical distributions. As a consequence, f, the error on the focal length, also becomes a statistical variable. If we consider that each source of errors R1, R2, t and n are independent random variables, we can sum their variances and get a statistical estimation of the overall focal length variance as

For a more complex system with many parameters xj, the formula generalizes to

The values xj must be entered by the optical designer and translate the accuracy requested (or achievable) during the manufacturing process. The value f is, in our example, an expectation of how much the focal length is susceptible to change given the requested accuracy. Knowing this value, we can either adapt other properties of the system (e.g. include some compensation mechanism for focusing), or simply reject the design as being too sensitive. We could also compute where we can loosen-up some tolerances or make other tighten to achieve a desired target (e.g. max 2% change in overall focal length).

That still leaves the question of the f/∂xj that we have to compute. This has to be done either numerically or analytically. I chose here to go for the analytical path.

Knowing that, in our [»] ABCD matrix formalism, the effective focal length is given by

we are therefore looking to evaluate analytically

The effective focal length is not the only property that we can evaluate as we also have the back-focal and front-focal lengths:

Any other property based on the ABCD matrix can be computed as well. For instance the paraxial focus image position is

and therefore

You can directly reuse the derivative of EFL, BFL and FFL computed here-above. By the way, the sets of xj can also include the value of o which is also a system parameter! Including it or not is merely a question of what you are trying to compute at the end. If the object (e.g. a target) is included in the mechanics and have a fixed position with some uncertainty, you might be interested to see by how much the image plane shift. On the other hand, if you can freely adjust the object position to get a sharp image then you can throw away any variations of o as they don’t make sense in that case (f/∂o=0).

All the formula shown here-above relied on the derivatives of the ABCD matrix

Knowing that the matrix of the system can be written as

where Mi is the matrix representation of the i-th element of the system and Mi is the concatenated matrix of the system

We can apply the derivative multiplication rule

which can be applied recursively on the Mn-1 concatenated matrices.

We can identify two important cases for the thin paraxial lens and the thickness matrices:

but we can also derive formula for the thick-lens system, the plane window system etc.

Although you can develop the analytical expression of M/∂xj for simple systems, you will soon struggle as the complexity increases. This applies not only to complex systems but also to more complex quantities. For instance, in our [»] previous post we have shown how to approximate the wavefront through the third-order aberration theory and Seidel formula:

with the Seidel terms summed over the N interfaces of the system


and (h,u), (h,u) are the marginal and chief ray height and angles respectively, which can be obtained from the Mi matrices.

The rms of the wavefront can also be computed directly from the coefficients ci through

where the Qi,j are well-defined coefficients for the aperture shape.

Computing the rms/∂xj for every xj of the system therefore requires a lot of computations, even for simple cases – a task that would be extremely cumbersome to do by hand!

To solve this problem, I spent some time building a symbolic math solver to compute analytical solutions for expressions like the ones we are facing (focal lengths, image position, wavefront rms…). A sample C++ program can be downloaded on our [∞] companion website under a LGPL license. The results in this post refers to version 1.1 of the software.

In the approach I followed, every property of the system (focal length, wavefront rms etc.) are represented as a tree where leaves are the toleranced parameters (radii of curvatures, thicknesses, indices of refraction…) and nodes are mathematical operators (addition, multiplication, square root etc.). This is similar to the Hariharan tree structure I gave in my earlier post about [»] job systems.

Evaluating the direct value of the property is easy as you just have to follow the tree and evaluate each node bottom-up. Evaluating derivatives is not much more complex because you can use the well-known [∞] analytical differentiation rules. You can create a new tree for each node based on its known analytical differentiation.

For instance, Figure 1 represents the tree structure conversion of the operation x0*x1 for which we expect the differential to be x0/∂xj * x1 + x0 * ∂x1/∂xj.

Figure 1 – Tree structure conversion for the multiplication node

Leaves have a special differentiation rule which is

Each property differential therefore has its own (new) tree structure that is based on the initial property tree. Evaluation is then performed bottom-up as for any other tree, except this time you get the property partial derivative to xj.

In practice, some tricks are necessary to keep the performances to acceptable levels:

Null branches shall be pruned as they are created. If an expression always yields zero, it should be removed from the tree to simplify it. Upon differentiation, many branches of the tree will yield null results just because the leaves differential are evaluated to zero when the parameters do not match (xi/∂xj=0 for all i≠j). This will bloat your tree structure and hinder your performances. I experienced 5× speed-ups by pruning empty branches in the example. I did not tracked memory but I expect similar performances increase there as well.

Proxy values shall be used to avoid recomputing the same operations. Upon differentiation of operators like multiplication, division and square roots, entire branches will be repeated again and again. By default, they will be evaluated every time they are repeated which is particularly slow. By using proxies, you can reuse pre-existing evaluation to speed-up your program. In the example, I experienced 80× speed-ups by enabling proxies!

The example I used in the program is the system we already exposed in the [»] previous post on third-order aberration theory and which is repeated here in Figure 2.

Figure 2 – Example system

For each lens, I used 1% tolerance on each radii of curvature (front and back), 50 µm tolerance on all thickness and 2×10-4 tolerance on the refractive index for a total of 20 parameters. The program evaluated the effects of these tolerances on back-focal length, seidel aberration (S1 to S5) and the wavefront rms (excluding defocus and tilt) for on-axis and off-axis configurations (y=0 and y=1). I tested the software in various conditions and got the following results:

On my old Intel i7 6700 I got a median (probable) time of 14.141 ms when compiling with Visual Studio CE 2022 optimized for speed (/O2, x64). This translates to about 70 systems per second.

On the same Intel i7 6700 machine, I got this time a median of 10.555 ms when compiling with G++ optimized for speed (-Ofast, x64) and running under the Windows Subsystem for Linux (WSL). This translates to about 95 systems per second.

On my EXSi server which is based on an Intel i3 10100f processor, using G++ optimized for speed and running under Debian 10 with 1 vCore, I got a median time of 5.761 ms which translates to about 175 systems per second.

All computations were performed over 100 computations and include both the creation of the system and the computation of the derivatives themselves.

It is worth noting the big gap between the execution time of the Visual Studio compiled code vs. G++ compiled code although both were compiled with speed optimizations enabled. Also, the 10th generation i3 clearly outperforms the 6th generation i7.

In all cases, the results are well within the real-time domain with speeds going from 70 to 175 systems per second for a moderate system comprising 4 lens and 20 parameters. This is to be compared to Monte-Carlo simulations which would have taken typically 20-30 minutes if not more to give us a quick feedback (in-depth analysis can take several hours with Monte-Carlo analysis).

The performances obtained in the example enable direct feedback during the design process which opens new door. In particular, the system not only report the total expected variation on the final property but also how each element affects the total variation. For instance, using the provided tolerances of above, the program evaluates that the wavefront is susceptible to vary by 0.25 lambdas (compared to 0.02 lambdas nominal) with the repartition of variances given by the results in Figure 3. We clearly see that lens 2 is responsible for most of the change and we can therefore focus our attention on its re-design. A look at the variation of the Seidel coefficients reveals that most of the wavefront variation comes from distortion followed by astigmatism.

Figure 3 – Repartition of variance across the system parameters

The overall idea of the paraxial tolerance analysis is to provide quick feedback during the design stage. We can therefore imagine some sort of color or star-based systems where the software automatically informs the designer which elements contributes the most to the variance of the system.

Before we end this post, I would like to detail something about the computation of the differentials. I said above that parameters differentials were evaluated at the leaves in the form of xi/∂xj. This is actually not correct and was a bit of an approximation to avoid putting too much information at once. To be more complete, nothing prevents us from computing the differential at the node level as well. Let me illustrate with an example to explain why we would want to do that.

Imagine you have a lens made of some glass material n. So far, we have studied the influence of a bad mix of the glass in the form of a variation on the absolute index of refraction of the material (±2×10-4 in the example). However, the index of refraction of the glass is only valid for the wavelength at which it is specified, and it is common to develop an analytical expression to find the index of refraction at any wavelength through expressions like the Sellmeier equations:

where the Bi and Ci are well-defined coefficients for the glass material.

When using such an expression, the leaf of the system is now λ and n becomes a node. You can therefore compute the value Y/∂λ but nothing prevents you from also computing Y/∂n for some property Y. This way, you can both represent variations that are due to a bad glass mix but also the variations that are due to a shift in wavelength. The only difference is where you stop the regression into the tree during the evaluation of the derivatives.

Similarly, we have seen [url=" /-devoptical-part-7--replacing-thin-lenses-by-real-lenses/"]here[/url] how to express lens radii R1 and R2 based on the desired focal length f and lens bending X. During the design phase, it is indeed much easier to work directly on f and X rather than on R1 and R2 directly. However, the manufacturing tolerances will apply to R1 and R2 and not f and X. In this case, R1 and R2 are therefore nodes as well and f and X are the leaves. At contrario to the index of refraction however, we have no reason to study variations relative to Y/∂f and Y/∂X but only variations relative to Y/∂R1 and Y/∂R2.

The process to obtain the differential of a node is exactly the same as for a leaf. Either replace the node by a one, a zero or a branch depending on the case you met (respectively, xj is the node, the node depends on xj or the node is not xj and do not depends on xj).

That is all for today! This closes a big part of the discussion I wanted to have about #DevOptical which required to explain first what are PSF, MTF and WFE before I could switch to how to approximate MTF and WFE quality through the paraxial equations via the third-order aberration theory, just because I intended to develop this paraxial tolerance analysis framework!

We are currently reaching quite a lot of content in this #DevOptical series that has now started more than one year ago. I still have some important concepts to present but we are reaching a point where I can soon start working on an alpha version for the software! I would say that we have already made 75% of our way there and I have 4/5 posts pending before we can close the series and have enough asset to produce something that will have sufficient added value to be released. I would therefore expect that we come to a first conclusion by the end of the year 2022!

I would like to give a big thanks to James, Lilith, Samuel, Themulticaster, Sivaraman, Vaclav and Jesse who have supported this post through [∞] Patreon. I remind you to take the occasion to invite you to donate through Patreon, even as little as $1. I cannot stress it more, you can really help me to post more content and make more experiments! The device presented in this post was paid 100% through the money collected on Patreon!

[⇈] Top of Page

You may also like:

[»] #DevOptical Part 14: Third-Order Aberration Theory

[»] #DevOptical Part 7: Replacing Thin-Lenses by Real Lenses

[»] #DevOptical Part 11: The Diffractive PSFs

[»] #DevOptical Part 9: Geometrical Image Simulation

[»] Custom 5x Plan Objective from Stock Elements