# Taming Pattern and Focus Variation in VLSI Design

Fook-Luen Heng<sup>a</sup>, Puneet Gupta<sup>b</sup>, Kafai Lai<sup>c</sup>, Ron Gordon<sup>c</sup>, and Jin-Fuw Lee<sup>a</sup>

<sup>a</sup>IBM T.J. Watson Research Center, Yorktown Heights, NY, USA; <sup>b</sup>Department of ECE, UC San Diego, CA, USA; <sup>c</sup>IBM Microelectronics, East Fishkill, NY, USA

## ABSTRACT

Tight ACLV control has become increasingly difficult due to the diminishing process constant, K1. Focus variation and pitch variation are two major systematic components of ACLV. In this paper, we demonstrate these systematic effects and propose a design flow which exploits the systematic effect.

We demonstrate the systematic ACLV by showing a Bossung plot for a nominal 90nm technology node. The plot is generated by simulation with lithographic parameters closely resembling a production technology node.

Traditionally, tight CD control is achieved by sophisticated RET such as OPC, SRAF, AltPSM and more recently the Dense Template Design.<sup>1</sup> The CD variation is specified in the design manual and the circuit designs will ensure functionality by building in enough margin to account for the variability. Even though, the systematic components of CD variation are understood, they have always been considered together with other random components as being random. This approach has left design performance on the table.

We propose a holistic design flow by integrating the technology development process, design process and the manufacturing process. This holistic approach is aiming to tame the systematic through-pitch and through-focus CD variation. We quantify the design timing benefit using this approach by circuit design experiments. Results of our experiments show that timing uncertainty can be reduced by up to 30%.

We also discuss other possibilities which are infeasible to carry out in traditional approach with silos of technology development, design and manufacturing.

Keywords: ACLV, Systematic Variation, through-focus variation, through-pitch variation

# 1. INTRODUCTION

Solutions to control Across Chip Linewidth Variation (ACLV) are very important to VLSI designs, since it directly impacts the electrical timing and functionality of the designs. There are many sources which contribute to ACLV: through-pitch variation, through-process variation, topography variation, mask variation, etching etc. Due to the complex interaction between these sources of variation, ACLV has been modeled as a random phenomena.<sup>2</sup> In reality, at least 50% of ACLV is systematic.<sup>3,4</sup> The systematic through-pitch variation is the major contributor to variation at nominal process condition, and the systematic through-focus variation is the major contributor for through process condition. These systematic variations can be modeled very accurately once a physical layout is completed.

The systematic variation is treated as random in a traditional design flow, since no mechanism exists in the traditional flow to allow effective use of this information. In a traditional silos of technology development, design and manufacturing, various guard-banding steps are setup to simplify interaction between the different silos. Design ground rules and electrical device models are created to insulate designers from details of a technology. Layout as well as the Nominal Delay Rules (NDR) of basic circuit building blocks are created so that more complex designs can be constructed using these basic building blocks. In an ASIC design environment, these

Further author information: (Send correspondence to Fook-Luen Heng)

Fook-Luen Heng: E-mail: heng@us.ibm.com

Puneet Gupta: E-mail: puneet@ucsd.edu

Kafai Lai: E-mail: kafailai@us.ibm.com

Ron Gordon: E-mail: rlgordon@us.ibm.com

Jin-Fuw Lee: E-mail: jinfuw@us.ibm.com

basic building blocks are cells in a standard cell library. In structured custom design methodology of an advanced microprocessor design environment,<sup>5</sup> similar approach is also employed.

Once a design is taped out, i.e. completed and released for manufacturing, further processing such as OPC is done to the completed layout such that the original specification of the technology is met. For example a drawn transistor with gate length of 80nm, is expected to be printed on wafer as 80nm. OPC is used to correct the drawn rectangle so that it will be as close to 80nm as possible on wafer at the nominal process condition. Since process variation need to be taken into consideration, ACLV tolerance is specified in the design manual so that designs can account for the variation in gate length and still work.

This guard-banding process has been the standard practice for designing VLSI circuits. This has served the industry well since the beginning of the CMOS technology. Since critical dimensions are scaling faster than our ability to control them, e.g. effective gate length of a transistor, variability has become an increasingly more important design issue.<sup>6,7</sup> This has led to very active effort in the industry in trying to address the problem. It is recognized that traditional static timing approach is becoming too conservative to predict the actual performance of a design.<sup>2,6,8,9</sup> Progress has been made to employ statistical techniques to model variability of circuit performance. A general probabilistic framework has been proposed to improve the accuracy of timing prediction.<sup>8</sup> Several approaches to address the correlations due to path re-convergence and proximity gates are studied.<sup>6,9,10</sup>

In this paper, we propose a holistic design flow by integrating the technology development process, design process and the manufacturing process. This holistic approach is aiming to exploit the systematic through-pitch and through-focus CD variation in design. We quantify the design timing benefit using this approach by circuit design experiments. Results of our experiments show that timing uncertainty can be reduced by up to 30%.

In the next section, we discuss the systematic ACLV in more details by showing some linewidth data corresponding to a 90nm technology node. Section 3 describes our proposed flow to integrate the traditional semiconductor silos. Section 4 describes our experiments and results. Then in Section 5, we describe other possibilities which are infeasible to carry out in the traditional approach, but are made possible with the holistic approach. Finally, in section 6, we summarize our finding and describe future work.

#### 2. SIMULATION DATA FOR SYSTEMATIC ACLV

It is a fundamental fact of optical lithography that the optical response to a particular structure in a photomask depends on its proximity environment. To illustrate, some focus-exposure matrices for line-space patterns of differing pitches are illustrated in Fig. 1. Here, the desire is to print 85 nm lines at various pitches under a particular illumination condition, typically designed for maximum insensitivity to dose and focus errors. In this case, the pitches are 245, 315, 450, and 1050 nm, and the 193 nm illumination has a numerical aperture of 0.75, and is a disk shape with a pupil-fill of 0.7.

The line sizes on the mask are determined to be 85 nm at 1X for all of the pitches here. The features on the mask are made of an attenuated, phase shifting mask, where the intensity transmission of the material is 6.5% and the phase shift is  $\pi$  radians. The imaging takes place in a resist of refraction index of 1.7. The range of focus sampled lies between -0.5 and 0.4 um; the reason for the asymmetric range is that the best focus is shifted from zero due to refraction in the resist. The dose range sampled lies between -40% and 40% deviation from that which produces the wafer linewidth of 85 nm.

The optical model used here takes into account the vector nature of the light in the lens system, and assumes unpolarized light there. The develop model is known as a "lumped parameter model", and assumes a resist thickness of 300 nm, a diffusion length of 22 nm, and a contrast of 10.

Note that, in Fig. 1, the various curves are that of constant dose. For the same dose through the different pitches, the behavior of the linewidth through focus can vary greatly. Here, in the dense case, there is a "smile" plot, with positive curvature, whereas for the other cases, there is negative curvature.



Figure 1. Focus-exposure matrices for line-space patterns of varying pitch, according to optical parameters specified in the text. (a) 245 nm pitch, (b) 315 nm pitch, (c) 450 nm pitch, and (d) 1025 nm pitch.

# 3. HOLISTIC APPROACH TO TAME SYSTEMATIC VARIATION

Timing model for a standard-cell is characterized with very intensive simulation process. It is reduced to a set of formulas which predict delay of input to output paths based on parameters such as gate length, temperature, voltage, oxide thickness etc. The corners of the model assume worst-case condition for each parameter. In particular, worst-case gate length is assumed to be the maximum possible gate length variation. In reality, as described above, gate length variation can be predicted more accurately based on the spatial environment of each gate. The accurate prediction will remove at least half of the best-case to worst-case spread of the gate length. In this section, we describe a timing methodology which takes into consideration the systematic variation of gate length. We also quantify the pessimism caused by using the worst-case assumption.

## 3.1. Accounting for Pattern Dependent Variation

Traditional timing methodology assumes perfect printing of the gates under nominal process condition and hence computes timing of a design based on the target gate length. Model-based OPC tries to achieve the target gate length but is never able to correct the design perfectly. The reasons may include geometrical limitations of layout and limitations of the OPC algorithm as well as constraints on runtime. As a result there always is some iso-dense bias in printing of polysilicon shapes. Isolated lines tend to print smaller (or larger depending on the process) than nested or dense shapes. This pitch dependent variation of printed gate length is systematic and hence can be predicted. After placement spacing between all gate shapes is known and hence printed shapes can be predicted accurately.

OPC can be performed on the layout and lithography simulations can be done to predict the printed shape on the final wafer. The critical dimension or gate length can then be measured from this simulated print-image of the layout for each device. This more accurate gate length can then be used to predict the timing of the device, cell and hence the entire design more accurately. The problems with such an elaborate approach are as follows.

- OPC is computation intensive. Model-based OPC is very computation intensive. Typical numbers range from about 1100 seconds for a small 5900 gate design to several CPU days for modern multi-million gate designs.<sup>11</sup> Moreover, image simulation of the entire design is also very time consuming and hence not suitable for use during the design process which may involve many synthesis, place and route iterations.
- Library characterization is an involved process. Characterizing a standard cell for continuously varying gate lengths (or *Critical Dimension*, *CD*) of all the devices within it is a herculean task if not an impossible one. Performing circuit-level timing on the entire design with accurate gate-lengths is also not feasible due to runtime and scalability constraints.

Our method of accounting for through-pitch variation in static timing has three major components namely: accurate CD measurement, constructing timing libraries and contextual timing analysis. We describe these parts of our flow next.

## 3.1.1. CD measurement

To circumvent the problems of full-chip OPC and elaborate characterization, we adopt a library based OPC approach similar to one described in the literature.<sup>11</sup> Individual library cells are corrected conservatively in a typical placement environment. The placement environment is emulated using a set of dummy geometries. The average gate length\* is then measured for all devices in the gate. These "printed" gate lengths are then used to predict timing for the devices.

This library-based OPC approach is accurate enough because the radius of influence for 193nm steppers is about 600nm, as estimated by using the range of sampling equation.<sup>12</sup> I.e., features beyond 600nm of any given device have negligible impact on its printing. As a result, the devices which are not at the periphery of the cell have an environment which is almost identical to their actual placement environment. Therefore, the CD predicted for them after library-based OPC is very close to the CD predicted for them after full-chip OPC. Further details of the library-based OPC approach can be found in.<sup>11</sup>

Devices which lie at the boundary of the cell are not as accurately predictable by the library-OPC approach. For these devices, we use a through-pitch CD simulation approach. We construct a look-up table which matches pitch to printed CD for the given process. The CD measurements are again done post-OPC. The empirical model is constructed for a number of spacings up to 600nm. The placement of the cell in layout determines the CD to be used for these border devices. An example is shown in Figure 2.

<sup>\*</sup>The gate length varies along the width of the device. We do a simple averaging of the CD. We believe this to be a reasonable approximation as device delay varies almost linearly as gate length.



Figure 2. An example placement of cells A, B and C. For cell B,  $nps_B^{LT} = 900, nps_B^{RT} = 950, nps_B^{LB} = 750, nps_B^{RB} = 900.$ 

#### 3.1.2. Constructing Timing Libraries

In a placement, a cell's environment will depend on the neighboring cells (left and right in a horizontal cell placement row)<sup> $\dagger$ </sup> and the whitespace between the cell and its neighbors.

In a placement for a cell  $C_i$ , its environment is described by a set of four spacings  $nps_i^{LT}$  (distance of the device on the "left-top" to the nearest poly feature on the left in the neighboring cell),  $nps_i^{RB}$  (distance of the device on the "right-bottom" to the nearest poly feature on the right),  $nps_i^{LB}$  and  $nps_i^{RT}$ .<sup>‡</sup> These four space parameters enable us to determine the printed CD for the border poly features in the cell in the placement context using the through-pitch CD simulation results. Since continuous variation of these parameters makes a library difficult to characterize, we use *three* different values for each of these parameters. This gives rise to 81 different versions of the same cells.<sup>§</sup>

For our current experiments, we assume delay of any timing arc from an input pin to an output pin in a cell to be linearly proportional to the gate lengths of the devices involved in the transition. For a given input vector, the devices involved are those that are on the critical input to output path at the nominal gate length. As the gate lengths vary, the same set of devices are used to compute the delay. Though we use this linear approximation for simplicity, more accurate circuit simulation based analysis is also feasible. We construct timing look up tables (with varying load capacitance and input slews) for these 81 versions of the library cell master. As a result, we obtain a .lib which has 81 versions of each cell in the original library.

#### 3.1.3. In-Context Timing Analysis

After the library generation, the next step is to identify correct canonical environment for every cell instance in the layout and perform a contextual static timing analysis. We define four parameters for a cell  $C_i$ :  $s_i^{LT}$ (the distance of cell outline from the closest device on the "left-top" corner of the cell),  $s_i^{LB}$  (spacing between left-bottom device and the cell outline),  $s_i^{RT}$  and  $s_i^{RB}$ . Analyzing the placement (i.e., whitespace around the cell and the four *s* parameters for the given cell and its immediate neighbors) puts the given cell in the given layout into one of the 81 categories.

After annotating each cell instance with its correct version, we run static timing analysis with the expanded library. The result of this timing analysis takes into account iso-dense effects and the resulting through-pitch variation at the nominal focus and exposure.

#### 3.2. Taming Focus Variation

The next systematic component of variation that we account for in our proposed timing analysis methodology is the CD variation arising out of focus variation. Isolated and dense lines behave differently with defocus. Isolated

<sup>&</sup>lt;sup>†</sup>We do not consider "vertical" neighbors as they have negligible impact on gate CD.

<sup>&</sup>lt;sup>‡</sup>Note that the top and bottom spacings can be different as they correspond to p and n devices respectively which may not be aligned in the cell layout.

<sup>&</sup>lt;sup>§</sup>81 is arrived as a compromise between accuracy and ease of implementation.



Figure 3. An example cell layout depicting isolated, dense and self-compensated devices.

lines tend to get thinner with defocus while dense lines get thicker. As a result, isolated devices get faster with focus variation while dense devices tend to get slower than nominal.

An important component of "process" corner for timing is gate length variation. A very important component of gate length variation is focus variation. The systematic "smile-frown" behavior of focus-based variation of CD implies that depending on whether a certain timing arc involves isolated devices or dense ones, the worst-casing in one of its corners can be reduced. Moreover, there is some "self-compensation" of focus variation for timing arcs which involve both isolated and dense devices.

As before, we analyze the devices in the layout and label them as isolated, dense or self-compensated depending on the spacing to the nearest poly line on the left and the right.<sup>¶</sup> For example, a standard-cell layout with the three kinds of devices labeled is shown in Figure 3. Next we label each timing arc (input pin to output pin transition) as "smiling", "frowning" or "self-compensating" depending on whether the devices involved in the transition are isolated, dense or self-compensated.<sup> $\parallel$ </sup>

We assume given a certain percentage contribution of focus variation to CD variation. For smiling timing arcs, we trim off that portion from the best-case gate length. For frowning timing arcs, the worst-case gate-length is reduced while for self-compensated timing arcs worst-case as well as best-case gate lengths are impacted. As a result, timing uncertainty arising out of focus variation is reduced for *all* timing arcs in the design.

## 3.3. Computing the Corners

Traditional corner-based timing analysis uses slow, nominal and fast corners for process. The systematic variation aware static timing analysis flow proposed in this work reduces the pessimism and uncertainty caused by these variations.

To compute the impact of through-pitch variation, we draw test layouts consisting of parallel poly lines with fixed width and length but varying spacing. These test layouts are then corrected with the standard OPC flow and CD is measured to construct the lookup table described in section 3.1. Denote the total range of CD variation

<sup>&</sup>lt;sup>¶</sup>We assume "dense" spacing to be less than the contacted-pitch and anything larger to be "isolated".

<sup>&</sup>lt;sup>||</sup>For purpose of this work, we assume the majority determines the nature. For example, if a timing arc involves two isolated and one dense device, then it is labeled as frowning. Better focus-sensitivity based characterization is possible but we limit ourselves for want of an accurate defocus print-image simulator.



Figure 4. An artificial Bossung curve at some given nominal exposure. The smile denotes the "most dense" feature in the technology while the frown denotes the "most isolated" one. It should be clear that the total span of CD variation  $(= 2(lvar_{pitch} + lvar_{focus})))$  is too pessimistic.

after OPC by  $\pm lvar_{pitch}$ . We calculate (similarly defined)  $\pm lvar_{focus}$  using the FEM (Focus Exposure Matrix) curves built from fabrication of test structures. We measure the CD variation with defocus (focus variation range is taken to be  $\pm 300$ nm) for a number of pitches (ranging from minimum pitch to a pitch slightly larger than the contacted pitch). These variations are shown in the artificial Bossung plot in Figure 4.

Let  $l^{nom}$  and  $l^{nom}_{new}$  denote the traditional nominal gate length (independent of the cell layout and placement) and the iso-dense aware gate length respectively. Define  $l^{WC}_{pitch}$  and  $l^{BC}_{pitch}$  to be the worst-case and best-case gate lengths after accounting for through-pitch variation in CD. Similarly,  $l^{WC}$  and  $l^{BC}$  be the corresponding numbers in the conventional flow. Then

$$l_{pitch}^{WC} = l_{new}^{nom} + (l^{WC} - l^{nom} - lvar_{pitch})$$

$$l_{pitch}^{BC} = l_{new}^{nom} - (l^{nom} - l^{BC} - lvar_{pitch})$$

$$(1)$$

There are many factors affecting the best and worst case gate length. We removed the variation due to pitch. In reality, there are dependencies between the pitch and the non-pitch factors. For the purpose of quantifying the potential impact of the systematic variation, this is a very good first order assumption. We will discuss what can be done to improve the accuracy in an actual systematic variation aware timing methodology in section 5.

Focus variation does not affect the nominal process corner, but it may affect worst-case and best-case corners differently depending on whether the timing arc under consideration is smiling, frowning or self-compensating.\*\* For smiling timing arcs, the values are

$$l_{smile}^{WC} = l_{pitch}^{WC}$$

$$l_{smile}^{BC} = l_{pitch}^{BC} + lvar_{focus}$$

$$(2)$$

Here, we are removing the variation due to focus from the best case, since it is not a factor for dense lines. Similarly for frowning timing arcs,

$$l_{frown}^{WC} = l_{pitch}^{WC} - lvar_{focus}$$

$$l_{frown}^{BC} = l_{pitch}^{BC}$$

$$(3)$$

\*\*In this work we do not consider "degree" of compensation for the lack of supporting data.



Figure 5. Distribution of error for model-based OPC for C3540 ISCAS85 benchmark.

|          |        | Traditional Timing (ns) |      |      | New "Accurate" Timing (ns) |      |      | % Reduction in |
|----------|--------|-------------------------|------|------|----------------------------|------|------|----------------|
| Testcase | #Gates | Nom                     | BC   | WC   | Nom                        | BC   | WC   | Uncertainty    |
| C1355    | 2058   | 2.15                    | 1.57 | 2.88 | 2.15                       | 1.70 | 2.62 | 29             |
| C2670    | 3655   | 5.07                    | 3.74 | 6.64 | 5.05                       | 4.04 | 5.96 | 33             |
| C3540    | 5903   | 6.32                    | 4.72 | 8.34 | 6.26                       | 5.20 | 7.35 | 40             |
| C432     | 968    | 5.77                    | 4.21 | 7.70 | 5.70                       | 4.53 | 6.88 | 32             |
| C499     | 1728   | 2.30                    | 1.66 | 3.10 | 2.29                       | 1.79 | 2.82 | 28             |

**Table 1.** Comparison of traditional worst-case timing with systematic variation aware timing methodology. Nom, BC, WC denote nominal, best-case and worst-case corners of the library respectively.

For self-compensated arcs, both worst-case and best-case timing is modified.

$$l_{selfcomp}^{WC} = l_{pitch}^{WC} - lvar_{focus} \tag{4}$$

$$l_{selfcomp}^{BC} = l_{pitch}^{BC} + lvar_{focus} \tag{5}$$

## 4. EXPERIMENTS AND RESULTS

To quantify the magnitude of the pessimism of traditional STA, we take 10 most frequency used cells in a 90nm standard-cell library, synthesize ISCAS85 benchmark circuits with the 10 cells, and then time the synthesized and placed circuits for best-case, nominal and worst-case. The corner case libraries are constructed with just the process corners while the voltage and temperature are kept the same across all the libraries. We do this to evaluate the benefit of the proposed timing methodology independent of any orthogonal effects.

We apply OPC to these 10 cell masters as described in section 3.1.1 using commercial EDA software. Modelbased OPC is performed using IBM 90nm pre-production process models. To verify that through-pitch variation is sizeable even after model-based OPC, we measure CDs of simulated full-chip standard model-based OPC and compare it with simulated nominal gate length. The distribution of error is given for an example circuit in Figure 5. We see up to 20% variation in printed gate length even after model-based OPC.

We perform in-context timing analysis for the synthesized and placed circuits with the in-context timing model described in section 3, by substituting the correct version of the timing model for each cell based on its placement. We generate the 81 versions of each cell as described in section 3.1 with values of  $nps^{LT}$ ,  $nps^{RT}$ ,  $nps^{LB}$  and  $nps^{RB}$  each being put into one of the three bins: {400-500nm, 500-600nm,  $\geq$  600nm}. Since the radius of influence of 193nm steppers is about 600nm, any spacing larger than 600nm is isolated spacing and prints almost the same as a 600nm spacing. Since dense geometries print larger in the process, we use the lower of the bin extremes (e.g., 400nm for 400-500nm bin) to be pessimistic in our timing estimates.

We compare the best-case, nominal and worst-case timing with the standard timing as described above. Assuming  $lvar_{focus}$  and  $lvar_{pitch}$  each to be 30% of the total gate length variation,<sup>4</sup> the results of systematic-variation aware STA are shown in Table 1. Our results show that the best-case to worst-case timing spread is reduced by 28% to 40% in the systematic variation aware approach. Since the majority of the devices in the layout are isolated (due to the whitespace distribution or the cell layout itself), the nominal timing improves when through-pitch variation is accounted for.

## 5. OTHER POSSIBILITIES

Our experiment demonstrates that there is substantial pessimism in the traditional static timing analysis by not considering the systematic components of ACLV. In this section, we propose a practical systematic variation aware timing methodology.

In order to produce more accurate in-context timing model for each standard cell, each cell will need to be "corrected" by the OPC process before it is characterized. This can be done by the library based OPC methodology,<sup>11</sup> in which, gates in the cell are corrected by standard OPC processed on a per cell definition basis as opposed to be corrected in a per instance basis. Gates on the boundary can have several versions of correction based on context. In such an OPC methodology, the timing characterization of a cell can be performed based on the actual wafer image of the corrected gates in the cell.

Furthermore, we need to develop a parameterized gate length model for each gate on the cell boundary. The model will predict the actual gate length and its variation based on the proximity spatial information, i.e. distance of the neighboring gate. From our discussion in section 3, the nominal gate length can be predicted by through-pitch gate length simulation, and the through-focus gate length variation can be predicted by a Focus Exposure Matrix (FEM) plot.

A timing model which includes the proximity spatial information as a parameter for input to output path delay will need to be constructed. More specifically, the input to output delay is parameterized by  $s_i^{LT}$ ,  $s_i^{LB}$ ,  $s_i^{RT}$ ,  $s_i^{RB}$  as described in section 3.1.3. One naive way to construct such a model will be to perform extensive input to output delay path simulation for each value of the boundary gate length. A more efficient construction of such a model is a topic which will require separate investigation.

With such a timing model parameterized by proximity spatial information, the systematic variation aware static timing analysis can be performed after placement.

# 6. CONCLUSIONS AND FUTURE WORK

In this paper, we have proposed a novel static timing methodology which accounts for systematic variation arising due to proximity effects and focus variation. The methodology brings process and design closer and has elements of RET, library characterization as well as conventional static timing analysis. We quantify the magnitude of the pessimism of traditional static timing analysis which neglects systematic components of ACLV. This can amount to as much as 40% tightening of the best-case to worst-case timing spread. In practice, ASIC hardware always performs better than traditional STA predicts. Even though, different compensating mechanisms has been built into traditional STA, e.g. IBM EinsTimer,<sup>6</sup> systematic variation could be one key component which contributes to the discrepancy as suggested by our results.

We are refining our experiment for process technology which includes other RET such as Sub-Resolution Assist Features. We also plan to further quantify such pessimism by using statistical timing methodology with more realistic gate length distribution based on iso-dense attributes and proximity spatial information, as opposed to the simplistic Gaussian distribution of gate length variation. Another process phenomenon not accounted for in our current experiments is exposure dose variation. Exposure variation can alter the nature of devices (i.e. dense or isolated).

Our current work also investigates the implications of exposure variation on the proposed timing methodology. Systematic nature of focus dependent CD variation suggests potential implications for compensating for such focus variation.

#### ACKNOWLEDGMENTS

The authors would like to thank Ruchir Puri and Prabhakar Kudva for very insightful input to the standard-cell methodology. We also like to thank Jim Culp and Yuping Cui for providing help with the OPC software, Daniel L Ostapko and Michael Lercel for reviewing the manuscript.

#### REFERENCES

- M. Fritze, B. Tyrrell, R. Mallen, B. Wheeler, P. Rhyins, and P. Martin, "Dense only phase shift template lithography," in SPIE Conf. on Design and Process Integration for Microelectronics Manufacturing, pp. 15– 29, 2003.
- D. Blaauw, S. Nassif, L. Scheffer, and A. Strojwas, "Design for manufacturing in the sub-100nm era," in Design Automation Conference, p. Tutorial, 2003.
- 3. S. Postnikov and S. Hector, *ITRS CD Error Budgets: Proposed Simulation Study Methodology*, ITRS, May 2003.
- 4. W. Chu in *Personal Communication*, July 2003.
- 5. G. Northrop and P.-F. Lu, "A semi-custom design flow in high-performance microprocessor design," in *Design Automation Conference*, pp. 426–431, 2001.
- 6. C. Visweswariah, "Death, taxes and failing chips," in Design Automation Conference, pp. 343–347, 2003.
- 7. 2001 International Technology Raodmap for Semiconductors (ITRS), http://public.itrs.net.
- 8. M. Orshansky and K. Keutzer, "A general probabilistic framework for worst case timing analysis," in *Design* Automation Conference, pp. 556–561, 2002.
- J. Hess, K. Kalafala, S. Naidu, R. Otten, and C. Visweswariah, "Statistical timing for parametric yield prediction of digital integrated circuits," in *Design Automation Conference*, pp. 932–937, 2003.
- 10. A. Agrawal, D. Blaauw, V. Zolotov, and S. Vrudhula, "Statistical timing analysis using bounds and selective enumeration," in *Design Automation Conference*, pp. 348–353, 2003.
- 11. P. Gupta, F. Heng, and M. Lavin, "Merits of cellwise model-based opc," in *SPIE Conf. on Design and Process Integration for Microelectronic Manufacturing*, p. to appear, 2004.
- 12. A. Wong and R. Ferguson, "Kernel-based fast aerial-image computation for a large-scale design of integrated circuit patterns," in US Patent 6,223,139, 2001.