A Neural Network to Detect Thin hBN Layers on Wafers

Posted Jan 10, 2024 Updated Oct 2, 2024

By Bangzheng Han

2 min read

Introduction

During my undergraduate research at the University of California, Berkeley (Sep 2023 - Jan 2024), I worked on Charge Transfer Dynamics in RuCl₃/hBN/WSe₂ Heterostructures under Professor Feng Wang. To enhance experimental efficiency, I independently developed a Convolutional Neural Network (CNN) that detects thin hBN layers on wafers with 95% accuracy.

Project Motivation

The current flake finding system used in laboratories cannot effectively handle thin hBN because thin hBN, wafers, and residue on tape often have similar colors. Traditional color-based algorithms frequently misidentify residue as flakes, causing significant inefficiencies as researchers need to spend considerable time distinguishing between residue and actual flakes. This inefficiency has prompted the integration of machine learning techniques for improved accuracy and efficiency.

Neural Network Architecture

The developed CNN architecture is both simple and effective. Below is the structure implemented:

  
import torch
import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)  # Output: 6 channels, 59x59 feature maps
        self.pool = nn.MaxPool2d(2, 2) 
        self.conv2 = nn.Conv2d(6, 16, 5)  # Output: 16 channels, 13x13 feature maps
        self.fc1 = nn.Linear(16 * 13 * 13 , 64)
        self.fc2 = nn.Linear(64, 32)
        self.fc3 = nn.Linear(32, 2)  # Output: 2 classes (Sample, Tape)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = torch.flatten(x, 1)  # Flatten all dimensions except batch
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

net = Net()
net.to(device)

Dataset and Preprocessing

The dataset comprised approximately 50,000 images, manually labeled as tape or flake, with flake images constituting about 2% of the dataset.

Preprocessing Steps:

Color Normalization: Adjusting the images based on wafer color standards to account for color variation in wafers scanned at different times. The standardization result is shown below.
Color variation in wafers scanned at different months and the standardized result after applying color normalization
Standard Transformations:
- Scaling
- Type Conversion
- Normalization
Data Augmentation for Flake Images:
- Rotation
- Symmetry Transformations
- Scaling

These methods helped enhance the dataset, particularly augmenting the minority flake class for better model training.

Training and Testing

The training process involved splitting the dataset, with 10% reserved as a validation set. Following are the results after 400 epochs of training:

[Epoch 396/400] [Loss: 19.741972] [Acc: T_S: 99.32% T_T: 95.51% V_S: 94.95% V_T: 94.71%] [Time: 109.3]

Accuracy Metrics:

T_S (Sample): Training set accuracy for Sample (flake)
T_T (Tape): Training set accuracy for Tape
V_S (Sample): Validation set accuracy for Sample (flake)
V_T (Tape): Validation set accuracy for Tape

The following sections highlight the model performance at different learning rates and epoch counts:

Accuracy of Sample identification (flake) at different learning rates and epochs. Accuracy of Tape identification at different learning rates and epochs Accuracy at learning rate 0.0007 over epochs.