2.2 Machine vision sensors

Machine vision sensors should be selected according to the application they are to be used for. The two scanning types of sensors are line and area scanners, where the former is more demanding. The “area” refers to the sensor shape inside the camera. The line-scanning sensor provides a one-dimensional line at a time, whereas the area-scanning sensor produces two-dimensional data. Therefore, the area scanner is a typical choice for general applications.

The two main types of sensors are charge-coupled devices (CCD) and complementary metal oxide semiconductors (CMOS). The main difference between the sensor types is the pixel-level conversion from charge to voltage. CCD uses sequential methods, whereas the CMOS sensor reads the pixels parallel. Currently, the performance of CMOS sensors has exceeded CDDs, and CMOS sensors are widely used in machine vision applications.

With CMOS sensors, each pixel is addressable on a row and column basis. The voltages are read parallel, enabling high frame rates and users to define the regions of interest (ROIs). Depending on the number of transistors per pixel, the sensor might have a global shutter or a higher signal-to-noise ratio. The sensor’s pixel-level operating principles and basic characteristics are described in the following subsections. Next, we will see the basic principles of an imaging sensor: how the photons are converted to digital numbers.

2.2.1 A simplified physical model of a pixel and imaging sensor

The pixel’s function is to convert the incoming photons into electrons. The electrons are converted into a voltage, which can be measured. Each sensor has a full-well capacity, which describes the maximum number of electrons stored in a pixel. Below, Figure 1 illustrates a pixel well and a physical model of an imaging sensor that converts the photons into digital numbers. As can be seen in Figure 1, the full-well area is proportional to the pixel’s light-sensitive front area.

Figure 1. A simplified physical model of a pixel and imaging sensor. A number of photons hit the photosensitive pixel area during the exposure time. The photons are converted to photo-electrons. The charge formed by the electrons e− is then converted by a capacitor to a voltage, before being amplified and digitised, resulting in the digital grey values. The red e− denotes temporal noise. The pixel-depth dependent physical properties, from the full-well capacity to the dynamic range, are visualised in the pixel well.

Dynamic range (DR, Figure 1) is the ratio between the smallest and largest amplitude of the signal a sensor can produce (Baumer 2022). It describes the imager’s ability to simultaneously provide detailed information from bright and dark areas. The parameter is important when the illumination conditions rapidly change or the targets have a strong contrast. The area that can capture light is called the fill factor. Since each pixel in a CMOS sensor has its own readout, charge conversion and digitalisation structures (visualised as the substrate, below in Figure 9), the interline transfer might have a 20-50% fill factor from the pixel size (Stemmer 2022). This affects the sensors’ overall photosensitivity, which can be improved, for instance, by using microlenses in front of the pixels (Figure 2).

Figure 2. Full well capacity i.e., the maximum electrical charge possible, is limited by the physical pixel depth. One way to improve the pixel’s photosensitivity is to use microlenses above the pixel well. Therefore, some of the light heading into the sensor’s substrate areas can be directed to the pixel.

Photons are converted into electrons, and the conversion ratio is called quantum efficiency (QE). The quantum efficiency depends on the wavelength. The sensor’s light sensitivity depends on the number of photons converted to electrons. The more photons are converted, the greater the quantum efficiency, increasing the level of information provided by the sensor. If the sensor is used with filters, the measured quantum efficiency of the system might differ from the sensor-level quantum efficiency (EMVA 2016).

Found something useful? Wish to cite? This post is based on my dissertation. For citations and more information, click here to see the scientific version of it.

References

Baumer 2022. Baumer Group, Operating principles of CMOS sensors. URL:https://www.baumer.com/es/en/service-support/function-principle/operating-principle-and-features-of-cmos-sensors/a/EMVA1288. (A leading manufacturer of sensors, encoders, measuring instruments and components for automated image-processing. Accessed on 7.4.2022).

Stemmer 2022. Stemmer Imaging, The Imaging and Vision Handbook. ⟨URL:https://www.stemmer-imaging.com/en/the-imaging-vision-handbook/⟩. (A leading international machine vision technology provider. Accessed on 7.4.2022).

EMVA 2016. The European machine vision association, EMVA Standard 1288, Release 3.1. https://www.emva.org/standards-technology/emva-1288/emva-standard-1288-downloads-2/⟩. (Sensor and camera standards. Accessed on 8.4.2022).

2.1 Standards

Two important standards are briefly introduced in this section, hosted by the European machine vision associate (EMVA), a non-profit organisation founded in 2003. The EMVA aims to connect organisations in machine vision, computer vision, embedded vision and imaging technologies, which includes manufacturers, system and machine builders, integrators, distributors, consultancies, research organisations and academia (EMVA 2022).

The reason for introducing these standards is that a standard might be valuable for comparing different sensors and it may enable a more versatile programming interface than a common programming library.

The first standard is EMVA 1288, which can be used for comparing sensors; the second standard is GenICam, which provides a generic programming interface for sensors and other devices.

When comes to authors’ hands-on experience, the GenICam standard is the foundation of the Python library Camazing (Jääskeläinen et al. 2019a), which is used with VTTs HS imagers, which are mentioned also in my posts. Camazing was developed at the Spectral Imaging Laboratory at the University of Jyväskylä (JYU 2022) with an MIT license.

EMVA 1288

Sensor selection is the first phase, and one of the most important ones in the image-capturing process. Manufacturers offer a large selection of different sensors, which vary in their price. Some of the properties introduced in the Post 2.2 are definably dependent on the requirements of the target; such as the resolution, the choice between monochromatic or colour sensor, the shutter type and the camera interface. The challenge lies in the measurable features; such as the sensor’s sensitivity, temporal noise and spectral response.

EMVA 1288 is a standard for the specification and measurement of machine vision sensors and cameras (EMVA 2022). Its main purpose is to create transparency by defining reliable and exact measurement procedures and data presentation guidelines to make comparing cameras and image sensors easier.

Manufacturers that do not hold EMVA licences might measure and present the sensor performance and feature measurements differently and may even leave some important details to the buyers to test, which might be expensive and difficult.

Manufacturers that adhere to the EMVA 1288 standard measure the sensors and cameras according to the standard’s exact measurement procedures, making the presented data sheets of different sensors and sensor manufacturers comparable. The “EMVA standard 1288 compliant” logo ensures that products are licensed, measured and presented transparently and reliably.

The complete list of measurement groups and their mandatorily can be found in the standard specification (EMVA 2016). The standard’s documentation includes mathematical formulas, descriptions of the measurement setups and guidelines for presenting the results in a consistent format. The current release cited in this Post is 3.1.

GenICam standard

Generic Interface for Cameras (GenICamTM) provides a generic programming interface for various sensors, cameras and other devices, regardless of the interface technology (GigE Vision, USB3 Vision, CoaXPress, Camera Link HS, Camera Link, etc.) they use or what the features they implement.

Table 1. The GenICam standard modules (GenICam 2022).

The GenICam standard comprises five modules, which are introduced above in Table 4. The role of GenApi is to define the mechanism used to provide the generic application defining interface (API) via a self-describing eXtensible Markup Language (XML) file in the device. The XML file format is defined in Schema, which is a part of GenApi. Underneath GenApi is the standard features naming convention (SFNC), which standardises the name, type, meaning and use of device features, resulting in standard functionalities between manufacturers. The features are typically introduced in a tree view, being controllable via an application. The related standard for the consistent naming of pixel formats is the pixel format naming convention (PFNC).

GenTL is the standard for the transport layer programming interface. As a low-level API, it provides standard device interfaces and allows one to enumerate devices, access device registers, stream data and deliver asynchronous events. Data Container (GenDC) module allows devices to send any form of data (e.g. 1D, 2D, 3D, multi-spectral, metadata) in a Transport Layer Protocol (TLP) independent format and permits to share of a common data container format for all the TLP standards. The control protocol (GenCP) is a low-level standard to define the packet format for device control and can be used as a control protocol for each new standard.

The main purpose and benefit of GenICam are to provide an API, which is identical regardless of interface technology used in standardised devices (GenICam 2022).

Found something useful? Wish to cite? This post is based on my dissertation. For citations and more information, click here to see the scientific version of it.

References

GenICam 2022. The European machine vision association, GenICam standard, version 2.1.1. https://www.emva.org/standards-technology/genicam/ introduction-new/

EMVA 2022. The European machine vision association, home page. https: /www.emva.org/. (Organisation behind sensor and camera standards. Accessed on 8.4.2022).

Jääskeläinen, S., Eskelinen, M., Annala, L. & Raita-Hakola, A.-M. 2019a. Camazing Python library. https://pypi.org/project/camazing/. (Machine vision library for GenICam-compliant cameras. Developed at the University of Jäskylä, Spectral Imaging Laboratory. Released under MIT-licence. Accessed on 9.4.2022).

2 Introduction to image capturing processes and related process phases

We will now begin a series of posts, that aims to point out which sensor and optical details should be considered while selecting components for a machine vision system.

First, we can take look at Figure 1, which visualises image-capturing processes with related process phases.

Figure 1 Image capturing processes and related process phases. The full theory can be found in the author’s dissertation, which those pink circles refer to. The related posts are listed below.

The structure of the Series of Posts is divided into four image-capturing topics and related process phases which are visualised above in Figure 1. The phases are design, construction of the imager, control and capture. In our scope, image capturing aims to produce a high-quality optical image (a detector array) for the following computer vision processes.

This series of posts will cover all four topics and processes, with related process phases. The posts are organised as follows. At first, 2.1 briefly introduces two important standards to enhance the reader’s understanding of how sensors are compared and how a standard can make the device-controlling phase easier.

Post 2.2 presents advanced information about machine vision sensors by introducing their physical model and describing how the optical image is formed from the captured photons. 2.3 Discusses the sensor properties and Bayer pattern. The final post in series 2 is Post 2.4, which shows how to pre-process colour filter arrays.

We will leave the details of optics and optomechanical components to be explored in series 3 and finalise the processes through device controlling and imaging setups in Series #4.

Found something useful? Wish to cite? These websites are based on my dissertation. For citations and more information, click here to see the scientific version of it.

List of related Posts

1 From machine vision terminology to machine vision fundamentals

Machine vision and computer vision terminology might be confusing. Here is a definition, that we will use in this website.

The first high-level term in this field was computer vision. Since the core of the 1970s intelligent robot was a vision, the research area was named after it (Ejiri 2007). In terms of research content, however, the case and name refer more to the current interpreter of machine vision.

According to Smith et al. (2021), the current de facto interpretation of machine vision is “computer vision techniques to help solve practical industrial problems that involve a significant visual component”. The state-of-the-art interpretation combines machine vision and deep learning methods and considers machine vision as one of the core technologies of artificial intelligence (Smith et al. 2021).

Figure 1. Relationship between artificial
intelligence, machine learning and deep learning. Machine learning is a subset of artificial intelligence, and deep learning is a subset of particular machine learning.
Figure 2. Relationship between the terms machine vision, computer vision and machine learning. Machine vision has two sub-terms; computer vision and image capturing. The essence of computer vision lies in machine learning.
Continue reading “1 From machine vision terminology to machine vision fundamentals”