Issue #1/2021
V. S. Zakharikov, D. V. Klusov, A. V. Gusachenko, B. N. Novgorodov, G. N. Popov
Eye-Tracking Systems With a Single-Point Calibration on the Monitor
Eye-Tracking Systems With a Single-Point Calibration on the Monitor
DOI: 10.22184/1993-7296.FRos.2021.15.1.100.106
Existing gaze tracking systems use 3‑point or more calibration. This article discusses the algorithm for calibrating an eyetracker by only one point on the monitor. Similar system allows you to start working in a few seconds. The presented system can also be used in the military sphere: in aviation and ground vehicles with controlled sighting systems.
Existing gaze tracking systems use 3‑point or more calibration. This article discusses the algorithm for calibrating an eyetracker by only one point on the monitor. Similar system allows you to start working in a few seconds. The presented system can also be used in the military sphere: in aviation and ground vehicles with controlled sighting systems.
Теги: ergonomics eye-tracker eye tracking technology interface regression polynomial tracking system video equipment айтрекер видеоаппаратура интерфейс регрессионный полином технология отслеживания взгляда эргономика
Eye-Tracking Systems With a Single-Point Calibration on the Monitor
V. S. Zakharikov1, D. V. Klusov1, A. V. Gusachenko2, B. N. Novgorodov2, G. N. Popov3
Special Design and Technological Bureau, Rostov Optical and Mechanical Plant PJSC, Rostov, Yaroslavl region, Russia
Subsidiary of CPI of Institute of Semiconductor Physics of the Siberian Branch of the Russian Academy of Sciences, Design and Technological Institute of Applied Microelectronics, Novosibirsk, Russia
Ekran Experimental Design Bureau, Rostov Optical and Mechanical Plant PJSC, Rostov, Yaroslavl region, Russia
Existing gaze tracking systems use 3‑point or more calibration. This article discusses the algorithm for calibrating an eyetracker by only one point on the monitor. Similar system allows you to start working in a few seconds. The presented system can also be used in the military sphere: in aviation and ground vehicles with controlled sighting systems.
Keywords: Eye tracking technology, tracking system, interface, ergonomics, video equipment, eye-tracker, regression polynomial
Received on: 10.12.2020
Accepted on: 18.01.2021
INTRODUCTION
Technologies for tracking the direction of a person’s gaze (video oculography, eye-tracking) are used in various systems for designing computer interfaces [Tobii Eye, Microsoft’s Eye Control]. The fixation time of the eye and the density of the trajectory allow one to draw conclusions about which elements of the object or image under consideration are given the most attention.
The main part of information about the world around a person perceives with the help of sight.
The elements of the visual system (the eye, the nerve and the visual analyzer of the brain) are closely interconnected, therefore, the study of the trajectory of the eyeball movement allows one to draw conclusions about the process of recognizing visual images and the thought processes of a person as a whole. Tracking the gaze direction also makes it possible to build fundamentally new interfaces of interaction between humans and technical means. This allows you to significantly improve the convenience of working with computing equipment (personal computer, communicator, tablet, e-book, smartphone, as well as a navigator in a modern car, etc.), simplifying and speeding up the following links and working with applications. Such systems allow for more ergonomic user interaction with multi-window applications. The software can be adapted for a large number of applications. A large area of application of the technology is represented by interactive systems that allow controlling various devices using the eyes [1–3].
There are several ways to determine the gaze direction from eye movement, e. g., measuring the electrical potential in the eye muscles using electrodes around the eye or using special contact lenses to simplify the tracking procedure. However, infrared eye scanning is currently the most acceptable method. This method uses an emitter in the form of an infrared LED and a receiver in the form of an infrared video camera or multi-element photodetector [4, 5]. Both the emitter and the receiver are installed within the line of sight of the operator’s face. The light emitted by the LED is reflected from the eyes and captured by the photodetector. Since infrared radiation is virtually invisible to humans, it does not interfere with normal vision. One of the options for the relative position of the emitter and the receiver is their placement on different axes at some distance from each other (Fig. 1), as a result of which the pupil on the receiver will have a dark shade. From the point of view of physics, the pupil is an almost absolutely black body, i. e. the ratio of the reflected and absorbed light flux is close to zero. This provides a strong contrast between the pupil and iris regions in the frame.
The position of the gaze can be judged by the relative position of the rays of directional infrared illumination reflected from different parts of the eyeball – the so-called Purkinje method of images. However, this requires special equipment and laboratory conditions. In this regard, the task of creating a system for tracking the gaze direction that does not require them is urgent.
Image processing
and recognition algorithms
For an accurate assessment of the gaze, it is necessary to accurately detect the position of the pupils in the image [1–3, 5–7]. Research has shown that pupil detection algorithms are unstable on low-resolution cameras. A significant improvement in the quality of pupil detection occurs when using infrared illumination and a pupil resolution of at least 15 pixels in diameter. Image processing allows you to highlight the reflection from the cornea (flare) and the pupil. Based on the relative position, you can calculate the direction of sight. Flare is the starting point (reference point) of the pupil position in the sensor plane.
Algorithms for image processing and recognition are not easy, since they must take into account many factors:
The developed algorithm consists of the following steps:
A black-and-white image coming from an infrared camera is binarized with a certain threshold so that areas darker than the threshold are considered as possible candidates for a pupil.
Connectivity components are highlighted on the resulting binarized image.
Some of the components are eliminated in accordance with the following rules:
From the extreme points (top, bottom, right, left) of the found component, four arrays are selected by several points inward, a check is made that the number of points in opposite arrays does not differ by more than a specified value (protection from blinking). The centre of mass of the left and right components is taken as the x coordinate of the centre of the pupil, and the centre of mass of the left and right components is taken as the y coordinate of the upper and lower.
The centre of the flare can be found using a similar algorithm, changing in the first paragraph that areas brighter than a certain threshold are considered as possible candidates for the flare. Figure 2 illustrates the image quality of the eye, glare and the result of calculating the centre of the pupil.
To switch from the camera coordinate system to the screen coordinate system, which the operator is looking at, it is necessary to use a regression polynomial:
sx = a0 + a1x + a2x3 + a3y2
sy = b0 + b1x + b2y + b3x2y,
where x, y are the coordinates of the centre of the pupil relative to the IR flare in the plane of the photodetector; sx, sy are the coordinates of the centre of the pupil in the plane of the operator’s screen. Thus, before starting the system, calibration is required to calculate the coefficients ai, bi, where i = 0..3.
Studies and experimental experiments have shown that the coefficients a1 and b2 have a significant effect when the eye is “tied” to the monitor, the rest of the coefficients can be neglected, despite the changed distance between the operator and the monitor.
Considering the above, the regression polynomial can be converted to linear form:
Sx = kг · (xт – xц) + W / 2
Sy = kв · (yт – yц) + H / 2,
where xт, yт are the coordinates of the centre of the pupil relative to the IR flare in the plane of the photodetector when looking at the current point; xц, yц are the coordinates of the centre of the pupil relative to the IR flare in the plane of the photodetector, corresponding to the centre of the monitor screen; kг, kв are the horizontal and vertical transfer coefficients, depending on the distance of the eye to the camera; sx, sy are the coordinates of the centre of the pupil in the plane of the operator’s screen; W × Н – monitor screen resolution.
Thus, to calibrate and start the eye tracker system, it is only necessary to look at a point in the centre of the screen for a few seconds, which is much faster than other systems in which calibration is carried out at three or more points, the calibration time is at least 9 s. When the distance between the operator and the monitor changes, the coefficients kг, kв are automatically recalculated, the value of the coefficients varies from 10 to 12, depending on the type of eye and the distance to the monitor. As the distance to the monitor increases, the value of the coefficients increases.
Scope of application
In military technology, such systems are used practically only in aviation (Fig. 3). The image is projected onto a transparent screen in front of the pilot’s eyes and attached to his helmet. Since the screen is transparent, the pilot can simultaneously observe the environment and the displayed information. The image is collimated to infinity, thereby eliminating the need for eye accommodation.
Currently, the field of application of these systems is limited to combat aircraft and helicopters, where they allow for flight control and aiming without looking down at the indicators in the cockpit, i. e. without being distracted from the environment, which is very important in combat conditions and during intense flight phases.
Conclusion
The developed system has the broadest possibilities for application in computing, robotics, medicine and practically any field of activity in which motors and displays are used, accelerating the processes of transmitting commands to actuators. Also, unlike similar systems, thanks to a simplified and quick calibration process that allows you to start working in a few seconds, the presented system can also be used in the military sphere: in aviation and ground vehicles with controlled sighting systems.
REFERENCES
Gonzalez R. C., Woods R. E. Digital Image Processing (Pearson Education Inc). – M: Technosphera. 2005. 1072 p. ISBN 5-94836-028-8. [In Russ].
Mesteckij L. M. Matematicheskie metody raspoznavaniya obrazov. – M: VMiK MSU. 2002–2004; 42–44. [In Russ].
Malin I. K. Algoritm nahozhdeniya raduzhki dlya sistemy otslezhivaniya napravleniya vzglyada na osnove dostupnoj videoapparatury. Trudy MAI. 2009; 36. [In Russ].
Froimson M. I., Mihajlov D. M., Korsakova A. I., Sorokina M. A., Kondrat’ev M. D. Sistema opredeleniya napravleniya vzglyada pol’zovatelya v rezhime real’nogo vremeni. Spectekhnika i svyaz’. 2013; 3: 32–34. [In Russ].
CHinaev N.N., Matveev I. A. Opredelenie tochnoj granicy zrachka. Mashinnoe obuchenie i analiz dannyh. 2013; 1: 5. [In Russ].
Yang fu. Regression Based gaze estimation with natural head movement. Partial fulfillment of the requirements for the degree of master of applied science. – Concordia university montreal. quebec. canada. 2015. 87 p.
Strupczewski A. Commodity Camera Eye Gaze Tracking. Warsaw university of technology. Warsaw. 2016. 203 p.
Contribution of authors
All authors contributed to the work and subsequent discussion of the manuscript: Popov G. N. – concept of the algorithm; Zakharikov V. S. – mathematical implementation of the algorithm; Klusov D. V. – implementation of the algorithm in a software environment, conducting a number of experiments; Gusachenko A. V., Novgorodov B. N. – implementation of hardware and software for eye-tracker image processing.
Conflict of interests
The authors declare no conflicts of interests
About the authors
Zakharikov V. S., Candidate of technical Science, Head of the Special Design and Technological Bureau, Rostov Optical and Mechanical Plant PJSC; corresponding author; sktb@romz.ru; Rostov, Yaroslavl region, Russia.
Klusov D. V., design engineer of the 2nd category of the design department of Special Design and Technological Bureau, Rostov Optical and Mechanical Plant PJSC, Rostov, Yaroslavl region, Russia.
Gusachenko A. V., leading electronic engineer of the Electronic Systems Department of Subsidiary of CPI of Institute of Semiconductor Physics of the Siberian Branch of the Russian Academy of Sciences, Design and Technological Institute of Applied Microelectronics, Novosibirsk, Russia.
Novgorodov B. N., Leading Software Engineer of the Thermal Imaging and Television Department of the IFP SB RAS Branch “KTIPM”, Novosibirsk, mbo@mail.ru
Popov G. N., Director of Ekran Experimental Design Bureau, Rostov Optical and Mechanical Plant PJSC, Rostov, Yaroslavl region, Russia.
V. S. Zakharikov1, D. V. Klusov1, A. V. Gusachenko2, B. N. Novgorodov2, G. N. Popov3
Special Design and Technological Bureau, Rostov Optical and Mechanical Plant PJSC, Rostov, Yaroslavl region, Russia
Subsidiary of CPI of Institute of Semiconductor Physics of the Siberian Branch of the Russian Academy of Sciences, Design and Technological Institute of Applied Microelectronics, Novosibirsk, Russia
Ekran Experimental Design Bureau, Rostov Optical and Mechanical Plant PJSC, Rostov, Yaroslavl region, Russia
Existing gaze tracking systems use 3‑point or more calibration. This article discusses the algorithm for calibrating an eyetracker by only one point on the monitor. Similar system allows you to start working in a few seconds. The presented system can also be used in the military sphere: in aviation and ground vehicles with controlled sighting systems.
Keywords: Eye tracking technology, tracking system, interface, ergonomics, video equipment, eye-tracker, regression polynomial
Received on: 10.12.2020
Accepted on: 18.01.2021
INTRODUCTION
Technologies for tracking the direction of a person’s gaze (video oculography, eye-tracking) are used in various systems for designing computer interfaces [Tobii Eye, Microsoft’s Eye Control]. The fixation time of the eye and the density of the trajectory allow one to draw conclusions about which elements of the object or image under consideration are given the most attention.
The main part of information about the world around a person perceives with the help of sight.
The elements of the visual system (the eye, the nerve and the visual analyzer of the brain) are closely interconnected, therefore, the study of the trajectory of the eyeball movement allows one to draw conclusions about the process of recognizing visual images and the thought processes of a person as a whole. Tracking the gaze direction also makes it possible to build fundamentally new interfaces of interaction between humans and technical means. This allows you to significantly improve the convenience of working with computing equipment (personal computer, communicator, tablet, e-book, smartphone, as well as a navigator in a modern car, etc.), simplifying and speeding up the following links and working with applications. Such systems allow for more ergonomic user interaction with multi-window applications. The software can be adapted for a large number of applications. A large area of application of the technology is represented by interactive systems that allow controlling various devices using the eyes [1–3].
There are several ways to determine the gaze direction from eye movement, e. g., measuring the electrical potential in the eye muscles using electrodes around the eye or using special contact lenses to simplify the tracking procedure. However, infrared eye scanning is currently the most acceptable method. This method uses an emitter in the form of an infrared LED and a receiver in the form of an infrared video camera or multi-element photodetector [4, 5]. Both the emitter and the receiver are installed within the line of sight of the operator’s face. The light emitted by the LED is reflected from the eyes and captured by the photodetector. Since infrared radiation is virtually invisible to humans, it does not interfere with normal vision. One of the options for the relative position of the emitter and the receiver is their placement on different axes at some distance from each other (Fig. 1), as a result of which the pupil on the receiver will have a dark shade. From the point of view of physics, the pupil is an almost absolutely black body, i. e. the ratio of the reflected and absorbed light flux is close to zero. This provides a strong contrast between the pupil and iris regions in the frame.
The position of the gaze can be judged by the relative position of the rays of directional infrared illumination reflected from different parts of the eyeball – the so-called Purkinje method of images. However, this requires special equipment and laboratory conditions. In this regard, the task of creating a system for tracking the gaze direction that does not require them is urgent.
Image processing
and recognition algorithms
For an accurate assessment of the gaze, it is necessary to accurately detect the position of the pupils in the image [1–3, 5–7]. Research has shown that pupil detection algorithms are unstable on low-resolution cameras. A significant improvement in the quality of pupil detection occurs when using infrared illumination and a pupil resolution of at least 15 pixels in diameter. Image processing allows you to highlight the reflection from the cornea (flare) and the pupil. Based on the relative position, you can calculate the direction of sight. Flare is the starting point (reference point) of the pupil position in the sensor plane.
Algorithms for image processing and recognition are not easy, since they must take into account many factors:
- light level and pupil size can vary greatly;
- daylight and infrared radiation of various natures create perceptible interference;
- it is necessary to ignore various insignificant factors, e. g., blinking and involuntary jumps of the pupil in all directions inherent in the process of visual fixation.
The developed algorithm consists of the following steps:
A black-and-white image coming from an infrared camera is binarized with a certain threshold so that areas darker than the threshold are considered as possible candidates for a pupil.
Connectivity components are highlighted on the resulting binarized image.
Some of the components are eliminated in accordance with the following rules:
- components touching the image border are removed;
- components are removed, the area of which is above or below the specified thresholds;
From the extreme points (top, bottom, right, left) of the found component, four arrays are selected by several points inward, a check is made that the number of points in opposite arrays does not differ by more than a specified value (protection from blinking). The centre of mass of the left and right components is taken as the x coordinate of the centre of the pupil, and the centre of mass of the left and right components is taken as the y coordinate of the upper and lower.
The centre of the flare can be found using a similar algorithm, changing in the first paragraph that areas brighter than a certain threshold are considered as possible candidates for the flare. Figure 2 illustrates the image quality of the eye, glare and the result of calculating the centre of the pupil.
To switch from the camera coordinate system to the screen coordinate system, which the operator is looking at, it is necessary to use a regression polynomial:
sx = a0 + a1x + a2x3 + a3y2
sy = b0 + b1x + b2y + b3x2y,
where x, y are the coordinates of the centre of the pupil relative to the IR flare in the plane of the photodetector; sx, sy are the coordinates of the centre of the pupil in the plane of the operator’s screen. Thus, before starting the system, calibration is required to calculate the coefficients ai, bi, where i = 0..3.
Studies and experimental experiments have shown that the coefficients a1 and b2 have a significant effect when the eye is “tied” to the monitor, the rest of the coefficients can be neglected, despite the changed distance between the operator and the monitor.
Considering the above, the regression polynomial can be converted to linear form:
Sx = kг · (xт – xц) + W / 2
Sy = kв · (yт – yц) + H / 2,
where xт, yт are the coordinates of the centre of the pupil relative to the IR flare in the plane of the photodetector when looking at the current point; xц, yц are the coordinates of the centre of the pupil relative to the IR flare in the plane of the photodetector, corresponding to the centre of the monitor screen; kг, kв are the horizontal and vertical transfer coefficients, depending on the distance of the eye to the camera; sx, sy are the coordinates of the centre of the pupil in the plane of the operator’s screen; W × Н – monitor screen resolution.
Thus, to calibrate and start the eye tracker system, it is only necessary to look at a point in the centre of the screen for a few seconds, which is much faster than other systems in which calibration is carried out at three or more points, the calibration time is at least 9 s. When the distance between the operator and the monitor changes, the coefficients kг, kв are automatically recalculated, the value of the coefficients varies from 10 to 12, depending on the type of eye and the distance to the monitor. As the distance to the monitor increases, the value of the coefficients increases.
Scope of application
In military technology, such systems are used practically only in aviation (Fig. 3). The image is projected onto a transparent screen in front of the pilot’s eyes and attached to his helmet. Since the screen is transparent, the pilot can simultaneously observe the environment and the displayed information. The image is collimated to infinity, thereby eliminating the need for eye accommodation.
Currently, the field of application of these systems is limited to combat aircraft and helicopters, where they allow for flight control and aiming without looking down at the indicators in the cockpit, i. e. without being distracted from the environment, which is very important in combat conditions and during intense flight phases.
Conclusion
The developed system has the broadest possibilities for application in computing, robotics, medicine and practically any field of activity in which motors and displays are used, accelerating the processes of transmitting commands to actuators. Also, unlike similar systems, thanks to a simplified and quick calibration process that allows you to start working in a few seconds, the presented system can also be used in the military sphere: in aviation and ground vehicles with controlled sighting systems.
REFERENCES
Gonzalez R. C., Woods R. E. Digital Image Processing (Pearson Education Inc). – M: Technosphera. 2005. 1072 p. ISBN 5-94836-028-8. [In Russ].
Mesteckij L. M. Matematicheskie metody raspoznavaniya obrazov. – M: VMiK MSU. 2002–2004; 42–44. [In Russ].
Malin I. K. Algoritm nahozhdeniya raduzhki dlya sistemy otslezhivaniya napravleniya vzglyada na osnove dostupnoj videoapparatury. Trudy MAI. 2009; 36. [In Russ].
Froimson M. I., Mihajlov D. M., Korsakova A. I., Sorokina M. A., Kondrat’ev M. D. Sistema opredeleniya napravleniya vzglyada pol’zovatelya v rezhime real’nogo vremeni. Spectekhnika i svyaz’. 2013; 3: 32–34. [In Russ].
CHinaev N.N., Matveev I. A. Opredelenie tochnoj granicy zrachka. Mashinnoe obuchenie i analiz dannyh. 2013; 1: 5. [In Russ].
Yang fu. Regression Based gaze estimation with natural head movement. Partial fulfillment of the requirements for the degree of master of applied science. – Concordia university montreal. quebec. canada. 2015. 87 p.
Strupczewski A. Commodity Camera Eye Gaze Tracking. Warsaw university of technology. Warsaw. 2016. 203 p.
Contribution of authors
All authors contributed to the work and subsequent discussion of the manuscript: Popov G. N. – concept of the algorithm; Zakharikov V. S. – mathematical implementation of the algorithm; Klusov D. V. – implementation of the algorithm in a software environment, conducting a number of experiments; Gusachenko A. V., Novgorodov B. N. – implementation of hardware and software for eye-tracker image processing.
Conflict of interests
The authors declare no conflicts of interests
About the authors
Zakharikov V. S., Candidate of technical Science, Head of the Special Design and Technological Bureau, Rostov Optical and Mechanical Plant PJSC; corresponding author; sktb@romz.ru; Rostov, Yaroslavl region, Russia.
Klusov D. V., design engineer of the 2nd category of the design department of Special Design and Technological Bureau, Rostov Optical and Mechanical Plant PJSC, Rostov, Yaroslavl region, Russia.
Gusachenko A. V., leading electronic engineer of the Electronic Systems Department of Subsidiary of CPI of Institute of Semiconductor Physics of the Siberian Branch of the Russian Academy of Sciences, Design and Technological Institute of Applied Microelectronics, Novosibirsk, Russia.
Novgorodov B. N., Leading Software Engineer of the Thermal Imaging and Television Department of the IFP SB RAS Branch “KTIPM”, Novosibirsk, mbo@mail.ru
Popov G. N., Director of Ekran Experimental Design Bureau, Rostov Optical and Mechanical Plant PJSC, Rostov, Yaroslavl region, Russia.
Readers feedback