Virtual Advertising
Research
Image Processing
6 min read

## Introduction

An important aspect of image processing and computer vision is finding vanishing lines for calibration of a camera using these vanishing lines. For this project, we took a look at vanishing lines and in video clips of tennis and soccer games. The aim was to project a banner in the original video clips, transformed so that it looks like it is actually on the field. This means the static image resources need to be updated every frame to account for changes in camera position. We first take a closer look at edge detection techniques.

### Edge detection

To detect objects and other interesting variation in an image, we can apply edge detection. There are many methods for this, but most are search-based or zero-crossing based. Search-based methods find edges by calculating edge strength (ie. by using the gradient magnitude), and then searching for local directional maxima of the gradient magnitude. Zero-crossing based methods look for zero-crossings (the point where the sign of a mathematical function changes, from positive to negative or vice versa) in second-order derivatives (ie. the Laplacian) of the image.

Two classic methods zero-crossing based edge detection methods are Marr-Hildreth, and Canny. Marr-Hildreth, the older method, introduces the concept of edge detection with the following process: First, the image is convolved with the Laplacian of the Gaussian. Then to find the edges zero crossings are found in the filtered result. This approach can generate false edges, and depending on your dataset, Canny detection might be a better option. Canny edge detection works as follows: it also first starts with a Gaussian filter, and next, the image's intensity gradient is found. Then, with non-maximum suppresion and hysteresis thresholding the final detection of edges is done. Canny, despite its age, is still state-of-the-art. Not many other edge detectors perform significantly better without large drawbacks.

Figure 1. Canny edge detection applied to an image

## Strategy

The placement and projection of a virtual advertisement is based on a couple of aspects:

1. The 3D localization of the side lines of the game with respect to the arbitrarily chosen world coordinate system.
2. The position and orientation of the camera with respect to this world coordinate system.
3. The intrinsic parameters of the camera.

Since there were no specific calibration objects in the scene, the court lines of the field, and its known lengths, are used. In computer vision this falls under the umbrella of auto-calibration (self-calibration). See, for instance, Zhang et al. and Orghidan et al.

We chose some sample clips from tennis matches in which the (orthogonal) court lines were sufficiently visible during the panning, rotation, and zooming of the camera. Then, the lines were located using Canny edge detection.

A Hough transform is applied to the resulting binary edge map to obtain an accumulator transform matrix, including theta and rho values for the detected lines. Theta is the angle of that perpendicular vector in degrees ranging from -90 to 90 degrees. The peaks of the transform are then found, and the line segments are retrieved through the edge map and Hough output.

Gap filling and maximum length for line segments are implemented to restrict this selection. Lines are tracked across frames, and the parameters of the detected lines were then used to calibrate the camera. Using the calibration paramters we can then project a virtual banner near a court line. Figure 2 shows this line tracking and advertisement placement for the tennis video.

Figure 2. Line tracking and ad placement in a tennis match

We then applied this method to other videos, including one of a football match. Due to the larger variety in camera changes (zooming, large movements) in football matches versus tennis matches, this was a harder case. Figure 3 shows some results.

Figure 3. Visualization of placement of 8 banners. The edge map is shown on the left side, and the final image with banner projections is shown on the right side. The top four are vertical banners; the four placed on the field are horizontal banners angled at 60 degrees

## References

[1] Z. Zhang, "Camera Calibration", Emerging Topics in Computer Vision, G. Medioni and S. B. Kang, Eds., ed: Prentice Hall Professional Technical Reference, 2004, pp. 4-43.

[2] R. Orghidan, J. Salvi, M. Gordan, and B. Orza, "Camera calibration using two or three vanishing points", Computer Science and Information Systems (FedCSIS), 2012 Federated Conference on, 2012, pp. 123-130.

[3] Beardsley, Paul, and David Murray. "Camera calibration using vanishing points." BMVC92. Springer London, 1992. 416-425.

[4] Canny, John. "A computational approach to edge detection." Pattern Analysis and Machine Intelligence, IEEE Transactions on 6 (1986): 679-698.

[5] Ding, Lijun, and Ardeshir Goshtasby. "On the Canny edge detector." Pattern Recognition 34.3 (2001): 721-725.

[6] Lv, Fengjun, Tao Zhao, and Ram Nevatia. "Self-calibration of a camera from video of a walking human." Pattern Recognition, 2002. Proceedings. 16th International Conference on. Vol. 1. IEEE, 2002.

[7] Marr, David, and Ellen Hildreth. "Theory of edge detection." Proceedings of the Royal Society of London B: Biological Sciences 207.1167 (1980): 187-217.

[8] Sharifi, Mohsen, Mahmoud Fathy, and Maryam Tayefeh Mahmoudi. "A classified and comparative study of edge detection algorithms." Information Technology: Coding and Computing, 2002. Proceedings. International Conference on. IEEE, 2002.