{"id":1449,"date":"2018-03-05T12:08:29","date_gmt":"2018-03-05T02:08:29","guid":{"rendered":"https:\/\/www.cognav.net\/?p=1449"},"modified":"2018-03-05T12:08:29","modified_gmt":"2018-03-05T02:08:29","slug":"how-to-process-panoramic-images-for-scene-recognition","status":"publish","type":"post","link":"https:\/\/braininspirednavigation.com\/?p=1449","title":{"rendered":"How to process panoramic images for scene recognition?"},"content":{"rendered":"<p style=\"text-align: justify;\">The excerpt note is about panoramic images from Zhang et al., 2007.<\/p>\n<p style=\"text-align: justify;\">Zhang, A. M. (2007). <a href=\"https:\/\/pdfs.semanticscholar.org\/ab53\/aba3d1e95b7787a58bed958fde1a7a230e5d.pdf\">Robust appearance based visual route following in large scale outdoor environments<\/a>. Proceedings of the Australasian Conference on Robotics and Automation, Brisbane, Australia, 2007.<\/p>\n<p style=\"text-align: justify;\"><strong>Image Pre-processing<br \/>\n <\/strong><\/p>\n<p style=\"text-align: justify;\">Identical image pre-processing steps are applied to both reference and measurement images. Input colour image is first converted into greyscale (colour information is unstable under changing lighting conditions) then &#8220;un-warped&#8221; (i.e. remapped) onto azimuth-elevation coordinates. An example of the original colour image and its unwarped greyscale image is shown in Figures 3a and 3b respectively, where horizontal axis is azimuth and vertical axis is elevation. Vertical field of view is restricted to [-50 deg, 20 deg].<\/p>\n<p style=\"text-align: center;\"><img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces1.png\" alt=\"\" \/><\/p>\n<p style=\"text-align: center;\">Fig. 3: (a) Original colour image. (b) Converted to greyscale and mapped into azimuth-elevation coordinates, where the azimuth-axis is horizontal. (c) Patch normalised to remove lighting variations, using a neighbourhood of 17 by 17 pixels.<\/p>\n<p style=\"text-align: justify;\">Patch normalisation is then applied to compensate for changes in lighting condition. It transforms the pixel values as follows:<\/p>\n<p style=\"text-align: center;\"><img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces2.png\" alt=\"\" \/> (1)<\/p>\n<p style=\"text-align: justify;\">Where <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces3.png\" alt=\"\" \/> and <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces4.png\" alt=\"\" \/> are the original and normalised pixels respectively, <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces5.png\" alt=\"\" \/> and <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces6.png\" alt=\"\" \/> are the mean and standard deviation of pixel values in a neighbourhood centred around <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces7.png\" alt=\"\" \/>. Figure 3c shows the result of applying patch normalisation to Figure 3b. A neighbourhood size of 17 by 17 pixels has worked well in the experiments.<\/p>\n<p>&nbsp;<\/p>\n<p style=\"text-align: justify;\"><strong>Image Cross Correlation<br \/>\n <\/strong><\/p>\n<p style=\"text-align: justify;\">The section addresses the problem of measuring an orientation difference between a measurement image and a reference image.<\/p>\n<p style=\"text-align: justify;\">Orientation difference between reference and measurement image is therefore only a shift along the azimuth axis. This shift is recovered using Image Cross Correlation (ICC) performed efficiently in the Fourier domain. Let <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces8.png\" alt=\"\" \/> denote azimuth and <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces9.png\" alt=\"\" \/> elevation. The frontal 180 degree field view of the reference image serves as the template, i.e. <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces10.png\" alt=\"\" \/>. Let the search range be <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces11.png\" alt=\"\" \/> such that the measurement image is limited to the angular range <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces12.png\" alt=\"\" \/> . Because only a 1D cross-correlation along the azimuth axis is performed, each row in the image is transformed into Fourier domain separately. Reference image is padded with zeros to the same size as the measurement image. If the measurement image is <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces13.png\" alt=\"\" \/> by <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces14.png\" alt=\"\" \/> pixels, then the Fourier domain image consists of <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces15.png\" alt=\"\" \/>sets of 1D Fourier coefficients, each of a single row. Algorithmic complexity for a single image is <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces16.png\" alt=\"\" \/> . Convolution in spatial domain is equivalent to multiplication in Fourier domain:<\/p>\n<p style=\"text-align: center;\"><img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces17.png\" alt=\"\" \/><\/p>\n<p style=\"text-align: center;\"><img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces18.png\" alt=\"\" \/><\/p>\n<p style=\"text-align: center;\"><img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces19.png\" alt=\"\" \/> (2)<\/p>\n<p style=\"text-align: justify;\">Where <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces20.png\" alt=\"\" \/> is the Image Cross Correlation (ICC) coefficients, <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces21.png\" alt=\"\" \/> and <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces22.png\" alt=\"\" \/> are the i th row in the reference and measurement image respectively, * is the convolution operator and <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces23.png\" alt=\"\" \/> is the Fourier transform operator. Equation 2 states that each corresponding row of the measurement and reference images are multiplied in Fourier domain. The results are then summed followed by an inverse Fourier transform to obtain the spatial domain cross-correlation coefficients. Complexity for the multiplication in Fourier domain is <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces24.png\" alt=\"\" \/> and for inverse Fourier transform is <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces25.png\" alt=\"\" \/> . Fourier transforms for the reference images are calculated offline after the teaching run and stored. The complexity of a complete ICC is thus <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces26.png\" alt=\"\" \/> where m is the number reference images to compare against. This is significantly better than the complexity of ICC performed in spatial domain which is <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/03\/030518_0211_Howtoproces27.png\" alt=\"\" \/> . Comparing against 11 reference images only takes 2.3 ms on a 2.4GHz mobile Pentium 4 per measurement image.<\/p>\n<p style=\"text-align: justify;\">For further more info, please read the Zhang 2007.<\/p>\n<p style=\"text-align: justify;\">Zhang, A. M. (2007). <a href=\"https:\/\/pdfs.semanticscholar.org\/ab53\/aba3d1e95b7787a58bed958fde1a7a230e5d.pdf\">Robust appearance based visual route following in large scale outdoor environments<\/a>. Proceedings of the Australasian Conference on Robotics and Automation, Brisbane, Australia, 2007.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The excerpt note is about panoramic images from Zhang et al., 2007. Zhang, A. M. (2007). Robust appearance based visual route following in large scale outdoor environments. Proceedings of the Australasian Conference on Robotics and Automation, Brisbane, Australia, 2007. Image Pre-processing Identical image pre-processing steps are applied to both reference and measurement images. Input colour [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[126,114,96],"tags":[327,328,326,319],"_links":{"self":[{"href":"https:\/\/braininspirednavigation.com\/index.php?rest_route=\/wp\/v2\/posts\/1449"}],"collection":[{"href":"https:\/\/braininspirednavigation.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/braininspirednavigation.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/braininspirednavigation.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/braininspirednavigation.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1449"}],"version-history":[{"count":2,"href":"https:\/\/braininspirednavigation.com\/index.php?rest_route=\/wp\/v2\/posts\/1449\/revisions"}],"predecessor-version":[{"id":1451,"href":"https:\/\/braininspirednavigation.com\/index.php?rest_route=\/wp\/v2\/posts\/1449\/revisions\/1451"}],"wp:attachment":[{"href":"https:\/\/braininspirednavigation.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1449"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/braininspirednavigation.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1449"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/braininspirednavigation.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1449"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}