{"id":623,"date":"2018-01-04T15:17:55","date_gmt":"2018-01-04T05:17:55","guid":{"rendered":"https:\/\/www.cognav.net\/?p=623"},"modified":"2018-01-04T15:33:27","modified_gmt":"2018-01-04T05:33:27","slug":"how-to-implement-the-visual-processing-module-for-pose-calibration-in-ratslam","status":"publish","type":"post","link":"https:\/\/braininspirednavigation.com\/?p=623","title":{"rendered":"How to implement the visual processing module for pose calibration in RatSLAM?"},"content":{"rendered":"<p><span style=\"font-size: 12pt; color: #000000;\">In this report, I summarized some key methods for visual processing module in RatSLAM or RatSLAM-based System. There are more than six approaches as following. By comparing and doing some practical experiments, I think that the intensity scanline profile and feature matching approaches are good for the visual processing module.<\/span><\/p>\n<hr \/>\n<h4><span style=\"background-color: white; font-size: 12pt; color: #000000;\">Approach 1: Features based visual template matching <\/span><\/h4>\n<p><span style=\"font-size: 12pt; color: #000000;\">Step1 Features Extraction<\/span><\/p>\n<p><span style=\"font-size: 12pt; color: #000000;\">Step2 Features Matching<\/span><\/p>\n<p><span style=\"font-size: 12pt; color: #000000;\">\u00a0 \u00a0 \u00a0Compare the current image with all visual templates<\/span><\/p>\n<p><span style=\"font-size: 12pt; color: #000000;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 If there are some available features matched<\/span><\/p>\n<p><span style=\"font-size: 12pt; color: #000000;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0Get the distances, then get the minimum distance.<\/span><\/p>\n<p><span style=\"font-size: 12pt; color: #000000;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0If the minimum distance is more than the vt_threshold,<\/span><\/p>\n<p><span style=\"font-size: 12pt; color: #000000;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0add a new template and associating with the current direction<\/span><\/p>\n<p><span style=\"font-size: 12pt; color: #000000;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 Else<\/span><\/p>\n<p><span style=\"font-size: 12pt; color: #000000;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0Is a familiar scene<\/span><\/p>\n<p><span style=\"font-size: 12pt; color: #000000;\">End<\/span><\/p>\n<p style=\"text-align: justify;\">\u00a0<\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000; font-family: Arial; font-size: 12pt; background-color: white;\"><strong>Biologically inspired visual landmark processing for simultaneous localization and mapping <\/strong><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000; font-family: Arial; font-size: 12pt; background-color: white;\">(Feature extraction and template matching) <\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000; font-family: Arial; font-size: 12pt; background-color: white;\">Prasser, D. P., et al. 2004 proposed a method, which is based loosely on biological principles, using layers of filtering and pooling to create learned templates that correspond to different views of the environment. <\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000; font-family: Arial; font-size: 12pt; background-color: white;\">Rather than using a set of landmarks and reporting range and bearing to the landmark, this system maps views to poses. <\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000; font-family: Arial; font-size: 12pt; background-color: white;\">The challenge is to produce a system that produces the same view for small changes in robot pose, but provides different views for larger changes in pose. <\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000; font-family: Arial; font-size: 12pt; background-color: white;\">The complex cells of the visual cortex are generally considered to detect or respond to edges or bars at a particular orientation within a region of the retina, which is known as the cell&#8217;s receptive field. <\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000; font-family: Arial; font-size: 12pt; background-color: white;\">Vision information is converted into a local view representation which if familiar, injects activity into the particular pose cells that are associated with that specific local view. <\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000; font-family: Arial; font-size: 12pt; background-color: white;\">Significantly the local view contains no explicit spatial information such as distance and bearing to a landmark, instead RatSLAM learns to associate visual appearance with different poses. The approach has two levels: a biologically motivated feature extraction of the image using the complex cells; followed by a primitive scene recognition stage. <\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000; font-family: Arial; font-size: 12pt; background-color: white;\">The units in the local view are controlled by a two stage computer vision system. <\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000; font-family: Arial; font-size: 12pt; background-color: white;\">The first stage uses a complex cell model to extract features from the image. These features are then used to represent the image. <\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000; font-family: Arial; font-size: 12pt; background-color: white;\">The second stages uses a sum of absolute differences metric to compare the output of the complex cells against previously learnt templates. Each template has a corresponding local view unit which is activated when the input matches the template. <\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000; font-family: Arial; font-size: 12pt; background-color: white;\"><strong>A. Feature Extraction <\/strong><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000; font-family: Arial; font-size: 12pt; background-color: white;\">The complex cell model used in this method is based on the first layer of D.H. Hubel. 1988. In this model the input image is first normalized and then convolved with a number of odd Gabor filters to produce edge detected images. Each of these is then passed through a winner-takes-most mechanism across the orientation dimension. In the current implementation only two orientations are used for the complex cell filters- horizontal and vertical. <\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000; font-family: Arial; font-size: 12pt; background-color: white;\"><strong>B. Template Matching <\/strong><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000; font-family: Arial; font-size: 12pt; background-color: white;\">The output of the complex cell filters is input for a sum of absolute differences (SAD) module. The SAD module operates as a learning system that compares the input against a preciously learned set of templates. When the minimum distance between any of the learnt set and the input is larger than a threshold <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem1.png\" alt=\"\" \/> the input is added to the set as a new template. <\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\"><span style=\"font-family: Arial; background-color: white;\">The output of the SAD module is a linear vector of cells each of which corresponds to one particular template. <\/span>The activity level of a cell,<img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem2.png\" alt=\"\" \/>, varies between 0 and 1, and is given by:<\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-size: 12pt; color: #000000;\"><img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem3.png\" alt=\"\" \/>\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0\u00a0 (1)<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">where<img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem4.png\" alt=\"\" \/>is introduced to avoid infinite activation levels when the distance is zero, and <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem5.png\" alt=\"\" \/>is a distance threshold. The distance,<img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem6.png\" alt=\"\" \/>, is the sum of absolute differences between pixel intensity values in the template and the current image and is given by:<\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-size: 12pt; color: #000000;\"><img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem7.png\" alt=\"\" \/>\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0(2)<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">where <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem8.png\" alt=\"\" \/>is the intensity value of the <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem9.png\" alt=\"\" \/>pixel in the visual template associated with cell <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem10.png\" alt=\"\" \/>, and <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem11.png\" alt=\"\" \/> is the value for the<img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem12.png\" alt=\"\" \/> pixel in the current camera image. The total activity level in the set of local view cells is normalised to unity.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\"><span style=\"font-family: Arial; background-color: white;\">Finally the cell activation if normalized to unity over all of the LV cells. By using this representation the SAD module can respond to uncertain matches by weakly activating two or more cells rather than just signalling the most correct template. The performance of the SAD module is chiefly controlled by the <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem13.png\" alt=\"\" \/><\/span> parameter. Large values for this parameter will result in a small number of more ambiguous templates being found, whereas small values will produce many highly specific templates.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">At several points in the environment the complex cell outputs can be all zero or very near zero. This causes a template to form which represents seeing no features. The frequency with which this template is found means that this template contributes no useful information to the RatSLAM system and does not help in localization. To prevent this, there is a special template that corresponds to the case of no visual input. This template has no corresponding LV unit and is never linked to any pose cells.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">The vision system implements a crude form of expectation within the template matching process. In an ideal situation the order in which templates are added to the set of learned templates will correspond roughly to the relative physical position of the templates. The system uses the best template match from the previous frame, template b, as the input to an expectation system which suppresses templates that are not nearby in template index. This means the pose cell system now needs two frames before it can start to recover from global kidnapping.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000; font-family: Arial; font-size: 12pt; background-color: white;\">Prasser, D. P., Gordon Fraser Wyeth, and M. J. Milford. &#8220;Biologically inspired visual landmark processing for simultaneous localization and mapping.&#8221; In Intelligent Robots and Systems, 2004.(IROS 2004). Proceedings. 2004 IEEE\/RSJ International Conference on, vol. 1, pp. 730-735. IEEE, 2004.<\/span><\/p>\n<hr \/>\n<h4><span style=\"background-color: white; font-size: 12pt; color: #000000;\">Approach 2: Intensity Scanline Profiles based visual template matching <\/span><\/h4>\n<p style=\"text-align: justify;\"><span style=\"color: #000000; font-family: Arial; font-size: 12pt; background-color: white;\">David Ball, et al. 2013 utilizes the intensity scanline profiles based visual template matching in OpenRatSLAM. <\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-size: 12pt; color: #000000;\"><img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem14.png\" alt=\"\" \/><\/span><\/p>\n<p><span style=\"color: #000000; font-family: Arial; font-size: 12pt; background-color: white;\">Place recognition using view templates for the iRat dataset. The bottom half of the image is discarded by cropping, as it contains perceptually ambiguous black floor. The Local View node compares the copped, sub-sampled and grayscale current view to all of the stored view templates to find the best match. The delta operator indicates that comparisons are made while shifting the current view and visual templates relative to each other. The result is the currently active view template, which may be a new view template. <\/span><\/p>\n<p><span style=\"font-family: 'Albertus Md'; font-size: 12pt; color: #000000;\"><strong>Sum of Absolute Differences Module <\/strong><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">The 96-pixel images are processed by a Sum of Absolute Differences (SAD) module that produces the local view (Prasser, Wyeth et al. 2004). The SAD module compares each image with its repository of stored template images. New images that are sufficiently similar to template images are re recognised as such, while significantly different images are learned as new image templates and added to the repository.<\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-size: 12pt; color: #000000;\"><img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem15.png\" alt=\"\" \/><\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-size: 12pt; color: #000000;\">High resolution and down sampled greyscale camera images. The low resolution 12 \u00d7 8 pixel images are used by the RatSLAM system.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">Using this vision module, the local view consists of a one-dimensional array of cells, with each cell corresponding to a particular image template. The activity level of a cell,<img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem16.png\" alt=\"\" \/>, varies between 0 and 1, and is given by:<\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-size: 12pt; color: #000000;\"><img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem17.png\" alt=\"\" \/> (1)<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">where<img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem18.png\" alt=\"\" \/>is introduced to avoid infinite activation levels when the distance is zero, and <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem19.png\" alt=\"\" \/>is a distance threshold. The distance,<img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem20.png\" alt=\"\" \/>, is the sum of absolute differences between pixel intensity values in the template and the current image and is given by:<\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-size: 12pt; color: #000000;\"><img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem21.png\" alt=\"\" \/> (2)<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">where <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem22.png\" alt=\"\" \/>is the intensity value of the <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem23.png\" alt=\"\" \/>pixel in the visual template associated with cell <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem24.png\" alt=\"\" \/>, and <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem25.png\" alt=\"\" \/> is the value for the<img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem26.png\" alt=\"\" \/> pixel in the current camera image. The total activity level in the set of local view cells is normalised to unity.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">The SAD vision processing method suited the indoor environments and robot movement scheme. The corridor and wall following movement behaviours and small field of view forward facing camera ensured that the robot only learned forward and backward representations of large portions of the environment, rather than complete panoramic visual representations. The advantage of the scheme is its simplicity \u2013 it is a low resolution, appearance based method that requires only consistency in the visual appearance of the environment. There is no geometric processing of scenes and no feature extraction or tracking. The disadvantages of the scheme are its sensitivity to illumination variation and inability to recognise new images as being rotated versions of stored template images. These issues were addressed in outdoor environments by the second visual processing method, Image Histograms Matching.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000; font-family: Arial; font-size: 12pt; background-color: white;\">Ball, David, Scott Heath, Janet Wiles, Gordon Wyeth, Peter Corke, and Michael Milford. &#8220;OpenRatSLAM: an open source brain-based SLAM system.&#8221; Autonomous Robots 34, no. 3 (2013): 149-176.<\/span><\/p>\n<hr \/>\n<h4><span style=\"background-color: white; font-size: 12pt; color: #000000;\">Approach 3: ORB based visual template matching <\/span><\/h4>\n<p><span style=\"font-size: 12pt; color: #000000;\"><span style=\"font-family: Arial; background-color: white;\">Zhou, Sun-Chun<\/span>, et al. 2017 proposed an ORB-based method, which is employed to extract features from images as visual template. When the current image matches prior visual templates, it is considered that robot reached this place previously. Otherwise, a new visual template is added to local view cells.<\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-size: 12pt; color: #000000;\"><img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem27.png\" alt=\"\" \/><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">Templates matching process for local view cells. ORB features are extracted from environment scenes as visual template which is compared against all visual templates associated with local view cells<img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem28.png\" alt=\"\" \/> . When current visual template matches a prior template in visual templates, the associated local view <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem29.png\" alt=\"\" \/> fires and injects energy to pose cells. Otherwise, add a new local view cell <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem30.png\" alt=\"\" \/> to <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem31.png\" alt=\"\" \/>.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000; font-family: Arial; font-size: 12pt; background-color: white;\">Zhou, Sun-Chun, Rui Yan, Jia-Xin Li, Ying-Ke Chen, and Huajin Tang. &#8220;A brain-inspired SLAM system based on ORB features.&#8221;\u00a0<em>International Journal of Automation and Computing<\/em>14, no. 5 (2017): 564-575.<\/span><\/p>\n<hr \/>\n<h4><span style=\"background-color: white; font-size: 12pt; color: #000000;\">Approach 4: RGB-D based visual template matching (Intensity Scanline Profiles) <\/span><\/h4>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">Tian, Bo<span style=\"font-family: Arial; background-color: white;\">, et al. 2013 propose a RGB-D based visual template matching approach. The main idea of visual processing method of Tian&#8217;s work based on RGB-D information in the following Fig. Intensity profiles of neighbouring environment scenes are firstly extracted. Then, these profiles are processed by a sum of absolute differences (SAD) method to get the distance<img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem32.png\" alt=\"\" \/>. This distance is the sum of absolute differences between pixels value in these intensity profiles. Each distance <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem33.png\" alt=\"\" \/> from both RGB and depth frames are finally assigned by using different weights to construct the distance<img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem34.png\" alt=\"\" \/>. It extracts one dimensional intensity profiles from both RGB and depth images. And these one dimensional intensity profiles are processed to calculate the distance between current image and recorded visual templates in local view cells. This method discards much information from physical environment, although it is simple and efficient. <\/span><\/span><\/p>\n<p><span style=\"font-size: 12pt; color: #000000;\"><img decoding=\"async\" class=\"aligncenter\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem35.png\" alt=\"\" \/><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">A pair of neighbouring RGB-D frames is showed. The top row is RGB information. The bottom row is depth information. Both RGB and depth information are captured simultaneously. And then, intensity profiles of neighbouring environment scenes are extracted. These intensity profiles are processed by a sum of absolute difference to get a distance <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem36.png\" alt=\"\" \/>. Distances <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem37.png\" alt=\"\" \/> from RGB and depth images are weighted and then contributed to <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem38.png\" alt=\"\" \/>, which is used to distinguish different scenes.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000; font-family: Arial; font-size: 12pt; background-color: white;\">Zhou, Sun-Chun, Rui Yan, Jia-Xin Li, Ying-Ke Chen, and Huajin Tang. &#8220;A brain-inspired SLAM system based on ORB features.&#8221;\u00a0<em>International Journal of Automation and Computing<\/em>14, no. 5 (2017): 564-575. <\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">Tian, Bo, Vui Ann Shim, Miaolong Yuan, Chithra Srinivasan, Huajin Tang, and Haizhou Li. &#8220;RGB-D based cognitive map building and navigation.&#8221; In Intelligent Robots and Systems (IROS), 2013 IEEE\/RSJ International Conference on, pp. 1562-1567. IEEE, 2013.<\/span><\/p>\n<hr \/>\n<h4><span style=\"background-color: white; font-size: 12pt; color: #000000;\">Approach 5: Cylinder Landmarks based visual template matching <\/span><\/h4>\n<p><span style=\"font-size: 12pt; color: #000000;\"><strong>Cylinder Recognition System <\/strong><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">Prasser, et al. 2003 propose a cylinder landmarks based visual place recognition approach. The vision problem was simplified by using a system of artificial visual landmarks. Coloured sheets of paper were rolled into 210 mm tall cylinders with a diameter of 90 mm. Cylinder colour schemes were split between the upper and lower halves of the cylinder. Four different colours \u2013 red, green, blue, and magenta \u2013 provided a total of 16 types of unique landmark. The vision system could report the bearing, range to and colour of cylinders within its field of vision, along with an associated uncertainty value for each reading (Prasser and Wyeth 2003). Cylinders were consistently visible at ranges between one and three metres, with a distance uncertainty of about 10%.<\/span><\/p>\n<p><span style=\"color: #000000; font-family: Arial; font-size: 12pt; background-color: white;\">Prasser, D. and Wyeth, G., 2003, September. Probabilistic visual recognition of artificial landmarks for simultaneous localization and mapping. In\u00a0<em>Robotics and Automation, 2003. Proceedings. ICRA&#8217;03. IEEE International Conference on<\/em>\u00a0(Vol. 1, pp. 1291-1296). IEEE.<\/span><\/p>\n<hr \/>\n<h4><span style=\"background-color: white; font-size: 12pt; color: #000000;\">Approach 6: Image Histograms based visual template matching <\/span><\/h4>\n<p style=\"text-align: justify;\"><span style=\"font-family: 'Albertus Md'; font-size: 12pt; color: #000000;\"><strong>Image Histograms Matching <\/strong><\/span><\/p>\n<p><span style=\"font-size: 12pt; color: #000000;\">Milford MJ proposed a histogram matching system for outdoor, which was developed for generating the local view. Outdoor experiments were performed on a robot platform with a panoramic camera. The larger field of view and greater variations in illumination required a more sophisticated vision processing system. A histogram matching system was developed for generating the local view. Histograms are invariant to rotations about the camera-axis, allowing recognition of familiar places with novel robot orientations. The red-green-blue colour space was used because it is more robust to changes in illumination in the testing environments than the camera&#8217;s native YUV colour space (Tews, Robert et al. 2005).<\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-size: 12pt; color: #000000;\"><img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem39.png\" alt=\"\" \/><\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-size: 12pt; color: #000000;\">Region of the panoramic image used to generate histograms<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">Only part of the camera&#8217;s panoramic image is used to produce an image histogram, as shown in the upper Fig. Other areas of the image contain no useful information. After conversion from YUV to a normalised red-green-blue colour space, the image was used to calculate a 32 \u00d7 32 two-dimensional histogram of the red-green components. From a local view perspective, these histograms serve the same purpose as the 12 \u00d7 8 image templates described in the previous section.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">As the robot moves around the environment, the current image histogram is either matched to an already learned histogram or added to the reference set. The distance between two histograms,<img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem40.png\" alt=\"\" \/>, is calculated using a modified <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem41.png\" alt=\"\" \/> statistic:<\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-size: 12pt; color: #000000;\"><img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem42.png\" alt=\"\" \/>(3)<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">where <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem43.png\" alt=\"\" \/>is the reference histogram and c is the current histogram. The j subscript refers to the bin number within each histogram.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">Each reference histogram has an associated local view cell that is activated by similar new histograms. New local view cells are created for each new reference histogram. The activity level of a local view cell, <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem44.png\" alt=\"\" \/>, is calculated using the <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem45.png\" alt=\"\" \/>distance, and normalised:<\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-size: 12pt; color: #000000;\"><img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem46.png\" alt=\"\" \/> (4)<\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-size: 12pt; color: #000000;\"><img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem47.png\" alt=\"\" \/><\/span><\/p>\n<p><span style=\"font-family: 'Albertus Md'; font-size: 12pt; color: #000000;\">The constant <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem48.png\" alt=\"\" \/>is a distance threshold beyond which a cell is inactive. <\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">P 103 ~ P111 in Michael&#8217;s Book, 2008<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">Milford MJ. Robot navigation from nature: Simultaneous localisation, mapping, and path planning based on hippocampal models. Springer Science &amp; Business Media; 2008 Feb 11.<\/span><\/p>\n<hr \/>\n<h4><span style=\"font-size: 12pt; color: #000000;\">Some extension content<\/span><\/h4>\n<p><span style=\"font-family: 'Albertus Md'; font-size: 12pt; color: #000000;\"><strong>Extracting Orientation Information <\/strong><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">The approach that was used incorporates a compass to provide robot orientation information. The compass is assumed only to be locally consistent rather than globally accurate. Other methods of determining relative orientation such as feature tracking in the panoramic images would also be suitable under this approach.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">Upon creation each local view cell stores not only the image template but also the compass orientation. The compass orientation when each image template <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem49.png\" alt=\"\" \/>is first stored becomes the reference orientation<img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem50.png\" alt=\"\" \/>. When the robot passes through the same environment location again, the difference in orientation, <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem51.png\" alt=\"\" \/>, is calculated and converted into the discrete heading space of the pose cells:<\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-size: 12pt; color: #000000;\"><img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem52.png\" alt=\"\" \/> (5)<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">where <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem53.png\" alt=\"\" \/> is the relative orientation, \u03b1 is the current compass orientation, and <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem54.png\" alt=\"\" \/> is the angular width of a pose cell. <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem55.png\" alt=\"\" \/> has a value of 10 since each pose cell initially represents 10 degrees of orientation.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">Eqs. 8 and 9 were adjusted to incorporate the extra relative orientation information stored with each local view cell. Equation 8 became:<\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-size: 12pt; color: #000000;\"><img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem56.png\" alt=\"\" \/> (6)<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">This equation reinforces the association between the reference orientation version of the histogram template and the pose cells that were active the first time it was seen. Multiple viewings of an image always reinforce the original template association, regardless of the orientation of the robot each time.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">To re-localise the robot, active local view cells inject energy into the pose cells. Which pose cells are activated depends on the relative orientation <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem57.png\" alt=\"\" \/>of the current image, which determines the target of the LV-PC links. The change in pose cell activation due to visual input,<img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem58.png\" alt=\"\" \/>, is given by:<\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-size: 12pt; color: #000000;\"><img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem59.png\" alt=\"\" \/> (7)<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">The <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem60.png\" alt=\"\" \/>subscript shifts the activity injection point within the pose cells to adjust for the shift in orientation between the original vision template and the current image.<\/span><\/p>\n<p style=\"text-align: justify;\">\u00a0<\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: 'Albertus Md'; font-size: 12pt; color: #000000;\"><strong>Learning Visual Scenes<\/strong><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">Visual scenes are associated with the robot&#8217;s believed position by a learning function which increases the connection strengths between co-activated local view and pose cells. The updated connection strength,<img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem61.png\" alt=\"\" \/>, is given by:<\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-size: 12pt; color: #000000;\"><img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem62.png\" alt=\"\" \/> (8)<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">where <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem63.png\" alt=\"\" \/>is the activity level of the local view cell and <img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem64.png\" alt=\"\" \/> is the activity level of the pose cell. These links form the pose-view map.<\/span><\/p>\n<p style=\"text-align: justify;\">\u00a0<\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: 'Albertus Md'; font-size: 12pt; color: #000000;\"><strong>Re-localising Using Familiar Visual Scenes <\/strong><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">The pose-view map can be used to inject activity into the pose cells, which is the mechanism that RatSLAM uses to maintain or correct its believed pose. Active local view cells project their activity into the pose cells to which they are associated, by an amount proportional to the association strength. The change in pose cell activity,<img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem65.png\" alt=\"\" \/> , is given by:<\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-size: 12pt; color: #000000;\"><img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem66.png\" alt=\"\" \/> (9)<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">The learning method that RatSLAM uses to build the pose-view map cannot usefully associate raw camera data with pose cells. The data must be processed to reduce the dimensionality of the camera image while preserving distinctive information. The single layer learning mechanism between the local view and pose cells works best when the local view is a sparse feature vector, as this avoids problems with linearly inseparable inputs. To constrain the local view structure to a practical number of cells, the vision processing system should perform spatial generalisation; activity in the local view cells should not change significantly for small changes in robot pose.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">P 103 ~ P111 in Michael&#8217;s Book, 2008<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 12pt; color: #000000;\">Milford MJ. Robot navigation from nature: Simultaneous localisation, mapping, and path planning based on hippocampal models. Springer Science &amp; Business Media; 2008 Feb 11.<\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-size: 12pt; color: #000000;\"><img decoding=\"async\" src=\"https:\/\/www.braininspirednavigation.com\/wp-content\/uploads\/2018\/01\/010418_0526_Howtoimplem67.png\" alt=\"\" \/><\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-size: 12pt; color: #000000;\">Visual Data Association<\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-size: 12pt; color: #000000;\">Gordon Wyeth, Michael Milford, Will Maddern. From Rats to Robots: Bio-inspired Localization and Navigation.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this report, I summarized some key methods for visual processing module in RatSLAM or RatSLAM-based System. There are more than six approaches as following. By comparing and doing some practical experiments, I think that the intensity scanline profile and feature matching approaches are good for the visual processing module. Approach 1: Features based visual [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[114,96],"tags":[100,213,215,115,214,216,212],"_links":{"self":[{"href":"https:\/\/braininspirednavigation.com\/index.php?rest_route=\/wp\/v2\/posts\/623"}],"collection":[{"href":"https:\/\/braininspirednavigation.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/braininspirednavigation.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/braininspirednavigation.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/braininspirednavigation.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=623"}],"version-history":[{"count":7,"href":"https:\/\/braininspirednavigation.com\/index.php?rest_route=\/wp\/v2\/posts\/623\/revisions"}],"predecessor-version":[{"id":698,"href":"https:\/\/braininspirednavigation.com\/index.php?rest_route=\/wp\/v2\/posts\/623\/revisions\/698"}],"wp:attachment":[{"href":"https:\/\/braininspirednavigation.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=623"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/braininspirednavigation.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=623"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/braininspirednavigation.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=623"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}