Vai ai contenuti. | Spostati sulla navigazione | Spostati sulla ricerca | Vai al menu | Contatti | Accessibilità

| Crea un account

Munaro, Matteo (2014) Robust perception of humans for mobile robots RGB-depth algorithms for people tracking, re-identification and action recognition. [Tesi di dottorato]

Full text disponibile come:

[img]
Anteprima
Documento PDF - Versione sottomessa
Available under License Creative Commons Attribution Non-commercial Share Alike.

49Mb

Abstract (inglese)

Human perception is one of the most important skills for a mobile robot sharing its workspace with humans.
This is not only true for navigation, because people have to be avoided differently than other obstacles, but also because mobile robots must be able to truly interact with humans.
In a near future, we can imagine that robots will be more and more present in every house and will perform services useful to the well-being of humans.
For this purpose, robust people tracking algorithms must be exploited and person re-identification techniques play an important role for allowing robots to recognize a person after a full occlusion or after long periods of time.
Moreover, they must be able to recognize what humans are doing, in order to react accordingly, helping them if needed or also learning from them.
This thesis tackles these problems by proposing approaches which combine algorithms based on both RGB and depth information which can be obtained with recently introduced consumer RGB-D sensors.
Our key contribution to people detection and tracking research is a depth-clustering method which allows to apply a robust image-based people detector only to a small subset of possible detection windows, thus decreasing the number of false detections while reaching high computational efficiency.
We also advance person re-identification research by proposing two techniques exploiting depth-based skeletal tracking algorithms: one is targeted to short-term re-identification and creates a compact, yet discrimative signature of people based on computing features at skeleton keypoints, which are highly repeatable and semantically meaningful; the other extract long-term features, such as 3D shape, to compare people by matching the corresponding 3D point cloud acquired with a RGB-D sensor. In order to account for the fact that people are articulated and not rigid objects, it exploits 3D skeleton information for warping people point clouds to a standard pose, thus making them directly comparable by means of least square fitting.
Finally, we describe an extension of flow-based action recognition methods to the RGB-D domain which computes motion over time of persons' 3D points by exploiting joint color and depth information and recognizes human actions by classifying gridded descriptors of 3D flow.
A further contribution of this thesis is the creation of a number of new RGB-D datasets which allow to compare different algorithms on data acquired by consumer RGB-D sensors. All these datasets have been publically released in order to foster research in these fields.

Abstract (italiano)

Una delle più importanti abilità per un robot mobile che agisce in un ambiente popolato da persone è la capacità di percepire gli esseri umani.
Questo non è vero soltanto per la navigazione perché le persone devono essere evitate in maniera diversa dagli altri ostacoli, ma anche perché i robot mobili devono essere in grado di interagire veramente con gli esseri umani.
In un prossimo futuro, si può immaginare che i robot saranno sempre più presenti in ogni casa e svolgeranno compiti utili al benessere delle persone.
Per questo scopo, è necessario utilizzare robusti algoritmi di tracking e le tecniche di re-identificazione svolgono un ruolo importante per far sì che i robot riconoscano una persona anche dopo un'occlusione totale o dopo lunghi periodi di tempo.
Inoltre, essi devono essere in grado di riconoscere le azioni delle persone per reagire in maniera adeguata, aiutandole se necessario o anche apprendendo da loro.
Questa tesi affronta queste problematiche proponendo approcci che combinano algoritmi basati su informazioni RGB e di profondità che possono essere ottenute con i sensori RGB-D recentemente introdotti nel mercato.
Il nostro contributo chiave alla ricerca sulla rilevazione e il tracking di persone è un clustering basato sull'informazione di profondità che permette di applicare un rilevatore di persone robusto e basato sull'immagine solamente a un ristretto insieme delle possibili finestre di detection, quindi diminuendo il numero di falsi allarmi e raggiungendo un'elevata efficienza computazionale.
La ricerca sulla re-identificazione di persone viene avanzata proponendo due tecniche che sfruttano algoritmi di tracking dello scheletro basati sull'informazione di profondità: una è pensata per la re-identificazione a breve termine e crea una firma compatta, ma discriminativa, delle persone calcolando delle feature alle posizioni chiave dello scheletro, che sono altamente ripetibili e semanticamente significative; l'altra estrae feature a lungo termine, come la forma 3D, per confrontare le persone in base alla loro nuvola di punti 3D acquisita con un sensore RGB-D. Per tenere conto del fatto che le persone non sono oggetti rigidi, ma sono articolate, questa tecnica sfrutta l'informazione 3D dello scheletro per ricondurre le nuvole di punti delle persone ad una posa standard che le renda direttamente confrontabili mediante un fitting ai minimi quadrati.
Infine, viene descritta un'estensione al dominio RGB-D delle tecniche di riconoscimento di azioni basati sul flusso ottico. Questa estensione calcola il flusso nel tempo dei punti 3D di una persona sfruttando congiuntamente l'informazione di colore e profondità e riconosce le azioni umane classificando descrittori a griglia del flusso 3D.
Un ulteriore contributo di questa tesi è la creazione di una serie di dataset RGB-D che permettono di confrontare diversi algoritmi su dati acquisiti con sensori RGB-D di tipo consumer. Tutti questi dataset sono stati rilasciati pubblicamente per favorire la ricerca in questi settori.

Statistiche Download - Aggiungi a RefWorks
Tipo di EPrint:Tesi di dottorato
Relatore:Menegatti, Emanuele
Dottorato (corsi e scuole):Ciclo 26 > Scuole 26 > INGEGNERIA DELL'INFORMAZIONE > SCIENZA E TECNOLOGIA DELL'INFORMAZIONE
Data di deposito della tesi:29 Gennaio 2014
Anno di Pubblicazione:28 Gennaio 2014
Parole chiave (italiano / inglese):Human perception RGB-D Mobile robots People tracking Person re-identification Action recognition Real time
Settori scientifico-disciplinari MIUR:Area 09 - Ingegneria industriale e dell'informazione > ING-INF/05 Sistemi di elaborazione delle informazioni
Struttura di riferimento:Dipartimenti > Dipartimento di Ingegneria dell'Informazione
Codice ID:6576
Depositato il:19 Mag 2015 15:48
Simple Metadata
Full Metadata
EndNote Format

Bibliografia

I riferimenti della bibliografia possono essere cercati con Cerca la citazione di AIRE, copiando il titolo dell'articolo (o del libro) e la rivista (se presente) nei campi appositi di "Cerca la Citazione di AIRE".
Le url contenute in alcuni riferimenti sono raggiungibili cliccando sul link alla fine della citazione (Vai!) e tramite Google (Ricerca con Google). Il risultato dipende dalla formattazione della citazione.

[1] A. Alahi, R. Ortiz, and P. Vandergheynst. Freak: Fast retina keypoint. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2012), pages 510–517, 2012. Cerca con Google

[2] S. Ali and M. Shah. Human action recognition in videos using kinematic features and multiple instance learning. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 32(2):288–303, 2010. Cerca con Google

[3] M. Bajracharya, B. Moghaddam, A. Howard, S. Brennan, and L. H. Matthies. A fast stereo-based system for detecting and tracking pedestrians from a moving vehicle. In International Journal of Robotics Research, volume 28, pages 1466–1485, 2009. Cerca con Google

[4] G. Ballin, M. Munaro, and E. Menegatti. Human action recognition from rgb-d frames based on real-time 3d optical flow estimation. In A. Chella, R. Pirrone, R. Sorbello, and K. R. J´ohannsd´ottir, editors, Biologically Inspired Cognitive Architectures 2012, volume 196 of Advances in Intelligent Systems and Computing, pages 65–74. Springer Berlin Heidelberg, 2012. Cerca con Google

[5] D. Baltieri, R. Vezzani, R. Cucchiara, A. Utasi, C. Benedek, and T. Sziranyi. Multi-view people surveillance using 3d information. In Proceedings of the 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops 2011), pages 1817–1824, 2011. Cerca con Google

[6] I. B. Barbosa, M. Cristani, A. Del Bue, L. Bazzani, and V. Murino. Reidentification with rgb-d sensors. In European Conference on Computer Vision (ECCV) Workshops 2012, pages 433–442. Springer, 2012. Cerca con Google

[7] F. Basso, M. Munaro, S. Michieletto, E. Pagello, and E. Menegatti. Fast and robust multi-people tracking from rgb-d data for a mobile robot. In 12th Intelligent Autonomous Systems Conference (IAS-12), pages 265–276, Jeju Island, Korea, June 2012. Cerca con Google

[8] M. Bauml and R. Stiefelhagen. Evaluation of local features for person reidentification in image sequences. In 8th IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS 2011), pages 291–296. IEEE, 2011. Cerca con Google

[9] H. Bay, A. Ess, T. Tuytelaars, and L. Van Gool. Speeded-up robust features (surf). Computer Vision and Image Understanding, 110(3):346–359, June 2008. Cerca con Google

[10] N. Bellotto and H. Hu. Computationally efficient solutions for tracking people with a mobile robot: an experimental evaluation of bayesian filters. Autonomous Robots, 28:425–438, May 2010. Cerca con Google

[11] K. Berger. The role of rgb-d benchmark datasets: an overview. arXiv preprint arXiv:1310.2053, 2013. Cerca con Google

[12] K. Bernardin and R. Stiefelhagen. Evaluating multiple object tracking performance: the clear mot metrics. Journal of Image Video Processing, 2008:1:1–1:10, January 2008. Cerca con Google

[13] P. J. Besl and N. McKay. A method for registration of 3-d shapes. IEEE Trans. on Pattern Analysis and Machine Intelligence, 14:239–256, 1992. Cerca con Google

[14] A. Bhattacharyya. On a measure of divergence between two statistical populations defined by their probability distributions. Bulletin of Calcutta Mathematical Society, 35:99–109, 1943. Cerca con Google

[15] M. Blank, L. Gorelick, E. Shechtman, M. Irani, and R. Basri. Actions as spacetime shapes. In Proc. Tenth IEEE Int. Conf. Computer Vision ICCV 2005, volume 2, pages 1395–1402, 2005. Cerca con Google

[16] V. Bloom, D. Makris, and V. Argyriou. G3d: A gaming action dataset and real time action recognition evaluation framework. In 2012 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 7 –12, june 2012. Cerca con Google

[17] G. Bradski. The OpenCV Library. Dr. Dobb’s Journal of Software Tools, 2000. Cerca con Google

[18] M. D. Breitenstein, F. Reichlin, B. Leibe, E. Koller-Meier, and L. V. Gool. Robust tracking-by-detection using a detector confidence particle filter. In International Conference on Computer Vision (ICCV) 2009, volume 1, pages 1515–1522, October 2009. Cerca con Google

[19] A. M. Bronstein, M. M. Bronstein, , and R. Kimmel. Three-dimensional face recognition. International Journal of Computer Vision, 64:5–30, 2005. Cerca con Google

[20] A. M. Bronstein, M. M. Bronstein, and R. Kimmel. Topology-invariant similarity of nonrigid shapes. International Journal of Computer Vision, 81:281–301, March 2009. Cerca con Google

[21] M. Calonder, V. Lepetit, C. Strecha, and P. Fua. Brief: binary robust independent elementary features. In Proc. of the 2010 European Conference on Computer Vision (ECCV 2010), pages 778–792. Springer, 2010. Cerca con Google

[22] A. Carballo, A. Ohya, and S. Yuta. Reliable people detection using range and intensity data from multiple layers of laser range finders on a mobile robot. International Journal of Social Robotics, 3(2):167–186, 2011. Cerca con Google

[23] S. Carlsson and J. Sullivan. Action recognition by shape matching to key frames. In IEEE Computer Society Workshop on Models versus Exemplars in Computer Vision, 2001. Cerca con Google

[24] J. Chen, D. Bautembach, and S. Izadi. Scalable real-time volumetric surface reconstruction. ACM Transactions on Graphics (TOG), 32(4):113, 2013. Cerca con Google

[25] D. S. Cheng, M. Cristani, M. Stoppa, L. Bazzani, and V. Murino. Custom pictorial structures for re-identification. In British Machine Vision Conference, volume 2, page 6, 2011. Cerca con Google

[26] W. Choi, C. Pantofaru, and S. Savarese. Detecting and tracking people using an rgb-d camera via multiple detector fusion. In International Conference on Computer Vision (ICCV) Workshops 2011, pages 1076–1083, 2011. Cerca con Google

[27] W. Choi, C. Pantofaru, and S. Savarese. A general framework for tracking multiple people from a moving camera. Pattern Analysis and Machine Intelligence (PAMI), 35(7):1577–1591, 2012. Cerca con Google

[28] C. Cortes and V. N. Vapnik. Support-vector networks. Machine Learning, 20, 1995. Cerca con Google

[29] C. Dal Mutto, P. Zanuttigh, and G. M. Cortelazzo. Time-of-Flight Cameras and Microsoft Kinect. Springer, 2012. Cerca con Google

[30] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In Computer Vision and Pattern Recognition (CVPR) 2005, volume 1, pages 886–893, June 2005. Cerca con Google

[31] M. Dantone, J. Gall, G. Fanelli, and L. V. Gool. Real-time facial feature detection using conditional regression forests. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2578–2585, 2012. Cerca con Google

[32] P. Dollar, V. Rabaud, G. Cottrell, and S. Belongie. Behavior recognition via sparse spatio-temporal features. In Proc. 2nd Joint IEEE Int Visual Surveillance and Performance Evaluation of Tracking and Surveillance Workshop, pages 65–72, 2005. Cerca con Google

[33] P. Dollar, C. Wojek, B. Schiele, and P. Perona. Pedestrian detection: A benchmark. In Computer Vision and Pattern Recognition (CVPR) 2009, pages 304–311, 2009. Cerca con Google

[34] A. Efros, A. Berg, G. Mori, and J. Malik. Recognizing action at a distance. In 9th IEEE International Conference on Computer Vision (ICCV) 2003, pages 726 –733 vol.2, oct. 2003. Cerca con Google

[35] A. Ess, B. Leibe, K. Schindler, and L. Van Gool. A mobile vision system for robust multi-person tracking. In Computer Vision and Pattern Recognition (CVPR) 2008, pages 1–8, 2008. Cerca con Google

[36] A. Ess, B. Leibe, K. Schindler, and L. Van Gool. Moving obstacle detection in highly dynamic scenes. In International Conference on Robotics and Automation (ICRA) 2009, pages 4451–4458, 2009. Cerca con Google

[37] M. Everingham, L. Gool, C. K. Williams, J. Winn, and A. Zisserman. The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88:303–338, June 2010. Cerca con Google

[38] M. Farenzena, L. Bazzani, A. Perina, V. Murino, and M. Cristani. Person reidentification by symmetry-driven accumulation of local features. In IEEE Conference on Computer Vision and Pattern Recognition, pages 2360 –2367, june 2010. Cerca con Google

[39] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part based models. Pattern Analysis and Machine Intelligence (PAMI), 32(9):1627–1645, September 2010. Cerca con Google

[40] S. Ghidoni, S. Anzalone, M. Munaro, M. S., and E. Menegatti. A distributed perception infrastructure for robot assisted living. To appear in Robotics and Autonomous Systems (RAS) Journal, 2014. Cerca con Google

[41] H. Grabner and H. Bischof. On-line boosting and vision. In Computer Vision and Pattern Recognition (CVPR), volume 1, pages 260–267. IEEE Computer Society, 2006. Cerca con Google

[42] D. Gray and H. Tao. Viewpoint invariant pedestrian recognition with an ensemble of localized features. In European Conference on Computer Vision, volume 5302, pages 262–275, 2008. Cerca con Google

[43] S. S. H. Jin and A. Yezzi. Multi-view stereo reconstruction of dense shape and complex appearance. International Journal of Computer Vision, 63:175–189, 2005. Cerca con Google

[44] E. Hall. The Hidden Dimension. Anchor books editions, 1966. Cerca con Google

[45] M. Holte and T. Moeslund. View invariant gesture recognition using 3d motion primitives. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2008, pages 797 –800, 31 2008-april 4 2008. Cerca con Google

[46] M. Holte, T. Moeslund, N. Nikolaidis, and I. Pitas. 3d human action recognition for multi-view camera systems. In 2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT), pages 342 –349, may 2011. Cerca con Google

[47] L. Hu, S. Jiang, Q. Huang, and W. Gao. People re-detection using adaboost with sift and color correlogram. In Proc. of the 15th International Conference on Image Processing (ICIP 2008), pages 1348–1351, 2008. Cerca con Google

[48] S. Izadi, D. Kim, O. Hilliges, D. Molyneaux, R. Newcombe, P. Kohli, J. Shotton, S. Hodges, D. Freeman, A. Davison, et al. Kinectfusion: real-time 3d reconstruction and interaction using a moving depth camera. In Proceedings of the 24th annual ACM symposium on User interface software and technology, pages 559–568. ACM, 2011. Cerca con Google

[49] A. K. Jain, S. C. Dass, and K. Nandakumar. Can soft biometric traits assist user recognition? Proc. SPIE, Biometric Technology for Human Identification, 5404:561–572, 2004. Cerca con Google

[50] S. Jain. A survey of laser range finding. National Science Foundation, 2003. Cerca con Google

[51] A. Janoch, S. Karayev, Y. Jia, J. Barron, M. Fritz, K. Saenko, and T. Darrell. A category-level 3-D object dataset: putting the Kinect to work. In International Conference on Computer Vision (ICCV) Workshop on Consumer Depth Cameras in Computer Vision, November 2011. Cerca con Google

[52] G. Johansson. Visual perception of biological motion and a model for its analysis. Attention, Perception, & Psychophysics, 14:201–211, 1973. 10.3758/BF03212378. Cerca con Google

[53] K. Jungling and M. Arens. Feature based person detection beyond the visible spectrum. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops 2009), pages 30–37, 2009. Cerca con Google

[54] Y. Ke, R. Sukthankar, and M. Hebert. Efficient visual event detection using volumetric features. In 10th IEEE International Conference on Computer Vision (ICCV) 2005, volume 1, pages 166–173, 2005. Cerca con Google

[55] A. Klaser, M. Marszałek, and C. Schmid. A spatio-temporal descriptor based on 3d-gradients. In British Machine Vision Conference, pages 995–1004, sep 2008. Cerca con Google

[56] P. Konstantinova, A. Udvarev, and T. Semerdjiev. A study of a target tracking algorithm using global nearest neighbor approach. In CompSysTec 2003: e-Learning, pages 290–295. ACM, 2003. Cerca con Google

[57] H. S. Koppula, A. Anand, T. Joachims, and A. Saxena. Semantic labeling of 3d point clouds for indoor scenes. In Neural Information Processing Systems (NIPS), pages 244–252, 2011. Cerca con Google

[58] M. Korner and J. Denzler. Analyzing the subspaces obtained by dimensionality reduction for human action recognition from 3d data. In IEEE 9th International Conference on Advanced Video and Signal-Based Surveillance (AVSS) 2012, pages 130 –135, sept. 2012. Cerca con Google

[59] K. Lai, L. Bo, X. Ren, , and D. Fox. A large-scale hierarchical multi-view rgb-d object dataset. In International Conference on Robotics and Automation (ICRA) 2011, pages 1817–1824, May 2011. Cerca con Google

[60] I. Laptev, B. Caputo, C. Schuldt, and T. Lindeberg. Local velocity-adapted motion events for spatio-temporal recognition. Computer Vision and Image Understanding, 108:207–229, December 2007. Cerca con Google

[61] I. Laptev and T. Lindeberg. Space-time interest points. In 9th IEEE International Computer Vision Conference (ICCV), pages 432–439, 2003. Cerca con Google

[62] I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld. Learning realistic human actions from movies. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2008, pages 1–8, 2008. Cerca con Google

[63] J. Lei, X. Ren, and D. Fox. Fine-grained kitchen activity recognition using rgb-d. In Proceedings of the 2012 ACM Conference on Ubiquitous Computing, UbiComp ’12, pages 208–211, New York, NY, USA, 2012. ACM. Cerca con Google

[64] T. Leyvand, C. Meekhof, Y.-C. Wei, J. Sun, and B. Guo. Kinect identity: Technology and experience. Computer, 44(4):94 –96, april 2011. Cerca con Google

[65] W. Li, Z. Zhang, and Z. Liu. Action recognition based on a bag of 3d points. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on, pages 9 –14, june 2010. Cerca con Google

[66] J. Liu, S. Ali, and M. Shah. Recognizing human actions using multiple features. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1 –8, june 2008. Cerca con Google

[67] R. Liu, S. Z. Li, X. Yuan, and R. He. Online determination of track loss using template inverse matching. In The Eighth International Workshop on Visual Surveillance - VS2008, Marseille, France, 2008. Graeme Jones and Tieniu Tan and Steve Maybank and Dimitrios Makris. Cerca con Google

[68] D. Lowe. Object recognition from local scale-invariant features. In Proceedings of the 7th IEEE International Conference on Computer Vision (ICCV 1999), volume 2, pages 1150–1157, 1999. Cerca con Google

[69] M. Luber, L. Spinello, and K. O. Arras. People tracking in rgb-d data with online boosted target models. In International Conference On Intelligent Robots and Systems (IROS) 2011, pages 3844–3849, 2011. Cerca con Google

[70] B. Lukas and T. Kanade. An iterative image registration technique with an application to stereo vision. In International Joint Conferences on Artificial Intelligence (IJCAI) 1981, pages 674–679, 1981. Cerca con Google

[71] C. Martin, E. Schaffernicht, A. Scheidig, and H.-M. Gross. Multi-modal sensor fusion using a probabilistic aggregation scheme for people detection and tracking. Robotics and Autonomous Systems, 54(9):721–728, 2006. Cerca con Google

[72] Y. Ming, Q. Ruan, and A. Hauptmann. Activity recognition from rgb-d camera with 3d local spatio-temporal features. In IEEE International Conference on Multimedia and Expo (ICME) 2012, pages 344 –349, july 2012. Cerca con Google

[73] D. Mitzel and B. Leibe. Real-time multi-person tracking with detector assisted structure propagation. In International Conference on Computer Vision (ICCV) Workshops 2011, pages 974–981. IEEE, 2011. Cerca con Google

[74] O. Mozos, R. Kurazume, and T. Hasegawa. Multi-part people detection using 2d range data. International Journal of Social Robotics, 2:31–40, 2010. Cerca con Google

[75] M. Munaro, G. Ballin, S. Michieletto, and E. Menegatti. 3D flow estimation for human action recognition from colored point clouds. Journal on Biologically Inspired Cognitive Architectures, page 4251, 2013. Cerca con Google

[76] M. Munaro, A. Basso, A. Fossati, L. Van Gool, and E. Menegatti. 3d reconstruction of freely moving persons for re-identification with a depth sensor. In IEEE International Conference on Robotics and Automation (ICRA), Hong Kong (China), June 2014. Cerca con Google

[77] M. Munaro, F. Basso, and E. Menegatti. Tracking people within groups with rgb-d data. In Proc. of the International Conference on Intelligent Robots and Systems (IROS), pages 2101–2107, Algarve, Portugal, October 2012. Cerca con Google

[78] M. Munaro, F. Basso, S. Michieletto, E. Pagello, and E. Menegatti. A software architecture for rgb-d people tracking based on ros framework for a mobile robot. In Frontiers of Intelligent Autonomous Systems, volume 466, pages 53–68. Springer, 2013. Cerca con Google

[79] M. Munaro, A. Fossati, A. Basso, E. Menegatti, and E. Van Gool. One-shot person re-identification with a consumer depth camera. In Person Re-Identification, pages 161–181. Springer, 2014. Cerca con Google

[80] M. Munaro, S. Ghidoni, D. Tartaro Dizmen, and E. Menegatti. A feature-based approach to people re-identification using skeleton keypoints. In IEEE International Conference on Robotics and Automation (ICRA), Hong Kong (China). Elsevier, June 2014. Cerca con Google

[81] M. Munaro and E. Menegatti. Fast rgb-d people tracking for service robots. To appear in Autonomous Robots Journal, 2014. Cerca con Google

[82] M. Munaro, S. Michieletto, and E. Menegatti. An evaluation of 3D motion flow and 3D pose estimation for human action recognition. In RSS Workshops: RGB-D: Advanced Reasoning with Depth Cameras, 2013. Cerca con Google

[83] C. L. Naberezny Azevedo. The multivariate normal distribution [online]. http://www.ime.unicamp.br/~cnaber/mvnprop.pdf. Vai! Cerca con Google

[84] L. E. Navarro-Serment, C. Mertz, and M. Hebert. Pedestrian detection and tracking using three-dimensional ladar data. In The International Journal of Robotics Research, Special Issue on the Seventh International Conference on Field and Service Robots, pages 103–112, 2009. Cerca con Google

[85] B. Ni, G.Wang, and P. Moulin. Rgbd-hudaact: A color-depth video database for human daily activity recognition. In IEEE International Conference on Computer VisionWorkshops (ICCVWorkshops), 2011, pages 1147 –1153, nov. 2011. Cerca con Google

[86] J. C. Niebles, H. Wang, and L. Fei-Fei. Unsupervised learning of human action categories using spatial-temporal words. International Journal of Computer Vision, 79:299–318, September 2008. Cerca con Google

[87] D. Ober, S. Neugebauer, and P. Sallee. Training and feature-reduction techniques for human identification using anthropometry. In 4th IEEE International Conference on Biometrics: Theory Applications and Systems (BTAS) 2010, pages 1 –8, sept. 2010. Cerca con Google

[88] F. Ofli, R. Chaudhry, G. Kurillo, R. Vidal, and R. Bajcsy. Sequence of the most informative joints (smij): A new representation for human skeletal action recognition. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on, pages 8 –13, june 2012. Cerca con Google

[89] F. Ofli, R. Chaudhry, G. Kurillo, R. Vidal, and R. Bajcsy. Berkeley mhad: A comprehensive multimodal human action database. In Proc. of the IEEE Workshop on Applications on Computer Vision, 2013. Cerca con Google

[90] J. Oliver, A. Albiol, and A. Albiol. 3d descriptor for people re-identification. In Proceedings of the 21st IEEE International Conference on Pattern Recognition (ICPR 2012), pages 1395–1398, 2012. Cerca con Google

[91] C. Pantofaru. The moving people, moving platform dataset. http://bags.willowgarage.com/downloads/people_dataset/. Vai! Cerca con Google

[92] N. Pears, Y. Liu, and P. Bunting. 3D imaging, analysis and applications. Springer, 2012. Cerca con Google

[93] M. Popa, A. Koc, L. Rothkrantz, C. Shan, and P. Wiggers. Kinect sensing of shopping related actions. In undefined, K. Van Laerhoven, and J. Gelissen, editors, Constructing Ambient Intelligence: AmI 2011 Workshops, Amsterdam, Netherlands, 11 2011. Cerca con Google

[94] R. Poppe. A survey on vision-based human action recognition. Image and Vision Computing, 28(6):976–990, June 2010. Cerca con Google

[95] J. Preis, M. Kessel, M. Werner, and C. Linnhoff-Popien. Gait recognition with kinect. In Proceedings of the First Workshop on Kinect in Pervasive Computing, 2012. Cerca con Google

[96] M. Quigley, B. Gerkey, K. Conley, J. Faust, T. Foote, J. Leibs, E. Berger, R. Wheeler, and A. Ng. Ros: an open-source robot operating system. In International Conference on Robotics and Automation (ICRA), 2009. Cerca con Google

[97] E. Rublee, V. Rabaud, K. Konolige, and G. Bradski. Orb: An efficient alternative to sift or surf. In Proc. of the 2011 IEEE International Conference on Computer Vision (ICCV 2011), pages 2564–2571, 2011. Cerca con Google

[98] R. Rusu, N. Blodow, and M. Beetz. Fast point feature histograms (fpfh) for 3d registration. In Proc. of the 2009 International Conference on Robotics and Automation (ICRA 2009), pages 3212–3217, 2009. Cerca con Google

[99] R. Rusu, N. Blodow, Z. Marton, and M. Beetz. Aligning point cloud views using persistent feature histograms. In Proc. of the 2008 International Conference on Intelligent Robots and Systems (IROS 2008), pages 3384–3391, 2008. Cerca con Google

[100] R. B. Rusu. Semantic 3d object maps for everyday manipulation in human living environments, 2010. Cerca con Google

[101] R. B. Rusu, J. Bandouch, F. Meier, I. A. Essa, and M. Beetz. Human action recognition using global point feature histograms and action shapes. Advanced Robotics, 23(14):1873–1908, 2009. Cerca con Google

[102] R. B. Rusu and S. Cousins. 3D is here: Point Cloud Library (PCL). In International Conference on Robotics and Automation (ICRA) 2011, pages 1–4, Shanghai, China, May 9-13 2011. Cerca con Google

[103] J. Satake and J. Miura. Robust stereo-based person detection and tracking for a person following robot. In Workshop on People Detection and Tracking (ICRA 2009), 2009. Cerca con Google

[104] R. Satta, F. Pala, G. Fumera, and F. Roli. Real-time appearance-based person re-identification over multiple Kinect cameras. In International Conference on Computer Vision and Applications (VisApp), 2013. Cerca con Google

[105] C. Schuldt, I. Laptev, and B. Caputo. Recognizing human actions: a local svm approach. In 17th International Conference on Pattern Recognition (ICPR) 2004, volume 3, pages 32–36, 2004. Cerca con Google

[106] P. Scovanner, S. Ali, and M. Shah. A 3-dimensional sift descriptor and its application to action recognition. In Proceedings of the 15th international conference on Multimedia, MULTIMEDIA ’07, pages 357–360, New York, NY, USA, 2007. ACM. Cerca con Google

[107] J. Shotton, A. W. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman, and A. Blake. Real-time human pose recognition in parts from single depth images. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1297–1304, 2011. Cerca con Google

[108] N. Silberman and R. Fergus. Indoor scene segmentation using a structured light sensor. In ICCV 2011 - Workshop on 3D Representation and Recognition, pages 601–608, 2011. Cerca con Google

[109] L. Spinello and K. O. Arras. People detection in rgb-d data. In International Conference On Intelligent Robots and Systems (IROS) 2011, pages 3838–3843, 2011. Cerca con Google

[110] L. Spinello, K. O. Arras, R. Triebel, and R. Siegwart. A layered approach to people detection in 3d range data. In Conference on Artificial Intelligence AAAI’10, PGAI Track, Atlanta, USA, 2010. Cerca con Google

[111] L. Spinello, M. Luber, and K. O. Arras. Tracking people in 3d using a bottomup top-down people detector. In International Conference on Robotics and Automation (ICRA) 2011, pages 1304–1310, Shanghai, 2011. Cerca con Google

[112] J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers. A benchmark for the evaluation of rgb-d slam systems. In International Conference On Intelligent Robots and Systems (IROS) 2012, pages 573–580, Oct. 2012. Cerca con Google

[113] J. Sung, C. Ponce, B. Selman, and A. Saxena. Human activity detection from rgbd images. In Plan, Activity, and Intent Recognition, 2011. Cerca con Google

[114] J. Sung, C. Ponce, B. Selman, and A. Saxena. Unstructured human activity detection from rgbd images. In International Conference on Robotics and Automation, ICRA, pages 842–849, May 2012. Cerca con Google

[115] H. L. U. Thuc, P. V. Tuan, and J.-N. Hwang. An effective 3d geometric relational feature descriptor for human action recognition. In IEEE RIVF International Conference on Computing and Communication Technologies, Research, Innovation, and Vision for the Future (RIVF), 2012, pages 1 –6, 27 2012-march 1 2012. Cerca con Google

[116] F. Tombari, S. Salti, and L. Di Stefano. Unique signatures of histograms for local surface description. In Proc. of the 2010 European Conference on Computer Vision (ECCV 2010), pages 356–369. Springer, 2010. Cerca con Google

[117] F. Tombari, S. Salti, and L. Di Stefano. A combined texture-shape descriptor for enhanced 3d feature matching. In Proceedings of the 18th IEEE International Conference on Image Processing (ICIP 2011), pages 809–812. IEEE, 2011. Cerca con Google

[118] C. Velardo and J.-L. Dugelay. Improving identification by pruning: A case study on face recognition and body soft biometric. In International Workshop on Image and Audio Analysis for Multimedia Interactive Services, pages 1–4, 2012. Cerca con Google

[119] P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. In Computer Vision and Pattern Recognition (CVPR) 2001, volume 1, pages 511–518, 2001. Cerca con Google

[120] J.Wang, Z. Liu, Y.Wu, and J. Yuan. Mining actionlet ensemble for action recognition with depth cameras. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2012), Providence, Rhode Island, pages 1290–1297, June 2012. Cerca con Google

[121] Z. L. Wanqing Li, Zhengyou Zhang. Action recognition based on a bag of 3d points. In IEEE International Workshop on CVPR for Human Communicative Behavior Analysis (in conjunction with CVPR 2010), San Francisco, CA, June 2010. Cerca con Google

[122] D. Weinland, R. Ronfard, and E. Boyer. Free viewpoint action recognition using motion history volumes. Computer Vision and Image Understanding, 104(2):249–257, Nov. 2006. Cerca con Google

[123] A. Weiss, D. Hirshberg, and M. Black. Home 3d body scans from noisy image and range data. In IEEE International Conference on Computer Vision (ICCV), pages 1951–1958, 2011. Cerca con Google

[124] C.Wolf, J. Mille, E. Lombardi, O. Celiktutan, M. Jiu, M. Baccouche, E. Dellandra, C.-E. Bichot, C. Garcia, and B. Sankur. The LIRIS Human activities dataset and the ICPR 2012 human activities recognition and localization competition. Technical report, LIRIS Laboratory, 2012. Cerca con Google

[125] L. Xia, C.-C. Chen, and J. Aggarwal. View invariant human action recognition using histograms of 3d joints. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on, pages 20–27, june 2012. Cerca con Google

[126] J. Xing, H. Ai, and S. Lao. Multi-object tracking through occlusions by local tracklets filtering and global tracklets association with detection responses. In Computer Vision and Pattern Recognition (CVPR), pages 1200–1207, 2009. Cerca con Google

[127] Y. Yacoob and M. Black. Parameterized modeling and recognition of activities. In 6th International Conference on Computer Vision (ICCV), 1998, pages 120–127, jan 1998. Cerca con Google

[128] X. Yang and Y. Tian. Eigenjoints-based action recognition using naive-bayes nearest-neighbor. In IEEE Workshop on CVPR for Human Activity Understanding from 3D Data, 2012. Cerca con Google

[129] A. Yilmaz and M. Shah. Actions sketch: a novel action representation. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2005, volume 1, pages 984 – 989 vol. 1, June 2005. Cerca con Google

[130] K. Yoon, D. Harwood, and L. Davis. Appearance-based person recognition using color/path-length profile. Journal of Visual Communication and Image Representation (JVCIR 2006), 17(3):605–622, 2006. Cerca con Google

[131] H. Zhang and L. E. Parker. 4-dimensional local spatio-temporal features for human activity recognition. In IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 2044 –2049, September 2011. Cerca con Google

[132] L. Zhang, Y. Li, and N. R. Global data association for multi-object tracking using network flows. In Computer Vision and Pattern Recognition (CVPR), pages 1–8, 2008. Cerca con Google

[133] Y. Zhao, Z. Liu, L. Yang, and H. Cheng. Combining rgb and depth map features for human activity recognition. In Signal Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012 Asia-Pacific, pages 1 –4, dec. 2012. Cerca con Google

Download statistics

Solo per lo Staff dell Archivio: Modifica questo record