Prof. João Gama foto: Portal DeGóis |
O Instituto de Ciências Matemáticas e de Computação (ICMC), da USP São Carlos, promove o curso de difusão Learning from Distributed Data Streams, a ser ministrado por João Manuel Portela da Gama, pesquisador da Universidade do Porto, em Portugal. O curso presencial, com carga de oito horas, acontecerá entre os dias 15 e 17 de outubro, no período da tarde, na sala 5-004.
O objetivo é apresentar o estado-da-arte em data stream e discutir os problemas e desafios nessa área. O foco será o processamento de data streams distribuídos, técnicas atuais de detecção de mudanças, clustering, classificação e padrões freqüentes. O curso é voltado para alunos e pesquisadores com conhecimentos básicos em conceitos e técnicas de mineração de dados e aprendizado de máquina, e demais interessados.
As inscrições são gratuitas, e devem ser feitas até quinta-feira, 11 de outubro, por meio de formulário eletrônico (link: http://goo.gl/wUDzg). As vagas são limitadas e será obedecida a ordem de chegada.
Detalhamento
1- Introduction to Data Streams
• Data Stream Models
• Basic Streaming Methods: windowing, sampling and summarization
• Illustrative Examples of Distributed Streaming Methods
2- Change Detection
• Introduction to Change Detection
• Characterization of Drift Detection Methods
• Illustrative Algorithms for Change Detection.
3- Learning Descriptive Models from Data Streams
• Introduction to Clustering from Data Streams
• Micro/Macro Clustering
• Distributed Clustering of Sensor Network data
• Distributed Clustering of Streaming Data Sources
4- Learning Predictive Models from Data Streams
• Decision Trees and Rules from Data Streams
• Characteristics of VFDT like Algorithms
• Neural Networks and Data Streams
5- Frequent pattern mining
• Introduction to frequent pattern mining
• Introduction to Frequent pattern mining in data streams
• Illustrative algorithms for frequent pattern stream mining
6- Future Directions
• Open issues
• Potential applications
Referências bibliográficas
1. Gama, J. Knowledge Discovery from data streams, CRC Press, 2010
2. Gama J., and Gaber M. M. (Eds), Learning from Data Streams: Processing Techniques in Sensor Networks, Springer Verlag, 2007.
3. C. C. Aggarwal (Ed), Data Streams: Models and Algorithms (Advances in Database Systems), Springer 2007.
4. B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Models and Issues in Data Stream Systems, in Proceedings of PODS, 2002.
5. P. Domingos and G. Hulten. Mining High-Speed Data Streams. In Proceedings of the Association for Computing Machinery Sixth International Conference on Knowledge Discovery and Data Mining, pages 71--80, 2000.
6. P. Domingos and G. Hulten, A General Method for Scaling Up Machine Learning Algorithms and its Application to Clustering, Proceedings of the Eighteenth International Conference on Machine Learning, 2001, 106--113, Williamstown, MA, Morgan Kaufmann.
7. Y. Dora Cai, D. Clutter, G. Pape, J. Han, M. Welge, L. Auvil. MAIDS: Mining Alarming Incidents from Data Streams, Proceedings of the 23rd ACM SIGMOD (International Conference on Management of Data), June 13-18, 2004, Paris, France.
8. M M Gaber, A B. Zaslavsky, S Krishnaswamy: Mining data streams: a review. SIGMOD Record 34(2): 18-26 (2005)
9. J. Gama, R. Rocha and P. Medas, Accurate Decision Trees for Mining High-Speed Data Streams, Proceedings of the Ninth International Conference on Knowledge Discovery and Data Mining, Edited by P. Domingos and C. Faloutsos, ACM Press, 2003.
10. P.P. Rodrigues and J. Gama: A system for analysis and prediction of electricity-load streams. Intelligent Data Analysis 13(3): 477-496 (2009)
11. P.D. Haghighi, A.B. Zaslavsky, S. Krishnaswamy, M.M. Gaber, S.W. Loke: Context-aware adaptive data stream mining. Intelligent Data Analysis 13(3):423-434 (2009)
12. O. Horovitz, S Krishnaswamy, M M Gaber, A fuzzy approach for interpretation of ubiquitous data stream clustering and its application in road safety. Intell. Data Anal. 11(1): 89-108 (2007)
13. G. Hulten, L. Spencer, and P. Domingos, Mining Time-Changing Data Streams. ACM SIGKDD 2001.
14. Kargupta, H., Park, B., Pittie, S., Liu, L., Kushraj, D. and Sarkar, K. (2002). MobiMine: Monitoring the Stock Market from a PDA. ACM SIGKDD Explorations. January 2002, Volume 3, Issue 2, Pages 37-46. ACM Press.
15. H Kargupta, R Bhargava, K Liu, M Powers, P Blair, S Bushra, J Dull, K Sarkar, M Klein, M Vasa, and D Handy, VEDAS: A Mobile and Distributed Data Stream Mining System for Real-Time Vehicle Monitoring, Proceedings of SIAM International Conference on Data Mining 2004.
16. E. Keogh, J. Lin and A. Fu, HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence. In Proc. of the 5th IEEE International Conference on Data Mining (ICDM 2005), pp. 226 - 233, Houston, Texas, Nov 27-30, 2005.
17. S. K. Chong, S. Krishnaswamy, S. W. Loke, M. M. Gaber, Using association rules for energy conservation in wireless sensor networks. SAC 2008: 971-975
18. J. Lin, E. Keogh, S. Lonardi, and B. Chiu, A Symbolic Representation of Time Series, with Implications for Streaming Algorithms. In proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery. San Diego, CA. June 13, 2003.
19. S. Muthukrishnan, Data streams: Algorithms and Applications, Proceedings of the fourteenth annual ACM-SIAM symposium on discrete algorithms, 2003
20. P P Rodrigues, J Gama, J P Pedroso: Hierarchical Clustering of Time-Series Data Streams. IEEE Trans. Knowl. Data Eng. 20(5): 615-627 (2008)
21. J. Gama, P.P. Rodrigues, L. Lopes: Clustering distributed sensor data streams using local processing and reduced communication. Intell. Data Anal. 15(1): 3-28 (2011)
22. P.P. Rodrigues, J. Gama, J. Araújo, L. Lopes: L2GClust: local-to-global clustering of stream sources. SAC 2011: 1006-1011
Informações
Seção de Eventos do ICMC
Tel. (16) 3373-9146