IEICE Technical Committee Submission System
Conference Paper's Information
Online Proceedings
[Sign in]
Tech. Rep. Archives
 Go Top Page Go Previous   [Japanese] / [English] 

Paper Abstract and Keywords
Presentation 2008-09-22 12:45
[Invited Talk] Data Stream Processing Research at IMC of East China Normal University
Aoying Zhou, Cheqing Jin, Weining Qian (East China Normal Univ.) DE2008-49
Abstract (in Japanese) (See Japanese page) 
(in English) Data stream processing has been attracting more and more attention in research and industry communities due to its broad potential applications. In this talk, we would like to introduce briefly the research work which have been done in our group. Our research interests on data streams are frequent item(set)s mining, clustering, and burst detection over data streams. Some work on practical application and some consideration on future work will be introduced as well.
For the basic problem of mining frequent items over data streams, an algorithm, called hCount is proposed. It is of low space complexity, low per-tuple processing cost, and high recall and precision. Then, for mining of the frequent itemsets, we develop a new false-negative frequent itemset mining algorithm which can get a condensed representation of frequent itemsets in transactional data streams by discovering a false negative collection of some special itemsets that covers frequent itemsets with high probability with respect to set inclusion relationship among itemsets.
Our research on data stream mining was focusing on clustering of data streams. SWClustering is the algorithm we proposed to cluster data streams over sliding windows, and EHCF (Exponential Histogram of Cluster Features) is the synopsis to maintain the statistic information of clusters in sliding windows. With SWClustering, not only the changing distribution of clusters but also the evolving behaviors of individual clusters could be captured. CluDistream is for clustering distributed data streams, which can effectively handle a huge volume of data with noisy, corrupted or incomplete data records generated in distributed enviornment. In CluDistream, the EM-based (Expectation Maximization) algorithms, each data record is assigned to a cluster with certain degree of membership.
The other important piece of work is on burst detection or monitoring over data streams. The fractal analysis method is adapted to enable the monitoring of both monotonic and non-monotonic aggregates on time changing data stream. The monotony property of aggregate monitoring is revealed and monotonic search space is built to decrease the time overhead for detecting bursts from O(m) to O(log m), where m is the number of windows to be monitored. With the help of a novel piecewise fractal model, the statistical summary is compressed to be fit in limited main memory, so that high aggregates on windows of any length can be detected accurately and efficiently on-line.
A practical data stream processing system for telecommunication network flow data analysis will be also introduced in this talk.
Keyword (in Japanese) (See Japanese page) 
(in English) Data stream processing / Frequent item / Clustering / Burst Detection / / / /  
Reference Info. IEICE Tech. Rep., vol. 108, no. 211, DE2008-49, pp. 39-40, Sept. 2008.
Paper # DE2008-49 
Date of Issue 2008-09-14 (DE) 
ISSN Print edition: ISSN 0913-5685    Online edition: ISSN 2432-6380
Copyright
and
reproduction
All rights are reserved and no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Notwithstanding, instructors are permitted to photocopy isolated articles for noncommercial classroom use without fee. (License No.: 10GA0019/12GB0052/13GB0056/17GB0034/18GB0034)
Download PDF DE2008-49

Conference Information
Committee DE  
Conference Date 2008-09-21 - 2008-09-22 
Place (in Japanese) (See Japanese page) 
Place (in English)  
Topics (in Japanese) (See Japanese page) 
Topics (in English)  
Paper Information
Registration To DE 
Conference Code 2008-09-DE 
Language English 
Title (in Japanese) (See Japanese page) 
Sub Title (in Japanese) (See Japanese page) 
Title (in English) Data Stream Processing Research at IMC of East China Normal University 
Sub Title (in English)  
Keyword(1) Data stream processing  
Keyword(2) Frequent item  
Keyword(3) Clustering  
Keyword(4) Burst Detection  
Keyword(5)  
Keyword(6)  
Keyword(7)  
Keyword(8)  
1st Author's Name Aoying Zhou  
1st Author's Affiliation East China Normal University (East China Normal Univ.)
2nd Author's Name Cheqing Jin  
2nd Author's Affiliation East China Normal University (East China Normal Univ.)
3rd Author's Name Weining Qian  
3rd Author's Affiliation East China Normal University (East China Normal Univ.)
4th Author's Name  
4th Author's Affiliation ()
5th Author's Name  
5th Author's Affiliation ()
6th Author's Name  
6th Author's Affiliation ()
7th Author's Name  
7th Author's Affiliation ()
8th Author's Name  
8th Author's Affiliation ()
9th Author's Name  
9th Author's Affiliation ()
10th Author's Name  
10th Author's Affiliation ()
11th Author's Name  
11th Author's Affiliation ()
12th Author's Name  
12th Author's Affiliation ()
13th Author's Name  
13th Author's Affiliation ()
14th Author's Name  
14th Author's Affiliation ()
15th Author's Name  
15th Author's Affiliation ()
16th Author's Name  
16th Author's Affiliation ()
17th Author's Name  
17th Author's Affiliation ()
18th Author's Name  
18th Author's Affiliation ()
19th Author's Name  
19th Author's Affiliation ()
20th Author's Name  
20th Author's Affiliation ()
Speaker Author-1 
Date Time 2008-09-22 12:45:00 
Presentation Time 45 minutes 
Registration for DE 
Paper # DE2008-49 
Volume (vol) vol.108 
Number (no) no.211 
Page pp.39-40 
#Pages
Date of Issue 2008-09-14 (DE) 


[Return to Top Page]

[Return to IEICE Web Page]


The Institute of Electronics, Information and Communication Engineers (IEICE), Japan