Resume of Data Analyst, Data Miner, Consultant




Title
Data Analyst, Data Miner, Consultant

Primary Skills
Data Mining, Knowledge Management,Data Privacy

Location
US-MD-Baltimore (will consider relocating)

Posted
Jan-31-07

RESUME DETAILS
Summary

· Ph.D. in Computer Science (Distributed Data Mining )
· 4+ years of research experience
· Industry experience at IBM (both Software and Research labs)
· 1 patent (filed)
· 10 accredited publications

Education

Doctor of Philosophy (Computer Science) Expected: Summer 2007
University of Maryland, Baltimore County (UMBC)

Dissertation topic: Approximate Distributed Algorithms for Mining Data in Peer-to-Peer Networks
Advisor : Dr. Hillol Kargupta


Master of Science (Computer Science) Summer 2004
University of Maryland, Baltimore County (UMBC)

GPA: 3.7/4.00
Dissertation topic: On Random Additive Perturbation for Privacy Preserving Data Mining


Bachelor of Engineering (Electrical Engineering) Summer 2001
Jadavpur University, India

Major Area (Electrical Engineering) GPA : 3.85/4.00
Ranked within top 10% in a class of 100



Industry Experience

* IBM Research Lab, Delhi, India May 2005-August 2005

Summer intern, Preserving individual privacy in unstructured text data

* IBM Global Service, Calcutta, India April 2002-July 2002

Software consultant, CRM software development in SAP.

* Electronic Research & Development Center of India, Calcutta, India.
October 2001-March 2002
project trainee, Simulation of virtual laboratory for remote technical education



Technical Knowledge

Programming Language Known :
C, Java, FORTRAN , Visual Basic, MatLab, PERL, Servlets and JSP, JADE-LEAP, HTML, Oracle & SQL.

Special Software Knowledge :
C 4.5, LabVIEW, SAP, IBM IntelliMiner.

Operating System :
DOS, Windows variants, Linux, Mac.

Patent

Data Obfuscation of text data using entity detection and replacement, filed, Souptik Datta, Sreeram Balakrishnan and Rema Ananthanarayanan, IBM Research Lab, India.

Awards

Recipient of best research paper award as lead student author in IEEE International Conference on Data Mining (ICDM- 2003).


Research Work

Graduate Doctoral Research Assistant, CSEE, UMBC, September 2004 -- Present

* Sampling based approximate data mining techniques in peer-to-peer network
* Clustering in peer-to-peer network.


Graduate Masters Research Assistant, CSEE, UMBC, August 2002 -- August 2004

* Evaluation of random additive perturbation in privacy preserving data mining.
* Data filtering in spectral domain.


Publications

Journal

1. Approximate K-means Clustering Over a Peer-to-peer Network.

Souptik Datta, Chris Giannella and Hillol Kargupta (2006). IEEE Transaction on Knowledge & Data Engineering (in communication).

2. Clustering Distributed Data Streams in Peer-to-peer Environments.

Sanghamitra Banyopadhyay, Chris Giannella, Ujjwal Maulik, Hillol Kargupta, Souptik Datta, Kun Liu (2006). Journal of Information Sciences , volume 176, number 14, pages 1952-1985, 2006.

3. Random Data Perturbation Techniques and Privacy Preserving Data Mining .

Hillol Kargupta, Souptik Datta, Qi Wang, and Krishnamoorthy Sivakumar. (2004).. Knowledge and Information Systems Journal, volume 7, number 4, pages 387-414,2004 (Extended version of the ICDM'03 paper).



Conference

1. Uniform Data Sampling from a Peer-to-peer Network.

Souptik Datta and Hillol Kargupta (2007). Submitted to 2007 IEEE International Conference on Distributed Computing Systems (ICDCS 2007).

2. K-means Clustering Over a Large, Dynamic Network.

Souptik Datta, Chris Giannella and Hillol Kargupta (2006). Proceedings of 2006 SIAM Conference on Data Mining (SDM- 2006), Bethesda, MD, April, 2006.

3. On the Privacy Preserving Properties of Random Data Perturbation Techniques.

Hillol Kargupta, Souptik Datta, Qi Wang, and Krishnamoorthy Sivakumar (2003). Proceedings of the Third IEEE International Conference on Data Mining (ICDM- 2003), Melbourne, FL, November, 2003

(winner of the 'Best Paper Award', 2003 IEEE International Conference on Data Mining)

4. Homeland security and privacy sensitive data mining from multi-party distributed resources. Hillol Kargupta, Kun Liu, Souptik Datta, Jessica Ryan, Krishnamoorthy Sivakumar(2003). Proceedings of the 12th IEEE International Conference on Fuzzy Systems, St. Louis, MO, May, 2003.


Workshop

1. K-Means Clustering over Peer-to-peer Networks.

Souptik Datta, Chris Giannella and Hillol Kargupta (2005). Proceedings of the 8th International Workshop on High Performance and Distributed Mining (HPDM'05), Newport Beach, CA, April 2005.

2. Privacy Preserving Data Mining and Random Perturbation.

Hillol Kargupta, Haimonti Dutta, Souptik Datta , Krishnamoorthy Sivakumar (2003).
Proceedings of the Workshop on Privacy in the Electronic Society (WPES'03), Washington DC,October,2003

3. Link Analysis, Privacy Preservation, and Random Perturbations.

Hillol Kargupta, Kun Liu, Souptik Datta, Jessica Ryan, Krishnamoorthy Sivakumar (2003). Proceedings of the KDD Workshop on Link Analysis for Detecting Complex Behavior (LinkKDD'03), Washington D.C., July 2003.

4. Homeland Defense, Privacy-Sensitive Data Mining, and Random Value Distortion .

Souptik Datta, Hillol Kargupta and Krishnamoorthy Sivakumar (2003). Proceedings of the SIAM Workshop on Data Mining for Counter Terrorism and Security (SDM'03),San Francisco, CA, May, 2003.



Invited Magazine Article

1. Distributed Data Mining in Peer-to-Peer Networks.

Souptik. Datta, Kanishka Bhaduri, Chris Giannella, Ran Wolff, Hillol Kargupta (2006). IEEE Internet Computing Special Issue on Distributed Data Mining , page 18-26, July-August issue, 2006.


Research Related Activities

· External paper reviewer : ICDM '03 , ICDM '04, SDM '05, TKDE Journal,'05, SDM'06, ICDM'06.,SDM'07, TKDE Journal,'06.
· Proposal reviewer: Kentucky Science & Engineering Foundation, 2006.
· Proposal writing: National Science Foundation CFP on 'Information Integration and Informatics', 2006.
· Invited talk on “Clustering over Peer to peer Network”- Organized by GOLD Affinity Group of IEEE Calcutta section in collaboration with CMATER, Department of Computer Science & Engineering, Jadavpur University, Calcutta, India, 2005.
· Research project demonstration : NSF IDM PI Meeting, Boston, MA, 2004.
· Student member : IEEE.
· Lead student volunteer, KDD, Washington DC, 2003.



Achievements

1. Received full research assistantship for pursuing M.S. + Ph.D. from Dr. Hillol Kargupta, department of Computer Science & Electrical Engineering, University of Maryland, Baltimore County. (August 2002).
2. Recipient of J.N.Tata Endowment award scholarship for Higher Studies of Indian Abroad, 2002.
3. Jadavpur University Alumni Association Annual Award Holder for 2001.
4. Recipient of National Science Talent Search Scholarship by National Council of Educational Research and Teaching (NCERT),India for the period of 1995-2001.
5. Received national merit scholarship for ranking 21st among four hundred thousand students (approx) in the Higher Secondary Examination of state of West Bengal, India (1997).
6. Received state government award and national merit scholarship for ranking 12th among six hundred thousand students (approx) in the Secondary Examination of state of West Bengal, India (1995).
7. Ranked 1st in the State Science Talent Search Test in grade X (1995).


Relevant Course Works

Introduction to Data Mining, Information Retrieval, Advanced Operating Systems, Principles of Database Systems, Java Server Technologies, Design and Analysis of Algorithms, Basic Probability, Introduction to Neural Networks, Information Theory, Distributed Data Mining.


Relevant Course Projects

* Detection of top-k ranking attributes in a peer-to-peer network.
* Privacy preserving correlation computation, outlier detection and decomposition of non-convex polygons into convex subparts.
* Implementation of backpropagation for training multilayer perceptron for parity detection.
* Designing online music store using JSP front-end and ORACLE backend, song suggestion mechanism based on user interest
* Design and implementation of a distributed file system.
* Design and implementation of a web search engine with special ability to show results in clustered format.
* Implementation of a shell in Linux with special commands to view and transfer remote files.
* Implementation of Tomasulo pipeline in C.


Reference

Available upon request

Certifications
See above

CONTACT DETAILS

You must be logged in and have a current resume access subscription. Login or Register »


View all resumes in US-MD-Baltimore »
View all resumes in US-MD »

View other Data Analyst, Data Miner, Consultant resumes, Database Administrator resumes