Sanstech

Ideas, Knowledge, Technology, Computer Science, Experience associated with my work and some geeky stuff I progressively encounter during my journey towards enlightenment. Read on…

  • RSS RSS Feed

    • Cloud Computing
      It’s been really long, since I last wrote a tech post. In this post, I’m just sharing few useful links to get started on Cloud Computing, whether you’re a developer, quality engineer, business leader, or a project manager intending to get started with Cloud Computing. I’m currently designing systems and services, for a platform that’s […]
    • The Pragmatic Programmer
      I finished reading The Pragmatic Programmer by Andrew Hunt and David Thomas. It’s not a new book in the market but I was curious to read this. The technology topics covered, are not any different from those found in most software engineering books, but the way they’re presented using Pragmatic Philosophy Approach, is remarkable. Code […]
    • 2013 in review
      The WordPress.com stats helper monkeys prepared a 2013 annual report for this blog. Here’s an excerpt: A San Francisco cable car holds 60 people. This blog was viewed about 1,200 times in 2013. If it were a cable car, it would take about 20 trips to carry that many people. Click here to see the […]
    • Goodbye, Ness!
      It had to happen sometime. I thought Feb 2013 was the right time. I quit Ness after a long 5 years and 4 months of stay, in Feb. I joined FICO (formerly, Fair Isaac) last Feb.  While I get an opportunity to work with many varied stakeholders like Scientists, Architect, Product Management, Peer Developers, PMO, Technical Publications and also […]
    • Meta: information retrieval from Tweets
      I pick significant problems randomly sometimes and enjoy solving them, or at least attempt designing api :-). Here’s one such problem! Problem: How’d you go about finding meta information about a person’s tweets? NOTE: a) Tweet == Twitter updates b) Meta information –> Loosely defined. You can interpret it anyway you want –> Frequency, topics, follower […]
  • Twitter Updates

Archive for April, 2011

Data Mining and Text Mining Resources

Posted by sanstechbytes on April 21, 2011

With the objective of learning data mining concepts and also applying them to my MS course project(I had promised to talk about this in one of my earlier posts), I happened to explore and compile links to some books, blogs, articles, papers etc. Here’s listing of those and it is useful to anyone who’s interested in Data Mining, Text Mining, NLP, Information Retrieval and related areas. This can serve up as a one-stop location, for my quick reference as well! 🙂

Academic/University Stuff:
http://infolab.stanford.edu/~ullman/mining/2009/index.html
http://www.cs.waikato.ac.nz/ml/weka/
http://www.cs.waikato.ac.nz/~ihw/papers/04-IHW-Textmining.pdf
http://www.laits.utexas.edu/~norman/BUS.FOR/course.mat/Alex/#9
http://www.cs.umbc.edu/~nicholas/clustering/
http://web.ccsu.edu/datamining/resources.html
http://www.stanford.edu/class/cs276/
http://www.cs.sunysb.edu/~cse634/presentations/TextMining.pdf
http://scpd.stanford.edu/ppc/kdnuggets-2011-04.jsp?_vsrefdom=EmailMarketing
http://ocw.mit.edu/courses/sloan-school-of-management/15-062-data-mining-spring-2003/lecture-notes/
http://www.crisp-dm.org/Process/index.htm
http://www.the-data-mine.com/
http://www.kdnuggets.com/
http://www.csc.kth.se/~rosell/undervisning/sprakt/irintro060801.pdf                             http://www.autonlab.org/tutorials/kmeans.html

http://filebox.vt.edu/users/wfan/text_mining.html

http://people.ischool.berkeley.edu/~hearst/research.html

Machine Learning Lectures

Books:
Introduction to Data Mining by Pang-Ning Tan, Michael Steinbach, Vipin Kumar

Java Data Mining: Standard, Theory and Practice:  A Practical Guide for architecture, design, and implementation by Mark H, Sunil Venkayala, Eric Marcade

Collective Intelligence in Action by Satnam Alag

Introduction to Information Retrieval by Christopher D. Manning, Prabhakar Raghavan and Hinrich  Schütze

Introduction to Modern Information Retrieval (Popular book)by G. Salton, Gerard

Electronics Statistics

Some more books

Companies:
http://www.oracle.com/technetwork/database/options/odm/index.html

http://download.oracle.com/docs/html/B14340_01/java_using.htm
http://www.dataminingcasestudies.com/

http://www.kxen.com/Products/Explorer/Text+Analytics

http://www.sas.com/textminer
http://www.ibm.com/developerworks/library/wa-wbdm/
http://www.developertutorials.com/tutorials/java/java-data-mining-822/
http://www2.parc.com/istl/projects/ia/sg-clustering.html

ACM KDD Special Interest Group

Papers:
Ontology-based Distance Measure for Text Clustering

Data Mining: Extending the Information Warehouse Framework

Pragmatic Text Mining: Minimizing Human Effort to Quantify Many
Issues in Call Logs

Differentiating data- and text-mining terminology by Jan H. Kroeze, Machdel C. Matthee, Theo J. D. Bothma

Mining Text Data: Special Features and Patterns by Miguel Delgado, Maria J. Martín-Bautista, Daniel Sánchez, María Amparo Vila Miranda

Overview and semantic issues of text mining by Anna Stavrianou, Periklis Andritsos, Nicolas Nicoloyannis

Better Rules, Fewer Features: A Semantic Approach to Selecting Features from Text

Blogs:
http://irthoughts.wordpress.com/about/
http://glinden.blogspot.com/2006/11/excellent-data-mining-lecture-notes.html

http://sujitpal.blogspot.com/2009_09_01_archive.html
http://www.kdnuggets.com/websites/blogs.html

Analytics:
http://www.kaushik.net/avinash/

NLP:
http://nlpers.blogspot.com/search/label/sentiment
http://blog.sematext.com/

Mother Link for Data Mining Resources from a Librarian(mayn’t find all those mentioned above though!)

Posted in Data Mining | Tagged: | Leave a Comment »