Sanstech

Ideas, Knowledge, Technology, Computer Science, Experience associated with my work and some geeky stuff I progressively encounter during my journey towards enlightenment. Read on…

  • RSS RSS Feed

    • The Pragmatic Programmer
      I finished reading The Pragmatic Programmer by Andrew Hunt and David Thomas. It’s not a new book in the market but I was curious to read this. The technology topics covered, are not any different from those found in most software engineering books, but the way they’re presented using Pragmatic Philosophy Approach, is remarkable. Code […]
    • 2013 in review
      The WordPress.com stats helper monkeys prepared a 2013 annual report for this blog. Here’s an excerpt: A San Francisco cable car holds 60 people. This blog was viewed about 1,200 times in 2013. If it were a cable car, it would take about 20 trips to carry that many people. Click here to see the […]
    • Goodbye, Ness!
      It had to happen sometime. I thought Feb 2013 was the right time. I quit Ness after a long 5 years and 4 months of stay, in Feb. I joined FICO (formerly, Fair Isaac) last Feb.  While I get an opportunity to work with many varied stakeholders like Scientists, Architect, Product Management, Peer Developers, PMO, Technical Publications and also […]
    • Meta: information retrieval from Tweets
      I pick significant problems randomly sometimes and enjoy solving them, or at least attempt designing api :-). Here’s one such problem! Problem: How’d you go about finding meta information about a person’s tweets? NOTE: a) Tweet == Twitter updates b) Meta information –> Loosely defined. You can interpret it anyway you want –> Frequency, topics, follower […]
    • Understanding Big Data
      It’s been a while, since I last posted! To keep this rolling (I’m hardly getting any time to post my own articles or stuff about my experiments these days 😦  ), I just wanted to share this ebook on Big Data titled  Understanding Big Data: Analytics for Enterprise Class and Streaming Data. Cheers!
  • Twitter Updates

Archive for May, 2012

My Master’s Dissertation, Revisited Series – Part 3

Posted by sanstechbytes on May 1, 2012

In this final part of the series, let us look at the choices of implementation that I made and the results of my work. I thought I would rather share that as a stuff, part of the download.

Although I liked world-leading data mining open-source software like WEKARapid Miner, I don’t recall why I stuck to exploring more of a trial version of commercial Oracle Data Mining Suite. We used Oracle 10g at work; we use SAS Enterprise Miner for data mining needs. Intuitively, it should have been with the intention of seeing how Oracle Data Mining fitted the bill in that context. It turns out that last statement was too much of a speculative future then.

The repository for all my work along with other useful artifacts can be downloaded here(around 93 MB, I think!!!).

When you’ve downloaded RAR file –

For my explanation of how I’ve used ODM API’s, refer to dissertationrepository\results\My Dissertation POC Demo.docx.

For detailed steps on installation of database, creating user, sample schemas etc.:  refer to Oracle Data Mining Administrator’s Guide can be located in RAR  file at: dissertationrepository\pdf\datamining_admin_guide.pdf.

For installing softwares like odminer and importing java project: refer to dissertationrepository\software. Perhaps, oracle database can be downloaded from here.

For screenshots and results of my POC work,  refer to:  dissertationrepository\results\My Dissertation POC Demo.docx

For other sample projects, books, ideas, papers etc: refer to dissertationrepository\extra

REFERENCES

1. Campos, M.M., Milenova, B.L., “O-Cluster: Scalable Clustering of Large High

Dimensional Data Sets”, Oracle Data Mining Technologies, 10 Van De Graaff

Drive, Burlington, MA 01803.

2. Oracle Data Miner –  http://www.oracle.com/technology/products/bi/odm/

3. JSR Specifications – http://jcp.org/en/jsr/detail?id_247

4. Introduction to Data Mining by Vipin Kumar, Michael Steinbach, Pang-ning Tan, Addison Wesley, 2006.

5. Java Data Mining: Standard, Theory and Practice: A Practical Guide for Architecture, Design and Implementation by Mark Hornick, Erik Marcade, and Sunil Venkayala, Morgan Kauffman Publishers, 2007

6. Collective Intelligence in Action by Satnam Alag, Manning Publications, 2007

7. Using text mining to understand the call center customers’ claims http://library.witpress.com/pages/PaperInfo.asp?PaperID=16715 by G. M. Caputo, V. M. Bastos & N. F. F. Ebecken.

8. Oracle Database 11g / Data Mining API Documentation

9.  KD Nuggets www.kdnuggets.com

10. ACM www.acm.org

Posted in Data Mining, Research | Leave a Comment »