FYI: I have an undergraduate degree in Sociology, which included plenty of instruction in quantitative research methods, and began my career as a data processing manager (cross tabs in WinCross, other analysis in SPSS, etc.) I'm currently known in-house as the stat/numbers nerd.
I'm registered for a course in SQL and aspire to attend training in SPSS and Cognos BI. I am proficient at Crystal Reports, but consider it a pretty crude tool (I liken using Crystal to trying to paint a fine portrait with kindergarten jumbo crayons.) Any other suggestions on launching a new leg to my career toward data mining?
-Sarah
Requires Free Membership to View
After you can access the data that you are interested in, the next step is to analyze it. A good statistical background is critical since all the data mining techniques need to operate in a statistically robust framework. Take several stats classes (if you haven't done this already) and try to work with real data. Learn how to evaluate and compare statistical models. You should also learn about the various machine learning algorithms (neural networks, decision trees, nearest neighbors, etc.), and depending on where you take your classes, this might be in either the statistics or computer science department. You want to build up a general knowledge of the tools that you can apply to solving data analysis problems. This isn't something that will happen quickly. If you are looking for a couple books on techniques, I can recommend "Intelligent Data Analysis" by Berthold & Hand and "The Elements of Statistical Learning" by Hastie, Tibshirani, and Friedman.
I would also recommend looking into data visualization and information design. Presenting the results of a data analysis project is often as important as the analysis itself. If your results are not understood and trusted, they will not have an impact. Unfortunately most data visualization courses focus on the technology and whiz-bang presentations so you should be careful that you don't waste your time in this area. I highly recommend you look into the books of Edward Tufte (especially his second book, "Envisioning Information") for an introduction to this subject. Tufte teaches a very popular one-day course on the subject that is quite good.
Finally, if you are mathematically inclined, you might want to look into the field of operations research and optimization. In many cases the output of a data mining model is not a single result but a collection of predictions. Optimization procedures can take these results and select one (or more) optimal actions.
For more information, check out SearchCRM's Best Web Links on Data Mining.
This was first published in April 2002

Join the conversationComment
Share
Comments
Results
Contribute to the conversation