Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

Skills for Data Mining/Analysis

  • 01-09-2015 12:35pm
    #1
    Registered Users, Registered Users 2 Posts: 8,593 ✭✭✭funkey_monkey


    Hi,

    I'm currently being made redundant and am looking for opportunities outside of my current sector which is a niche area of development.

    I was looking at the possibillity of switching into data mining / analysis for for some of the big finance houses (or similar).

    Does anyone here work in this sector and/or can you tell me what skills (soft skills and languages/toolsets) are required to work in this area?


    Thanks.


Comments

  • Registered Users, Registered Users 2 Posts: 112 ✭✭JigglyMcJabs


    Hi,

    I'm currently being made redundant and am looking for opportunities outside of my current sector which is a niche area of development.

    I was looking at the possibillity of switching into data mining / analysis for for some of the big finance houses (or similar).

    Does anyone here work in this sector and/or can you tell me what skills (soft skills and languages/toolsets) are required to work in this area?


    Thanks.

    Before you jump into toolsets and languages, I would suggest focussing on statistics, you need to have a solid base in stats first, then you could look at R and Python.

    There are a few good free courses in data analytics in the likes of NCI, DIT, DBS that you could look at too


  • Registered Users, Registered Users 2 Posts: 7,521 ✭✭✭jmcc


    I don't deal with Financial data but rather with Internet data. (Domain name transactions covering about 720 top level domains and measuring website usage in TLDs and Global IP address mapping.)

    Mathematics.
    If you have an Arts background with little Mathematics, you are going to be in trouble. Apart from the Statistics suggested earlier, you will also need a working knowledge of Numerical Computations and Parallelisation.

    Thinking.
    This is not the Philosophy wánkathon stuff but something far more important. You have to be able to think in terms of data and computations. You have to be able to specify a problem, define the data you need to solve it, define the calculations necessary to solve the problem and, most importantly, know when you have solved it or made a mistake. Some Big Data problems have highly counter-intuitive solutions so when some academic with no combat experience suggests the usual textbook approach, you have to understand the data and your software well enough to know that they are talking bullsh!t. The textbook for what you are attempting may not have been written yet.

    Persistence.
    Big Data is big. It can take time to create a solution and crunch the data. And that doesn't even get into the whole ETL (Extract, Transform, Load) part of cleaning data. You have to have a long attention span (the longer the better) because problem solving at this level is not like building a toy website with a mickey mouse 100KB database.

    Know Your Tools.
    Remember the movie "Ronin" where the spoofer asked the professional which was his favourite gun? The professional responds that it's a toolbox and he uses the right weapon for the job. Well you have to have a working knowledge of most of the major tools used and know how they work. You have to know database software, hardware (very important when it comes to calculations) and analysis software. YOu also have to be capable of writing your own software and tools. (Unless you have clean data, you are going to be cleaning the data so learn about the REGEX of your favourite parsing language.)


    You can also ask this question in the Big Data forum:
    http://www.boards.ie/vbulletin/forumdisplay.php?f=1630

    Regards...jmcc


  • Closed Accounts Posts: 22,648 ✭✭✭✭beauf


    If you have the skill-set. Is it possible to get into it, without any experience of the finance sector. I would have assumed business knowledge would be a requirement?


  • Registered Users, Registered Users 2 Posts: 8,593 ✭✭✭funkey_monkey


    What about the toolset - is it difficult to pick up on these?




  • What about the toolset - is it difficult to pick up on these?

    There are many different toolsets for many different jobs / tasks. I can speak only for myself, but I'm not aware of any overly market-leading technologies, and instead we use bits of X Y and Z everywhere.

    Tools / languages / databases that might be of interest

    Python - (SciPy stack includes an IDE and relevant packages)
    R - cran has ****tonnes of packages
    c#
    F#
    ironpython
    q

    sql
    mongodb
    riak
    kdb+

    Any familiarity with these that you might already have will probably guide you in what you should pickup first to get a step on the ladder etc.

    There's some excellent courses on Udemy / Coursera / NewBoston which are all free that can take you around some of the packages and get you up to speed quicker.

    https://www.kaggle.com/competitions is worth a glance at to see what might you might be expected to be able to work though. As far as I remember you can use some kernels on site without needing to install yourself etc.


  • Advertisement
  • Registered Users, Registered Users 2 Posts: 1,026 ✭✭✭whatever76


    this is very Microsoft centric but it does give good basic concepts to Machine Learning techniques and some good labs as well.. https://studio.azureml.net/

    JMCC above summed it up perfectly - statistics is key to get you started !


Advertisement