Skills Needed For Machine Learning
1. Computer Science Fundamentals and Programming:
Computer science fundamentals important for Machine Learning engineers include data structures (stacks, queues, multi-dimensional arrays, trees, graphs, etc.), algorithms (searching, sorting, optimization, dynamic programming, etc.), computability and complexity (P vs. NP, NP-complete problems, big-O notation, approximate algorithms, etc.), and computer architecture (memory, cache, bandwidth, deadlocks, distributed processing, etc.).
You must be able to apply, implement, adapt or address them (as appropriate) when programming. Practice problems, coding competitions and hackathons are a great way to hone your skills.
2. Probability and Statistics:
A formal characterization of probability (conditional probability, Bayes rule, likelihood, independence, etc.) and techniques derived from it (Bayes Nets, Markov Decision Processes, Hidden Markov Models, etc.) are at the heart of many Machine Learning algorithms; these are a means to deal with uncertainty in the real world. Closely related to this is the field of statistics, which provides various measures (mean, median, variance, etc.), distributions (uniform, normal, binomial, Poisson, etc.) and analysis methods (ANOVA, hypothesis testing, etc.) that are necessary for building and validating models from observed data. Many Machine Learning algorithms are essentially extensions of statistical modeling procedures.
3. Data Modeling and Evaluation:
Data modeling is the process of estimating the underlying structure of a given dataset, with the goal of finding useful patterns (correlations, clusters, eigenvectors, etc.) and/or predicting properties of previously unseen instances (classification, regression, anomaly detection, etc.). A key part of this estimation process is continually evaluating how good a given model is. Depending on the task at hand, you will need to choose an appropriate accuracy/error measure (e.g. log-loss for classification, sum-of-squared-errors for regression, etc.) and an evaluation strategy (training-testing split, sequential vs. randomized cross-validation, etc.). Iterative learning algorithms often directly utilize resulting errors to tweak the model (e.g. backpropagation for neural networks), so understanding these measures is very important even for just applying standard algorithms.
4. Applying Machine Learning Algorithms and Libraries:
Standard implementations of Machine Learning algorithms are widely available through libraries/packages/APIs (e.g. scikit-learn, Theano, Spark MLlib, H2O, TensorFlow etc.), but applying them effectively involves choosing a suitable model (decision tree, nearest neighbor, neural net, support vector machine, ensemble of multiple models, etc.), a learning procedure to fit the data (linear regression, gradient descent, genetic algorithms, bagging, boosting, and other model-specific methods), as well as understanding how hyperparameters affect learning. You also need to be aware of the relative advantages and disadvantages of different approaches, and the numerous gotchas that can trip you (bias and variance, overfitting and underfitting, missing data, data leakage, etc.). Data science and Machine Learning challenges such as those on Kaggle are a great way to get exposed to different kinds of problems and their nuances.
5. Software Engineering and System Design:
At the end of the day, a Machine Learning engineer’s typical output or deliverable is software. And often it is a small component that fits into a larger ecosystem of products and services. You need to understand how these different pieces work together, communicate with them (using library calls, REST APIs, database queries, etc.) and build appropriate interfaces for your component that others will depend on. Careful system design may be necessary to avoid bottlenecks and let your algorithms scale well with increasing volumes of data. Software engineering best practices (including requirements analysis, system design, modularity, version control, testing, documentation, etc.) are invaluable for productivity, collaboration, quality and maintainability.
Machine Learning Job Roles
Jobs related to Machine Learning are growing rapidly as companies try to get the most out of emerging technologies. The chart below depicts the relative importance of core skills for these general types of roles, with a typical Data Analyst role for comparison.
The Future of Machine Learning
What is perhaps most compelling about Machine Learning is its seemingly limitless applicability. There are already so many fields being impacted by Machine Learning, including education, finance, computer science, and more. There are also virtually NO fields to which Machine Learning doesn’t apply. In some cases, Machine Learning techniques are in fact desperately needed. Healthcare is an obvious example. Machine Learning techniques are already being applied to critical arenas within the Healthcare sphere, impacting everything from care variation reduction efforts to medical scan analysis. David Sontag, an assistant professor at New York University’s Courant Institute of Mathematical Sciences and NYU’s Center for Data Science, recently gave a talkon Machine Learning and the Healthcare system, in which he discussed “how machine learning has the potential to change health care across the industry, from enabling the next-generation electronic health record to population-level risk stratification from health insurance claims.”
The world is unquestionably changing in rapid and dramatic ways, and the demand for Machine Learning engineers is going to keep increasing exponentially. The world’s challenges are complex, and they will require complex systems to solve them. Machine Learning engineers are building these systems. If this is YOUR future, then there’s no time like the present to start mastering the skills and developing the mindset you’re going to need to succeed.
Link : https://en.wikipedia.org/wiki/Machine_learning
No comments:
Post a Comment
Write a comment . .