Everyone should have a conceptual understanding of machine learning, so they can communicate more effectively with practitioners. To really understand what machine learning can and can't do, you have to get hands-on with it, which is what the curious, the career builders, and the DIY problem-solvers are doing.
The starting point differs for individuals based on their education and experience. However, the titles of resources may not necessarily reflect that fact. Following is a short list of resources with a bit of insight into their requirements and value. Deep learning, a subcategory of machine learning, has been omitted intentionally to keep the focus of this article on machine learning in general.
Competitions provide an opportunity for anyone to get hands-on with machine learning. Don't let the word "competition" scare you, because you'll find a lot of helpful resources at these sites available free to anyone. Later, if you decide to compete, and if you achieve a prominent position on the leader board, you'll have something more to add to your resume.
Kaggle is a data science platform that businesses use to crowdsource problem-solving. Members can get access to datasets, kernels, free mini courses, a forum, blogs, job postings, documentation and more.
Open ML (beta 2) describes itself as "an inclusive movement to build an open, organized, online ecosystem for machine learning". It builds open source tools for discovering and sharing data. Participants can pull the open data into their favorite machine learning environments and build models themselves or with the help of community data scientists.
AnalyticsVidhya positions itself as "a next-gen data science ecosystem." Its website provides access to competitions, community, tutorials, blogs, certifications and job listings.
Online courses, bootcamps and certificate programs
Be forewarned that many "introductory" courses assume a base level of knowledge not everyone possesses, so resource titles can be misleading. For example, intermediate and advanced "Introduction to Machine Learning" courses assume R or Python programming skills and college-level knowledge of calculus, linear algebra, and statistics. There also are courses targeted at business leaders and others that require only basic programming skills (not necessarily in R or Python) and basic math skills.
Bear in mind that prerequisite courses are also available online that can help prepare you for classes that require skills you do not yet possess. If the course is free, you can simply drop it if it's too basic or too advanced. If money is involved, take time to understand the prerequisites as well as payment and refund terms before you commit.
Note that some courses do not include prerequisites in their course descriptions. However, the details may be discoverable in the syllabus. Alternatively, you may have to do some sleuth work, which means using the chat function, calling a number, sending an email or posting a query to an online community. Some courses encourage potential students to take a pre-test or review a problem to gauge the course is a fit for them.
ColumbiaX Machine Learning for Data Science and Analytics is taught by Columbia University professors and hosted on EdX. It's an introductory course targeted at business professionals that requires some exposure to programming and high school math skills. The course is free or $99 for graded exams and assignments plus a certificate.
The Coursera Machine Learning course is taught by Stanford University Adjunct Professor Andrew Ng (Google Brain founder and formerly Baidu's chief scientist). This course requires knowledge of linear algebra (per the syllabus). The course is free or $79 for a certificate.
DataCamp offers online courses on a subscription basis for $29 per month or $25 per month on a yearly basis. The "What is Machine Learning?" chapter of the Introduction to Machine Learning course is free.
The eCornell Machine Learning Certificate Program consists of 7 two-week courses aimed at developers, software and data engineers, data scientists and statisticians. Interested parties can take a pre-test to gauge their level of knowledge. The cost is $3,600.
HarvardX Data Science: Machine Learning is taught by Harvard University professors and hosted on edX. It has set open and close dates. This introductory course may be taken as part of a 9-course professional certificate program in data science, which includes R basics, statistical modeling and linear regression. The previous courses in the certificate program are recommended as prerequisites. The Machine Learning course is free (free on EdX translates to no certificate); the entire certificate program is currently discounted from $491 to $441.90.
LinkedIn Learning now has 555 videos machine learning-related videos for beginners. Courses may be purchased individually or accessed with a $29.99 per month Premium membership.
The Simplilearn Machine Learning Certification Training Course includes 44 hours of instruction and 4 industry projects. Students should have a basic understanding of Python programming, math and statistics. For those who wonder what "basic" means, the course FAQs provide links to three prerequisite classes, which are Data Science for Python, Programming in Python 3.x, and Math Refresher. To get access to the Math Refresher Preview or the syllabus, one must provide an email address and phone number. Pricing starts at $699 for individuals.
Springboard Machine Learning Bootcamp is a 6-month machine learning engineer course for those with deep development skills and knowledge of calculus, linear algebra, probability and descriptive statistics. Employment is guaranteed (but the graduate must submit several resumes and make several networking calls per week). Visit the site for pricing options and additional requirements. Pricing starts at $4,500.
The Stanford Online Machine Learning course has set enrollment and attendance dates. It requires knowledge of linear algebra as well as basic probability and statistics. Potential enrollees are encouraged to review the first problem set to determine they are a fit for the course. The for-credit course cost is $5,040.
The Udacity Introduction to Machine Learning course is an intermediate course that requires Python programming skills and knowledge of probability and statistics. The course is free or can be taken as part of a paid 3-month Machine Learning Engineer "nano degree" certificate program. The cost is $399 monthly or $1,077 pre-paid for 3 months.
University of California at Berkeley (UC Berkeley) School of Information Machine Learning online short course is a six-week course with set attendance dates. It is targeted at business leaders and technology professionals. The cost is $3,500.
Machine learning books
The following books are available on Amazon.com. Since Amazon's pricing is dynamic, the prices listed below are subject to change.
Machine Learning for Absolute Beginners (Second Edition) by Oliver Theobald. $10.70 on Amazon for the paperback on Amazon or free with Kindle Unlimited.
Machine Learning for Beginners & Machine Learning with Python by Hein Smith. $23.83 for the one remaining paperback on Amazon or free with Kindle Unlimited.
Machine Learning: The Absolute Complete Beginner's Guide to Learn and Understand Machine Learning from Beginners, Intermediate, Advanced, to Expert Concepts by Steven Samelson. $13.13 for the paperback on Amazon, which includes a free Kindle version or free with Kindle Unlimited.
Machine Learning with Python: Hands-On Learning for Beginners by Travis Booth. $19.99 on Amazon or free with Kindle Unlimited.
Introduction to Machine Learning with Python: A Beginner's Guide to Learn Concepts and Practical Solutions from Data. Methods, Benefits and Case Studies applied to Artificial Intelligence by William Gray. $23.99 for the paperback on Amazon or free with Kindle Unlimited.
AWS Sagemaker is an online machine learning service for developers and data scientists. Visit the site for pricing and related specifics.
Azure Machine Learning SDK for Python is used for building and running machine learning workflows. Visit the site to install the SDK.
Scikit-learn machine learning in Python is a set of tools for data mining and analysis. Visit the site to install the latest version.
Tensorflow.org Tensorflow is an open source machine learning platform from the Google Brain team. Visit the site to install Tensorflow or the Tensorflow 2.0 beta. Alternatively, if you want to learn Tensorflow without installing anything you can use Google Colab, which is a hosted Jupyter notebook environment with Tensorflow pre-installed.
Words of advice from professionals
If you want to succeed with machine learning, Josh Fleischer, a machine learning engineer at AI and machine learning consultancy Atrium offers three pieces of advice:
- Learn how to pull and manipulate data
- Understand that real-life data is never as clean as it is in textbooks
- In order to utilize the models you build, you have to know what data is meaningful. You cannot know this without an understanding of business operations
Ameen Kazerouni, Zappos lead scientist recommends starting with the business problem rather than machine learning.
"Inventing problems for cool solutions is rarely a sound investment," said Kazerouni. "Instead, look at your biggest problems and the data you have and then match those problems to your data. This allows you to formulate machine learning solutions for core business problems [and] sets you up for a much higher likelihood for success in the space."
Konrad Pabianczyk, AI business team lead at Netguru is a big fan of Scikit-learn for Python and Tensorflow.
"Scitkit-learn has good documentation targeting quick understanding of different methods," said Pabianczyk. "It can be a great starter for learning, e.g.. where multiple algorithms are compared for improved understanding. When looking for specific machine learning frameworks, look into starting with Tensorflow."