site banner

Small-Scale Question Sunday for March 19, 2023

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

3
Jump in the discussion.

No email address required.

I keep seeing this term "data science" come up lately. What is it and what is the connection to machine learning?

You got a few answers and they are all on point. It’s very broad and under defined and can mean anything from building dashboards to building data pipelines to building ML models to just being an analyst.

The standard definition is that data scientists combine math/statistics, programming, and knowledge of a business domain to use data for solving business problems. One of the key pieces of the tool kit are machine learning models which are at their heart statistical tools.

It's very fuzzy and inconsistently used, but the basic idea is: Suppose you have a huge amount of data. How do you curate the data set, how do you answer question with it, how do you make predictions with it?

Machine Learning can be used for any of those things. For example, you can curate it by first sorting the data set with some clustering algorithm into groups to make it easier to handle (and possibly throwing outliers away). You can answer questions by just running some CNN on the data set. Etc.

I'm not in the field but afaik it's an interdisciplinary field where you use statistics, math, programming, and domain specific expertise to analyze and interpret big or complex sets of data, and build models for predictions.

Machine learning is used to make programs or AIs that can more or less autonomously run and do the work of analysis and model building.