How to Search for Data Science Jobs

by Robert A. Muenchen

This article describes the technical details of how to search for jobs in the field of data science. The results of the searches are displayed and discussed in The Popularity of Data Science Software. The protocols were implemented 2/27/2017 and they are significantly different from the previous set posted on 2/20/2014.

Data Science Terms

Some software used for data science is also used by a wide range of other tasks. Let’s consider a few examples. General purpose languages, such as C, Java, or Python are used heavily for some data science tasks, but if you do a job search on just their names, the great majority of jobs found will not be for data science. Other software such as Cognos, SAS, and Tableau are very popular for simple report writing as well as for data science jobs. Therefore simple searches will find a blend of both types of jobs. Finally, some software, such as Apache Spark, SPSS, or Stata are very specific to data science. With such a mix of software, the challenge is to use search terms that will yield values that are comparable across all types of software.

To compile a list of search terms that are specific to data science jobs, I started out searching for jobs that required software that is used specifically for data science. I then looked for terms that often appeared in those job descriptions. Next, I searched for jobs that featured only those terms, one at a time. Some, such as “analytics” resulted in searches that were not well focused; jobs that had nothing to do with data science would appear. Others, such as “econometrics”, did indeed focus on data science jobs, but only in the field of economics. As I worked my way through these searches, I found more search terms to test. The results are shown in Table 1.

Search Terms Jobs Found
Analytics (not well focused) 123,895
Survey (not well focused) 72,323
Statistics (not well focused) 66,201
Statistical (not well focused) 55,998
Big Data 20,646
Analyze data (not well focused) 20,068
Business intelligence (too much reporting) 19,709
Data analytics 15,774
Machine learning 12,499
Statistical analysis 11,397
Data mining 9,757
Data Science 6,873
Quantitative analysis 4,095
Business analytics * 4,043
Research associate (too vague) 3,794
Advanced Analytics 3,479
Data Scientist 3,272
Statistical software 2,835
Predictive analytics 2,411
Artificial intelligence 2,404
Predictive modeling 2,264
Statistical modeling 2,040
Econometrics (too focused) 1,860
Quantitative research 1,837
Research analyst 1,756
Statistical tools 1,414
Statistician 904
Statistical packages 784
Survey research 440
Quantitative modeling 352
Statistical research 208
Statistical computing 153
Research computing 133
Statistical analyst 125
Data miner 34

Table 1. Terms used in data science job descriptions

Ideally, one could include all the focused terms in a search, but Indeed.com’s search feature limits the size of the search string. To determine the maximum string size, I put in the longest software string and then added in the data science terms. The data science terms then truncated to show the limit. Table 2 shows the resulting set of search terms that I used to append to each software title. For example, when searching for Java, I would enter: Java and (“big data” or “data analytics” or …”statistician”).

and ("big data"
or "data analytics"
or "machine learning"
or "statistical analysis"
or "data mining"
or "data science"
or "quantitative analysis"
or "business analytics"
or "advanced analytics"
or "data scientist"
or "statistical software"
or "predictive analytics"
or "artificial intelligence"
or "predictive modeling"
or "statistical modeling"
or "quantitative research"
or "research analyst"
or "statistical tools"
or "statistician")

Table 2. The data science terms and logic that are appended to every job search.

Additional Challenges

Some software offered additional challenges. Those with letter names, C and R, were found using spaces before and after their names, such as (” R ” or ” R,”) . This isn’t a perfect solution since it would count an advertisement for a data scientist skilled in SAS at the “Toys R Us” company as a job for an someone with R skills. Conversely, the search for an R programmer at the SAS Shoe company would also be counted as one for a SAS programmer. Many of these searches have flaws like that, but the size of the search limits the accuracy. However, if you look through the resulting job advertisements, you’ll see that errors with this search approach are rare.

When advertisements list the C language, it’s most often in the form of “C, C++, or C#” so no attempt was made to differentiate those variants. However Objective C was usually advertised for iPad or iPhone application development, so it was excluded.

Microsoft presented another challenge. Just its name combined with the data science terms yielded results that were heavily biased by the inclusion of general-purpose tools such as Microsoft SQL Server. Focusing the search with (“Azure Machine” or “Azure Stream” or “Microsoft R” or “Cortana Intelligence” or “Microsoft Cognitive” or CNTK) used up so much space that two of the data science terms had to be dropped: “statistical tools” and “statistician”.

Another challenging search was for Domino Data Labs’ Data Science Platform. The search (Domino and “Data Science Platform”) found no jobs, not even for those from the company itself! Just the term “Domino” along with the data science terms found mostly job descriptions that mentioned Lotus Domino. For the 2017 search, I simply culled the small number of results down by hand.

Similarly, the search for Alpine and the data science terms yielded hits that were mostly irrelevant, so I culled them manually.

The Search Terms

Table 3 shows the search terms used for each software. See Table 1 for the data science terms that were appended to every search except Microsoft, whose complete search is shown below.

Alteryx and data sci terms

Amazon:
I'm leaving Amazon and Microsoft very loosely defined:
(Amazon or AWS) and data sci terms
Becuase when I focus it using the following, it misses too many:
("Amazon Machine Learning" 
or "AWS Certified Machine Learning")
and data sci terms

Anaconda and data sci terms
(drop Flink next year; not general enough)
Apache Hadoop: Hadoop and data sci terms

Apache Mahout: Mahout and data sci terms

Apache Flink: Flink and data sci terms

Apache MXNet: MXnet and data sci terms
Apache Pig: Pig and data sci terms
Apache Spark: Spark and data sci terms
BigML and data sci terms
"BlueSky Statistics" and data sci terms

BMDP and data sci terms
("C programmer" or "C programming" or "C developer" or
 "C++" or "C#") and !("objective c") and data sci terms
Caffe and data sci terms

"Civis Analytics" and data sci terms

Cognos and data sci terms

Databricks and data sci terms
Dataiku and data sci terms

DataRobot and data sci terms
Deducer plus data sci terms

Domino Data Labs: "domino data" and data sci terms
[Beware of just "domino" as it gets pizza data sci jobs!]
FICO [avoid this one; too hard to focus away from credit scores]

FORTRAN and data sci terms

Google: 
Google and data sci terms
I've greatly simplified this one along with Amazon and Microsoft as it's hard not to undercount otherwise. Compare to this old version:
("Google Cloud Machine Learning"
or "Google Cloud AutoML"
or "cloud Dataproc"
or "Cloud Datalab")
and ("big data" or "data analytics" or "machine learning" or "statistical analysis" or "data mining" or "data science" or "quantitative analysis" or "business analytics" or "advanced analytics" or "data scientist" or "statistical software" or "predictive analytics" or "artificial intelligence" or "predictive modeling" or "statistical modeling" or "quantiative research" or "research analyst") 
so it's missing: or "statistical tools" or "statistician"

(graphPad and Prism) and data sci terms

H2O: 
"H2O" and data sci terms

Hadoop and data sci terms

IBM SPSS: (SPSS and !"SPSS Modeler") and data sci terms

IBM SPSS Modeler: "SPSS Modeler" and data sci terms

IBM Watson: "IBM Watson" and data sci terms

JASP and data sci terms

jamovi and data sci terms

Java and data sci terms 
JMP and data sci terms
Julia and data sci terms

Keras adn data sci terms
KNIME and data sci terms
Lavastorm and data sci terms

Lasagne and data sci terms 
[check this carefully]

Mathematica and data sci terms
MATLAB and data sci terms
(Megaputer or Polyanalyst) and data sci terms
Minitab and data sci terms

Microsoft:
Azure and data sci terms

MLlib and data sci terms 
NCSS and data sci terms

OpenText and data sci terms

("OriginPro" or "OriginLab") and data sci terms
[Just "origin" vastly overcounts.]

Pentaho and data sci terms

Python and data sci terms

Pytorch and data sci terms

R:
(" R " or " R,") and data sci terms
 
"R Commander" and data sci terms

RapidMiner + data sci terms

Rattle: (Rattle and !"Rattle off") and data sci terms

RKWard and data sci terms

("SAP Predictive" 
or "SAP Automated" 
or "SAP Leonardo"  
or "SAP Hana") and data sci terms

SAS:
(SAS !"Enterprise Miner") and data sci terms 

SAS Enterprise Miner: 
"Enterprise Miner" and data sci terms

Scala: "Scala" and data sci terms 

"Scikit Learn" and data sci terms

"Splunk" and data sci terms

SQL and data sci terms

Stata and data sci terms 

Statgraphics and data sci terms 

Systat and data sci terms 

Tableau and data sci terms 

Tensorflow and data sci terms

Theano and data sci terms 

Tibco: 
(tibco or spotfire or statistica) and data sci terms 

Vowpal Wabbit

WEKA and data sci terms

World Programming:
"WPS Analytics" plus data sci terms
[Test it this way]

Table 3. Search terms used for each software (see Table 1 for data science terms).

Searching for Trends

Indeed.com has a Job Trends tool that lets you see how jobs are changing across the last several years. You can enter one or more searches from one of the examples above to see the trends. Unfortunately, the search for trends must be much simpler than Indeed.com’s main job search.The best pair of queries I could get to compare R and SAS is:

R 
and ("big data"
or "data analytics"
or "machine learning"
or "statistical analysis"
or "data mining"
or "data science")

SAS
and ("big data"
or "data analytics"
or "machine learning"
or "statistical analysis"
or "data mining"
or "data science")

Now that you’ve got the details, check out the results here. I’m very interested in improving this methodology so if you have ideas, please comment below or send me email at muenchen.bob@gmail.com.

21 Responses to How to Search for Data Science Jobs

  1. Pingback: Job Trends in the Analytics Market: New, Improved, now Fortified with C, Java, MATLAB, Python, Julia and Many More! | r4stats.com

  2. Pingback: Job Trends in the Analytics Market: New, Improved, now Fortified with C, Java, MATLAB, Python, Julia and Many More! | Patient 2 Earn

  3. Soumya Boral says:

    A very good study indeed. By the way do you use the Advanced Job Search Option on indeed.com or on the ordinary search option .

  4. boral1 says:

    Very good study . By the way are you using the Advanced Job Search to get the numbers or the ordinary search option ?

  5. Pingback: Analytics Software Popularity Update: Counting Blogs, Simplifying Job Searches | r4stats.com

  6. Thanks Bob – applied to several positions today using this precise methodology… Wonderful!

    • Bob Muenchen says:

      Hi Joe,

      Since my goal was to measure the popularity or market share of analytics tools, it did not even occur to me that people would use this info to actually find jobs. Doh! I’m glad it helped!

      Cheers,
      Bob

  7. Pingback: Python for Analytics | Building The Analytic Enterprise

  8. Pingback: YOU CANalytics

  9. ProQuotient says:

    This is an extremely helpful post for people who are looking for jobs in the Data Science industry. Using advanced search is quite effective to find the posts you are looking for. Thank you for sharing this!

  10. Pingback: Data Science Job Report 2017: R Passes SAS, But Python Leaves Them Both Behind | r4stats.com

  11. Pingback: Data Science Job Report 2017: R Passes SAS, But Python Leaves Them Both Behind | A bunch of data

  12. Pingback: Data Science Job Report 2017: R Passes SAS, But Python Leaves Them Both Behind - Use-R!Use-R!

  13. Pingback: Jobs for “Data Science” Up 7-fold, for “Statistician” Down by Half | r4stats.com

  14. Africa says:

    Indeed.com should have a classification for employers for “Data Science ” for all of the “data scientists ” from different industries. Therefore, job hunting would be much easier for potential data scientists. Thanks for the articles.

    • Bob Muenchen says:

      Hi Africa,

      If you look at their “advanced search” you’ll see the fields they allow you to search on. I’m searching on all fields, and it’s very fast. They could create a classification system where you would choose a category, like “Science & Technology” and then choose a subcategory like “Data Scientist”, but it would be a big job for them to maintain as terminology changes.

      Cheers,
      Bob

  15. Pingback: Data Science Job Report 2017: R Passes SAS, But Python Leaves Them Both Behind | Open Data Science

  16. Pingback: The popularity of statistics software in (UK) academic job market – Dr Wang's Statistical Spectrum

  17. Pingback: Data Science Jobs Report 2019: Python Way Up, Tensorflow Growing Rapidly, R Use Double SAS – Data Science Austria

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

privacy policy