1

Anaconda Releases 2022 State of Data Science Survey Results

 1 year ago
source link: https://devm.io/databases/state-of-data-science-survey-2022
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Anaconda Releases 2022 State of Data Science Survey Results

From April 25 to May 14, 3,493 individuals from 133 different countries and regions took Anaconda's State of Data Science online survey. The survey collected demographic information on the data science, machine learning, and artificial intelligence community, analysed how the community works, and gathered details on the major issues and trends that are currently on people’s minds. Here are some of the findings.

Geographical distribution of survey respondents

Geographical distribution of survey respondents (source: 2022 State of Data Science Report)

Who are the respondents?

  • 66.54% of the respondents are either Generation Z or Millennials
  • 76% are male, indicating that the industry is still male-dominated
  • over 80% have a college degree
  • 16.46% are data scientists
  • 33.93% hold senior-level positions
  • most work in technology, finance, consulting, healthcare, and automotive
  • more than half work for companies with less than 1000 employees

How do the respondents spend their time?

  • respondents spend the majority of their time on data preparation and cleansing (37.75%) and reporting and presentation (16%)

Data science and ML measures and tools organisations use

  • 46.83% of commercial respondents said their organisations use Anaconda
  • other popular tools are GitHub (44.94%), Posit (33.33%), Stack Overflow (31.57%), and Tableau (30.65%)
  • 30.61% of organisations evaluate data collection methods according to internally-set criteria
  • 24.84% manually asses data sets for fairness and bias
  • 35.36% perform a series of controlled tests to assess model interpretability
  • 30.23% ensure model outcomes are applicable to all related groups and treatments in test samples

Where do organisations deploy models into production?

  • the majority of commercial respondents (69.20%) deploy models into production, typically via an on-premises local server (41.32%) or the cloud (27.88%)

The most important skills/areas of expertise missing in the data science/ML areas of organisations

  • the top five most important skills/areas of expertise missing in the data science/ML areas of their organisations are engineering skills (38.12%), probability and statistics (33.26%), business knowledge (32.22%), communication skills (30.56%), and big data management (29.24%)

Attitudes towards open-source software

  • 87% of organisations allow the use of open-source software
  • 52% encourage employees to contribute to open-source projects
  • when it comes to the benefits of OSS, the majority of respondents value affordability (20.84%) and speed of innovation (20.54%)

Addressing security challenges

  • 31.08% of the respondents identified security vulnerabilities as the biggest challenge in the open-source community
  • 40.39% claim they use vulnerability and security scanning software
  • 32.76% develop and use custom and proprietary software
  • 27.48% perform manual model and application audits

Programming languages used in data science and machine learning

  • Python remains the programming language of choice among data scientists
  • SQL and Bash/Shell follow

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK