Data & AI Pillar guide 18 min read · Updated Apr 2026

Data Scientist Career Guide 2025:
Salaries, LLM Skills & Resume Strategy

The data science job title is fragmenting. Generative AI has created three new roles, collapsed entry-level hiring, and added an 18–30% salary premium for LLM skills. This guide maps what actually changed, what the real salary data says across seniority and specialisation, and exactly what to put on your resume to pass ATS in the new landscape.

BLS: 34% job growth 2024–2034 — fastest outside healthcare
ML Engineer roles grew 350%+ over the last decade
LLM skills command 18–30% salary premium over baseline
4.2M unfilled AI roles globally — 320K qualified developers

The actual market

What's actually happening in the data science job market in 2025

Two contradictory narratives dominate data science career discourse in 2025, and both contain real signal. The pessimistic read: entry-level data scientist postings declined significantly, a 2025 study examining 285,000 companies found that GenAI adoption reduced junior hiring, and tech layoffs hit the field hard in 2022–2023. The optimistic read: the BLS projects 34% employment growth for data scientists from 2024 to 2034 — roughly 23,400 openings per year — and LinkedIn ranked AI Engineer as the #1 fastest-growing job category in early 2025. Both are true. The resolution: the data science title is declining; data science work is exploding into more specialised, better-paid roles.

What's gone: the "Swiss Army knife" data scientist of 2015 — one person expected to clean data, build models, deploy to production, and present to the CEO. Companies found this delivered poor ROI. The role has split into at least three distinct profiles that look completely different in job descriptions and on resumes. What's surging: ML Engineering (350%+ job posting growth over the decade), AI Engineering (LLM-focused application building), and MLOps roles that operationalise models at scale. And notably: even amid a 27% year-over-year decline in overall tech hiring in 2025, AI-related postings grew 16% over the same period.

34%

BLS-projected job growth for data scientists, 2024–2034. Fastest of any non-healthcare tech role.

BLS Occupational Outlook Handbook, 2024

$112,590

BLS median annual wage for data scientists (May 2024). Total comp at FAANG ranges $180K–$450K+.

BLS OEWS 2024

38%+

ML Engineer salary premium over Data Scientists at the same seniority level, per Jobs-in-data.com analysis of thousands of listings.

Jobs-in-data.com, 2025

The entry-level collapse is real — but it's not the whole story

A September 2025 study examining 285,000 companies found that GenAI adoption reduced junior and entry-level hiring. This tracks: tasks that used to require a junior data scientist — basic data cleaning, routine EDA, simple dashboards — are increasingly automated. But the same study found that senior and mid-level roles were minimally affected. The implication for job seekers: getting to mid-level faster by building production-grade portfolio projects (RAG systems, deployed ML APIs, LLM applications) is more valuable than spending years in junior analyst roles. The floor has risen; the ceiling is much higher.

The skill premium data is the most actionable signal. PwC's 2025 Global AI Jobs Barometer found that skills in AI-exposed jobs are changing 66% faster than in other jobs, and that workers with AI skills can command a substantial wage premium. Specifically in data science: Machine Learning skills add ~25% to base salary, Deep Learning adds ~30%, and GenAI/LLM skills add 18–30% over industry baseline. The half-life of a data science skill set has shortened to roughly 18 months — which means what you put on your resume must reflect what you've done in the last two years, not your entire career arc.

Role fragmentation

The three data scientist archetypes — and why your resume needs to pick one

The biggest resume mistake data scientists make in 2025 is writing a generic resume that tries to be all three archetypes simultaneously. ATS systems and recruiters are pattern-matching against specific job descriptions. When your resume mentions Jupyter Notebooks, A/B testing, and Tableau alongside Kubernetes and Kubeflow, it signals confusion about which role you're applying for — and ranks lower in systems tuned for a specific profile. Here's what actually distinguishes the three archetypes in job descriptions:

The Analytics Data Scientist

Business-aligned · Insights-first · Experimentation

$100K–$145K median · $180K+ at top companies

Most common resume mistake: Writing ML Engineer-style bullets ('Deployed containerised model serving 50K req/s') when applying for analytics roles. This signals you're over-engineered for the role and may leave quickly.

Primary tools stack

Python (pandas, NumPy)

SQL

Tableau / Power BI / Looker

A/B testing / experimentation platforms

Statistical modelling (scikit-learn)

Jupyter Notebooks

How to identify if this is the right archetype

Statistical significance, p-values, confidence intervals appear in job description

'Stakeholder communication' or 'presenting to leadership' mentioned prominently

Role at a product company, e-commerce, fintech, or consulting firm

Job title: Data Scientist, Senior Data Scientist, Analytics Scientist, Applied Scientist

Resume priority for this archetype

Lead with business impact quantified in dollars or percentage improvements. 'Built churn prediction model that retained $2.3M ARR.' Recruiters for this archetype care that you translate models into decisions, not just that you built them.

Bullet example — same experience, two framings

Implemented XGBoost classifier using Python to analyse customer data.

Built XGBoost churn classifier for 450K-user subscription platform; model identified 78% of high-risk accounts 30 days before churn, enabling $2.1M in targeted retention campaigns.

The ML Engineer / Applied Scientist

Production-first · Scale-oriented · Systems-thinking

$150K–$200K median · $250K–$400K+ total comp at FAANG

Most common resume mistake: Describing research-style work ('Explored various architectures') when applying to ML engineering roles. These roles need evidence you can take a model from experiment to production.

Primary tools stack

Python + PyTorch / TensorFlow

MLOps: MLflow, Kubeflow, Weights & Biases

Docker + Kubernetes

Cloud: AWS SageMaker / GCP Vertex / Azure ML

Feature stores: Feast, Tecton

Apache Spark, Databricks

How to identify if this is the right archetype

'Production', 'deploy', 'inference', 'latency', 'throughput' in job description

'MLOps', 'model monitoring', 'data drift', 'retraining pipelines' mentioned

Role at a tech platform, AI product company, or large enterprise ML team

Job title: ML Engineer, Applied Scientist, Research Engineer, MLE

Resume priority for this archetype

Lead with systems you shipped, not models you trained. Scale signals (requests/second, users served, latency improvements) are the ATS-ranking keywords for this archetype. MLOps certifications and cloud platform credentials carry real weight.

Bullet example — same experience, two framings

Trained neural network models for recommendation system using TensorFlow.

Designed and deployed real-time recommendation engine serving 12M users at <50ms p95 latency; reduced inference cost 40% through quantisation and batching on AWS SageMaker.

The AI / LLM Engineer

GenAI-focused · Application-builder · Integration-first

$150K–$250K median · Explosive demand since 2023

Most common resume mistake: Writing 'experience with ChatGPT' or 'familiar with LLMs' without showing a deployed RAG system, an agent, or a fine-tuned model. These are table stakes by 2025 — not differentiators.

Primary tools stack

Python + LLM APIs (OpenAI, Anthropic, Google)

LangChain / LlamaIndex / Haystack

RAG pipelines + vector databases (Pinecone, Weaviate, Chroma)

Hugging Face Transformers

Fine-tuning: LoRA, QLoRA, RLHF

Prompt engineering + evaluation frameworks

How to identify if this is the right archetype

'LLM', 'RAG', 'generative AI', 'prompt engineering', 'agents' in job description

'LangChain', 'vector database', 'embeddings', 'fine-tuning' in requirements

Role at a startup building AI-first product, or enterprise AI transformation team

Job title: AI Engineer, LLM Engineer, GenAI Engineer, Applied AI Scientist

Resume priority for this archetype

This archetype cares almost entirely about what you built with LLMs, not what you trained as an LLM. Per Chip Huyen's AI Engineering (O'Reilly, 2025): AI engineers integrate foundation models into applications; ML engineers build those models. Your resume needs to reflect building, deploying, and evaluating LLM-powered systems.

Bullet example — same experience, two framings

Used GPT-4 API to build a chatbot for customer service queries.

Built production RAG pipeline over 500K enterprise knowledge base documents using LangChain + Pinecone; reduced hallucination rate from 34% to 6% via hybrid retrieval and context reranking; deployed on AWS Lambda serving 8K daily queries.

The LLM skills layer

The LLM skill stack every data scientist needs to understand — and how to show it on your resume

By 2025, every data science job description at a company with more than 50 engineers will contain at least one of: RAG, LLM fine-tuning, prompt engineering, vector databases, LangChain, Hugging Face, or agentic AI. These are no longer specialist qualifications — they're base-level literacy. The salary premium for genuine LLM competence (18–30% over baseline) exists because there is a massive gap between candidates who list these keywords and those who can actually build with them. Here's what each actually means in practice, and what evidence belongs on your resume.

Retrieval-Augmented Generation (RAG)

A technique that grounds LLM responses in external data by retrieving relevant document chunks at query time and passing them as context. Solves LLM hallucination and knowledge cutoff problems without expensive fine-tuning.

What to list on your resume

Vector databases used (Pinecone, Weaviate, Chroma, Qdrant), embedding models (text-embedding-3-small, BGE-large), chunking strategy, retrieval metrics achieved. Name the orchestration layer: LangChain, LlamaIndex, Haystack.

Bullet signal format

"Built production RAG pipeline over [N] documents achieving [X]% reduction in hallucinations / [Y]% retrieval precision"

Top ATS keywords

RAGRetrieval-Augmented Generationvector databaseembeddingssemantic searchLangChainLlamaIndexPineconeWeaviate

Fine-tuning & Parameter-Efficient Methods (LoRA, QLoRA)

Training a pre-trained model on domain-specific data to improve performance on targeted tasks. LoRA (Low-Rank Adaptation) and QLoRA (quantised LoRA) allow fine-tuning on consumer hardware by training only adapter layers, not the full model.

What to list on your resume

Base model used (LLaMA-3, Mistral, GPT-3.5 via OpenAI fine-tuning API), method (LoRA, QLoRA, SFT, RLHF), dataset size, performance improvement achieved. Specify the task (classification, summarisation, code generation, domain Q&A).

Bullet signal format

"Fine-tuned [model] using [method] on [N] domain examples; achieved [X]% improvement in [metric] vs. zero-shot baseline"

Top ATS keywords

fine-tuningLoRAQLoRARLHFDPOSFTHugging FacePEFTtransformersdomain adaptation

Prompt Engineering

Systematic design of inputs to elicit reliable, structured outputs from LLMs. Includes few-shot prompting, chain-of-thought, instruction tuning, output parsing, and evaluation frameworks. At scale, this is a software engineering discipline.

What to list on your resume

Techniques used (chain-of-thought, few-shot, structured output parsing with Pydantic/Instructor), evaluation framework (RAGAS, TruLens, custom eval harness), business context. Avoid listing 'prompt engineering' alone — it reads as table stakes by 2025.

Bullet signal format

"Designed prompt evaluation harness testing [N] prompts across [K] scenarios; identified and resolved 3 systematic failure modes, improving task accuracy by [X]%"

Top ATS keywords

prompt engineeringchain-of-thoughtfew-shot learningstructured outputsevaluationLLM evaluationRAGASprompt optimisation

Agentic AI & Orchestration

Systems where LLMs autonomously select tools, plan multi-step tasks, and interact with external APIs to complete goals. Built with frameworks like LangChain Agents, LlamaIndex, AutoGen, CrewAI, or custom orchestration.

What to list on your resume

Framework used, tools integrated (code interpreter, web search, databases, APIs), reliability patterns implemented (human-in-the-loop, fallback handling), scope of deployment. This is the fastest-growing sub-skill in LLM engineering.

Bullet signal format

"Built multi-agent system using [framework] to automate [workflow]; [N] agents coordinating across [K] tools, processing [X] tasks daily with [Y]% task completion rate"

Top ATS keywords

AI agentsagentic AILangChain agentsAutoGenCrewAItool usefunction callingorchestrationmulti-agent systems

MLOps & Model Lifecycle

The engineering discipline of taking ML/AI models from experiment to reliable production. Covers experiment tracking, model registries, serving infrastructure, monitoring, drift detection, and retraining pipelines.

What to list on your resume

Platforms: MLflow, Weights & Biases, Kubeflow, Airflow, Prefect. Serving: Ray Serve, Triton, ONNX, FastAPI. Monitoring: Evidently AI, Arize, Fiddler. Specify the scale: models served, requests per second, latency SLAs met.

Bullet signal format

"Built MLOps pipeline using [platform] for [N] models; reduced time-to-production from [X] weeks to [Y] days, with automated retraining triggered on [Z]% data drift"

Top ATS keywords

MLOpsMLflowKubeflowmodel monitoringdata driftWeights & Biasesfeature storemodel registryCI/CD for MLAirflow

The portfolio signal that separates real LLM experience from resume inflation

Hiring managers for AI/LLM roles report a significant gap between candidates who list RAG and fine-tuning on resumes and those who can actually discuss implementation decisions in technical screens. The differentiator: a public GitHub repository with deployed code. A resume bullet saying "built RAG pipeline" paired with a linked GitHub repo containing the code, evaluation results, and a deployed demo is 5× more credible than the same bullet without the link. Add your GitHub profile URL to your resume header — not just LinkedIn.

Salary intelligence

Data science salary reality in 2025: by seniority, role, and specialisation

The BLS median of $112,590 (May 2024) captures the full distribution but doesn't tell the real compensation story for data professionals. The spread is enormous — entry-level analytics roles at regional companies start at $75K–$85K, while senior ML engineers at FAANG companies earn $250K–$450K in total compensation including equity. Here's the complete picture across experience levels and specialisations.

Salary by seniority & archetype — US market 2025

LevelAnalytics DSML EngineerAI/LLM EngineerFAANG total comp
Entry (0–2 yrs)$80K–$105K$105K–$130K$110K–$150K$180K–$280K
Mid (3–5 yrs)$110K–$140K$140K–$180K$150K–$220K$250K–$380K
Senior (6–9 yrs)$140K–$175K$175K–$220K$200K–$280K$350K–$500K
Staff / Principal$175K–$220K$210K–$280K$250K–$350K$450K–$600K+
Director / VP$220K–$300K$280K–$350K$300K–$400K$600K+

Sources: BLS OEWS 2024, Jobs-in-data.com, Glassdoor, Levels.fyi. FAANG total comp includes base + bonus + equity.

Skill premiums above baseline salary

+25%

Machine Learning (general)

above baseline data scientist salary

+30%

Deep Learning / Neural networks

above baseline data scientist salary

+18–30%

LLM / Generative AI

above baseline data scientist salary

+20%

Cloud + MLOps

above baseline data scientist salary

The ML Engineer salary gap (38% above Data Scientists at the same seniority) is the most underappreciated compensation story in data careers. A mid-level Data Scientist earns roughly $119,550 median; a mid-level ML Engineer earns roughly $165,000 — a $45K difference for work that is often at the same company, on the same team. The reason: ML Engineers command engineering compensation bands, not data science bands. If you're doing production model deployment, monitoring, and inference optimisation, your title and resume framing are worth $45K+ per year.

Resume architecture

How to structure your data science resume for ATS in 2025

Over 97% of tech companies use ATS to filter data science resumes. The technical skill set of data science makes resumes particularly vulnerable to two failure modes: (1) tool-dense skills sections that list 40 items but provide no evidence of proficiency, and (2) research-style experience bullets that describe activities rather than outcomes. Here's the structure that performs.

Optimal section order for data science resumes

01

Header — with GitHub and portfolio linksData science-specific

Name, email, LinkedIn, GitHub (required), and a portfolio URL or deployed project link if you have one. In data science, GitHub is treated as part of your credential set — recruiters for ML and AI roles check it. If your GitHub has deployed models or LLM projects, a live link on your resume is a direct conversion driver. Format: "github.com/yourname | portfolio.dev"

02

Technical Skills — categorised, not listed alphabeticallyData science-specific

Categorise your skills by type, not by importance. Recruiters and ATS scan for specific tool names. Example categories: Programming Languages: Python, R, SQL | ML Frameworks: PyTorch, TensorFlow, scikit-learn, Hugging Face | LLM/GenAI: LangChain, RAG, fine-tuning (LoRA/QLoRA), prompt engineering | MLOps: MLflow, Kubeflow, Docker, Kubernetes | Cloud: AWS (SageMaker, S3, Lambda), GCP (Vertex AI), Azure ML | Data/Viz: Spark, dbt, Tableau, Airflow. Keep it to skills you can discuss in depth — ATS scans for presence; humans verify in interviews.

03

Professional Summary — 2–3 sentences, archetype-specific

Lead with your archetype signal and your most differentiating recent work. Do not use "passionate about data" or "results-driven professional." Formula: [Title] with [X years] in [archetype-specific specialty]. [One quantified achievement]. [2–3 most relevant tools/skills for the target role]. Example for ML Engineer: "Machine Learning Engineer with 5 years building and deploying production ML systems at scale. Designed recommendation engine serving 12M users at <50ms p95 latency, reducing infrastructure cost by $1.2M annually. Expert in PyTorch, MLflow, Kubernetes, and AWS SageMaker."

04

Experience — impact-first, tool-evidence

For each role: job title, company, location, dates (MM/YYYY — required for ATS date parsing). Then 3–5 bullets using the formula: [Action verb] + [what you built/did] + [technical specifics] + [quantified outcome]. The technical specifics serve as ATS keyword carriers; the quantified outcome is what human reviewers look for. Avoid: "responsible for," "worked on," "involved in." Use: designed, built, deployed, optimised, reduced, improved, automated.

05

Projects — essential for AI/LLM engineers and new grads

For AI/LLM engineers especially: 2–3 projects with GitHub links and deployed demos carry as much weight as work experience. Include: project name, GitHub link, tech stack, and one quantified outcome. "RAG-powered enterprise knowledge base — LangChain + Pinecone + GPT-4 — 87% retrieval accuracy on 500K documents — github.com/yourname/rag-kb." This section is where career changers and bootcamp graduates can compensate for limited work history.

06

Education & certifications

Degree (BS/MS/PhD in CS, Statistics, Engineering, or related). For data science roles, the field matters more than the institution tier for non-elite programs. List relevant certifications: AWS Certified Machine Learning Specialty, Google Professional ML Engineer, Azure AI Engineer Associate, DeepLearning.AI specialisations (Andrew Ng courses carry real weight in the field). Kaggle Competition Master or Expert status is meaningful signal for analytics DS roles.

The tool-list skills section is killing data science resumes

The most common data science resume failure: a Technical Skills section listing 35+ tools in a flat alphabetical list. This creates two problems simultaneously. For ATS: it passes, but lacks category structure that allows weight-based matching. For human reviewers: it reads as credential inflation — a candidate listing both "beginner SQL" and "production Kubernetes" is immediately suspect. The fix is categorisation with honesty about depth. 12 tools you genuinely know deeply beats 40 tools you've touched once. Recruiters will ask about every item in your skills section in technical screens.

ATS-optimised templates

Data science resume templates — by role

Every template is structured for its specific archetype, single-column for ATS, and pre-loaded with salary data and the keywords that actually surface in recruiter searches.

What to avoid

The 8 data science resume mistakes that cost you interviews

01

Listing tools without evidence of depth

Writing 'PyTorch, TensorFlow, Keras, JAX, MXNet' in a skills section without any project or role that demonstrates production use of more than one. Recruiters for ML roles will ask about every item. List the frameworks you've used in production; acknowledge others as 'familiar.'

02

Describing model training as model deployment

A bullet saying 'Trained XGBoost model with 94% accuracy' is research-level evidence. Hiring managers for DS and MLE roles in 2025 want to know if the model reached production. 'Deployed XGBoost model to REST API (FastAPI + Docker) serving 50K daily predictions' is 3× stronger.

03

'Passionate about data' in the summary

Every data scientist is 'passionate about data.' Your summary needs your archetype signal, a quantified achievement, and your 3 most relevant tools — not a personality statement. Generic openers cause ATS semantic scoring to weight your summary near-zero.

04

Omitting GitHub when applying for ML/AI roles

Recruiters and engineers for ML roles check GitHub before interviews. A resume with GitHub omitted leaves the reviewer to assume you have no public code. A resume with a GitHub link showing deployed ML projects is a first-page differentiator for technical roles.

05

Listing 'Machine Learning' without specifying which ML

ML is an umbrella covering supervised classification, regression, clustering, reinforcement learning, NLP, CV, generative models, and more. 'Machine Learning' as a keyword is table stakes. What surfaces in ATS searches is the specific technique or framework: 'XGBoost', 'transformer fine-tuning', 'SARIMA forecasting', 'ResNet50', 'BERT'. Be specific.

06

Research-style bullets that describe process instead of outcome

'Conducted extensive exploratory data analysis and feature engineering to prepare data for model training' describes activity. What was the model? What outcome did it drive? Restructure as: 'Engineered 47 features from raw clickstream data to train LightGBM conversion model, increasing predicted conversion precision from 61% to 84%.' Same work; 5× the signal.

07

Applying to ML Engineer roles with a DS resume (and vice versa)

ML Engineer job descriptions contain signals like 'production', 'deployment', 'latency', 'Kubernetes', 'model serving'. If your resume uses 'stakeholder reporting', 'A/B testing', and 'business intelligence', you're not speaking the same language. Maintain two resume variants: one analytics-focused, one engineering-focused.

08

Listing 'ChatGPT' or 'AI tools' as LLM skills

Using ChatGPT as an end-user is not LLM engineering experience. Recruiters for GenAI roles are looking for: LLM API integration, RAG pipeline implementation, fine-tuning workflows, evaluation frameworks. 'Used ChatGPT for productivity' belongs nowhere on a data science resume.

Pre-submit

Data science resume checklist — run this before every application

Archetype alignment

Technical content

ATS formatting

Keyword coverage

FAQ

Data science career frequently asked questions

Build your data science resume.
Archetype-matched, ATS-safe.

FluidBright's data science templates are structured by archetype — Analytics DS, ML Engineer, and AI/LLM Engineer — with salary data and the specific keywords that surface in recruiter searches for each role. Free to start.

Free forever · ATS-optimised · No credit card