45+ Must-Try ChatGPT Prompts for Data Science

Welcome to our blog post, where we delve into the world of ChatGPT prompts for Data Science, revolutionizing the way professionals interact with artificial intelligence. ChatGPT prompts have emerged as a game-changing tool for data scientists, unlocking new possibilities and enhancing the entire data exploration process. By leveraging the capabilities of ChatGPT, researchers and analysts can embark on a journey of unparalleled creativity and efficiency.

In the realm of Data Science, where uncovering insights and making data-driven decisions are paramount, ChatGPT prompts for data science serve as invaluable assets. These prompts allow users to interact with ChatGPT, an advanced language model, by inputting specific queries and receiving informative responses. The seamless integration of ChatGPT prompts into the Data Science workflow enables professionals to refine their understanding, ask nuanced questions, and even explore alternative perspectives.

While ChatGPT is undeniably a groundbreaking tool, it’s important to mention Arvin—an exceptional alternative to ChatGPT. Arvin seamlessly integrates with any website, offering users a user-friendly interface to input prompts and receive exceptional answers. Designed to provide insightful responses, Arvin complements ChatGPT by offering an additional option for those seeking the perfect AI companion in their data exploration journey.

In the following sections, we will explore 35 compelling ChatGPT prompts for Data Science, each tailored to address specific challenges and extract valuable insights. Let’s dive into the possibilities and unlock the true potential of ChatGPT prompts.

Best ChatGPT Prompts for Data Science:

ChatGPT Prompts for Data Science: Data nalysis workflows

SQL data analysis workflows

1. Data generation & creating tables

I want you to act as a data generator. Can you write SQL queries in {database version} that create a table {table name} with the columns {column name}. Include relevant constraints and index.

2. Common table expressions

I want you to act as a SQL code programmer. I am running {database version}. Can you rewrite this query using CTE? {Insert query}

3. Write SQL queries from natural language

Example: Data aggregation in SQL

I want you to act as a data scientist. {Insert description of tables}. Can you {count/sum/take average} of {value} which are {insert filters}

Example: 7 day running average in SQL

Please act as a data scientist.  I am running {PostgreSQL 14/MySQL 8/SQLite 3.4/other versions.}. I have the tables {table_name} which are {table description}. The sales table consists of the columns {column names}. Can you please write a query that finds the 7-day running average of {quantity}?

Example: Window functions in SQL

Please act as a data scientist.  I am running {PostgreSQL 14/MySQL 8/SQLite 3.4/other versions.}. I have the tables {table_name} which are {table description}. The sales table consists of the columns {column names}. Can you please write a query that finds {required window function}?

Example: Window functions in SQL

Please act as a data scientist.  I am running {PostgreSQL 14/MySQL 8/SQLite 3.4/other versions.}. I have the tables {table_name} which are {table description}. The sales table consists of the columns {column names}. Can you please write a query that finds {required window function}?

Python data analysis workflows

1. Data generation workflow

Example: Generate Markdown

I want you to act as a data generator in Python. Can you generate a Markdown file that contains {data requirement}. Save the file to {filename}

Example: Generate CSV

I want you to act as a data generator in Python. Can you generate a CSV file that contains {data requirement}. Save the file to {filename}

Example: Generate JSON

I want you to act as a data generator in Python. Can you generate a JSON file that contains {data requirement}. Save the file to {filename}

2. Data cleaning workflow

I want you to act as a data scientist programming in Python Pandas. Given a CSV file that contains data of {dataframe name}  with the columns {colum names}  for {dataset context}, write code to clean the data? {Insert requirements for data}

3. Data analysis workflow in pandas

Example: Data Aggregation

I want you to act as a data scientist programming in Python Pandas. Given a table {table name} that consists of the columns {column names}  can you please write a query that finds {requirement}?

Example: Data Merging

I want you to act as a data scientist programming in Python Pandas. Given a table {table 1 name}  that consists of the columns {column names}  and another table {table 2 name}  with the columns {column names}, please merge the two tables. {Insert additional requirement, if any}

Example: Data Reshaping

I want you to act as a data scientist programming in Python Pandas. Given a table {table name} that consists of the columns {column names}  can you aggregate the {value}  by {column} and convert it from long to wide format?

Example: Generate Markdown

I want you to act as a data generator in R. Can you generate a Markdown file that contains {data requirement}. Save the file to {filename}

R data analysis workflows

1. Data generation workflow 

Example: Generate Markdown

I want you to act as a data generator in R. Can you generate a Markdown file that contains {data requirement}. Save the file to {filename}

Example: Generate CSV

I want you to act as a data generator in R. Can you generate a CSV file that contains {data requirement}. Save the file to {filename}

Example: Generate JSON

I want you to act as a data generator in R. Can you generate a JSON file that contains {data requirement}. Save the file to {filename}

2. Data cleaning workflow

I want you to act as a data scientist programming in R tidyr. You are given the {dataframe name} dataframe containing the columns {column name}. {Insert requirement}

3. Data analysis workflow in tidyr

Data Aggregation

I want you to act as a data scientist programming in R tidyr. You are given the {dataframe name} dataframe containing the columns {column name}. {Insert requirement}

Data Merging

I want you to act as a data scientist programming in R tidyr. You are given the {dataframe 1 name} dataframe containing the columns {column name}. You also have a {dataframe 2 name} dataframe containing the columns {column name}. Find the {required output} 

Example: Data Reshaping (Long to Wide)

I want you to act as a data scientist programming in R tidyr. You are given the {dataframe name} dataframe containing the columns {column name}.  Please convert the data to wide format.

Example: Data Reshaping (Wide to Long)

I want you to act as a data scientist programming in R tidyr. You are given the {dataframe name} dataframe containing the columns {column name}.  Please convert the data to long format.

ChatGPT Prompts for Data Science: Data Visualization Workflows

R data visualization workflows

1. Creating plots in ggplot2

I want you to act as a data scientist coding in R. Given a dataframe {dataframe name} containing the columns {column names} Use ggplot2 to plot a {chart type and requirement}.

2. Gridplot visualizations in ggplot2

I want you to act as a data scientist coding in R. Given a dataframe {dataframe name} containing the columns {column names}. Use ggplot2 to plot a pair plot that shows the relationship of one variable against another.

3. Annotating and formatting plots

I want you to act as a data scientist coding in R. Given a dataframe {dataframe name} containing the columns {column names}, use ggplot2 to plot a {chart type} the relationship between {variables}. {Insert annotation and formatting requirements}

4. Changing plot themes in ggplot2

I want you to act as a data scientist coding in R. Given a dataframe {dataframe name} containing the columns {column names}, use ggplot2 to to plot a {chart type} the relationship between {variables}. Change the color theme to match that of {theme}

Python data visualization workflows

1. Creating plots with matplotlib

I want you to act as a data scientist coding in Python.  Given a dataframe {dataframe name} containing the columns {column names} Use matplotlib  to plot a {chart type and requirement}.

2. Crating pairplots with matplotlib

I want you to act as a data scientist coding in Python. Given a dataframe {dataframe name} containing the columns {column names}. Use matplotlib to plot a pair plot that shows the relationship of one variable against another.

3. Annotating and formatting plots in matplotlib

I want you to act as a data scientist coding in Python. Given a dataframe {dataframe name} containing the columns {column names}, use matplotlib to to plot a {chart type} the relationship between {variables}. {Insert annotation and formatting requirements}

4. Changing plot themes in matplotlib

I want you to act as a data scientist coding in Python. Given a dataframe {dataframe name} containing the columns {column names}, use matplotlib to to plot a {chart type} the relationship between {variables}. Change the color theme to match that of {theme}

ChatGPT Prompts for Data Science: Machine Learning Workflows

General machine learning workflow

1. Feature engineering ideation

I want you to act as a data scientist. Given a dataset of {dataset name} that contains the {columns}, you are to predict {predicted variable}. Suggest data that will be helpful for this problem and perform feature engineering for this problem.

Python machine learning workflow

1. Model training workflow

I want you to act as a data scientist programming in Python. Given a dataset of {dataframe name} that contains the {column name}, write code to predict {output variable}.

2. Hyperparameter tuning workflow

I want you to act as a data scientist programming in Python. Given a {type of model} model, write code to tune the hyperparameter.

3. Model explainability workflow

I want you to act as a data scientist programming in Python. Given a {type of model} that predicts the {predictor variable}, write code that explains an output using Shap values.

R machine learning workflow

1. Model training workflow

I want you to act as a data scientist programming in R. Given a dataframe of {dataframe name} that contains {column names}, write code to predict {output}.

2. Hyperparameter tuning workflow

I want you to act as a data scientist programming in R. Given a {type of model} model, write code to tune the hyperparameter.

3. Model explainability workflow

I want you to act as a data scientist programming in R. Given a {type of model} that predicts the {predictor variable}, write code that explains an output using Shap values.

ChatGPT Prompts for Data Science: Time Series Analysis Workflows

Python time series analysis workflows

1. Changing time horizons using pandas

I want you to act as a data scientist coding in Python. Given a time series data in a Pandas dataframe {dataframe name} with timestamp Index in {original frequency} frequency with one column {column name}, convert the timestamp frequency to {desired frequency}.

2. Build test series model

I want you to act as a data scientist coding in Python. Given a time series data in a dataframe {dataframe name} with timestamp Index in {original frequency} frequency with one column {column name},  build a forecasting model, assuming data is stationary.

3. Perform stationarity test

I want you to act as a data scientist coding in Python. Given a time series data in a dataframe {dataframe name} with timestamp Index in {original frequency} frequency with one column {column name}, perform a Dicky Fuller test.

R time series analysis workflows

1. Changing time horizons 

I want you to act as a data scientist coding in R. Given a time series data in a dataframe {dataframe name} with timestamp Index in {original frequency} frequency with one column {column name}, convert the timestamp frequency to {desired frequency}.

2. Changing time horizons 

I want you to act as a data scientist coding in R. Given a time series data in a dataframe {dataframe name} with timestamp Index in {original frequency} frequency with one column {column name}, convert the timestamp frequency to {desired frequency}

3. Perform stationarity test

I want you to act as a data scientist coding in R. Given a time series data in a dataframe {dataframe name} with timestamp Index in {original frequency} frequency with one column {column name}, perform a Dicky Fuller test.

ChatGPT Prompts for Data Science: Natural Language Processing Workflows

Classify text sentiment

I want you to act as a sentiment classifier. Classify the following text which came from {describe text origin} as “positive”, “negative”, “neutral” or “unsure”: {Insert text to be classifier}.

Create regular expressions

I want you to act as a programmer coding in Python, use regular expressions to test if a string {insert requirements}.

Text dataset generation

I want you to act as a dataset generator. Please generate {number of text} texts on {required text and the context}. {Insert additional requirements}.

Machine translation

I want you to act as a translator. Please translate {phrase}  from {origin language} to {translated language}.

ChatGPT has emerged as a powerful tool, revolutionizing the way professionals interact with AI and unlocking new frontiers in data exploration. With ChatGPT prompts for data science, enhance creativity, streamline workflows, and provide valuable insights, these prompts empower data scientists to push the boundaries of their research and make data-driven decisions with confidence.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top