Table of Contents
ToggleScikit-learn is one of the coolest python libraries because with just four to five lines of code you can train machine learning models like linear regression or a decision tree. If you are beginner and want to explore the power of python, then Scikit-learn is the best start for you. You can make build real predictive models without mastering math or complex programming.
By the end of this blogpost you will come to know the power of Scikit-learn, one real time project using Scikit-learn and how to check scikit-learn version in cmd, Jupiter notebook, anaconda and Google colab.
Lets gets started.
What Is Scikit-learn?
Scikit-learn is an open-source Python library that makes machine learning easy to use. It helps you to build models that can learn from data and make predictions without the need to write complex algorithms from scratch.
In short, Scikit-learn gives you the tools to teach a computer how to make smart decisions.
What Can You Do with Scikit-learn?
Here are some of the main things you can do with Scikit-learn:
1. Train Machine Learning Models
You can build models that learn patterns in data. For example:
- Predict stock prices.
- Classify emails as spam or not.
- Group customers by shopping habits.
2. Preprocess Data
Raw data often needs cleaning before you can use it. Scikit-learn helps you to:
- Remove missing values.
- Scale numbers.
- Convert categories to numbers.
3. Split Data for Training and Testing
You can easily divide your data into training and testing sets. This helps you to check how well your model is performing.
4. Choose the Right Algorithm
Scikit-learn offers many ready-to-use machine learning algorithms, like:
- Linear Regression (for predicting numbers)
- Decision Trees (for making decisions based on rules)
- Support Vector Machines (SVM) (for classifying data)
- K-Means (for clustering similar data points)
5. Evaluate Your Model
Once your model is trained, Scikit-learn helps you to measure how well it is working using tools like:
- Accuracy scores.
- Confusion matrices.
- Cross-validation.
Email Spam Classifier Using Scikit-learn
We are going to train a machine learning model that can classify emails for spam or not spam. We are using the SMS Spam Collection Dataset, which is a public dataset containing 5,000+ SMS messages labeled as spam or ham.
Step-by-Step Working Code
Please install scikit-learn on your computer before running the code.
How to install Scikit-learn:
If you are using a terminal or command line:
Run this command:
pip install scikit-learn
Or, if you are using Python 3 specifically:
pip3 install scikit-learn
If you are using Jupyter Notebook or an online IDE like Replit:
Add this line at the very top of your code in a new cell (or just run it once):
!pip install scikit-learn
Now, your environment to train the model is ready. Run this code.
# Step 1: Import libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, classification_report
# Step 2: Load the dataset
# You can use the "SMS Spam Collection Dataset"
url = "https://raw.githubusercontent.com/justmarkham/pycon-2016-tutorial/master/data/sms.tsv"
data = pd.read_csv(url, sep="\t", header=None, names=["label", "message"])
# Step 3: Convert labels to numbers (ham = 0, spam = 1)
data["label_num"] = data.label.map({"ham": 0, "spam": 1})
# Step 4: Split the data
X_train, X_test, y_train, y_test = train_test_split(
data["message"], data["label_num"], test_size=0.2, random_state=42
)
# Step 5: Vectorize the text (convert words to numbers)
vectorizer = CountVectorizer()
X_train_vectorized = vectorizer.fit_transform(X_train)
X_test_vectorized = vectorizer.transform(X_test)
# Step 6: Train the model
model = MultinomialNB()
model.fit(X_train_vectorized, y_train)
# Step 7: Make predictions
y_pred = model.predict(X_test_vectorized)
# Step 8: Evaluate the model
print("Accuracy:", accuracy_score(y_test, y_pred))
print("Classification Report:\n", classification_report(y_test, y_pred))
Expected Output
Accuracy: 0.98
Classification Report:
precision recall f1-score support
0 0.99 0.99 0.99 966
1 0.94 0.94 0.94 149
accuracy 0.98 1115
macro avg 0.96 0.96 0.96 1115
weighted avg 0.98 0.98 0.98 1115
What You Learn from this example?
- How to handle text data.
- How to vectorize text using
CountVectorizer
. - How to train a Naive Bayes classifier.
- How to evaluate a text classification model.
Now lets find out how to check version of scikit-learn in CMD, Jupiter notebook, anaconda and Google colab.
How to Check Version of Scikit-learn in CMD (Command Prompt)
To check the Scikit-learn version using the command line (CMD), just run the following command:
pip show scikit-learn
This will return information about the installed package, including the version. You’ll see something like this:
Name: scikit-learn
Version: 1.3.2
If this code does not work, make sure pip
is installed and you are in the correct Python environment.
How to Check Version of Scikit-learn in Jupyter Notebook
If you are working in a Jupyter Notebook, you can check the version using Python code:
import sklearn
print(sklearn.__version__)
This will print the version directly in your notebook output.
How to Check Version of Scikit-learn in Anaconda
If you are using Anaconda, you can check the version in a couple of ways:
Option 1: Using Anaconda Prompt
Open the Anaconda Prompt and run:
conda list scikit-learn
This will display the version and installation details.
Option 2: Using Anaconda Navigator
- Open Anaconda Navigator
- Go to the Environments tab
- Select your environment (e.g.,
base
) - Search for
scikit-learn
in the list - You will see the version number right next to it
How to Check Scikit-learn Version in Google Colab
Run this code in Google colab cell:
import sklearn
print(sklearn.__version__)
Colab runs Python in the cloud, but the method to check versions is exactly the same as in Jupyter.
What Is the Difference Between Conda vs Pip3?
There is the little difference between conda and pip3.
pip3
- Comes with Python by default.
- Installs packages from PyPI (Python Package Index).
- Works well for most Python packages.
conda
- Comes with Anaconda.
- Manages both Python packages and environments.
- Can install non-Python dependencies (like C libraries).
In short:
- Use
pip3
if you are working in a basic Python setup. - Use
conda
if you want a full-featured data science environment with package + environment management.
Both tools can install scikit-learn
, but you should stick to one (conda or pip) in each environment to avoid conflicts.
Final Thoughts
Scikit-learn is the beginner friendly python library that can be used to train different machine learning models. It is simple, powerful, and flexible. You don’t need to write complex math formulas or invent new algorithms, you just need to know your problem, your data, and how to use the tools.
Knowing your Scikit-learn and NLTK versions helps avoid compatibility issues and makes your development process smoother. Whether you’re using CMD, Jupyter, Anaconda, or Google Colab, checking the version is easy and only takes a line or two.
If you want to keep your machine learning environment clean and organized, choose between pip3
and conda
, don’t mix them unless you know what you are doing.
Happy coding!
People Also Ask
How to check Scikit-learn version in CMD?
Use this code:
pip show scikit-learn
This will display the version along with other details.
How do I know if Scikit is installed?
In the command line, run:
pip show scikit-learn
Or in Python:
import sklearn
If there is no error, it’s installed!
What version of Python is sklearn?
Scikit-learn supports different Python versions depending on the release. You can check compatibility on the official Scikit-learn documentation. To see your Python version, run:
python --version
How to check Scikit-learn version in Colab?
In a Colab cell, run:
import sklearn
print(sklearn.__version__)
It will show the version number right there in the output.
Stay ahead of the curve with the latest insights, tips, and trends in AI, technology, and innovation.