Build Your First Machine Learning Model with Python (Linear & Logistic Regression)

Data Analytics, Machine Learning
August 1, 2025
10:00 am
AuthorKalpana

-- Download Sample data set for below ML Model : https://go1digital.com/wp-content/uploads/sample_data.csv

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression, LogisticRegression
from sklearn.metrics import mean_squared_error, r2_score, accuracy_score, confusion_matrix, classification_report

Machine Learning (ML) may sound intimidating, but with the right steps, anyone can create and test models. In this guide, we’ll walk through building Linear Regression and Logistic Regression models using Python — step by step.

Step 1: Import Libraries and Load the Data

We start by importing the required libraries and loading the dataset.

Table of Contents

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression, LogisticRegression
from sklearn.metrics import mean_squared_error, r2_score, accuracy_score, confusion_matrix, classification_report

# Load dataset (replace 'sample_data.csv' with your file)
data = pd.read_csv('sample_data.csv')  
data.dropna(inplace=True)  # Remove missing values

Step 2: Clean the Data

We remove missing values and prepare the dataset for modeling.

data.dropna(inplace=True)
X = data.drop('target', axis=1)
y = data['target']

Step 3: Split Data into Training & Testing Sets

Splitting helps us evaluate model performance on unseen data.

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

Step 4: Build and Train the Models

We’ll train both Linear Regression and Logistic Regression models.

# Linear Regression
linear_model = LinearRegression()
linear_model.fit(X_train, y_train)

# Logistic Regression
logistic_model = LogisticRegression(max_iter=1000)
logistic_model.fit(X_train, y_train)

Step 5: Make Predictions

Once trained, we can make predictions.

linear_preds = linear_model.predict(X_test)
logistic_preds = logistic_model.predict(X_test)

Step 6: Evaluate the Models

We assess how well our models perform using metrics like MSE, Accuracy, and Confusion Matrix.

# Linear Regression Evaluation
print("Linear Regression:")
print("MSE:", mean_squared_error(y_test, linear_preds))
print("R² Score:", r2_score(y_test, linear_preds))

# Logistic Regression Evaluation
print("\nLogistic Regression:")
print("Accuracy:", accuracy_score(y_test, logistic_preds))
print("Confusion Matrix:\n", confusion_matrix(y_test, logistic_preds))
print("Classification Report:\n", classification_report(y_test, logistic_preds))

Step 7: Improve Your Model

Once your first model works, you can improve its performance:

# 1. Feature Scaling
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# 2. Add More Features
# Try adding new columns or transforming existing ones (feature engineering)

# 3. Use Cross-Validation
from sklearn.model_selection import cross_val_score
scores = cross_val_score(logistic_model, X_scaled, y, cv=5)
print("Cross-Validation Scores:", scores)
print("Average CV Score:", scores.mean())

Conclusion

You’ve successfully built, trained, evaluated, and even improved your machine learning models.
Now try these steps on real-world datasets and test out different algorithms for better results.