Beyond
Backpropagation: Smarter Neural Networks for Smart Manufacturing
(Code and
results of experiments related to the article submitted to ISM-2025:
“International Conference on Industry of the Future and Smart Manufacturing”)
ABSTRACT
As neural networks (NNs) become
integral to advanced applications in smart manufacturing, the demand for models
that are both accurate and robust continues to grow. A persistent challenge in
NN training lies in avoiding local minima, which can hinder the model’s ability
to minimize the loss function effectively—both in fitting training data and
generalizing to unseen test data, thereby achieving globally optimal
performance. To address this, we propose an extension to traditional
backpropagation, incorporating a self-adaptive mechanism that encourages
exploration of underutilized regions of the optimization landscape. This method
adds an auxiliary objective to the training process, complementing
gradient-based exploitation with an exploration component that dynamically
adjusts the network’s internal state. We provide a mathematical formulation of
the algorithm and conduct comparative experiments showing that our approach
achieves lower training loss and superior accuracy. We analyze its connections
to existing methods such as momentum and entropy-based regularization,
emphasizing its unique contributions. Finally, we discuss the implications for
the industry of the future, where NNs must perform reliably under dynamic,
real-world conditions. By enabling smarter, self-critical models, this approach
advances the development of more reliable and adaptive NNs for smart
manufacturing.
CODE AND EXPERIMENTS
Part I (Restricted Implementation)
(Simplified option when homogeneity gradient depends only on current
change without history part)
1. HOMOGENEITY NN TRAINING (FINAL CODE) [manual setup for learning rate,
labda depends on
] –
[A-A] – MNIST DATASET
=========================================
# BEGINNING OF THE CODE
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, random_split
from torchvision import datasets, transforms
import numpy as np
import matplotlib.pyplot as plt
import time
import math
# Device configuration
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Define the neural network model
class Net(nn.Module):
def __init__(self):
super(Net,
self).__init__()
self.fc1 = nn.Linear(28 * 28, 128)
self.relu = nn.ReLU()
self.fc2 = nn.Linear(128, 10)
def forward(self, x):
x = x.view(-1, 28 * 28)
x = self.fc1(x)
x = self.relu(x)
x = self.fc2(x)
return x
# Function to calculate Absolute Difference Similarity
(ADS)
def calculate_similarity(status, average, epsilon=1e-8):
num = torch.sum(torch.abs(status
- average))
den = epsilon + torch.sum(torch.abs(status) + torch.abs(average))
similarity = 1 - (num / den)
return similarity
# Function to calculate partial derivative of
Homogeneity
def calculate_partial_derivative(status, average, homogeneity_lambda, epsilon=1e-8):
N = torch.sum(torch.abs(status
- average))
D = epsilon + torch.sum(torch.abs(status) + torch.abs(average))
condition1 = torch.logical_or(torch.logical_and(status
> average, average > 0),
torch.logical_and(status <
average, average < 0))
condition2 = torch.logical_or(torch.logical_and(status
> 0, average < 0),
torch.logical_and(status < 0, average > 0))
condition3 = torch.logical_and(status == 0, average > 0)
condition4 = torch.logical_and(status == 0, average < 0)
condition5 = status == average
partial_derivative
= torch.zeros_like(status)
partial_derivative[condition1]
= (1 - homogeneity_lambda) * (1 / D**2) * (D - N)
partial_derivative[condition2]
= (1 - homogeneity_lambda) * (1 / D**2) * (D + N)
partial_derivative[condition3]
= -(1 - homogeneity_lambda) / D
partial_derivative[condition4]
= (1 - homogeneity_lambda) / D
partial_derivative[condition5]
= 0
remaining_indices
= torch.logical_not(torch.logical_or(torch.logical_or(torch.logical_or(condition1,
condition2),
torch.logical_or(condition3,
condition4)),
condition5))
partial_derivative[remaining_indices] = (1 - homogeneity_lambda)
* (1 / D**2) * (
D * torch.sign(status[remaining_indices]
- average[remaining_indices]) -
N * torch.sign(status[remaining_indices])
)
return partial_derivative
# Function to perform Homogeneity-driven weight update
def weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average, epsilon=1e-8):
partial_derivative
= calculate_partial_derivative(status, average, homogeneity_lambda,
epsilon)
delta = -homogeneity_learning_rate
* partial_derivative
return delta
# Function to train the model with Homogeneity-driven
update
def train_homogeneity_driven(model, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average):
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(),
lr=learning_rate)
optimizer_homogeneity
= optim.Adam(model.parameters(), lr=homogeneity_learning_rate)
train_losses =
[]
val_losses =
[]
train_accuracies
= []
val_accuracies
= []
homogeneity_values
= []
homogeneity = 1.0
total_start_time
= time.time()
iteration_counter
= 1
for epoch in range(num_epochs):
epoch_start_time
= time.time()
model.train()
epoch_train_loss
= 0
epoch_train_correct
= 0
epoch_train_total
= 0
for batch_idx,
(data, target) in enumerate(train_loader):
iteration_counter += 1
data, target
= data.to(device), target.to(device)
# --- Backpropagation Update ---
optimizer.zero_grad()
output =
model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
total_bp_update = 0
for param in model.parameters():
if param.grad is not None:
total_bp_update += torch.sum(torch.abs(param.grad)).item()
# --- Homogeneity-driven Update ---
status =
torch.cat([p.data.flatten() for p in model.parameters()]).detach()
weights_before_update = status.clone()
#
print("Model weights before
homogeneity update (first 5 values of fc1.weight):")
#
print(model.fc1.weight.data[:5])
similarity =
calculate_similarity(status, average)
homogeneity_lambda = (iteration_counter
- 1) / total_iterations
homogeneity
= (1 - homogeneity_lambda) * similarity + homogeneity_lambda * homogeneity
homogeneity_values.append(homogeneity.item())
delta = weights_update(status,
homogeneity, homogeneity_learning_rate, homogeneity_lambda, average)
current_index = 0
for param in model.parameters():
param_size = param.nelement()
update_value = delta[current_index: current_index + param_size].view_as(param.data)
param.grad = -update_value
current_index += param_size
optimizer_homogeneity.step()
optimizer_homogeneity.zero_grad()
#
print("\nModel
weights after homogeneity update (first 5 values of fc1.weight):")
#
print(model.fc1.weight.data[:5])
weights_after_update = torch.cat([p.data.flatten() for p in model.parameters()]).detach()
distance = torch.norm(weights_before_update
- weights_after_update)
#
print(f"Distance
between weights before and after homogeneity update: {distance.item()}\n")
average =
(status + (epoch * len(train_loader) + batch_idx) * average) / (
epoch * len(train_loader) + batch_idx + 1)
optimizer.zero_grad()
#
print(f"Epoch
[{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "
#
f"Backpropagation
Update: {total_bp_update:.4f}, Homogeneity Update: {torch.sum(torch.abs(delta)).item():.4f},
"
#
f"homogeneity_lambda:
{homogeneity_lambda:.4f}, similarity: {similarity:.4f}, homogeneity:
{homogeneity:.4f}")
epoch_train_loss += loss.item()
_, predicted
= torch.max(output.data, 1)
epoch_train_total += target.size(0)
epoch_train_correct += (predicted == target).sum().item()
epoch_train_loss
/= len(train_loader)
epoch_train_accuracy
= 100 * epoch_train_correct / epoch_train_total
train_losses.append(epoch_train_loss)
train_accuracies.append(epoch_train_accuracy)
model.eval()
epoch_val_loss
= 0
epoch_val_correct
= 0
epoch_val_total
= 0
with torch.no_grad():
for data, target in val_loader:
data, target = data.to(device), target.to(device)
output = model(data)
loss = criterion(output, target)
epoch_val_loss += loss.item()
_, predicted = torch.max(output.data, 1)
epoch_val_total += target.size(0)
epoch_val_correct += (predicted == target).sum().item()
epoch_val_loss
/= len(val_loader)
epoch_val_accuracy
= 100 * epoch_val_correct / epoch_val_total
val_losses.append(epoch_val_loss)
val_accuracies.append(epoch_val_accuracy)
epoch_end_time
= time.time()
epoch_time
= epoch_end_time - epoch_start_time
print(f"Epoch [{epoch + 1}/{num_epochs}], "
f"Train Loss: {epoch_train_loss:.4f}, "
f"Train Accuracy: {epoch_train_accuracy:.2f}%,
"
f"Val Loss: {epoch_val_loss:.4f}, "
f"Val Accuracy: {epoch_val_accuracy:.2f}%")
print(f"Epoch processing time: {epoch_time:.2f} seconds")
total_end_time
= time.time()
total_training_time
= total_end_time - total_start_time
print(f"Total training time: {total_training_time:.2f} seconds")
return train_losses, val_losses, train_accuracies, val_accuracies, homogeneity_values
# Function to train the model with traditional
backpropagation
def train_traditional(model, train_loader, val_loader, num_epochs, learning_rate):
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(),
lr=learning_rate)
train_losses =
[]
val_losses =
[]
train_accuracies
= []
val_accuracies
= []
total_start_time
= time.time()
iteration_counter
= 1 # Initialize iteration_counter here
for epoch in range(num_epochs):
epoch_start_time
= time.time()
model.train()
epoch_train_loss
= 0
epoch_train_correct
= 0
epoch_train_total
= 0
for batch_idx,
(data, target) in enumerate(train_loader):
iteration_counter += 1
data, target
= data.to(device), target.to(device)
optimizer.zero_grad()
output =
model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
# Calculate total_bp_update
for each iteration
total_bp_update = 0
for param in model.parameters():
if param.grad is not None:
total_bp_update += torch.sum(torch.abs(param.grad)).item()
# print(f"Epoch
[{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "
# f"Backpropagation
Update: {total_bp_update:.4f}")
epoch_train_loss += loss.item()
_, predicted
= torch.max(output.data, 1)
epoch_train_total += target.size(0)
epoch_train_correct += (predicted == target).sum().item()
epoch_train_loss
/= len(train_loader)
epoch_train_accuracy
= 100 * epoch_train_correct / epoch_train_total
train_losses.append(epoch_train_loss)
train_accuracies.append(epoch_train_accuracy)
model.eval()
epoch_val_loss
= 0
epoch_val_correct
= 0
epoch_val_total
= 0
with torch.no_grad():
for data, target in val_loader:
data, target = data.to(device), target.to(device)
output = model(data)
loss = criterion(output, target)
epoch_val_loss += loss.item()
_, predicted = torch.max(output.data, 1)
epoch_val_total += target.size(0)
epoch_val_correct += (predicted == target).sum().item()
epoch_val_loss
/= len(val_loader)
epoch_val_accuracy
= 100 * epoch_val_correct / epoch_val_total
val_losses.append(epoch_val_loss)
val_accuracies.append(epoch_val_accuracy)
epoch_end_time
= time.time()
epoch_time
= epoch_end_time - epoch_start_time
print(f"Epoch [{epoch + 1}/{num_epochs}], "
f"Train Loss: {epoch_train_loss:.4f}, "
f"Train Accuracy: {epoch_train_accuracy:.2f}%,
"
f"Val Loss: {epoch_val_loss:.4f}, "
f"Val Accuracy: {epoch_val_accuracy:.2f}%")
print(f"Epoch processing time: {epoch_time:.2f} seconds")
total_end_time
= time.time()
total_training_time
= total_end_time - total_start_time
print(f"Total training time: {total_training_time:.2f} seconds")
return train_losses, val_losses, train_accuracies, val_accuracies
# Function to calculate accuracy
def calculate_accuracy(model, data_loader):
model.eval()
correct = 0
total = 0
with torch.no_grad():
for images, labels in data_loader:
images,
labels = images.to(device), labels.to(device)
outputs =
model(images)
_, predicted
= torch.max(outputs.data, 1)
total += labels.size(0)
correct +=
(predicted == labels).sum().item()
return 100 * correct / total
# !!!!!!!!!!!! Hyperparameters !!!!!!!!!!!!!!!!!!!!!
num_epochs = 8
learning_rate = 0.1
batch_size = 420
homogeneity_learning_rate = 0.1
# Load and split MNIST dataset
transform = transforms.ToTensor()
full_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST(root='./data', train=False, download=True,
transform=transform)
training_samples_number = int(0.7 * len(full_dataset))
total_iterations = math.ceil(training_samples_number / batch_size)
* num_epochs
print("Hyperparameters and
Calculated Values:")
print(f" num_epochs: {num_epochs}")
print(f" learning_rate: {learning_rate}")
print(f" homogeneity_learning_rate: {homogeneity_learning_rate}")
print(f" batch_size: {batch_size}")
print(f" training_samples_number: {training_samples_number}")
print(f" total_iterations: {total_iterations}")
print("\n")
train_size = int(0.7 * len(full_dataset))
val_size = int(0.15 * len(full_dataset))
test_size = len(full_dataset) - train_size - val_size
train_dataset, val_dataset, _ = random_split(full_dataset, [train_size, val_size, test_size])
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)
# Initialize models
model_homogeneity = Net().to(device)
model_traditional = Net().to(device)
average_model = Net().to(device)
average = torch.cat([p.data.flatten() for p in average_model.parameters()]).detach()
# Train the models
print("Training with
Homogeneity-driven update:")
results_homogeneity = train_homogeneity_driven(model_homogeneity,
train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average)
print("\nTraining
with traditional backpropagation:")
results_traditional = train_traditional(model_traditional, train_loader, val_loader, num_epochs, learning_rate)
# Evaluate and plot results
train_losses_h, val_losses_h, train_accuracies_h, val_accuracies_h,
homogeneity_values_h = results_homogeneity
train_losses_t, val_losses_t, train_accuracies_t, val_accuracies_t
= results_traditional
print(f"\nFinal Test Accuracy
(Homogeneity-driven): {calculate_accuracy(model_homogeneity,
test_loader):.2f}%")
print(f"Final Test Accuracy (Traditional): {calculate_accuracy(model_traditional,
test_loader):.2f}%")
plt.figure(figsize=(10, 5))
plt.plot(train_losses_h, label='Homogeneity-Driven Train Loss')
plt.plot(val_losses_h, label='Homogeneity-Driven Validation Loss')
plt.plot(train_losses_t, label='Traditional Train Loss')
plt.plot(val_losses_t, label='Traditional Validation Loss')
plt.title('Training and
Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()
plt.figure(figsize=(10, 5))
plt.plot(train_accuracies_h, label='Homogeneity-Driven Train Accuracy')
plt.plot(val_accuracies_h, label='Homogeneity-Driven Validation Accuracy')
plt.plot(train_accuracies_t, label='Traditional Train Accuracy')
plt.plot(val_accuracies_t, label='Traditional Validation Accuracy')
plt.title('Training and
Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
plt.figure(figsize=(10, 5))
plt.plot(homogeneity_values_h,
label='Homogeneity')
plt.title('Homogeneity
Values over Iterations')
plt.xlabel('Iteration')
plt.ylabel('Homogeneity')
plt.legend()
plt.show()
# END OF THE CODE
=========================================
Samples of run:
-------------------------------------------------------
Sample [A-A-MNIST] 1:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 8
learning_rate: 0.01
homogeneity_learning_rate: 0.0005
batch_size: 140
training_samples_number: 42000
total_iterations: 2400
Training with Homogeneity-driven
update:
Total training time: 82.51 seconds
Training with traditional
backpropagation:
Total training time: 58.64 seconds
Final Test Accuracy
(Homogeneity-driven): 97.12%
Final Test Accuracy (Traditional):
97.05%
-------------------------------------------------------
Sample [A-A-MNIST] 2:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 8
learning_rate: 0.01
homogeneity_learning_rate: 0.001
batch_size: 280
training_samples_number: 42000
total_iterations: 1200
Training with Homogeneity-driven
update:
Total training time: 67.44 seconds
Training with traditional
backpropagation:
Total training time: 53.19 seconds
Final Test Accuracy
(Homogeneity-driven): 97.18%
Final Test Accuracy (Traditional):
97.18%
-------------------------------------------------------
Sample [A-A-MNIST] 3:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 8
learning_rate: 0.01
homogeneity_learning_rate: 0.001
batch_size: 420
training_samples_number: 42000
total_iterations: 800
Training with Homogeneity-driven
update:
Total training time: 61.46 seconds
Training with traditional
backpropagation:
Total training time: 51.95 seconds
Final Test Accuracy
(Homogeneity-driven): 97.53%
Final Test Accuracy (Traditional):
97.49%
-------------------------------------------------------
Sample [A-A-MNIST] 4:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 1
learning_rate: 0.01
homogeneity_learning_rate: 0.02
batch_size: 42000
training_samples_number: 42000
total_iterations: 1
Training with Homogeneity-driven
update:
Total training time: 7.19 seconds
Training with traditional
backpropagation:
Total training time: 6.57 seconds
Final Test Accuracy
(Homogeneity-driven): 64.99%
Final Test Accuracy (Traditional):
58.69%
-------------------------------------------------------
2. HOMOGENEITY NN TRAINING (Option with manual setting of learning rate
and lambda depends on current iteration
) – [A-B] – MNIST Dataset
=========================================
# BEGINNING OF THE CODE
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, random_split
from torchvision import datasets, transforms
import numpy as np
import matplotlib.pyplot as plt
import time
import math
# Device configuration
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Define the neural network model
class Net(nn.Module):
def __init__(self):
super(Net,
self).__init__()
self.fc1 = nn.Linear(28 * 28, 128)
self.relu = nn.ReLU()
self.fc2 = nn.Linear(128, 10)
def forward(self, x):
x = x.view(-1, 28 * 28)
x = self.fc1(x)
x = self.relu(x)
x = self.fc2(x)
return x
# Function to calculate Absolute Difference Similarity
(ADS)
def calculate_similarity(status, average, epsilon=1e-8):
num = torch.sum(torch.abs(status
- average))
den = epsilon + torch.sum(torch.abs(status) + torch.abs(average))
similarity = 1 - (num / den)
return similarity
# Function to calculate partial derivative of
Homogeneity
def calculate_partial_derivative(status, average, homogeneity_lambda, epsilon=1e-8):
N = torch.sum(torch.abs(status
- average))
D = epsilon + torch.sum(torch.abs(status) + torch.abs(average))
condition1 = torch.logical_or(torch.logical_and(status
> average, average > 0),
torch.logical_and(status <
average, average < 0))
condition2 = torch.logical_or(torch.logical_and(status
> 0, average < 0),
torch.logical_and(status < 0, average > 0))
condition3 = torch.logical_and(status == 0, average > 0)
condition4 = torch.logical_and(status == 0, average < 0)
condition5 = status == average
partial_derivative
= torch.zeros_like(status)
partial_derivative[condition1]
= (1 - homogeneity_lambda) * (1 / D**2) * (D - N)
partial_derivative[condition2]
= (1 - homogeneity_lambda) * (1 / D**2) * (D + N)
partial_derivative[condition3]
= -(1 - homogeneity_lambda) / D
partial_derivative[condition4]
= (1 - homogeneity_lambda) / D
partial_derivative[condition5]
= 0
remaining_indices
= torch.logical_not(torch.logical_or(torch.logical_or(torch.logical_or(condition1,
condition2),
torch.logical_or(condition3,
condition4)),
condition5))
partial_derivative[remaining_indices] = (1 - homogeneity_lambda)
* (1 / D**2) * (
D * torch.sign(status[remaining_indices]
- average[remaining_indices]) -
N * torch.sign(status[remaining_indices])
)
return partial_derivative
# Function to perform Homogeneity-driven weight update
def weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average, epsilon=1e-8):
partial_derivative
= calculate_partial_derivative(status, average, homogeneity_lambda,
epsilon)
delta = -homogeneity_learning_rate
* partial_derivative
return delta
# Function to train the model with Homogeneity-driven
update
def train_homogeneity_driven(model, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average):
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(),
lr=learning_rate)
optimizer_homogeneity
= optim.Adam(model.parameters(), lr=homogeneity_learning_rate)
train_losses =
[]
val_losses =
[]
train_accuracies
= []
val_accuracies
= []
homogeneity_values
= []
homogeneity = 1.0
total_start_time
= time.time()
iteration_counter
= 1
for epoch in range(num_epochs):
epoch_start_time
= time.time()
model.train()
epoch_train_loss
= 0
epoch_train_correct
= 0
epoch_train_total
= 0
for batch_idx,
(data, target) in enumerate(train_loader):
iteration_counter += 1
data, target
= data.to(device), target.to(device)
# --- Backpropagation Update ---
optimizer.zero_grad()
output =
model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
total_bp_update = 0
for param in model.parameters():
if param.grad is not None:
total_bp_update += torch.sum(torch.abs(param.grad)).item()
# --- Homogeneity-driven Update ---
status =
torch.cat([p.data.flatten() for p in model.parameters()]).detach()
weights_before_update = status.clone()
#
print("Model weights before
homogeneity update (first 5 values of fc1.weight):")
#
print(model.fc1.weight.data[:5])
similarity =
calculate_similarity(status, average)
homogeneity_lambda = (iteration_counter
- 1) / iteration_counter
homogeneity
= (1 - homogeneity_lambda) * similarity + homogeneity_lambda * homogeneity
homogeneity_values.append(homogeneity.item())
delta = weights_update(status,
homogeneity, homogeneity_learning_rate, homogeneity_lambda, average)
current_index = 0
for param in model.parameters():
param_size = param.nelement()
update_value = delta[current_index: current_index + param_size].view_as(param.data)
param.grad = -update_value
current_index += param_size
optimizer_homogeneity.step()
optimizer_homogeneity.zero_grad()
#
print("\nModel
weights after homogeneity update (first 5 values of fc1.weight):")
#
print(model.fc1.weight.data[:5])
weights_after_update = torch.cat([p.data.flatten() for p in model.parameters()]).detach()
distance = torch.norm(weights_before_update
- weights_after_update)
#
print(f"Distance
between weights before and after homogeneity update: {distance.item()}\n")
average =
(status + (epoch * len(train_loader) + batch_idx) * average) / (
epoch * len(train_loader) + batch_idx + 1)
optimizer.zero_grad()
#
print(f"Epoch
[{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "
#
f"Backpropagation
Update: {total_bp_update:.4f}, Homogeneity Update: {torch.sum(torch.abs(delta)).item():.4f},
"
#
f"homogeneity_lambda:
{homogeneity_lambda:.4f}, similarity: {similarity:.4f}, homogeneity:
{homogeneity:.4f}")
epoch_train_loss += loss.item()
_, predicted
= torch.max(output.data, 1)
epoch_train_total += target.size(0)
epoch_train_correct += (predicted == target).sum().item()
epoch_train_loss
/= len(train_loader)
epoch_train_accuracy
= 100 * epoch_train_correct / epoch_train_total
train_losses.append(epoch_train_loss)
train_accuracies.append(epoch_train_accuracy)
model.eval()
epoch_val_loss
= 0
epoch_val_correct
= 0
epoch_val_total
= 0
with torch.no_grad():
for data, target in val_loader:
data, target = data.to(device), target.to(device)
output = model(data)
loss = criterion(output, target)
epoch_val_loss += loss.item()
_, predicted = torch.max(output.data, 1)
epoch_val_total += target.size(0)
epoch_val_correct += (predicted == target).sum().item()
epoch_val_loss
/= len(val_loader)
epoch_val_accuracy
= 100 * epoch_val_correct / epoch_val_total
val_losses.append(epoch_val_loss)
val_accuracies.append(epoch_val_accuracy)
epoch_end_time
= time.time()
epoch_time
= epoch_end_time - epoch_start_time
print(f"Epoch [{epoch + 1}/{num_epochs}], "
f"Train Loss: {epoch_train_loss:.4f}, "
f"Train Accuracy: {epoch_train_accuracy:.2f}%,
"
f"Val Loss: {epoch_val_loss:.4f}, "
f"Val Accuracy: {epoch_val_accuracy:.2f}%")
print(f"Epoch processing time: {epoch_time:.2f} seconds")
total_end_time
= time.time()
total_training_time
= total_end_time - total_start_time
print(f"Total training time: {total_training_time:.2f} seconds")
return train_losses, val_losses, train_accuracies, val_accuracies, homogeneity_values
# Function to train the model with traditional
backpropagation
def train_traditional(model, train_loader, val_loader, num_epochs, learning_rate):
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(),
lr=learning_rate)
train_losses =
[]
val_losses =
[]
train_accuracies
= []
val_accuracies
= []
total_start_time
= time.time()
iteration_counter
= 1 # Initialize iteration_counter here
for epoch in range(num_epochs):
epoch_start_time
= time.time()
model.train()
epoch_train_loss
= 0
epoch_train_correct
= 0
epoch_train_total
= 0
for batch_idx,
(data, target) in enumerate(train_loader):
iteration_counter += 1
data, target
= data.to(device), target.to(device)
optimizer.zero_grad()
output =
model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
# Calculate total_bp_update
for each iteration
total_bp_update = 0
for param in model.parameters():
if param.grad is not None:
total_bp_update += torch.sum(torch.abs(param.grad)).item()
# print(f"Epoch
[{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "
# f"Backpropagation
Update: {total_bp_update:.4f}")
epoch_train_loss += loss.item()
_, predicted
= torch.max(output.data, 1)
epoch_train_total += target.size(0)
epoch_train_correct += (predicted == target).sum().item()
epoch_train_loss
/= len(train_loader)
epoch_train_accuracy
= 100 * epoch_train_correct / epoch_train_total
train_losses.append(epoch_train_loss)
train_accuracies.append(epoch_train_accuracy)
model.eval()
epoch_val_loss
= 0
epoch_val_correct
= 0
epoch_val_total
= 0
with torch.no_grad():
for data, target in val_loader:
data, target = data.to(device), target.to(device)
output = model(data)
loss = criterion(output, target)
epoch_val_loss += loss.item()
_, predicted = torch.max(output.data, 1)
epoch_val_total += target.size(0)
epoch_val_correct += (predicted == target).sum().item()
epoch_val_loss
/= len(val_loader)
epoch_val_accuracy
= 100 * epoch_val_correct / epoch_val_total
val_losses.append(epoch_val_loss)
val_accuracies.append(epoch_val_accuracy)
epoch_end_time
= time.time()
epoch_time
= epoch_end_time - epoch_start_time
print(f"Epoch [{epoch + 1}/{num_epochs}], "
f"Train Loss: {epoch_train_loss:.4f}, "
f"Train Accuracy: {epoch_train_accuracy:.2f}%,
"
f"Val Loss: {epoch_val_loss:.4f}, "
f"Val Accuracy: {epoch_val_accuracy:.2f}%")
print(f"Epoch processing time: {epoch_time:.2f} seconds")
total_end_time
= time.time()
total_training_time
= total_end_time - total_start_time
print(f"Total training time: {total_training_time:.2f} seconds")
return train_losses, val_losses, train_accuracies, val_accuracies
# Function to calculate accuracy
def calculate_accuracy(model, data_loader):
model.eval()
correct = 0
total = 0
with torch.no_grad():
for images, labels in data_loader:
images,
labels = images.to(device), labels.to(device)
outputs =
model(images)
_, predicted
= torch.max(outputs.data, 1)
total += labels.size(0)
correct +=
(predicted == labels).sum().item()
return 100 * correct / total
# !!!!!!!!!!!! Hyperparameters
!!!!!!!!!!!!!!!!!!!!!
num_epochs = 8
learning_rate = 0.1
batch_size = 420
homogeneity_learning_rate = 0.1
# Load and split MNIST dataset
transform = transforms.ToTensor()
full_dataset = datasets.MNIST(root='./data', train=True, download=True,
transform=transform)
test_dataset = datasets.MNIST(root='./data', train=False, download=True,
transform=transform)
training_samples_number = int(0.7 * len(full_dataset))
total_iterations = math.ceil(training_samples_number / batch_size)
* num_epochs
print("Hyperparameters and
Calculated Values:")
print(f" num_epochs: {num_epochs}")
print(f" learning_rate: {learning_rate}")
print(f" homogeneity_learning_rate: {homogeneity_learning_rate}")
print(f" batch_size: {batch_size}")
print(f" training_samples_number: {training_samples_number}")
print(f" total_iterations: {total_iterations}")
print("\n")
train_size = int(0.7 * len(full_dataset))
val_size = int(0.15 * len(full_dataset))
test_size = len(full_dataset) - train_size - val_size
train_dataset, val_dataset, _ = random_split(full_dataset, [train_size, val_size, test_size])
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)
# Initialize models
model_homogeneity = Net().to(device)
model_traditional = Net().to(device)
average_model = Net().to(device)
average = torch.cat([p.data.flatten() for p in average_model.parameters()]).detach()
# Train the models
print("Training with Homogeneity-driven
update:")
results_homogeneity = train_homogeneity_driven(model_homogeneity,
train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average)
print("\nTraining
with traditional backpropagation:")
results_traditional = train_traditional(model_traditional, train_loader, val_loader, num_epochs, learning_rate)
# Evaluate and plot results
train_losses_h, val_losses_h, train_accuracies_h, val_accuracies_h,
homogeneity_values_h = results_homogeneity
train_losses_t, val_losses_t, train_accuracies_t, val_accuracies_t
= results_traditional
print(f"\nFinal Test Accuracy
(Homogeneity-driven): {calculate_accuracy(model_homogeneity,
test_loader):.2f}%")
print(f"Final Test Accuracy (Traditional): {calculate_accuracy(model_traditional,
test_loader):.2f}%")
plt.figure(figsize=(10, 5))
plt.plot(train_losses_h, label='Homogeneity-Driven Train Loss')
plt.plot(val_losses_h, label='Homogeneity-Driven Validation Loss')
plt.plot(train_losses_t, label='Traditional Train Loss')
plt.plot(val_losses_t, label='Traditional Validation Loss')
plt.title('Training and
Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()
plt.figure(figsize=(10, 5))
plt.plot(train_accuracies_h, label='Homogeneity-Driven Train Accuracy')
plt.plot(val_accuracies_h, label='Homogeneity-Driven Validation Accuracy')
plt.plot(train_accuracies_t, label='Traditional Train Accuracy')
plt.plot(val_accuracies_t, label='Traditional Validation Accuracy')
plt.title('Training and
Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
plt.figure(figsize=(10, 5))
plt.plot(homogeneity_values_h,
label='Homogeneity')
plt.title('Homogeneity
Values over Iterations')
plt.xlabel('Iteration')
plt.ylabel('Homogeneity')
plt.legend()
plt.show()
# END OF THE CODE
=========================================
Samples of run:
-------------------------------------------------------
Sample [A-B-MNIST] 1:
-------------------------------------------------------
Hyperparameters and Calculated Values:
num_epochs: 8
learning_rate: 0.1
homogeneity_learning_rate: 0.1
batch_size: 420
training_samples_number: 42000
total_iterations: 800
Training with Homogeneity-driven update:
Total training time: 59.27 seconds
Training with traditional backpropagation:
Total training time: 51.43 seconds
Final Test Accuracy (Homogeneity-driven): 92.24%
Final Test Accuracy (Traditional): 90.14%
-------------------------------------------------------
Sample [A-B-MNIST] 2:
-------------------------------------------------------
Hyperparameters and Calculated Values: num_epochs: 18 learning_rate: 0.001 homogeneity_learning_rate: 0.001 batch_size: 4200 training_samples_number: 42000 total_iterations: 180 Training with Homogeneity-driven update:Total training time: 119.41 seconds Training with traditional backpropagation:Total training time: 115.07 seconds Final Test Accuracy (Homogeneity-driven): 93.64%Final Test Accuracy (Traditional): 93.27%
-------------------------------------------------------
Sample [A-B-MNIST] 3:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 16
learning_rate: 0.01
homogeneity_learning_rate: 0.008
batch_size: 21000
training_samples_number: 42000
total_iterations: 32
Training with Homogeneity-driven
update:
Total training time: 104.28 seconds
Training with traditional
backpropagation:
Total training time: 103.38 seconds
Final Test Accuracy
(Homogeneity-driven): 94.03%
Final Test Accuracy
(Traditional): 93.01%

-------------------------------------------------------
Sample [A-B-MNIST] 4:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 16
learning_rate: 0.01
homogeneity_learning_rate: 0.01
batch_size: 210
training_samples_number: 42000
total_iterations: 3200
Training with Homogeneity-driven
update:
Total training time: 141.07 seconds
Training with traditional
backpropagation:
Total training time: 108.91 seconds
Final Test Accuracy
(Homogeneity-driven): 96.85%
Final Test Accuracy
(Traditional): 96.63%

-------------------------------------------------------
Sample [A-B-MNIST] 5:
-------------------------------------------------------
Hyperparameters and Calculated Values: num_epochs: 16 learning_rate: 0.01 homogeneity_learning_rate: 0.01 batch_size: 840 training_samples_number: 42000 total_iterations: 800 Training with Homogeneity-driven update:Total training time: 109.00 seconds Training with traditional backpropagation:Total training time: 100.24 seconds Final Test Accuracy (Homogeneity-driven): 97.72%Final Test Accuracy (Traditional): 97.31%

-------------------------------------------------------
Sample [A-B-MNIST] 6:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 18
learning_rate: 0.001
homogeneity_learning_rate: 0.0008
batch_size: 2100
training_samples_number: 42000
total_iterations: 360
Training with Homogeneity-driven
update:
Total training time: 119.29 seconds
Training with traditional
backpropagation:
Epoch processing time: 5.94 seconds
Total training time: 115.52 seconds
Final Test Accuracy
(Homogeneity-driven): 95.20%
Final Test Accuracy (Traditional): 94.99%
-------------------------------------------------------
Sample [A-B-MNIST] 7:
-------------------------------------------------------
Hyperparameters and Calculated Values: num_epochs: 16 learning_rate: 0.01 homogeneity_learning_rate: 0.005 batch_size: 840 training_samples_number: 42000 total_iterations: 800 Training with Homogeneity-driven update:Total training time: 109.48 seconds Training with traditional backpropagation:Total training time: 100.49 seconds Final Test Accuracy (Homogeneity-driven): 97.69%Final Test Accuracy (Traditional): 97.51%
-------------------------------------------------------
Sample [A-B-MNIST] 8:
-------------------------------------------------------
Hyperparameters and Calculated Values:
num_epochs: 40
learning_rate: 0.01
homogeneity_learning_rate: 0.01
batch_size: 42000
training_samples_number: 42000
total_iterations: 40
Training with Homogeneity-driven update:
Total training time: 259.55 seconds
Training with traditional backpropagation:
Total training time: 257.16 seconds
Final Test Accuracy (Homogeneity-driven): 94.60%
Final Test Accuracy (Traditional): 94.24%



-------------------------------------------------------
Sample [A-B-MNIST] 9:
-------------------------------------------------------
Hyperparameters and Calculated Values:
num_epochs: 40
learning_rate: 0.01
homogeneity_learning_rate: 0.001
batch_size: 42000
training_samples_number: 42000
total_iterations: 40
Training with Homogeneity-driven update:
Total training time: 258.01 seconds
Training with traditional backpropagation:
Total training time: 258.25 seconds
Final Test Accuracy (Homogeneity-driven): 94.61%
Final Test Accuracy (Traditional): 94.25%

-------------------------------------------------------
Sample [A-B-MNIST] 10:
-------------------------------------------------------
Hyperparameters and Calculated Values:
num_epochs: 40
learning_rate: 0.01
homogeneity_learning_rate: 0.005
batch_size: 4200
training_samples_number: 42000
total_iterations: 400
Training with Homogeneity-driven update:
Total training time: 260.06 seconds
Training with traditional backpropagation:
Total training time: 253.40 seconds
Final Test Accuracy (Homogeneity-driven): 97.13%
Final Test Accuracy (Traditional): 97.02%

-------------------------------------------------------
Sample [A-B-MNIST] 11:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 18
learning_rate: 0.001
homogeneity_learning_rate: 0.0008
batch_size: 1050
training_samples_number: 42000
total_iterations: 720
Final Test Accuracy
(Homogeneity-driven): 96.40%
Final Test Accuracy (Traditional):
96.24%
-------------------------------------------------------
Sample [A-B-MNIST] 12:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 10
learning_rate: 0.001
homogeneity_learning_rate: 0.001
batch_size: 672
training_samples_number: 42000
total_iterations: 630
Training with Homogeneity-driven
update:
Total training time: 70.79 seconds
Training with traditional
backpropagation:
Total training time: 63.73 seconds
Final Test Accuracy (Homogeneity-driven):
95.69%
Final Test Accuracy (Traditional):
95.63%
-------------------------------------------------------
Sample [A-B-MNIST] 13:
-------------------------------------------------------
Hyperparameters and Calculated Values:
num_epochs: 40
learning_rate: 0.01
homogeneity_learning_rate: 0.005
batch_size: 420
training_samples_number: 42000
total_iterations: 4000
Training with Homogeneity-driven update:
Total training time: 304.14 seconds
Training with traditional backpropagation:
Total training time: 263.50 seconds
Final Test Accuracy (Homogeneity-driven): 97.42%
Final Test Accuracy (Traditional): 97.46%

-------------------------------------------------------
Sample [A-B-MNIST] 14:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 16
learning_rate: 0.001
homogeneity_learning_rate: 0.0008
batch_size: 672
training_samples_number: 42000
total_iterations: 1008
Training with Homogeneity-driven
update:
Total training time: 116.50 seconds
Training with traditional
backpropagation:
Total training time: 102.51 seconds
Final Test Accuracy
(Homogeneity-driven): 96.65%
Final Test Accuracy (Traditional):
96.60%
-------------------------------------------------------
Sample [A-B-MNIST] 15:
-------------------------------------------------------
Hyperparameters and Calculated Values:
num_epochs: 20
learning_rate: 0.01
homogeneity_learning_rate: 0.003
batch_size: 42
training_samples_number: 42000
total_iterations: 20000
Training with Homogeneity-driven update:
Total training time: 409.74 seconds
Training with traditional backpropagation:
Total training time: 213.06 seconds
Final Test Accuracy (Homogeneity-driven): 96.65%
Final Test Accuracy (Traditional): 96.07%
-------------------------------------------------------
Sample [A-B-MNIST] 16:
-------------------------------------------------------
Hyperparameters and Calculated Values:
num_epochs: 20
learning_rate: 0.01
homogeneity_learning_rate: 0.01
batch_size: 168
training_samples_number: 42000
total_iterations: 5000
Training with Homogeneity-driven update:
Total training time: 199.35 seconds
Training with traditional backpropagation:
Total training time: 144.97 seconds
Final Test Accuracy (Homogeneity-driven): 97.07%
Final Test Accuracy (Traditional): 96.94%
-------------------------------------------------------
Sample [A-B-MNIST] 17:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 20
learning_rate: 0.01
homogeneity_learning_rate: 0.005
batch_size: 336
training_samples_number: 42000
total_iterations: 2500
Training with Homogeneity-driven
update:
Total training time: 160.23 seconds
Training with traditional
backpropagation:
Total training time: 131.83 seconds
Final Test Accuracy
(Homogeneity-driven): 97.34%
Final Test Accuracy (Traditional):
97.16%
-------------------------------------------------------
Sample [A-B-MNIST] 18:
-------------------------------------------------------
Hyperparameters and Calculated Values: num_epochs: 24 learning_rate: 0.001 homogeneity_learning_rate: 0.0005 batch_size: 4200 training_samples_number: 42000 total_iterations: 240 Training with Homogeneity-driven update:Total training time: 157.12 seconds Training with traditional backpropagation:Total training time: 153.61 seconds Final Test Accuracy (Homogeneity-driven): 94.25%Final Test Accuracy (Traditional): 94.03%
-------------------------------------------------------
Sample [A-B-MNIST] 19:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 12
learning_rate: 0.1
homogeneity_learning_rate: 0.001
batch_size: 3360
training_samples_number: 42000
total_iterations: 156
Training with Homogeneity-driven
update:
Epoch [1/12], Train
Loss: 8.7319, Train Accuracy: 30.93%, Val Loss: 1.4389, Val Accuracy: 48.40%
Epoch processing time:
7.59 seconds
Epoch [2/12], Train
Loss: 1.1013, Train Accuracy: 64.49%, Val Loss: 0.8079, Val Accuracy: 73.32%
Epoch processing time:
5.98 seconds
Epoch [3/12], Train
Loss: 0.6478, Train Accuracy: 80.60%, Val Loss: 0.5574, Val Accuracy: 85.28%
Epoch processing time:
6.99 seconds
Epoch [4/12], Train
Loss: 0.4568, Train Accuracy: 87.32%, Val Loss: 0.4280, Val Accuracy: 88.48%
Epoch processing time:
5.81 seconds
Epoch [5/12], Train
Loss: 0.3534, Train Accuracy: 89.93%, Val Loss: 0.3521, Val Accuracy: 90.40%
Epoch processing time:
7.00 seconds
Epoch [6/12], Train
Loss: 0.2978, Train Accuracy: 91.44%, Val Loss: 0.3133, Val Accuracy: 91.67%
Epoch processing time:
6.05 seconds
Epoch [7/12], Train
Loss: 0.2655, Train Accuracy: 92.46%, Val Loss: 0.3019, Val Accuracy: 91.56%
Epoch processing time:
6.93 seconds
Epoch [8/12], Train
Loss: 0.2472, Train Accuracy: 92.78%, Val Loss: 0.2807, Val Accuracy: 92.37%
Epoch processing time:
6.01 seconds
Epoch [9/12], Train
Loss: 0.2282, Train Accuracy: 93.35%, Val Loss: 0.2738, Val Accuracy: 92.56%
Epoch processing time:
6.92 seconds
Epoch [10/12], Train
Loss: 0.2132, Train Accuracy: 93.80%, Val Loss: 0.2575, Val Accuracy: 92.93%
Epoch processing time:
6.06 seconds
Epoch [11/12], Train
Loss: 0.1981, Train Accuracy: 94.22%, Val Loss: 0.2553, Val Accuracy: 92.82%
Epoch processing time:
6.95 seconds
Epoch [12/12], Train
Loss: 0.1878, Train Accuracy: 94.39%, Val Loss: 0.2489, Val Accuracy: 93.12%
Epoch processing time: 5.82 seconds
Total training time: 78.10 seconds
Training with traditional
backpropagation:
Epoch [1/12], Train
Loss: 6.9806, Train Accuracy: 37.27%, Val Loss: 1.5671, Val Accuracy: 49.42%
Epoch processing time:
6.84 seconds
Epoch [2/12], Train
Loss: 1.1666, Train Accuracy: 60.90%, Val Loss: 0.9454, Val Accuracy: 72.51%
Epoch processing time:
5.91 seconds
Epoch [3/12], Train
Loss: 0.7834, Train Accuracy: 75.20%, Val Loss: 0.6856, Val Accuracy: 78.37%
Epoch processing time:
6.80 seconds
Epoch [4/12], Train
Loss: 0.5979, Train Accuracy: 81.24%, Val Loss: 0.5383, Val Accuracy: 83.37%
Epoch processing time:
5.91 seconds
Epoch [5/12], Train
Loss: 0.4758, Train Accuracy: 85.61%, Val Loss: 0.4466, Val Accuracy: 86.86%
Epoch processing time:
6.86 seconds
Epoch [6/12], Train
Loss: 0.3828, Train Accuracy: 88.98%, Val Loss: 0.3825, Val Accuracy: 89.56%
Epoch processing time:
5.87 seconds
Epoch [7/12], Train
Loss: 0.3210, Train Accuracy: 91.02%, Val Loss: 0.3443, Val Accuracy: 90.71%
Epoch processing time:
6.86 seconds
Epoch [8/12], Train
Loss: 0.2844, Train Accuracy: 92.08%, Val Loss: 0.3211, Val Accuracy: 91.11%
Epoch processing time:
5.70 seconds
Epoch [9/12], Train
Loss: 0.2688, Train Accuracy: 92.48%, Val Loss: 0.3099, Val Accuracy: 91.96%
Epoch processing time:
6.86 seconds
Epoch [10/12], Train
Loss: 0.2522, Train Accuracy: 93.00%, Val Loss: 0.2966, Val Accuracy: 92.21%
Epoch processing time:
5.94 seconds
Epoch [11/12], Train
Loss: 0.2417, Train Accuracy: 93.25%, Val Loss: 0.2952, Val Accuracy: 91.84%
Epoch processing time:
7.90 seconds
Epoch [12/12], Train
Loss: 0.2306, Train Accuracy: 93.39%, Val Loss: 0.2825, Val Accuracy: 92.67%
Epoch processing time: 5.94 seconds
Total training time: 77.39 seconds
Final Test Accuracy (Homogeneity-driven):
93.44%
Final Test Accuracy (Traditional):
92.67%
3. OPTION WITH DYNAMIC HOMOGENEITY LEARNING RATE and labda
depends on
–
[B-A] – MNIST DATASET
=========================================
# BEGINNING OF THE CODE
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, random_split
from torchvision import datasets, transforms
import numpy as np
import matplotlib.pyplot as plt
import time
import math
# Device configuration
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Define the neural network model
class Net(nn.Module):
def __init__(self):
super(Net,
self).__init__()
self.fc1 = nn.Linear(28 * 28, 128)
self.relu = nn.ReLU()
self.fc2 = nn.Linear(128, 10)
def forward(self, x):
x = x.view(-1, 28 * 28)
x = self.fc1(x)
x = self.relu(x)
x = self.fc2(x)
return x
# Function to calculate Absolute Difference Similarity
(ADS)
def calculate_similarity(status, average, epsilon=1e-8):
num = torch.sum(torch.abs(status
- average))
den = epsilon + torch.sum(torch.abs(status) + torch.abs(average))
similarity = 1 - (num / den)
return similarity
# Function to calculate partial derivative of
Homogeneity
def calculate_partial_derivative(status, average, homogeneity_lambda, epsilon=1e-8):
N = torch.sum(torch.abs(status
- average))
D = epsilon + torch.sum(torch.abs(status) + torch.abs(average))
condition1 = torch.logical_or(torch.logical_and(status
> average, average > 0),
torch.logical_and(status <
average, average < 0))
condition2 = torch.logical_or(torch.logical_and(status
> 0, average < 0),
torch.logical_and(status < 0, average > 0))
condition3 = torch.logical_and(status == 0, average > 0)
condition4 = torch.logical_and(status == 0, average < 0)
condition5 = status == average
partial_derivative
= torch.zeros_like(status)
partial_derivative[condition1]
= (1 - homogeneity_lambda) * (1 / D**2) * (D - N)
partial_derivative[condition2]
= (1 - homogeneity_lambda) * (1 / D**2) * (D + N)
partial_derivative[condition3]
= -(1 - homogeneity_lambda) / D
partial_derivative[condition4]
= (1 - homogeneity_lambda) / D
partial_derivative[condition5]
= 0
remaining_indices
= torch.logical_not(torch.logical_or(torch.logical_or(torch.logical_or(condition1,
condition2),
torch.logical_or(condition3,
condition4)),
condition5))
partial_derivative[remaining_indices] = (1 - homogeneity_lambda)
* (1 / D**2) * (
D * torch.sign(status[remaining_indices]
- average[remaining_indices]) -
N * torch.sign(status[remaining_indices])
)
return partial_derivative
# Function to perform Homogeneity-driven weight update
def weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average, epsilon=1e-8):
partial_derivative
= calculate_partial_derivative(status, average, homogeneity_lambda,
epsilon)
delta = -homogeneity_learning_rate
* partial_derivative
return delta
# Function to train the model with Homogeneity-driven
update
def train_homogeneity_driven(model, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average):
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(),
lr=learning_rate)
optimizer_homogeneity
= optim.Adam(model.parameters(), lr=homogeneity_learning_rate)
train_losses =
[]
val_losses =
[]
train_accuracies
= []
val_accuracies
= []
homogeneity_values
= []
homogeneity = 1.0
total_start_time
= time.time()
iteration_counter
= 1
for epoch in range(num_epochs):
epoch_start_time
= time.time()
model.train()
epoch_train_loss
= 0
epoch_train_correct
= 0
epoch_train_total
= 0
for batch_idx,
(data, target) in enumerate(train_loader):
iteration_counter += 1
data, target
= data.to(device), target.to(device)
# --- Backpropagation Update ---
optimizer.zero_grad()
output =
model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
total_bp_update = 0
for param in model.parameters():
if param.grad is not None:
total_bp_update += torch.sum(torch.abs(param.grad)).item()
# --- Homogeneity-driven Update ---
status =
torch.cat([p.data.flatten() for p in model.parameters()]).detach()
weights_before_update = status.clone()
#
print("Model weights before
homogeneity update (first 5 values of fc1.weight):")
#
print(model.fc1.weight.data[:5])
similarity =
calculate_similarity(status, average)
homogeneity_lambda = (iteration_counter
- 1) / total_iterations
homogeneity
= (1 - homogeneity_lambda) * similarity + homogeneity_lambda * homogeneity
homogeneity_values.append(homogeneity.item())
delta = weights_update(status,
homogeneity, homogeneity_learning_rate, homogeneity_lambda, average)
current_index = 0
for param in model.parameters():
param_size = param.nelement()
update_value = delta[current_index: current_index + param_size].view_as(param.data)
param.grad = -update_value
current_index += param_size
optimizer_homogeneity.step()
optimizer_homogeneity.zero_grad()
#
print("\nModel
weights after homogeneity update (first 5 values of fc1.weight):")
#
print(model.fc1.weight.data[:5])
weights_after_update = torch.cat([p.data.flatten() for p in model.parameters()]).detach()
distance = torch.norm(weights_before_update
- weights_after_update)
#
print(f"Distance
between weights before and after homogeneity update: {distance.item()}\n")
average =
(status + (epoch * len(train_loader) + batch_idx) * average) / (
epoch * len(train_loader) + batch_idx + 1)
optimizer.zero_grad()
#
print(f"Epoch
[{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "
#
f"Backpropagation
Update: {total_bp_update:.4f}, Homogeneity Update: {torch.sum(torch.abs(delta)).item():.4f},
"
#
f"homogeneity_lambda:
{homogeneity_lambda:.4f}, similarity: {similarity:.4f}, homogeneity:
{homogeneity:.4f}")
epoch_train_loss += loss.item()
_, predicted
= torch.max(output.data, 1)
epoch_train_total += target.size(0)
epoch_train_correct += (predicted == target).sum().item()
epoch_train_loss
/= len(train_loader)
epoch_train_accuracy
= 100 * epoch_train_correct / epoch_train_total
train_losses.append(epoch_train_loss)
train_accuracies.append(epoch_train_accuracy)
model.eval()
epoch_val_loss
= 0
epoch_val_correct
= 0
epoch_val_total
= 0
with torch.no_grad():
for data, target in val_loader:
data, target = data.to(device), target.to(device)
output = model(data)
loss = criterion(output, target)
epoch_val_loss += loss.item()
_, predicted = torch.max(output.data, 1)
epoch_val_total += target.size(0)
epoch_val_correct += (predicted == target).sum().item()
epoch_val_loss
/= len(val_loader)
epoch_val_accuracy
= 100 * epoch_val_correct / epoch_val_total
val_losses.append(epoch_val_loss)
val_accuracies.append(epoch_val_accuracy)
epoch_end_time
= time.time()
epoch_time
= epoch_end_time - epoch_start_time
print(f"Epoch [{epoch + 1}/{num_epochs}], "
f"Train Loss: {epoch_train_loss:.4f}, "
f"Train Accuracy: {epoch_train_accuracy:.2f}%,
"
f"Val Loss: {epoch_val_loss:.4f}, "
f"Val Accuracy: {epoch_val_accuracy:.2f}%")
print(f"Epoch processing time: {epoch_time:.2f} seconds")
total_end_time
= time.time()
total_training_time
= total_end_time - total_start_time
print(f"Total training time: {total_training_time:.2f} seconds")
return train_losses, val_losses, train_accuracies, val_accuracies, homogeneity_values
# Function to train the model with traditional
backpropagation
def train_traditional(model, train_loader, val_loader, num_epochs, learning_rate):
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(),
lr=learning_rate)
train_losses =
[]
val_losses =
[]
train_accuracies
= []
val_accuracies
= []
total_start_time
= time.time()
iteration_counter
= 1 # Initialize iteration_counter here
for epoch in range(num_epochs):
epoch_start_time
= time.time()
model.train()
epoch_train_loss
= 0
epoch_train_correct
= 0
epoch_train_total
= 0
for batch_idx,
(data, target) in enumerate(train_loader):
iteration_counter += 1
data, target
= data.to(device), target.to(device)
optimizer.zero_grad()
output =
model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
# Calculate total_bp_update
for each iteration
total_bp_update = 0
for param in model.parameters():
if param.grad is not None:
total_bp_update += torch.sum(torch.abs(param.grad)).item()
# print(f"Epoch
[{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "
# f"Backpropagation
Update: {total_bp_update:.4f}")
epoch_train_loss += loss.item()
_, predicted
= torch.max(output.data, 1)
epoch_train_total += target.size(0)
epoch_train_correct += (predicted == target).sum().item()
epoch_train_loss
/= len(train_loader)
epoch_train_accuracy
= 100 * epoch_train_correct / epoch_train_total
train_losses.append(epoch_train_loss)
train_accuracies.append(epoch_train_accuracy)
model.eval()
epoch_val_loss
= 0
epoch_val_correct
= 0
epoch_val_total
= 0
with torch.no_grad():
for data, target in val_loader:
data, target = data.to(device), target.to(device)
output = model(data)
loss = criterion(output, target)
epoch_val_loss += loss.item()
_, predicted = torch.max(output.data, 1)
epoch_val_total += target.size(0)
epoch_val_correct += (predicted == target).sum().item()
epoch_val_loss
/= len(val_loader)
epoch_val_accuracy
= 100 * epoch_val_correct / epoch_val_total
val_losses.append(epoch_val_loss)
val_accuracies.append(epoch_val_accuracy)
epoch_end_time
= time.time()
epoch_time
= epoch_end_time - epoch_start_time
print(f"Epoch [{epoch + 1}/{num_epochs}], "
f"Train Loss: {epoch_train_loss:.4f}, "
f"Train Accuracy: {epoch_train_accuracy:.2f}%,
"
f"Val Loss: {epoch_val_loss:.4f}, "
f"Val Accuracy: {epoch_val_accuracy:.2f}%")
print(f"Epoch processing time: {epoch_time:.2f} seconds")
total_end_time
= time.time()
total_training_time
= total_end_time - total_start_time
print(f"Total training time: {total_training_time:.2f} seconds")
return train_losses, val_losses, train_accuracies, val_accuracies
# Function to calculate accuracy
def calculate_accuracy(model, data_loader):
model.eval()
correct = 0
total = 0
with torch.no_grad():
for images, labels in data_loader:
images,
labels = images.to(device), labels.to(device)
outputs =
model(images)
_, predicted
= torch.max(outputs.data, 1)
total += labels.size(0)
correct +=
(predicted == labels).sum().item()
return 100 * correct / total
# !!!!!!!!!!!! Hyperparameters
!!!!!!!!!!!!!!!!!!!!!
num_epochs = 4
learning_rate = 0.01
batch_size = 10
# removed to test
dynamic value homogeneity_learning_rate = 0.01
# Load and split MNIST dataset
transform = transforms.ToTensor()
full_dataset = datasets.MNIST(root='./data', train=True, download=True,
transform=transform)
test_dataset = datasets.MNIST(root='./data', train=False, download=True,
transform=transform)
training_samples_number = int(0.7 * len(full_dataset))
total_iterations = math.ceil(training_samples_number / batch_size)
* num_epochs
homogeneity_learning_rate = (2* learning_rate) / (learning_rate *
total_iterations + 1)
print("Hyperparameters and
Calculated Values:")
print(f" num_epochs: {num_epochs}")
print(f" learning_rate: {learning_rate}")
print(f" homogeneity_learning_rate: {homogeneity_learning_rate}")
print(f" batch_size: {batch_size}")
print(f" training_samples_number: {training_samples_number}")
print(f" total_iterations: {total_iterations}")
print("\n")
train_size = int(0.7 * len(full_dataset))
val_size = int(0.15 * len(full_dataset))
test_size = len(full_dataset) - train_size - val_size
train_dataset, val_dataset, _ = random_split(full_dataset, [train_size, val_size, test_size])
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)
# Initialize models
model_homogeneity = Net().to(device)
model_traditional = Net().to(device)
average_model = Net().to(device)
average = torch.cat([p.data.flatten() for p in average_model.parameters()]).detach()
# Train the models
print("Training with
Homogeneity-driven update:")
results_homogeneity = train_homogeneity_driven(model_homogeneity,
train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average)
print("\nTraining
with traditional backpropagation:")
results_traditional = train_traditional(model_traditional, train_loader, val_loader, num_epochs, learning_rate)
# Evaluate and plot results
train_losses_h, val_losses_h, train_accuracies_h, val_accuracies_h,
homogeneity_values_h = results_homogeneity
train_losses_t, val_losses_t, train_accuracies_t, val_accuracies_t
= results_traditional
print(f"\nFinal Test Accuracy
(Homogeneity-driven): {calculate_accuracy(model_homogeneity,
test_loader):.2f}%")
print(f"Final Test Accuracy (Traditional): {calculate_accuracy(model_traditional,
test_loader):.2f}%")
plt.figure(figsize=(10, 5))
plt.plot(train_losses_h, label='Homogeneity-Driven Train Loss')
plt.plot(val_losses_h, label='Homogeneity-Driven Validation Loss')
plt.plot(train_losses_t, label='Traditional Train Loss')
plt.plot(val_losses_t, label='Traditional Validation Loss')
plt.title('Training and
Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()
plt.figure(figsize=(10, 5))
plt.plot(train_accuracies_h, label='Homogeneity-Driven Train Accuracy')
plt.plot(val_accuracies_h, label='Homogeneity-Driven Validation Accuracy')
plt.plot(train_accuracies_t, label='Traditional Train Accuracy')
plt.plot(val_accuracies_t, label='Traditional Validation Accuracy')
plt.title('Training and
Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
plt.figure(figsize=(10, 5))
plt.plot(homogeneity_values_h,
label='Homogeneity')
plt.title('Homogeneity
Values over Iterations')
plt.xlabel('Iteration')
plt.ylabel('Homogeneity')
plt.legend()
plt.show()
# END OF THE CODE
=========================================
Samples of run:
-------------------------------------------------------
Sample [B-A-MNIST] 1:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 4
learning_rate: 0.01
homogeneity_learning_rate:
0.00011834319526627219
batch_size: 10
training_samples_number: 42000
total_iterations: 16800
Training with Homogeneity-driven
update:
Total training time: 249.94 seconds
Training with traditional
backpropagation:
Epoch processing time: 22.37 seconds
Total training time: 90.24 seconds
Final Test Accuracy (Homogeneity-driven):
95.12%
Final Test Accuracy (Traditional):
94.38%
-------------------------------------------------------
Sample [B-A-MNIST] 2:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 8
learning_rate: 0.01
homogeneity_learning_rate:
0.00011834319526627219
batch_size: 20
training_samples_number: 42000
total_iterations: 16800
Training with Homogeneity-driven
update:
Total training time: 277.78 seconds
Training with traditional
backpropagation:
Total training time: 112.23 seconds
Final Test Accuracy
(Homogeneity-driven): 95.91%
Final Test Accuracy (Traditional):
95.78%
-------------------------------------------------------
Sample [B-A-MNIST] 3:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 10
learning_rate: 0.01
homogeneity_learning_rate:
0.006666666666666667
batch_size: 2100
training_samples_number: 42000
total_iterations: 200
Training with Homogeneity-driven
update:
Total training time: 65.42 seconds
Training with traditional backpropagation:
Total training time: 62.62 seconds
Final Test Accuracy
(Homogeneity-driven): 96.92%
Final Test Accuracy (Traditional):
96.87%
4. HOMOGENEITY NN TRAINING (Option with lambda based on current
iteration and dynamic learning rate computation) – [B-B]– MNIST DATASET
=========================================
# BEGINNING OF THE CODE
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, random_split
from torchvision import datasets, transforms
import numpy as np
import matplotlib.pyplot as plt
import time
import math
# Device configuration
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Define the neural network model
class Net(nn.Module):
def __init__(self):
super(Net,
self).__init__()
self.fc1 = nn.Linear(28 * 28, 128)
self.relu = nn.ReLU()
self.fc2 = nn.Linear(128, 10)
def forward(self, x):
x = x.view(-1, 28 * 28)
x = self.fc1(x)
x = self.relu(x)
x = self.fc2(x)
return x
# Function to calculate Absolute Difference Similarity
(ADS)
def calculate_similarity(status, average, epsilon=1e-8):
num = torch.sum(torch.abs(status
- average))
den = epsilon + torch.sum(torch.abs(status) + torch.abs(average))
similarity = 1 - (num / den)
return similarity
# Function to calculate partial derivative of
Homogeneity
def calculate_partial_derivative(status, average, homogeneity_lambda, epsilon=1e-8):
N = torch.sum(torch.abs(status
- average))
D = epsilon + torch.sum(torch.abs(status) + torch.abs(average))
condition1 = torch.logical_or(torch.logical_and(status
> average, average > 0),
torch.logical_and(status <
average, average < 0))
condition2 = torch.logical_or(torch.logical_and(status
> 0, average < 0),
torch.logical_and(status < 0, average > 0))
condition3 = torch.logical_and(status == 0, average > 0)
condition4 = torch.logical_and(status == 0, average < 0)
condition5 = status == average
partial_derivative
= torch.zeros_like(status)
partial_derivative[condition1]
= (1 - homogeneity_lambda) * (1 / D**2) * (D - N)
partial_derivative[condition2]
= (1 - homogeneity_lambda) * (1 / D**2) * (D + N)
partial_derivative[condition3]
= -(1 - homogeneity_lambda) / D
partial_derivative[condition4]
= (1 - homogeneity_lambda) / D
partial_derivative[condition5]
= 0
remaining_indices
= torch.logical_not(torch.logical_or(torch.logical_or(torch.logical_or(condition1,
condition2),
torch.logical_or(condition3,
condition4)),
condition5))
partial_derivative[remaining_indices] = (1 - homogeneity_lambda)
* (1 / D**2) * (
D * torch.sign(status[remaining_indices]
- average[remaining_indices]) -
N * torch.sign(status[remaining_indices])
)
return partial_derivative
# Function to perform Homogeneity-driven weight update
def weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average, epsilon=1e-8):
partial_derivative
= calculate_partial_derivative(status, average, homogeneity_lambda,
epsilon)
delta = -homogeneity_learning_rate
* partial_derivative
return delta
# Function to train the model with Homogeneity-driven
update
def train_homogeneity_driven(model, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average):
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(),
lr=learning_rate)
optimizer_homogeneity
= optim.Adam(model.parameters(), lr=homogeneity_learning_rate)
train_losses =
[]
val_losses =
[]
train_accuracies
= []
val_accuracies
= []
homogeneity_values
= []
homogeneity = 1.0
total_start_time
= time.time()
iteration_counter
= 1
for epoch in range(num_epochs):
epoch_start_time
= time.time()
model.train()
epoch_train_loss
= 0
epoch_train_correct
= 0
epoch_train_total
= 0
for batch_idx,
(data, target) in enumerate(train_loader):
iteration_counter += 1
data, target
= data.to(device), target.to(device)
# --- Backpropagation Update ---
optimizer.zero_grad()
output =
model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
total_bp_update = 0
for param in model.parameters():
if param.grad is not None:
total_bp_update += torch.sum(torch.abs(param.grad)).item()
# --- Homogeneity-driven Update ---
status =
torch.cat([p.data.flatten() for p in model.parameters()]).detach()
weights_before_update = status.clone()
#
print("Model weights before
homogeneity update (first 5 values of fc1.weight):")
#
print(model.fc1.weight.data[:5])
similarity =
calculate_similarity(status, average)
homogeneity_lambda = (iteration_counter
- 1) / iteration_counter
homogeneity
= (1 - homogeneity_lambda) * similarity + homogeneity_lambda * homogeneity
homogeneity_values.append(homogeneity.item())
delta = weights_update(status,
homogeneity, homogeneity_learning_rate, homogeneity_lambda, average)
current_index = 0
for param in model.parameters():
param_size = param.nelement()
update_value = delta[current_index: current_index + param_size].view_as(param.data)
param.grad = -update_value
current_index += param_size
optimizer_homogeneity.step()
optimizer_homogeneity.zero_grad()
#
print("\nModel
weights after homogeneity update (first 5 values of fc1.weight):")
#
print(model.fc1.weight.data[:5])
weights_after_update = torch.cat([p.data.flatten() for p in model.parameters()]).detach()
distance = torch.norm(weights_before_update
- weights_after_update)
#
print(f"Distance
between weights before and after homogeneity update: {distance.item()}\n")
average =
(status + (epoch * len(train_loader) + batch_idx) * average) / (
epoch * len(train_loader) + batch_idx + 1)
optimizer.zero_grad()
#
print(f"Epoch
[{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "
#
f"Backpropagation
Update: {total_bp_update:.4f}, Homogeneity Update: {torch.sum(torch.abs(delta)).item():.4f},
"
#
f"homogeneity_lambda:
{homogeneity_lambda:.4f}, similarity: {similarity:.4f}, homogeneity:
{homogeneity:.4f}")
epoch_train_loss += loss.item()
_, predicted
= torch.max(output.data, 1)
epoch_train_total += target.size(0)
epoch_train_correct += (predicted == target).sum().item()
epoch_train_loss
/= len(train_loader)
epoch_train_accuracy
= 100 * epoch_train_correct / epoch_train_total
train_losses.append(epoch_train_loss)
train_accuracies.append(epoch_train_accuracy)
model.eval()
epoch_val_loss
= 0
epoch_val_correct
= 0
epoch_val_total
= 0
with torch.no_grad():
for data, target in val_loader:
data, target = data.to(device), target.to(device)
output = model(data)
loss = criterion(output, target)
epoch_val_loss += loss.item()
_, predicted = torch.max(output.data, 1)
epoch_val_total += target.size(0)
epoch_val_correct += (predicted == target).sum().item()
epoch_val_loss
/= len(val_loader)
epoch_val_accuracy
= 100 * epoch_val_correct / epoch_val_total
val_losses.append(epoch_val_loss)
val_accuracies.append(epoch_val_accuracy)
epoch_end_time
= time.time()
epoch_time
= epoch_end_time - epoch_start_time
print(f"Epoch [{epoch + 1}/{num_epochs}], "
f"Train Loss: {epoch_train_loss:.4f}, "
f"Train Accuracy: {epoch_train_accuracy:.2f}%,
"
f"Val Loss: {epoch_val_loss:.4f}, "
f"Val Accuracy: {epoch_val_accuracy:.2f}%")
print(f"Epoch processing time: {epoch_time:.2f} seconds")
total_end_time
= time.time()
total_training_time
= total_end_time - total_start_time
print(f"Total training time: {total_training_time:.2f} seconds")
return train_losses, val_losses, train_accuracies, val_accuracies, homogeneity_values
# Function to train the model with traditional
backpropagation
def train_traditional(model, train_loader, val_loader, num_epochs, learning_rate):
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(),
lr=learning_rate)
train_losses =
[]
val_losses =
[]
train_accuracies
= []
val_accuracies
= []
total_start_time
= time.time()
iteration_counter
= 1 # Initialize iteration_counter here
for epoch in range(num_epochs):
epoch_start_time
= time.time()
model.train()
epoch_train_loss
= 0
epoch_train_correct
= 0
epoch_train_total
= 0
for batch_idx,
(data, target) in enumerate(train_loader):
iteration_counter += 1
data, target
= data.to(device), target.to(device)
optimizer.zero_grad()
output =
model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
# Calculate total_bp_update
for each iteration
total_bp_update = 0
for param in model.parameters():
if param.grad is not None:
total_bp_update += torch.sum(torch.abs(param.grad)).item()
# print(f"Epoch
[{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "
# f"Backpropagation
Update: {total_bp_update:.4f}")
epoch_train_loss += loss.item()
_, predicted
= torch.max(output.data, 1)
epoch_train_total += target.size(0)
epoch_train_correct += (predicted == target).sum().item()
epoch_train_loss
/= len(train_loader)
epoch_train_accuracy
= 100 * epoch_train_correct / epoch_train_total
train_losses.append(epoch_train_loss)
train_accuracies.append(epoch_train_accuracy)
model.eval()
epoch_val_loss
= 0
epoch_val_correct
= 0
epoch_val_total
= 0
with torch.no_grad():
for data, target in val_loader:
data, target = data.to(device), target.to(device)
output = model(data)
loss = criterion(output, target)
epoch_val_loss += loss.item()
_, predicted = torch.max(output.data, 1)
epoch_val_total += target.size(0)
epoch_val_correct += (predicted == target).sum().item()
epoch_val_loss
/= len(val_loader)
epoch_val_accuracy
= 100 * epoch_val_correct / epoch_val_total
val_losses.append(epoch_val_loss)
val_accuracies.append(epoch_val_accuracy)
epoch_end_time
= time.time()
epoch_time
= epoch_end_time - epoch_start_time
print(f"Epoch [{epoch + 1}/{num_epochs}], "
f"Train Loss: {epoch_train_loss:.4f}, "
f"Train Accuracy: {epoch_train_accuracy:.2f}%,
"
f"Val Loss: {epoch_val_loss:.4f}, "
f"Val Accuracy: {epoch_val_accuracy:.2f}%")
print(f"Epoch processing time: {epoch_time:.2f} seconds")
total_end_time
= time.time()
total_training_time
= total_end_time - total_start_time
print(f"Total training time: {total_training_time:.2f} seconds")
return train_losses, val_losses, train_accuracies, val_accuracies
# Function to calculate accuracy
def calculate_accuracy(model, data_loader):
model.eval()
correct = 0
total = 0
with torch.no_grad():
for images, labels in data_loader:
images,
labels = images.to(device), labels.to(device)
outputs =
model(images)
_, predicted
= torch.max(outputs.data, 1)
total += labels.size(0)
correct +=
(predicted == labels).sum().item()
return 100 * correct / total
# !!!!!!!!!!!! Hyperparameters
!!!!!!!!!!!!!!!!!!!!!
num_epochs = 4
learning_rate = 0.01
batch_size = 10
# removed to test
dynamic value homogeneity_learning_rate = 0.01
# Load and split MNIST dataset
transform = transforms.ToTensor()
full_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST(root='./data', train=False, download=True,
transform=transform)
training_samples_number = int(0.7 * len(full_dataset))
total_iterations = math.ceil(training_samples_number / batch_size)
* num_epochs
homogeneity_learning_rate = (2* learning_rate) / (learning_rate *
total_iterations + 1)
print("Hyperparameters and
Calculated Values:")
print(f" num_epochs: {num_epochs}")
print(f" learning_rate: {learning_rate}")
print(f" homogeneity_learning_rate: {homogeneity_learning_rate}")
print(f" batch_size: {batch_size}")
print(f" training_samples_number: {training_samples_number}")
print(f" total_iterations: {total_iterations}")
print("\n")
train_size = int(0.7 * len(full_dataset))
val_size = int(0.15 * len(full_dataset))
test_size = len(full_dataset) - train_size - val_size
train_dataset, val_dataset, _ = random_split(full_dataset, [train_size, val_size, test_size])
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)
# Initialize models
model_homogeneity = Net().to(device)
model_traditional = Net().to(device)
average_model = Net().to(device)
average = torch.cat([p.data.flatten() for p in average_model.parameters()]).detach()
# Train the models
print("Training with
Homogeneity-driven update:")
results_homogeneity = train_homogeneity_driven(model_homogeneity,
train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average)
print("\nTraining
with traditional backpropagation:")
results_traditional = train_traditional(model_traditional, train_loader, val_loader, num_epochs, learning_rate)
# Evaluate and plot results
train_losses_h, val_losses_h, train_accuracies_h, val_accuracies_h,
homogeneity_values_h = results_homogeneity
train_losses_t, val_losses_t, train_accuracies_t, val_accuracies_t
= results_traditional
print(f"\nFinal Test Accuracy
(Homogeneity-driven): {calculate_accuracy(model_homogeneity,
test_loader):.2f}%")
print(f"Final Test Accuracy (Traditional): {calculate_accuracy(model_traditional,
test_loader):.2f}%")
plt.figure(figsize=(10, 5))
plt.plot(train_losses_h, label='Homogeneity-Driven Train Loss')
plt.plot(val_losses_h, label='Homogeneity-Driven Validation Loss')
plt.plot(train_losses_t, label='Traditional Train Loss')
plt.plot(val_losses_t, label='Traditional Validation Loss')
plt.title('Training and
Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()
plt.figure(figsize=(10, 5))
plt.plot(train_accuracies_h, label='Homogeneity-Driven Train Accuracy')
plt.plot(val_accuracies_h, label='Homogeneity-Driven Validation Accuracy')
plt.plot(train_accuracies_t, label='Traditional Train Accuracy')
plt.plot(val_accuracies_t, label='Traditional Validation Accuracy')
plt.title('Training and
Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
plt.figure(figsize=(10, 5))
plt.plot(homogeneity_values_h,
label='Homogeneity')
plt.title('Homogeneity
Values over Iterations')
plt.xlabel('Iteration')
plt.ylabel('Homogeneity')
plt.legend()
plt.show()
# END OF THE CODE
=========================================
Samples of run:
-------------------------------------------------------
Sample [B-B-MNIST] 1:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 8
learning_rate: 0.01
homogeneity_learning_rate:
0.0005780346820809249
batch_size: 100
training_samples_number: 42000
total_iterations: 3360
Training with Homogeneity-driven
update:
Total training time: 133.06 seconds
Training with traditional
backpropagation:
Total training time: 58.51 seconds
Final Test Accuracy
(Homogeneity-driven): 96.67%
Final Test Accuracy (Traditional):
96.27%
-------------------------------------------------------
Sample [B-B-MNIST] 2:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 8
learning_rate: 0.001
homogeneity_learning_rate:
0.00022222222222222223
batch_size: 42
training_samples_number: 42000
total_iterations: 8000
Training with Homogeneity-driven
update:
Total training time: 201.09 seconds
Training with traditional
backpropagation:
Total training time: 71.01 seconds
Final Test Accuracy
(Homogeneity-driven): 97.42%
Final Test Accuracy (Traditional):
97.33%
-------------------------------------------------------
Sample [B-B-MNIST] 3:
-------------------------------------------------------
Hyperparameters and
Calculated Values:
num_epochs: 8
learning_rate: 0.1
homogeneity_learning_rate:
0.00012492192379762648
batch_size: 21
training_samples_number:
42000
total_iterations:
16000
Training with
Homogeneity-driven update:
Total training time:
316.75 seconds
Training with
traditional backpropagation:
Total training time:
127.37 seconds
Final Test Accuracy
(Homogeneity-driven): 49.31%
Final Test Accuracy
(Traditional): 45.58%
SAME
(basic) CODE IDEA FOR IRIS
DATASET (labda depends on number of iterations
and dynamic learning rate) [B-A]
=========================================
# BEGINNING OF THE CODE
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader, random_split
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import numpy as np
import matplotlib.pyplot as plt
import time
import math
# Device configuration
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Define the neural network model (adjusted for Iris
dataset)
class Net(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(Net,
self).__init__()
self.fc1 = nn.Linear(input_size, hidden_size)
self.relu = nn.ReLU()
self.fc2 = nn.Linear(hidden_size, output_size)
def forward(self, x):
x = self.fc1(x)
x = self.relu(x)
x = self.fc2(x)
return x
# Function to calculate Absolute Difference Similarity
(ADS)
def calculate_similarity(status, average, epsilon=1e-8):
num = torch.sum(torch.abs(status
- average))
den = epsilon + torch.sum(torch.abs(status) + torch.abs(average))
similarity = 1 - (num / den)
return similarity
# Function to calculate partial derivative of
Homogeneity
def calculate_partial_derivative(status, average, homogeneity_lambda, epsilon=1e-8):
N = torch.sum(torch.abs(status
- average))
D = epsilon + torch.sum(torch.abs(status) + torch.abs(average))
condition1 = torch.logical_or(torch.logical_and(status
> average, average > 0),
torch.logical_and(status <
average, average < 0))
condition2 = torch.logical_or(torch.logical_and(status
> 0, average < 0),
torch.logical_and(status < 0, average > 0))
condition3 = torch.logical_and(status == 0, average > 0)
condition4 = torch.logical_and(status == 0, average < 0)
condition5 = status == average
partial_derivative
= torch.zeros_like(status)
partial_derivative[condition1]
= (1 - homogeneity_lambda) * (1 / D**2) * (D - N)
partial_derivative[condition2]
= (1 - homogeneity_lambda) * (1 / D**2) * (D + N)
partial_derivative[condition3]
= -(1 - homogeneity_lambda) / D
partial_derivative[condition4]
= (1 - homogeneity_lambda) / D
partial_derivative[condition5]
= 0
remaining_indices
= torch.logical_not(torch.logical_or(torch.logical_or(torch.logical_or(condition1,
condition2),
torch.logical_or(condition3,
condition4)),
condition5))
partial_derivative[remaining_indices] = (1 - homogeneity_lambda)
* (1 / D**2) * (
D * torch.sign(status[remaining_indices]
- average[remaining_indices]) -
N * torch.sign(status[remaining_indices])
)
return partial_derivative
# Function to perform Homogeneity-driven weight update
def weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average, epsilon=1e-8):
partial_derivative
= calculate_partial_derivative(status, average, homogeneity_lambda,
epsilon)
delta = -homogeneity_learning_rate
* partial_derivative
return delta
# Function to train the model with Homogeneity-driven
update
def train_homogeneity_driven(model, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average, total_iterations, batch_size):
# Added batch_size as an argument
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(),
lr=learning_rate)
optimizer_homogeneity
= optim.Adam(model.parameters(), lr=homogeneity_learning_rate)
train_losses =
[]
val_losses =
[]
train_accuracies
= []
val_accuracies
= []
homogeneity_values
= []
homogeneity = 1.0
total_start_time
= time.time()
iteration_counter
= 1
for epoch in range(num_epochs):
epoch_start_time
= time.time()
model.train()
epoch_train_loss
= 0
epoch_train_correct
= 0
epoch_train_total
= 0
for batch_idx,
(data, target) in enumerate(train_loader):
iteration_counter += 1
data, target
= data.to(device), target.to(device)
# --- Backpropagation Update ---
optimizer.zero_grad()
output =
model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
total_bp_update = 0
for param in model.parameters():
if param.grad is not None:
total_bp_update += torch.sum(torch.abs(param.grad)).item()
# --- Homogeneity-driven Update ---
status =
torch.cat([p.data.flatten() for p in model.parameters()]).detach()
weights_before_update = status.clone()
# print("Model weights
before homogeneity update (first 5 values of fc1.weight):")
# print(model.fc1.weight.data[:5])
similarity =
calculate_similarity(status, average)
homogeneity_lambda = (iteration_counter - 1) / total_iterations
homogeneity
= (1 - homogeneity_lambda) * similarity + homogeneity_lambda * homogeneity
homogeneity_values.append(homogeneity.item())
delta = weights_update(status,
homogeneity, homogeneity_learning_rate, homogeneity_lambda, average)
current_index = 0
for param in model.parameters():
param_size = param.nelement()
update_value = delta[current_index: current_index + param_size].view_as(param.data)
param.grad = -update_value
current_index += param_size
optimizer_homogeneity.step()
optimizer_homogeneity.zero_grad()
# print("\nModel weights after homogeneity update (first 5 values of
fc1.weight):")
# print(model.fc1.weight.data[:5])
weights_after_update = torch.cat([p.data.flatten() for p in model.parameters()]).detach()
distance = torch.norm(weights_before_update
- weights_after_update)
# print(f"Distance
between weights before and after homogeneity update: {distance.item()}\n")
average =
(status + (epoch * len(train_loader) + batch_idx) * average) / (
epoch * len(train_loader) + batch_idx + 1)
optimizer.zero_grad()
# print(f"Epoch
[{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "
# f"Backpropagation
Update: {total_bp_update:.4f}, Homogeneity Update: {torch.sum(torch.abs(delta)).item():.4f},
"
# f"homogeneity_lambda:
{homogeneity_lambda:.4f}, similarity: {similarity:.4f}, homogeneity:
{homogeneity:.4f}")
epoch_train_loss += loss.item()
_, predicted
= torch.max(output.data, 1)
epoch_train_total += target.size(0)
epoch_train_correct += (predicted == target).sum().item()
epoch_train_loss
/= len(train_loader)
epoch_train_accuracy
= 100 * epoch_train_correct / epoch_train_total
train_losses.append(epoch_train_loss)
train_accuracies.append(epoch_train_accuracy)
model.eval()
epoch_val_loss
= 0
epoch_val_correct
= 0
epoch_val_total
= 0
with torch.no_grad():
for data, target in val_loader:
data, target = data.to(device), target.to(device)
output = model(data)
loss = criterion(output, target)
epoch_val_loss += loss.item()
_, predicted = torch.max(output.data, 1)
epoch_val_total += target.size(0)
epoch_val_correct += (predicted == target).sum().item()
epoch_val_loss
/= len(val_loader)
epoch_val_accuracy
= 100 * epoch_val_correct / epoch_val_total
val_losses.append(epoch_val_loss)
val_accuracies.append(epoch_val_accuracy)
epoch_end_time
= time.time()
epoch_time
= epoch_end_time - epoch_start_time
print(f"Epoch [{epoch + 1}/{num_epochs}], "
f"Train Loss: {epoch_train_loss:.4f}, "
f"Train Accuracy: {epoch_train_accuracy:.2f}%,
"
f"Val Loss: {epoch_val_loss:.4f}, "
f"Val Accuracy: {epoch_val_accuracy:.2f}%")
#print(f"Epoch processing time:
{epoch_time:.2f} seconds")
total_end_time
= time.time()
total_training_time
= total_end_time - total_start_time
print(f"Total training time: {total_training_time:.2f} seconds")
return train_losses, val_losses, train_accuracies, val_accuracies, homogeneity_values
# Function to train the model with traditional
backpropagation
def train_traditional(model, train_loader, val_loader, num_epochs, learning_rate, batch_size):
# Added batch_size as an argument
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(),
lr=learning_rate)
train_losses =
[]
val_losses =
[]
train_accuracies
= []
val_accuracies
= []
total_start_time
= time.time()
iteration_counter
= 1 # Initialize iteration_counter here
for epoch in range(num_epochs):
epoch_start_time
= time.time()
model.train()
epoch_train_loss
= 0
epoch_train_correct
= 0
epoch_train_total
= 0
for batch_idx,
(data, target) in enumerate(train_loader):
iteration_counter += 1
data, target
= data.to(device), target.to(device)
optimizer.zero_grad()
output =
model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
# Calculate total_bp_update
for each iteration
total_bp_update = 0
for param in model.parameters():
if param.grad is not None:
total_bp_update += torch.sum(torch.abs(param.grad)).item()
# print(f"Epoch
[{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "
# f"Backpropagation
Update: {total_bp_update:.4f}")
epoch_train_loss += loss.item()
_, predicted
= torch.max(output.data, 1)
epoch_train_total += target.size(0)
epoch_train_correct += (predicted == target).sum().item()
epoch_train_loss
/= len(train_loader)
epoch_train_accuracy
= 100 * epoch_train_correct / epoch_train_total
train_losses.append(epoch_train_loss)
train_accuracies.append(epoch_train_accuracy)
model.eval()
epoch_val_loss
= 0
epoch_val_correct
= 0
epoch_val_total
= 0
with torch.no_grad():
for data, target in val_loader:
data, target = data.to(device), target.to(device)
output = model(data)
loss = criterion(output, target)
epoch_val_loss += loss.item()
_, predicted = torch.max(output.data, 1)
epoch_val_total += target.size(0)
epoch_val_correct += (predicted == target).sum().item()
epoch_val_loss
/= len(val_loader)
epoch_val_accuracy
= 100 * epoch_val_correct / epoch_val_total
val_losses.append(epoch_val_loss)
val_accuracies.append(epoch_val_accuracy)
epoch_end_time
= time.time()
epoch_time
= epoch_end_time - epoch_start_time
print(f"Epoch [{epoch + 1}/{num_epochs}], "
f"Train Loss: {epoch_train_loss:.4f}, "
f"Train Accuracy: {epoch_train_accuracy:.2f}%,
"
f"Val Loss: {epoch_val_loss:.4f}, "
f"Val Accuracy: {epoch_val_accuracy:.2f}%")
#print(f"Epoch processing time: {epoch_time:.2f}
seconds")
total_end_time
= time.time()
total_training_time
= total_end_time - total_start_time
print(f"Total training time: {total_training_time:.2f} seconds")
return train_losses, val_losses, train_accuracies, val_accuracies
# Function to calculate accuracy
def calculate_accuracy(model, data_loader):
model.eval()
correct = 0
total = 0
with torch.no_grad():
for images, labels in data_loader:
images,
labels = images.to(device), labels.to(device)
outputs =
model(images)
_, predicted
= torch.max(outputs.data, 1)
total += labels.size(0)
correct +=
(predicted == labels).sum().item()
return 100 * correct / total
# -------------------- Hyperparameters
--------------------
num_epochs = 10
learning_rate = 0.01
# removed to be
replaced by dynamic option homogeneity_learning_rate
= 0.01
input_size = 4 # Number of features in Iris dataset
hidden_size = 10
output_size = 3 # Number of classes in Iris dataset
batch_size = 32 # Added batch size definition
#
---------------------------------------------------------
# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target
# Split data into training, validation, and testing
sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train,
test_size=0.25, random_state=42) # 0.25 x 0.8 = 0.2
# Scale the data
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_val = scaler.transform(X_val)
X_test = scaler.transform(X_test)
# Create custom dataset class for Iris data
class IrisDataset(Dataset):
def __init__(self, X, y):
self.X = torch.tensor(X,
dtype=torch.float32)
self.y = torch.tensor(y, dtype=torch.long)
def __len__(self):
return len(self.X)
def __getitem__(self, idx):
return self.X[idx], self.y[idx]
# Create data loaders
train_dataset = IrisDataset(X_train, y_train)
val_dataset = IrisDataset(X_val, y_val)
test_dataset = IrisDataset(X_test, y_test)
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True) # Updated to use batch_size variable
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False) # Updated to use batch_size variable
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False) # Updated to use batch_size variable
training_samples_number = len(train_dataset)
total_iterations = math.ceil(training_samples_number / batch_size)
* num_epochs # Updated to use batch_size variable
homogeneity_learning_rate = (2* learning_rate) / (learning_rate *
total_iterations + 1)
print("Hyperparameters and
Calculated Values:")
print(f" num_epochs: {num_epochs}")
print(f" learning_rate: {learning_rate}")
print(f" homogeneity_learning_rate: {homogeneity_learning_rate}")
print(f" batch_size: {batch_size}")
print(f" training_samples_number: {training_samples_number}")
print(f" total_iterations: {total_iterations}")
print("\n")
# Initialize models
model_homogeneity = Net(input_size,
hidden_size, output_size).to(device)
model_traditional = Net(input_size,
hidden_size, output_size).to(device)
average_model = Net(input_size,
hidden_size, output_size).to(device)
average = torch.cat([p.data.flatten() for p in average_model.parameters()]).detach()
# Train the models
print("Training with Homogeneity-driven
update:")
results_homogeneity = train_homogeneity_driven(model_homogeneity,
train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average, total_iterations,
batch_size)
print("\nTraining
with traditional backpropagation:")
results_traditional = train_traditional(model_traditional, train_loader, val_loader, num_epochs, learning_rate, batch_size)
# Evaluate and plot results
train_losses_h, val_losses_h, train_accuracies_h, val_accuracies_h,
homogeneity_values_h = results_homogeneity
train_losses_t, val_losses_t, train_accuracies_t, val_accuracies_t
= results_traditional
print(f"\nFinal Test Accuracy
(Homogeneity-driven): {calculate_accuracy(model_homogeneity,
test_loader):.2f}%")
print(f"Final Test Accuracy (Traditional): {calculate_accuracy(model_traditional,
test_loader):.2f}%")
plt.figure(figsize=(10, 5))
plt.plot(train_losses_h, label='Homogeneity-Driven Train Loss')
plt.plot(val_losses_h, label='Homogeneity-Driven Validation Loss')
plt.plot(train_losses_t, label='Traditional Train Loss')
plt.plot(val_losses_t, label='Traditional Validation Loss')
plt.title('Training and
Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()
plt.figure(figsize=(10, 5))
plt.plot(train_accuracies_h, label='Homogeneity-Driven Train Accuracy')
plt.plot(val_accuracies_h, label='Homogeneity-Driven Validation Accuracy')
plt.plot(train_accuracies_t, label='Traditional Train Accuracy')
plt.plot(val_accuracies_t, label='Traditional Validation Accuracy')
plt.title('Training and
Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
plt.figure(figsize=(10, 5))
plt.plot(homogeneity_values_h,
label='Homogeneity')
plt.title('Homogeneity
Values over Iterations')
plt.xlabel('Iteration')
plt.ylabel('Homogeneity')
plt.legend()
plt.show()
# END OF THE CODE
=========================================
Samples of run:
-------------------------------------------------------
Sample [B-A-IRIS] 1:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 10
learning_rate: 0.01
homogeneity_learning_rate:
0.015384615384615384
batch_size: 32
training_samples_number: 90
total_iterations: 30
Training with Homogeneity-driven
update:
Total training time: 0.16 seconds
Training with traditional
backpropagation:
Final Test Accuracy
(Homogeneity-driven): 86.67%
Final Test Accuracy (Traditional):
80.00%
-------------------------------------------------------
Sample [B-A-IRIS] 2:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 2
learning_rate: 0.01
homogeneity_learning_rate: 0.0196078431372549
batch_size: 90
training_samples_number: 90
total_iterations: 2
hidden size: 16
Training with Homogeneity-driven
update:
Total training time: 0.02 seconds
Training with traditional
backpropagation:
Total training time: 0.01 seconds
Final Test Accuracy
(Homogeneity-driven): 63.33%
Final Test Accuracy (Traditional):
46.67%
-------------------------------------------------------
Sample [B-A-IRIS] 3:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 2
learning_rate: 0.01
homogeneity_learning_rate: 0.0196078431372549
batch_size: 90
training_samples_number: 90
total_iterations: 2
hidden_size: 4
Training with Homogeneity-driven
update:
Total training time: 0.02 seconds
Training with traditional
backpropagation:
Total training time: 0.01 seconds
Final Test Accuracy
(Homogeneity-driven): 40.00%
Final Test Accuracy (Traditional):
16.67%
-------------------------------------------------------
Sample [B-A-IRIS] 4:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 2
hidden_size: 2
learning_rate: 0.01
homogeneity_learning_rate: 0.0196078431372549
batch_size: 90
training_samples_number: 90
total_iterations: 2
Training with Homogeneity-driven
update:
Total training time: 0.02 seconds
Training with traditional
backpropagation:
Total training time: 0.01 seconds
Final Test Accuracy
(Homogeneity-driven): 43.33%
Final Test Accuracy (Traditional):
36.67%
-------------------------------------------------------
Sample [B-A-IRIS] 5:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 2
hidden_size: 24
learning_rate: 0.01
homogeneity_learning_rate: 0.0196078431372549
batch_size: 90
training_samples_number: 90
total_iterations: 2
Training with Homogeneity-driven
update:
Total training time: 0.06 seconds
Training with traditional
backpropagation:
Total training time: 0.03 seconds
Final Test Accuracy
(Homogeneity-driven): 70.00%
Final Test Accuracy (Traditional):
43.33%
-------------------------------------------------------
Sample [B-A-IRIS] 6:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 10
hidden_size: 3
learning_rate: 0.01
homogeneity_learning_rate:
0.015384615384615384
batch_size: 30
training_samples_number: 90
total_iterations: 30
Training with Homogeneity-driven
update:
Total training time: 0.20 seconds
Training with traditional
backpropagation:
Total training time: 0.14 seconds
Final Test Accuracy
(Homogeneity-driven): 90.00%
Final Test Accuracy (Traditional):
70.00%
SAME (basic) CODE IDEA FOR IRIS DATASET (labda depends on current iteration, manual setting of learning
rate) [A-B]
=========================================
# BEGINNING OF THE CODE
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader, random_split
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import numpy as np
import matplotlib.pyplot as plt
import time
import math
# Device configuration
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Define the neural network model (adjusted for Iris
dataset)
class Net(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(Net,
self).__init__()
self.fc1 = nn.Linear(input_size, hidden_size)
self.relu = nn.ReLU()
self.fc2 = nn.Linear(hidden_size, output_size)
def forward(self, x):
x = self.fc1(x)
x = self.relu(x)
x = self.fc2(x)
return x
# Function to calculate Absolute Difference Similarity
(ADS)
def calculate_similarity(status, average, epsilon=1e-8):
num = torch.sum(torch.abs(status
- average))
den = epsilon + torch.sum(torch.abs(status) + torch.abs(average))
similarity = 1 - (num / den)
return similarity
# Function to calculate partial derivative of
Homogeneity
def calculate_partial_derivative(status, average, homogeneity_lambda, epsilon=1e-8):
N = torch.sum(torch.abs(status
- average))
D = epsilon + torch.sum(torch.abs(status) + torch.abs(average))
condition1 = torch.logical_or(torch.logical_and(status
> average, average > 0),
torch.logical_and(status <
average, average < 0))
condition2 = torch.logical_or(torch.logical_and(status
> 0, average < 0),
torch.logical_and(status < 0, average > 0))
condition3 = torch.logical_and(status == 0, average > 0)
condition4 = torch.logical_and(status == 0, average < 0)
condition5 = status == average
partial_derivative
= torch.zeros_like(status)
partial_derivative[condition1]
= (1 - homogeneity_lambda) * (1 / D**2) * (D - N)
partial_derivative[condition2]
= (1 - homogeneity_lambda) * (1 / D**2) * (D + N)
partial_derivative[condition3]
= -(1 - homogeneity_lambda) / D
partial_derivative[condition4]
= (1 - homogeneity_lambda) / D
partial_derivative[condition5]
= 0
remaining_indices
= torch.logical_not(torch.logical_or(torch.logical_or(torch.logical_or(condition1,
condition2),
torch.logical_or(condition3,
condition4)),
condition5))
partial_derivative[remaining_indices] = (1 - homogeneity_lambda)
* (1 / D**2) * (
D * torch.sign(status[remaining_indices]
- average[remaining_indices]) -
N * torch.sign(status[remaining_indices])
)
return partial_derivative
# Function to perform Homogeneity-driven weight update
def weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average, epsilon=1e-8):
partial_derivative
= calculate_partial_derivative(status, average, homogeneity_lambda,
epsilon)
delta = -homogeneity_learning_rate
* partial_derivative
return delta
# Function to train the model with Homogeneity-driven
update
def train_homogeneity_driven(model, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average, total_iterations, batch_size):
# Added batch_size as an argument
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(),
lr=learning_rate)
optimizer_homogeneity
= optim.Adam(model.parameters(), lr=homogeneity_learning_rate)
train_losses =
[]
val_losses =
[]
train_accuracies
= []
val_accuracies
= []
homogeneity_values
= []
homogeneity = 1.0
total_start_time
= time.time()
iteration_counter
= 1
for epoch in range(num_epochs):
epoch_start_time
= time.time()
model.train()
epoch_train_loss
= 0
epoch_train_correct
= 0
epoch_train_total
= 0
for batch_idx,
(data, target) in enumerate(train_loader):
iteration_counter += 1
data, target
= data.to(device), target.to(device)
# --- Backpropagation Update ---
optimizer.zero_grad()
output =
model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
total_bp_update = 0
for param in model.parameters():
if param.grad is not None:
total_bp_update += torch.sum(torch.abs(param.grad)).item()
# --- Homogeneity-driven Update ---
status =
torch.cat([p.data.flatten() for p in model.parameters()]).detach()
weights_before_update = status.clone()
# print("Model weights
before homogeneity update (first 5 values of fc1.weight):")
# print(model.fc1.weight.data[:5])
similarity =
calculate_similarity(status, average)
homogeneity_lambda = (iteration_counter - 1) / iteration_counter
homogeneity
= (1 - homogeneity_lambda) * similarity + homogeneity_lambda * homogeneity
homogeneity_values.append(homogeneity.item())
delta = weights_update(status,
homogeneity, homogeneity_learning_rate, homogeneity_lambda, average)
current_index = 0
for param in model.parameters():
param_size = param.nelement()
update_value = delta[current_index: current_index + param_size].view_as(param.data)
param.grad = -update_value
current_index += param_size
optimizer_homogeneity.step()
optimizer_homogeneity.zero_grad()
# print("\nModel weights after homogeneity update (first 5 values of
fc1.weight):")
# print(model.fc1.weight.data[:5])
weights_after_update = torch.cat([p.data.flatten() for p in model.parameters()]).detach()
distance = torch.norm(weights_before_update
- weights_after_update)
# print(f"Distance
between weights before and after homogeneity update: {distance.item()}\n")
average =
(status + (epoch * len(train_loader) + batch_idx) * average) / (
epoch * len(train_loader) + batch_idx + 1)
optimizer.zero_grad()
# print(f"Epoch
[{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "
# f"Backpropagation
Update: {total_bp_update:.4f}, Homogeneity Update: {torch.sum(torch.abs(delta)).item():.4f},
"
# f"homogeneity_lambda:
{homogeneity_lambda:.4f}, similarity: {similarity:.4f}, homogeneity:
{homogeneity:.4f}")
epoch_train_loss += loss.item()
_, predicted
= torch.max(output.data, 1)
epoch_train_total += target.size(0)
epoch_train_correct += (predicted == target).sum().item()
epoch_train_loss
/= len(train_loader)
epoch_train_accuracy
= 100 * epoch_train_correct / epoch_train_total
train_losses.append(epoch_train_loss)
train_accuracies.append(epoch_train_accuracy)
model.eval()
epoch_val_loss
= 0
epoch_val_correct
= 0
epoch_val_total
= 0
with torch.no_grad():
for data, target in val_loader:
data, target = data.to(device), target.to(device)
output = model(data)
loss = criterion(output, target)
epoch_val_loss += loss.item()
_, predicted = torch.max(output.data, 1)
epoch_val_total += target.size(0)
epoch_val_correct += (predicted == target).sum().item()
epoch_val_loss
/= len(val_loader)
epoch_val_accuracy
= 100 * epoch_val_correct / epoch_val_total
val_losses.append(epoch_val_loss)
val_accuracies.append(epoch_val_accuracy)
epoch_end_time
= time.time()
epoch_time
= epoch_end_time - epoch_start_time
print(f"Epoch [{epoch + 1}/{num_epochs}], "
f"Train Loss: {epoch_train_loss:.4f}, "
f"Train Accuracy: {epoch_train_accuracy:.2f}%,
"
f"Val Loss: {epoch_val_loss:.4f}, "
f"Val Accuracy: {epoch_val_accuracy:.2f}%")
#print(f"Epoch processing time: {epoch_time:.2f}
seconds")
total_end_time
= time.time()
total_training_time
= total_end_time - total_start_time
print(f"Total training time: {total_training_time:.2f} seconds")
return train_losses, val_losses, train_accuracies, val_accuracies, homogeneity_values
# Function to train the model with traditional
backpropagation
def train_traditional(model, train_loader, val_loader, num_epochs, learning_rate, batch_size):
# Added batch_size as an argument
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(),
lr=learning_rate)
train_losses =
[]
val_losses =
[]
train_accuracies
= []
val_accuracies
= []
total_start_time
= time.time()
iteration_counter
= 1 # Initialize iteration_counter here
for epoch in range(num_epochs):
epoch_start_time
= time.time()
model.train()
epoch_train_loss
= 0
epoch_train_correct
= 0
epoch_train_total
= 0
for batch_idx,
(data, target) in enumerate(train_loader):
iteration_counter += 1
data, target
= data.to(device), target.to(device)
optimizer.zero_grad()
output = model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
# Calculate total_bp_update
for each iteration
total_bp_update = 0
for param in model.parameters():
if param.grad is not None:
total_bp_update += torch.sum(torch.abs(param.grad)).item()
# print(f"Epoch
[{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "
# f"Backpropagation
Update: {total_bp_update:.4f}")
epoch_train_loss += loss.item()
_, predicted
= torch.max(output.data, 1)
epoch_train_total += target.size(0)
epoch_train_correct += (predicted == target).sum().item()
epoch_train_loss
/= len(train_loader)
epoch_train_accuracy
= 100 * epoch_train_correct / epoch_train_total
train_losses.append(epoch_train_loss)
train_accuracies.append(epoch_train_accuracy)
model.eval()
epoch_val_loss
= 0
epoch_val_correct
= 0
epoch_val_total
= 0
with torch.no_grad():
for data, target in val_loader:
data, target = data.to(device), target.to(device)
output = model(data)
loss = criterion(output, target)
epoch_val_loss += loss.item()
_, predicted = torch.max(output.data, 1)
epoch_val_total += target.size(0)
epoch_val_correct += (predicted == target).sum().item()
epoch_val_loss
/= len(val_loader)
epoch_val_accuracy
= 100 * epoch_val_correct / epoch_val_total
val_losses.append(epoch_val_loss)
val_accuracies.append(epoch_val_accuracy)
epoch_end_time
= time.time()
epoch_time
= epoch_end_time - epoch_start_time
print(f"Epoch [{epoch + 1}/{num_epochs}], "
f"Train Loss: {epoch_train_loss:.4f}, "
f"Train Accuracy: {epoch_train_accuracy:.2f}%,
"
f"Val Loss: {epoch_val_loss:.4f}, "
f"Val Accuracy: {epoch_val_accuracy:.2f}%")
#print(f"Epoch processing time: {epoch_time:.2f}
seconds")
total_end_time
= time.time()
total_training_time
= total_end_time - total_start_time
print(f"Total training time: {total_training_time:.2f} seconds")
return train_losses, val_losses, train_accuracies, val_accuracies
# Function to calculate accuracy
def calculate_accuracy(model, data_loader):
model.eval()
correct = 0
total = 0
with torch.no_grad():
for images, labels in data_loader:
images,
labels = images.to(device), labels.to(device)
outputs =
model(images)
_, predicted
= torch.max(outputs.data, 1)
total += labels.size(0)
correct +=
(predicted == labels).sum().item()
return 100 * correct / total
# -------------------- Hyperparameters
--------------------
num_epochs = 10
learning_rate = 0.01
homogeneity_learning_rate = 0.01
input_size = 4 # Number of features in Iris dataset
hidden_size = 10
output_size = 3 # Number of classes in Iris dataset
batch_size = 32 # Added batch size definition
#
---------------------------------------------------------
# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target
# Split data into training, validation, and testing
sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train,
test_size=0.25, random_state=42) # 0.25 x 0.8 = 0.2
# Scale the data
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_val = scaler.transform(X_val)
X_test = scaler.transform(X_test)
# Create custom dataset class for Iris data
class IrisDataset(Dataset):
def __init__(self, X, y):
self.X = torch.tensor(X,
dtype=torch.float32)
self.y = torch.tensor(y, dtype=torch.long)
def __len__(self):
return len(self.X)
def __getitem__(self, idx):
return self.X[idx], self.y[idx]
# Create data loaders
train_dataset = IrisDataset(X_train, y_train)
val_dataset = IrisDataset(X_val, y_val)
test_dataset = IrisDataset(X_test, y_test)
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True) # Updated to use batch_size
variable
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False) # Updated to use batch_size variable
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False) # Updated to use batch_size variable
training_samples_number = len(train_dataset)
total_iterations = math.ceil(training_samples_number / batch_size)
* num_epochs # Updated to use batch_size variable
# homogeneity_learning_rate = (2* learning_rate)
/ (learning_rate * total_iterations
+ 1)
print("Hyperparameters and
Calculated Values:")
print(f" num_epochs: {num_epochs}")
print(f" learning_rate: {learning_rate}")
print(f" homogeneity_learning_rate: {homogeneity_learning_rate}")
print(f" batch_size: {batch_size}")
print(f" training_samples_number: {training_samples_number}")
print(f" total_iterations: {total_iterations}")
print("\n")
# Initialize models
model_homogeneity = Net(input_size,
hidden_size, output_size).to(device)
model_traditional = Net(input_size,
hidden_size, output_size).to(device)
average_model = Net(input_size,
hidden_size, output_size).to(device)
average = torch.cat([p.data.flatten() for p in average_model.parameters()]).detach()
# Train the models
print("Training with
Homogeneity-driven update:")
results_homogeneity = train_homogeneity_driven(model_homogeneity,
train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average, total_iterations,
batch_size)
print("\nTraining
with traditional backpropagation:")
results_traditional = train_traditional(model_traditional, train_loader, val_loader, num_epochs, learning_rate, batch_size)
# Evaluate and plot results
train_losses_h, val_losses_h, train_accuracies_h, val_accuracies_h,
homogeneity_values_h = results_homogeneity
train_losses_t, val_losses_t, train_accuracies_t, val_accuracies_t
= results_traditional
print(f"\nFinal Test Accuracy
(Homogeneity-driven): {calculate_accuracy(model_homogeneity,
test_loader):.2f}%")
print(f"Final Test Accuracy (Traditional): {calculate_accuracy(model_traditional,
test_loader):.2f}%")
plt.figure(figsize=(10, 5))
plt.plot(train_losses_h, label='Homogeneity-Driven Train Loss')
plt.plot(val_losses_h, label='Homogeneity-Driven Validation Loss')
plt.plot(train_losses_t, label='Traditional Train Loss')
plt.plot(val_losses_t, label='Traditional Validation Loss')
plt.title('Training and
Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()
plt.figure(figsize=(10, 5))
plt.plot(train_accuracies_h, label='Homogeneity-Driven Train Accuracy')
plt.plot(val_accuracies_h, label='Homogeneity-Driven Validation Accuracy')
plt.plot(train_accuracies_t, label='Traditional Train Accuracy')
plt.plot(val_accuracies_t, label='Traditional Validation Accuracy')
plt.title('Training and
Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
plt.figure(figsize=(10, 5))
plt.plot(homogeneity_values_h,
label='Homogeneity')
plt.title('Homogeneity
Values over Iterations')
plt.xlabel('Iteration')
plt.ylabel('Homogeneity')
plt.legend()
plt.show()
# END OF THE CODE
=========================================
Samples of run:
-------------------------------------------------------
Sample [A-B-IRIS] 1:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 10
learning_rate: 0.01
homogeneity_learning_rate: 0.01
batch_size: 32
training_samples_number: 90
total_iterations: 30
Training with Homogeneity-driven
update:
Total training time: 0.14 seconds
Training with traditional backpropagation:
Total training time: 0.09 seconds
Final Test Accuracy
(Homogeneity-driven): 90.00%
Final Test Accuracy (Traditional):
83.33%
-------------------------------------------------------
Sample [A-B-IRIS] 2:
-------------------------------------------------------
Hyperparameters and Calculated
Values:
num_epochs: 10
learning_rate: 0.01
homogeneity_learning_rate: 0.01
batch_size: 64
training_samples_number: 90
total_iterations: 20
Training with Homogeneity-driven
update:
Total training time: 0.12 seconds
Training with traditional
backpropagation:
Total training time: 0.06 seconds
Final Test Accuracy
(Homogeneity-driven): 90.00%
Final Test Accuracy (Traditional): 80.00%
EXPERIMENTS WITH HYBRID DYNAMIC LAMBDAs
– MNIST DATASET
=========================================
# BEGINNING OF THE CODE
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, random_split
from torchvision import datasets, transforms
import numpy as np
import matplotlib.pyplot as plt
import time
import math
# Device configuration
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Define the neural network model
class Net(nn.Module):
def __init__(self):
super(Net,
self).__init__()
self.fc1 = nn.Linear(28 * 28, 128)
self.relu = nn.ReLU()
self.fc2 = nn.Linear(128, 10)
def forward(self, x):
x = x.view(-1, 28 * 28)
x = self.fc1(x)
x = self.relu(x)
x = self.fc2(x)
return x
# Function to calculate Absolute Difference Similarity
(ADS)
def calculate_similarity(status, average, epsilon=1e-8):
num = torch.sum(torch.abs(status
- average))
den = epsilon + torch.sum(torch.abs(status) + torch.abs(average))
similarity = 1 - (num / den)
return similarity
# Function to calculate partial derivative of
Homogeneity
def calculate_partial_derivative(status, average, homogeneity_lambda, epsilon=1e-8):
N = torch.sum(torch.abs(status
- average))
D = epsilon + torch.sum(torch.abs(status) + torch.abs(average))
condition1 = torch.logical_or(torch.logical_and(status
> average, average > 0),
torch.logical_and(status <
average, average < 0))
condition2 = torch.logical_or(torch.logical_and(status
> 0, average < 0),
torch.logical_and(status < 0, average > 0))
condition3 = torch.logical_and(status == 0, average > 0)
condition4 = torch.logical_and(status == 0, average < 0)
condition5 = status == average
partial_derivative
= torch.zeros_like(status)
partial_derivative[condition1]
= (1 - homogeneity_lambda) * (1 / D**2) * (D - N)
partial_derivative[condition2]
= (1 - homogeneity_lambda) * (1 / D**2) * (D + N)
partial_derivative[condition3]
= -(1 - homogeneity_lambda) / D
partial_derivative[condition4]
= (1 - homogeneity_lambda) / D
partial_derivative[condition5]
= 0
remaining_indices
= torch.logical_not(torch.logical_or(torch.logical_or(torch.logical_or(condition1,
condition2),
torch.logical_or(condition3,
condition4)),
condition5))
partial_derivative[remaining_indices] = (1 - homogeneity_lambda)
* (1 / D**2) * (
D * torch.sign(status[remaining_indices]
- average[remaining_indices]) -
N * torch.sign(status[remaining_indices])
)
return partial_derivative
# Function to perform Homogeneity-driven weight update
def weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average, epsilon=1e-8):
partial_derivative
= calculate_partial_derivative(status, average, homogeneity_lambda,
epsilon)
delta = -homogeneity_learning_rate
* partial_derivative
return delta
# Function to train the model with Homogeneity-driven
update
def train_homogeneity_driven(model, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average):
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(),
lr=learning_rate)
optimizer_homogeneity
= optim.Adam(model.parameters(), lr=homogeneity_learning_rate)
train_losses =
[]
val_losses =
[]
train_accuracies
= []
val_accuracies
= []
homogeneity_values
= []
homogeneity = 1.0
total_start_time
= time.time()
iteration_counter
= 1
for epoch in range(num_epochs):
epoch_start_time
= time.time()
model.train()
epoch_train_loss
= 0
epoch_train_correct
= 0
epoch_train_total
= 0
for batch_idx,
(data, target) in enumerate(train_loader):
iteration_counter += 1
data, target
= data.to(device), target.to(device)
# --- Backpropagation Update ---
optimizer.zero_grad()
output =
model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
total_bp_update = 0
for param in model.parameters():
if param.grad is not None:
total_bp_update += torch.sum(torch.abs(param.grad)).item()
# --- Homogeneity-driven Update ---
status =
torch.cat([p.data.flatten() for p in model.parameters()]).detach()
weights_before_update = status.clone()
#
print("Model weights before
homogeneity update (first 5 values of fc1.weight):")
#
print(model.fc1.weight.data[:5])
similarity =
calculate_similarity(status, average)
beta = 0.999
alpha = (total_iterations
- iteration_counter * (1-beta)- beta) / (total_iterations - 1)
homogeneity_lambda =
alpha * ((iteration_counter - 1) / iteration_counter )
homogeneity = (1 - homogeneity_lambda)
* similarity + homogeneity_lambda * homogeneity
homogeneity_values.append(homogeneity.item())
delta = weights_update(status,
homogeneity, homogeneity_learning_rate, homogeneity_lambda, average)
current_index = 0
for param in model.parameters():
param_size = param.nelement()
update_value = delta[current_index: current_index + param_size].view_as(param.data)
param.grad = -update_value
current_index += param_size
optimizer_homogeneity.step()
optimizer_homogeneity.zero_grad()
#
print("\nModel
weights after homogeneity update (first 5 values of fc1.weight):")
#
print(model.fc1.weight.data[:5])
weights_after_update = torch.cat([p.data.flatten() for p in model.parameters()]).detach()
distance = torch.norm(weights_before_update
- weights_after_update)
#
print(f"Distance
between weights before and after homogeneity update: {distance.item()}\n")
average =
(status + (epoch * len(train_loader) + batch_idx) * average) / (
epoch * len(train_loader) + batch_idx + 1)
optimizer.zero_grad()
#
print(f"Epoch
[{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "
#
f"Backpropagation
Update: {total_bp_update:.4f}, Homogeneity Update: {torch.sum(torch.abs(delta)).item():.4f},
"
#
f"homogeneity_lambda:
{homogeneity_lambda:.4f}, similarity: {similarity:.4f}, homogeneity:
{homogeneity:.4f}")
epoch_train_loss += loss.item()
_, predicted
= torch.max(output.data, 1)
epoch_train_total += target.size(0)
epoch_train_correct += (predicted == target).sum().item()
epoch_train_loss
/= len(train_loader)
epoch_train_accuracy
= 100 * epoch_train_correct / epoch_train_total
train_losses.append(epoch_train_loss)
train_accuracies.append(epoch_train_accuracy)
model.eval()
epoch_val_loss
= 0
epoch_val_correct
= 0
epoch_val_total
= 0
with torch.no_grad():
for data, target in val_loader:
data, target = data.to(device), target.to(device)
output = model(data)
loss = criterion(output, target)
epoch_val_loss += loss.item()
_, predicted = torch.max(output.data, 1)
epoch_val_total += target.size(0)
epoch_val_correct += (predicted == target).sum().item()
epoch_val_loss
/= len(val_loader)
epoch_val_accuracy
= 100 * epoch_val_correct / epoch_val_total
val_losses.append(epoch_val_loss)
val_accuracies.append(epoch_val_accuracy)
epoch_end_time
= time.time()
epoch_time
= epoch_end_time - epoch_start_time
print(f"Epoch [{epoch + 1}/{num_epochs}], "
f"Train Loss: {epoch_train_loss:.4f}, "
f"Train Accuracy: {epoch_train_accuracy:.2f}%,
"
f"Val Loss: {epoch_val_loss:.4f}, "
f"Val Accuracy: {epoch_val_accuracy:.2f}%")
print(f"Epoch processing time: {epoch_time:.2f} seconds")
total_end_time
= time.time()
total_training_time
= total_end_time - total_start_time
print(f"Total training time: {total_training_time:.2f} seconds")
return train_losses, val_losses, train_accuracies, val_accuracies, homogeneity_values
# Function to train the model with traditional
backpropagation
def train_traditional(model, train_loader, val_loader, num_epochs, learning_rate):
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(),
lr=learning_rate)
train_losses =
[]
val_losses =
[]
train_accuracies
= []
val_accuracies
= []
total_start_time
= time.time()
iteration_counter
= 1 # Initialize iteration_counter here
for epoch in range(num_epochs):
epoch_start_time
= time.time()
model.train()
epoch_train_loss
= 0
epoch_train_correct
= 0
epoch_train_total
= 0
for batch_idx,
(data, target) in enumerate(train_loader):
iteration_counter += 1
data, target
= data.to(device), target.to(device)
optimizer.zero_grad()
output =
model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
# Calculate total_bp_update
for each iteration
total_bp_update = 0
for param in model.parameters():
if param.grad is not None:
total_bp_update += torch.sum(torch.abs(param.grad)).item()
# print(f"Epoch
[{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "
# f"Backpropagation
Update: {total_bp_update:.4f}")
epoch_train_loss += loss.item()
_, predicted
= torch.max(output.data, 1)
epoch_train_total += target.size(0)
epoch_train_correct += (predicted == target).sum().item()
epoch_train_loss
/= len(train_loader)
epoch_train_accuracy
= 100 * epoch_train_correct / epoch_train_total
train_losses.append(epoch_train_loss)
train_accuracies.append(epoch_train_accuracy)
model.eval()
epoch_val_loss
= 0
epoch_val_correct
= 0
epoch_val_total
= 0
with torch.no_grad():
for data, target in val_loader:
data, target = data.to(device), target.to(device)
output = model(data)
loss = criterion(output, target)
epoch_val_loss += loss.item()
_, predicted = torch.max(output.data, 1)
epoch_val_total += target.size(0)
epoch_val_correct += (predicted == target).sum().item()
epoch_val_loss
/= len(val_loader)
epoch_val_accuracy
= 100 * epoch_val_correct / epoch_val_total
val_losses.append(epoch_val_loss)
val_accuracies.append(epoch_val_accuracy)
epoch_end_time
= time.time()
epoch_time
= epoch_end_time - epoch_start_time
print(f"Epoch [{epoch + 1}/{num_epochs}], "
f"Train Loss: {epoch_train_loss:.4f}, "
f"Train Accuracy: {epoch_train_accuracy:.2f}%,
"
f"Val Loss: {epoch_val_loss:.4f}, "
f"Val Accuracy: {epoch_val_accuracy:.2f}%")
print(f"Epoch processing time: {epoch_time:.2f} seconds")
total_end_time
= time.time()
total_training_time
= total_end_time - total_start_time
print(f"Total training time: {total_training_time:.2f} seconds")
return train_losses, val_losses, train_accuracies, val_accuracies
# Function to calculate accuracy
def calculate_accuracy(model, data_loader):
model.eval()
correct = 0
total = 0
with torch.no_grad():
for images, labels in data_loader:
images,
labels = images.to(device), labels.to(device)
outputs =
model(images)
_, predicted
= torch.max(outputs.data, 1)
total += labels.size(0)
correct +=
(predicted == labels).sum().item()
return 100 * correct / total
# !!!!!!!!!!!! Hyperparameters !!!!!!!!!!!!!!!!!!!!!
num_epochs = 10
learning_rate = 0.1
batch_size = 420
homogeneity_learning_rate = 0.1
# Load and split MNIST dataset
transform = transforms.ToTensor()
full_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST(root='./data', train=False, download=True,
transform=transform)
training_samples_number = int(0.7 * len(full_dataset))
total_iterations = math.ceil(training_samples_number / batch_size)
* num_epochs
print("Hyperparameters and
Calculated Values:")
print(f" num_epochs: {num_epochs}")
print(f" learning_rate: {learning_rate}")
print(f" homogeneity_learning_rate: {homogeneity_learning_rate}")
print(f" batch_size: {batch_size}")
print(f" training_samples_number: {training_samples_number}")
print(f" total_iterations: {total_iterations}")
print("\n")
train_size = int(0.7 * len(full_dataset))
val_size = int(0.15 * len(full_dataset))
test_size = len(full_dataset) - train_size - val_size
train_dataset, val_dataset, _ = random_split(full_dataset, [train_size, val_size, test_size])
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)
# Initialize models
model_homogeneity = Net().to(device)
model_traditional = Net().to(device)
average_model = Net().to(device)
average = torch.cat([p.data.flatten() for p in average_model.parameters()]).detach()
# Train the models
print("Training with
Homogeneity-driven update:")
results_homogeneity = train_homogeneity_driven(model_homogeneity,
train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average)
print("\nTraining
with traditional backpropagation:")
results_traditional = train_traditional(model_traditional, train_loader, val_loader, num_epochs, learning_rate)
# Evaluate and plot results
train_losses_h, val_losses_h, train_accuracies_h, val_accuracies_h,
homogeneity_values_h = results_homogeneity
train_losses_t, val_losses_t, train_accuracies_t, val_accuracies_t
= results_traditional
print(f"\nFinal Test Accuracy
(Homogeneity-driven): {calculate_accuracy(model_homogeneity,
test_loader):.2f}%")
print(f"Final Test Accuracy (Traditional): {calculate_accuracy(model_traditional,
test_loader):.2f}%")
plt.figure(figsize=(10, 5))
plt.plot(train_losses_h, label='Homogeneity-Driven Train Loss')
plt.plot(val_losses_h, label='Homogeneity-Driven Validation Loss')
plt.plot(train_losses_t, label='Traditional Train Loss')
plt.plot(val_losses_t, label='Traditional Validation Loss')
plt.title('Training and
Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()
plt.figure(figsize=(10, 5))
plt.plot(train_accuracies_h, label='Homogeneity-Driven Train Accuracy')
plt.plot(val_accuracies_h, label='Homogeneity-Driven Validation Accuracy')
plt.plot(train_accuracies_t, label='Traditional Train Accuracy')
plt.plot(val_accuracies_t, label='Traditional Validation Accuracy')
plt.title('Training and
Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
plt.figure(figsize=(10, 5))
plt.plot(homogeneity_values_h,
label='Homogeneity')
plt.title('Homogeneity
Values over Iterations')
plt.xlabel('Iteration')
plt.ylabel('Homogeneity')
plt.legend()
plt.show()
# END OF THE CODE
=========================================
Samples of run:
-------------------------------------------------------
Samples [HYBRID-MNIST] 1-10:
-------------------------------------------------------
Final Test Accuracy (Traditional): 92.00%
beta = 0.500 Final Test
Accuracy (Homogeneity-driven): 61.28%
beta = 0.800 Final Test
Accuracy (Homogeneity-driven): 78.19%
beta = 0.900 Final Test
Accuracy (Homogeneity-driven): 86.76%
beta = 0.950 Final Test
Accuracy (Homogeneity-driven): 90.48%
beta = 0.980 Final Test
Accuracy (Homogeneity-driven): 91.71%
beta = 0.990 Final Test
Accuracy (Homogeneity-driven): 91.78%
beta = 0.995 Final Test
Accuracy (Homogeneity-driven): 92.06%
beta
= 0.999 Final
Test Accuracy (Homogeneity-driven): 92.30%
beta = 1.000 Final Test Accuracy
(Homogeneity-driven): 91.66%
beta = 1.010 Final
Test Accuracy (Homogeneity-driven): 91.02%
Part II (Full Implementation)
(Complete option: when homogeneity gradient depends both on current and
historical change; all lambda options are in one code)
=========================================
# BEGINNING OF THE CODE
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader, random_split
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import numpy as np
import matplotlib.pyplot as plt
import time
import math
# Import transforms from torchvision
from torchvision import transforms, datasets
# Device configuration
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Define the neural network model
class Net(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(Net,
self).__init__()
self.fc1 = nn.Linear(input_size, hidden_size)
self.relu = nn.ReLU()
self.fc2 = nn.Linear(hidden_size, output_size)
def forward(self, x):
x = self.fc1(x)
x = self.relu(x)
x = self.fc2(x)
return x
# Function to calculate Absolute Difference Similarity
(ADS)
def calculate_similarity(status, average, epsilon=1e-8):
num = torch.sum(torch.abs(status
- average))
den = epsilon + torch.sum(torch.abs(status) + torch.abs(average))
similarity = 1 - (num / den)
return similarity
# Function to perform Homogeneity-driven weight update
(Modified)
def weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average, gradients, epsilon=1e-8):
N = torch.sum(torch.abs(status
- average))
D = epsilon + torch.sum(torch.abs(status) + torch.abs(average))
# Calculate partial derivative based on conditions from Table 1
partial_derivative
= torch.zeros_like(status)
condition1 = torch.logical_or(torch.logical_and(status
> average, average > 0),
torch.logical_and(status <
average, average < 0))
condition2 = torch.logical_or(torch.logical_and(status
> 0, average < 0),
torch.logical_and(status < 0, average > 0))
condition3 = torch.logical_and(status == 0, average < 0)
condition4 = torch.logical_and(status == 0, average > 0)
condition5 = status == average
partial_derivative[condition1]
= (1 - homogeneity_lambda) / (D ** 2) * (D - N) + homogeneity_lambda * gradients[condition1]
partial_derivative[condition2]
= (1 - homogeneity_lambda) / (D ** 2) * (D + N) + homogeneity_lambda * gradients[condition2]
partial_derivative[condition3]
= (1 - homogeneity_lambda) / D + homogeneity_lambda * gradients[condition3]
partial_derivative[condition4]
= -(1 - homogeneity_lambda) / D + homogeneity_lambda * gradients[condition4]
partial_derivative[condition5]
= homogeneity_lambda * gradients[condition5]
# Update gradients for the next iteration
gradients = partial_derivative.clone()
delta = -homogeneity_learning_rate
* partial_derivative
return delta, gradients
# Function to train the model with Homogeneity-driven
update (Modified)
def train_homogeneity_driven(model, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average, total_iterations, batch_size, lambda_type, lambda_value):
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(),
lr=learning_rate)
optimizer_homogeneity
= optim.Adam(model.parameters(), lr=homogeneity_learning_rate)
train_losses =
[]
val_losses =
[]
train_accuracies
= []
val_accuracies
= []
homogeneity_values
= []
# Initialization (t=0)
homogeneity = 1.0
gradients = torch.cat([torch.zeros_like(p.data.flatten()) for p in model.parameters()])
total_start_time
= time.time()
iteration_counter
= 1
for epoch in range(num_epochs):
epoch_start_time
= time.time()
model.train()
epoch_train_loss
= 0
epoch_train_correct
= 0
epoch_train_total
= 0
for batch_idx,
(data, target) in enumerate(train_loader):
iteration_counter += 1
data, target
= data.to(device), target.to(device)
# 1. Compute Cross-Entropy Loss
optimizer.zero_grad()
# Flatten MNIST data if needed
data = data.view(data.size(0), -1) # Always flatten for MNIST
output =
model(data)
loss = criterion(output, target)
# 2. Update Parameters via Backpropagation
loss.backward()
optimizer.step()
# Calculate total_bp_update
for monitoring
total_bp_update = 0
for param in model.parameters():
if param.grad is not None:
total_bp_update += torch.sum(torch.abs(param.grad)).item()
# --- Homogeneity-driven Update ---
status =
torch.cat([p.data.flatten() for p in model.parameters()]).detach()
# Calculate lambda based on lambda_type
if lambda_type == 'fixed':
homogeneity_lambda = lambda_value
elif lambda_type == 'linear':
homogeneity_lambda = (iteration_counter
- 1) / total_iterations
elif lambda_type == 'dynamic':
homogeneity_lambda = (iteration_counter
- 1) / iteration_counter if iteration_counter
> 1 else 0
else:
raise ValueError("Invalid lambda_type. Choose from 'fixed', 'linear', or
'dynamic'.")
# 3. Compute Updated Homogeneity (Formula 1 and 3)
similarity =
calculate_similarity(status, average)
homogeneity
= (1 - homogeneity_lambda) * similarity + homogeneity_lambda * homogeneity
homogeneity_values.append(homogeneity.item())
# 4. Update NN Parameters Using Homogeneity Gradients
(Formula 4 and 5)
delta,
gradients = weights_update(status, homogeneity, homogeneity_learning_rate,
homogeneity_lambda, average, gradients)
current_index = 0
for param in model.parameters():
param_size = param.nelement()
update_value = delta[current_index: current_index + param_size].view_as(param.data)
param.data.add_(-homogeneity_learning_rate * update_value)
# Equation 5
current_index += param_size
# 5. Update Average Parameter Vector (Formula 2)
average =
(status + (epoch * len(train_loader) + batch_idx) * average) / (epoch * len(train_loader) + batch_idx + 1)
# Reset optimizer state after homogeneity update
optimizer.zero_grad()
# --- End of Homogeneity-driven Update ---
epoch_train_loss += loss.item()
_, predicted
= torch.max(output.data, 1)
epoch_train_total += target.size(0)
epoch_train_correct += (predicted == target).sum().item()
epoch_train_loss
/= len(train_loader)
epoch_train_accuracy
= 100 * epoch_train_correct / epoch_train_total
train_losses.append(epoch_train_loss)
train_accuracies.append(epoch_train_accuracy)
model.eval()
epoch_val_loss
= 0
epoch_val_correct
= 0
epoch_val_total
= 0
with torch.no_grad():
for data, target in val_loader:
data, target = data.to(device), target.to(device)
# Flatten MNIST data if needed
data = data.view(data.size(0), -1) # Always flatten
for MNIST
output = model(data)
loss = criterion(output, target)
epoch_val_loss += loss.item()
_, predicted = torch.max(output.data, 1)
epoch_val_total += target.size(0)
epoch_val_correct += (predicted == target).sum().item()
epoch_val_loss
/= len(val_loader)
epoch_val_accuracy
= 100 * epoch_val_correct / epoch_val_total
val_losses.append(epoch_val_loss)
val_accuracies.append(epoch_val_accuracy)
epoch_end_time
= time.time()
epoch_time
= epoch_end_time - epoch_start_time
print(f"Epoch [{epoch + 1}/{num_epochs}], "
f"Train Loss: {epoch_train_loss:.4f}, "
f"Train Accuracy: {epoch_train_accuracy:.2f}%,
"
f"Val Loss: {epoch_val_loss:.4f}, "
f"Val Accuracy: {epoch_val_accuracy:.2f}%")
print(f"Epoch processing time: {epoch_time:.2f} seconds")
total_end_time
= time.time()
total_training_time
= total_end_time - total_start_time
print(f"Total training time: {total_training_time:.2f} seconds")
return train_losses, val_losses, train_accuracies, val_accuracies, homogeneity_values
# Function to train the model with traditional
backpropagation
def train_traditional(model, train_loader, val_loader, num_epochs, learning_rate):
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(),
lr=learning_rate)
train_losses =
[]
val_losses =
[]
train_accuracies
= []
val_accuracies
= []
total_start_time
= time.time()
iteration_counter
= 1
for epoch in range(num_epochs):
epoch_start_time
= time.time()
model.train()
epoch_train_loss
= 0
epoch_train_correct
= 0
epoch_train_total
= 0
for batch_idx,
(data, target) in enumerate(train_loader):
iteration_counter += 1
data, target
= data.to(device), target.to(device)
optimizer.zero_grad()
# Flatten MNIST data if needed
data = data.view(data.size(0), -1) # Always flatten for MNIST
output =
model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
# Calculate total_bp_update
for each iteration
total_bp_update = 0
for param in model.parameters():
if param.grad is not None:
total_bp_update += torch.sum(torch.abs(param.grad)).item()
epoch_train_loss += loss.item()
_, predicted
= torch.max(output.data, 1)
epoch_train_total += target.size(0)
epoch_train_correct += (predicted == target).sum().item()
epoch_train_loss
/= len(train_loader)
epoch_train_accuracy
= 100 * epoch_train_correct / epoch_train_total
train_losses.append(epoch_train_loss)
train_accuracies.append(epoch_train_accuracy)
model.eval()
epoch_val_loss
= 0
epoch_val_correct
= 0
epoch_val_total
= 0
with torch.no_grad():
for data, target in val_loader:
data, target = data.to(device), target.to(device)
# Flatten MNIST data if needed
data = data.view(data.size(0), -1) # Always flatten
for MNIST
output = model(data)
loss = criterion(output, target)
epoch_val_loss += loss.item()
_, predicted = torch.max(output.data, 1)
epoch_val_total += target.size(0)
epoch_val_correct += (predicted == target).sum().item()
epoch_val_loss
/= len(val_loader)
epoch_val_accuracy
= 100 * epoch_val_correct / epoch_val_total
val_losses.append(epoch_val_loss)
val_accuracies.append(epoch_val_accuracy)
epoch_end_time
= time.time()
epoch_time
= epoch_end_time - epoch_start_time
print(f"Epoch [{epoch + 1}/{num_epochs}], "
f"Train Loss: {epoch_train_loss:.4f}, "
f"Train Accuracy: {epoch_train_accuracy:.2f}%,
"
f"Val Loss: {epoch_val_loss:.4f}, "
f"Val Accuracy: {epoch_val_accuracy:.2f}%")
print(f"Epoch processing time: {epoch_time:.2f} seconds")
total_end_time
= time.time()
total_training_time
= total_end_time - total_start_time
print(f"Total training time: {total_training_time:.2f} seconds")
return train_losses, val_losses, train_accuracies, val_accuracies
# Function to calculate accuracy (Modified)
def calculate_accuracy(model, data_loader):
model.eval()
correct = 0
total = 0
with torch.no_grad():
for data, target in data_loader:
data, target
= data.to(device), target.to(device)
# Flatten MNIST data if needed
data = data.view(data.size(0), -1) # Always flatten for MNIST
outputs =
model(data)
_, predicted
= torch.max(outputs.data, 1)
total += target.size(0)
correct +=
(predicted == target).sum().item()
return 100 * correct / total
# Hyperparameters
num_epochs = 16
learning_rate = 0.01
batch_size = 210
homogeneity_learning_rate = 0.005
input_size = 28 * 28 # For MNIST
hidden_size = 128
output_size = 10
# Homogeneity Hyperparameters
lambda_type = 'dynamic' # Options: 'fixed', 'linear', 'dynamic'
lambda_value = 0.9 # Default value for lambda if
lambda_type is 'fixed'
# Load and split MNIST dataset
transform = transforms.ToTensor()
full_dataset = datasets.MNIST(root='./data', train=True, download=True,
transform=transform)
test_dataset = datasets.MNIST(root='./data', train=False, download=True,
transform=transform)
training_samples_number = int(0.7 * len(full_dataset))
total_iterations = math.ceil(training_samples_number / batch_size)
* num_epochs
print("Hyperparameters and Calculated
Values:")
print(f" num_epochs: {num_epochs}")
print(f" learning_rate: {learning_rate}")
print(f" homogeneity_learning_rate: {homogeneity_learning_rate}")
print(f" batch_size: {batch_size}")
print(f" training_samples_number: {training_samples_number}")
print(f" total_iterations: {total_iterations}")
print(f" lambda_type: {lambda_type}")
print(f" lambda_value: {lambda_value}") # Only relevant if lambda_type
is 'fixed'
print("\n")
train_size = int(0.7 * len(full_dataset))
val_size = int(0.15 * len(full_dataset))
test_size = len(full_dataset) - train_size - val_size
train_dataset, val_dataset, _ = random_split(full_dataset, [train_size, val_size, test_size])
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)
# Initialize models
model_homogeneity = Net(input_size,
hidden_size, output_size).to(device)
model_traditional = Net(input_size,
hidden_size, output_size).to(device)
average_model = Net(input_size,
hidden_size, output_size).to(device)
average = torch.cat([p.data.flatten() for p in average_model.parameters()]).detach()
# Train the models
print("Training with
Homogeneity-driven update:")
results_homogeneity = train_homogeneity_driven(model_homogeneity,
train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average, total_iterations,
batch_size, lambda_type, lambda_value)
print("\nTraining
with traditional backpropagation:")
results_traditional = train_traditional(model_traditional, train_loader, val_loader, num_epochs, learning_rate)
# Evaluate and plot results
train_losses_h, val_losses_h, train_accuracies_h, val_accuracies_h,
homogeneity_values_h = results_homogeneity
train_losses_t, val_losses_t, train_accuracies_t, val_accuracies_t
= results_traditional
print(f"\nFinal Test Accuracy
(Homogeneity-driven): {calculate_accuracy(model_homogeneity,
test_loader):.2f}%")
print(f"Final Test Accuracy (Traditional): {calculate_accuracy(model_traditional,
test_loader):.2f}%")
plt.figure(figsize=(10, 5))
plt.plot(train_losses_h, label='Homogeneity-Driven Train Loss')
plt.plot(val_losses_h, label='Homogeneity-Driven Validation Loss')
plt.plot(train_losses_t, label='Traditional Train Loss')
plt.plot(val_losses_t, label='Traditional Validation Loss')
plt.title('Training and Validation
Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()
plt.figure(figsize=(10, 5))
plt.plot(train_accuracies_h, label='Homogeneity-Driven Train Accuracy')
plt.plot(val_accuracies_h, label='Homogeneity-Driven Validation Accuracy')
plt.plot(train_accuracies_t, label='Traditional Train Accuracy')
plt.plot(val_accuracies_t, label='Traditional Validation Accuracy')
plt.title('Training and
Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
plt.figure(figsize=(10, 5))
plt.plot(homogeneity_values_h,
label='Homogeneity')
plt.title('Homogeneity
Values over Iterations')
plt.xlabel('Iteration')
plt.ylabel('Homogeneity')
plt.legend()
plt.show()
# END OF THE CODE
=========================================
Samples of run:
-------------------------------------------------------
Samples [COMPLETE-MNIST] 1:
-------------------------------------------------------
Hyperparameters and Calculated Values:
num_epochs: 10
learning_rate: 0.01
homogeneity_learning_rate: 0.001
batch_size: 210
training_samples_number: 42000
total_iterations: 2000
lambda_type: dynamic
Training with Homogeneity-driven update:
Total training time: 90.20 seconds
Training with traditional backpropagation:
Total training time: 69.26 seconds
Final Test Accuracy (Homogeneity-driven): 96.87%
Final Test Accuracy (Traditional): 96.77%

-------------------------------------------------------
Samples [COMPLETE-MNIST] 2:
-------------------------------------------------------
Hyperparameters and Calculated Values:
num_epochs: 16
learning_rate: 0.01
homogeneity_learning_rate: 0.005
batch_size: 210
training_samples_number: 42000
total_iterations: 3200
lambda_type: dynamic
Training with Homogeneity-driven update:
Total training time: 142.09 seconds
Training with traditional backpropagation:
Total training time: 112.01 seconds
Final Test Accuracy (Homogeneity-driven): 97.07%
Final Test Accuracy (Traditional): 96.66%

-------------------------------------------------------
Samples [COMPLETE-MNIST] 3:
-------------------------------------------------------
Hyperparameters and Calculated Values:
num_epochs: 16
learning_rate: 0.01
homogeneity_learning_rate: 0.005
batch_size: 105
training_samples_number: 42000
total_iterations: 6400
lambda_type: dynamic
Training with Homogeneity-driven update:
Total training time: 184.55 seconds
Training with traditional backpropagation:
Total training time: 123.41 seconds
Final Test Accuracy (Homogeneity-driven): 97.18%
Final Test Accuracy (Traditional): 96.60%

-------------------------------------------------------
Samples [COMPLETE-MNIST] 4:
-------------------------------------------------------
Hyperparameters and Calculated Values:
num_epochs: 8
learning_rate: 0.01
homogeneity_learning_rate: 0.01
batch_size: 105
training_samples_number: 42000
total_iterations: 3200
lambda_type: linear
Training with Homogeneity-driven update:
Total training time: 91.41 seconds
Training with traditional backpropagation:
Total training time: 61.28 seconds
Final Test Accuracy (Homogeneity-driven): 96.60%
Final Test Accuracy (Traditional): 96.38%

-------------------------------------------------------
Samples [COMPLETE-MNIST] 5:
-------------------------------------------------------
Hyperparameters and Calculated Values:
num_epochs: 8
learning_rate: 0.01
homogeneity_learning_rate: 0.01
batch_size: 1050
training_samples_number: 42000
total_iterations: 320
lambda_type: linear
Training with Homogeneity-driven update:
Total training time: 53.81 seconds
Training with traditional backpropagation:
Total training time: 50.97 seconds
Final Test Accuracy (Homogeneity-driven): 97.27%
Final Test Accuracy (Traditional): 97.00%

-------------------------------------------------------
Samples [COMPLETE-MNIST] 6:
-------------------------------------------------------
Hyperparameters and Calculated Values:
num_epochs: 8
learning_rate: 0.01
homogeneity_learning_rate: 0.01
batch_size: 2100
training_samples_number: 42000
total_iterations: 160
lambda_type: linear
Training with Homogeneity-driven update:
Total training time: 53.00 seconds
Training with traditional backpropagation:
Total training time: 51.18 seconds
Final Test Accuracy (Homogeneity-driven): 96.76%
Final Test Accuracy (Traditional): 96.32%

-------------------------------------------------------
Samples [COMPLETE-MNIST] 7:
-------------------------------------------------------
Hyperparameters and Calculated Values:
num_epochs: 8
learning_rate: 0.01
homogeneity_learning_rate: 0.01
batch_size: 2100
training_samples_number: 42000
total_iterations: 160
lambda_type: fixed
lambda_value:
0.9
Training with Homogeneity-driven update:
Total training time: 59.20 seconds
Training with traditional backpropagation:
Total training time: 55.72 seconds
Final Test Accuracy (Homogeneity-driven): 96.67%
Final Test Accuracy (Traditional): 96.52%

-------------------------------------------------------
Samples [COMPLETE-MNIST] 8:
-------------------------------------------------------
Hyperparameters and Calculated Values:
num_epochs: 8
learning_rate: 0.01
homogeneity_learning_rate: 0.01
batch_size: 2100
training_samples_number: 42000
total_iterations: 160
lambda_type: fixed lambda_value: 0.5
Training with Homogeneity-driven update:
Total training time: 52.22 seconds
Training with traditional backpropagation:
Total training time: 51.21 seconds
Final Test Accuracy (Homogeneity-driven): 96.80%
Final Test Accuracy (Traditional): 96.73%

-------------------------------------------------------
Samples [COMPLETE-MNIST] 9:
-------------------------------------------------------
Hyperparameters and Calculated Values:
num_epochs: 8
learning_rate: 0.01
homogeneity_learning_rate: 0.01
batch_size: 2100
training_samples_number: 42000
total_iterations: 160
lambda_type: fixed lambda_value: 0.1
Training with Homogeneity-driven update:
Total training time: 53.95 seconds
Training with traditional backpropagation:
Total training time: 51.14 seconds
Final Test Accuracy (Homogeneity-driven): 96.81%
Final Test Accuracy (Traditional): 96.72%

-------------------------------------------------------
Samples [COMPLETE-MNIST] 10:
-------------------------------------------------------
Hyperparameters and Calculated Values:
num_epochs: 8
learning_rate: 0.01
homogeneity_learning_rate: 0.001
batch_size: 105
training_samples_number: 42000
total_iterations: 3200
lambda_type: fixed lambda_value: 0.9
Training with Homogeneity-driven update:
Total training time: 94.03 seconds
Training with traditional backpropagation:
Epoch processing time: 7.23 seconds
Total training time: 61.98 seconds
Final Test Accuracy (Homogeneity-driven): 96.57%
Final Test Accuracy (Traditional): 96.57%

-------------------------------------------------------
Samples [COMPLETE-MNIST] 11:
-------------------------------------------------------
Hyperparameters and Calculated Values:
num_epochs: 16
learning_rate: 0.01
homogeneity_learning_rate: 0.005
batch_size: 21000
training_samples_number: 42000
total_iterations: 32
lambda_type: fixed lambda_value: 0.9
Training with Homogeneity-driven update:
Total training time: 105.42 seconds
Training with traditional backpropagation:
Total training time: 106.84 seconds
Final Test Accuracy (Homogeneity-driven): 93.55%
Final Test Accuracy (Traditional): 93.02%

-------------------------------------------------------
Samples [COMPLETE-MNIST] 12:
-------------------------------------------------------
Hyperparameters and Calculated Values:
num_epochs: 16
learning_rate: 0.01
homogeneity_learning_rate: 0.005
batch_size: 5
training_samples_number: 42000
total_iterations: 134400
lambda_type: fixed lambda_value: 0.9
Training with Homogeneity-driven update:
Total training time: 2068.69 seconds
Training with traditional backpropagation:
Total training time: 785.92 seconds
Final Test Accuracy (Homogeneity-driven): 94.54%
Final Test Accuracy (Traditional): 94.47%

-------------------------------------------------------
Samples [COMPLETE-MNIST] 13:
-------------------------------------------------------
Hyperparameters and Calculated Values:
num_epochs: 4
learning_rate: 0.01
homogeneity_learning_rate: 0.005
batch_size: 2
training_samples_number: 42000
total_iterations: 84000
lambda_type: fixed lambda_value: 0.9
Training with Homogeneity-driven update:
Total training time: 1199.32 seconds
Training with traditional backpropagation:
Total training time: 377.90 seconds
Final Test Accuracy (Homogeneity-driven): 92.46%
Final Test Accuracy (Traditional): 90.46%

-------------------------------------------------------
Samples [COMPLETE-MNIST] 14:
-------------------------------------------------------
Hyperparameters and Calculated Values:
num_epochs: 20
learning_rate: 0.01
homogeneity_learning_rate: 0.005
batch_size: 42000
training_samples_number: 42000
total_iterations: 20
lambda_type: fixed lambda_value: 0.9
Training with Homogeneity-driven update:
Total training time: 131.89 seconds
Training with traditional backpropagation:
Total training time: 132.32 seconds
Final Test Accuracy (Homogeneity-driven): 91.40%
Final Test Accuracy (Traditional): 91.30%

-------------------------------------------------------
Samples [COMPLETE-MNIST] 15:
-------------------------------------------------------
Hyperparameters and Calculated Values:
num_epochs: 4
learning_rate: 0.01
homogeneity_learning_rate: 0.01
batch_size: 1
training_samples_number: 42000
total_iterations: 168000
lambda_type: linear
Training with Homogeneity-driven update:
Total training time: 2556.08 seconds
Training with traditional backpropagation:
Total training time: 833.99 seconds
Final Test Accuracy (Homogeneity-driven): 89.85%
Final Test Accuracy (Traditional): 87.74%
