Beyond Backpropagation: Smarter Neural Networks for Smart Manufacturing

(Code and results of experiments related to the article submitted to ISM-2025: “International Conference on Industry of the Future and Smart Manufacturing”)

ABSTRACT

As neural networks (NNs) become integral to advanced applications in smart manufacturing, the demand for models that are both accurate and robust continues to grow. A persistent challenge in NN training lies in avoiding local minima, which can hinder the model’s ability to minimize the loss function effectively—both in fitting training data and generalizing to unseen test data, thereby achieving globally optimal performance. To address this, we propose an extension to traditional backpropagation, incorporating a self-adaptive mechanism that encourages exploration of underutilized regions of the optimization landscape. This method adds an auxiliary objective to the training process, complementing gradient-based exploitation with an exploration component that dynamically adjusts the network’s internal state. We provide a mathematical formulation of the algorithm and conduct comparative experiments showing that our approach achieves lower training loss and superior accuracy. We analyze its connections to existing methods such as momentum and entropy-based regularization, emphasizing its unique contributions. Finally, we discuss the implications for the industry of the future, where NNs must perform reliably under dynamic, real-world conditions. By enabling smarter, self-critical models, this approach advances the development of more reliable and adaptive NNs for smart manufacturing.

CODE AND EXPERIMENTS

Part I (Restricted Implementation)

(Simplified option when homogeneity gradient depends only on current change without history part)

1. HOMOGENEITY NN TRAINING (FINAL CODE) [manual setup for learning rate, labda depends on ] –

[A-A] – MNIST DATASET

=========================================

# BEGINNING OF THE CODE

import torch

import torch.nn as nn

import torch.optim as optim

from torch.utils.data import DataLoader, random_split

from torchvision import datasets, transforms

import numpy as np

import matplotlib.pyplot as plt

import time

import math

# Device configuration

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Define the neural network model

class Net(nn.Module):

def __init__(self):

super(Net, self).__init__()

self.fc1 = nn.Linear(28 * 28, 128)

self.relu = nn.ReLU()

self.fc2 = nn.Linear(128, 10)

def forward(self, x):

x = x.view(-1, 28 * 28)

x = self.fc1(x)

x = self.relu(x)

x = self.fc2(x)

return x

# Function to calculate Absolute Difference Similarity (ADS)

def calculate_similarity(status, average, epsilon=1e-8):

num = torch.sum(torch.abs(status - average))

den = epsilon + torch.sum(torch.abs(status) + torch.abs(average))

similarity = 1 - (num / den)

return similarity

# Function to calculate partial derivative of Homogeneity

def calculate_partial_derivative(status, average, homogeneity_lambda, epsilon=1e-8):

N = torch.sum(torch.abs(status - average))

D = epsilon + torch.sum(torch.abs(status) + torch.abs(average))

condition1 = torch.logical_or(torch.logical_and(status > average, average > 0),

torch.logical_and(status < average, average < 0))

condition2 = torch.logical_or(torch.logical_and(status > 0, average < 0),

torch.logical_and(status < 0, average > 0))

condition3 = torch.logical_and(status == 0, average > 0)

condition4 = torch.logical_and(status == 0, average < 0)

condition5 = status == average

partial_derivative = torch.zeros_like(status)

partial_derivative[condition1] = (1 - homogeneity_lambda) * (1 / D**2) * (D - N)

partial_derivative[condition2] = (1 - homogeneity_lambda) * (1 / D**2) * (D + N)

partial_derivative[condition3] = -(1 - homogeneity_lambda) / D

partial_derivative[condition4] = (1 - homogeneity_lambda) / D

partial_derivative[condition5] = 0

remaining_indices = torch.logical_not(torch.logical_or(torch.logical_or(torch.logical_or(condition1, condition2),

torch.logical_or(condition3, condition4)),

condition5))

partial_derivative[remaining_indices] = (1 - homogeneity_lambda) * (1 / D**2) * (

D * torch.sign(status[remaining_indices] - average[remaining_indices]) -

N * torch.sign(status[remaining_indices])

)

return partial_derivative

# Function to perform Homogeneity-driven weight update

def weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average, epsilon=1e-8):

partial_derivative = calculate_partial_derivative(status, average, homogeneity_lambda, epsilon)

delta = -homogeneity_learning_rate * partial_derivative

return delta

# Function to train the model with Homogeneity-driven update

def train_homogeneity_driven(model, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average):

criterion = nn.CrossEntropyLoss()

optimizer = optim.Adam(model.parameters(), lr=learning_rate)

optimizer_homogeneity = optim.Adam(model.parameters(), lr=homogeneity_learning_rate)

train_losses = []

val_losses = []

train_accuracies = []

val_accuracies = []

homogeneity_values = []

homogeneity = 1.0

total_start_time = time.time()

iteration_counter = 1

for epoch in range(num_epochs):

epoch_start_time = time.time()

model.train()

epoch_train_loss = 0

epoch_train_correct = 0

epoch_train_total = 0

for batch_idx, (data, target) in enumerate(train_loader):

iteration_counter += 1

data, target = data.to(device), target.to(device)

# --- Backpropagation Update ---

optimizer.zero_grad()

output = model(data)

loss = criterion(output, target)

loss.backward()

optimizer.step()

total_bp_update = 0

for param in model.parameters():

if param.grad is not None:

total_bp_update += torch.sum(torch.abs(param.grad)).item()

# --- Homogeneity-driven Update ---

status = torch.cat([p.data.flatten() for p in model.parameters()]).detach()

weights_before_update = status.clone()

# print("Model weights before homogeneity update (first 5 values of fc1.weight):")

# print(model.fc1.weight.data[:5])

similarity = calculate_similarity(status, average)

homogeneity_lambda = (iteration_counter - 1) / total_iterations

homogeneity = (1 - homogeneity_lambda) * similarity + homogeneity_lambda * homogeneity

homogeneity_values.append(homogeneity.item())

delta = weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average)

current_index = 0

for param in model.parameters():

param_size = param.nelement()

update_value = delta[current_index: current_index + param_size].view_as(param.data)

param.grad = -update_value

current_index += param_size

optimizer_homogeneity.step()

optimizer_homogeneity.zero_grad()

# print("\nModel weights after homogeneity update (first 5 values of fc1.weight):")

# print(model.fc1.weight.data[:5])

weights_after_update = torch.cat([p.data.flatten() for p in model.parameters()]).detach()

distance = torch.norm(weights_before_update - weights_after_update)

# print(f"Distance between weights before and after homogeneity update: {distance.item()}\n")

average = (status + (epoch * len(train_loader) + batch_idx) * average) / (

epoch * len(train_loader) + batch_idx + 1)

optimizer.zero_grad()

# print(f"Epoch [{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "

# f"Backpropagation Update: {total_bp_update:.4f}, Homogeneity Update: {torch.sum(torch.abs(delta)).item():.4f}, "

# f"homogeneity_lambda: {homogeneity_lambda:.4f}, similarity: {similarity:.4f}, homogeneity: {homogeneity:.4f}")

epoch_train_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_train_total += target.size(0)

epoch_train_correct += (predicted == target).sum().item()

epoch_train_loss /= len(train_loader)

epoch_train_accuracy = 100 * epoch_train_correct / epoch_train_total

train_losses.append(epoch_train_loss)

train_accuracies.append(epoch_train_accuracy)

model.eval()

epoch_val_loss = 0

epoch_val_correct = 0

epoch_val_total = 0

with torch.no_grad():

for data, target in val_loader:

data, target = data.to(device), target.to(device)

output = model(data)

loss = criterion(output, target)

epoch_val_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_val_total += target.size(0)

epoch_val_correct += (predicted == target).sum().item()

epoch_val_loss /= len(val_loader)

epoch_val_accuracy = 100 * epoch_val_correct / epoch_val_total

val_losses.append(epoch_val_loss)

val_accuracies.append(epoch_val_accuracy)

epoch_end_time = time.time()

epoch_time = epoch_end_time - epoch_start_time

print(f"Epoch [{epoch + 1}/{num_epochs}], "

f"Train Loss: {epoch_train_loss:.4f}, "

f"Train Accuracy: {epoch_train_accuracy:.2f}%, "

f"Val Loss: {epoch_val_loss:.4f}, "

f"Val Accuracy: {epoch_val_accuracy:.2f}%")

print(f"Epoch processing time: {epoch_time:.2f} seconds")

total_end_time = time.time()

total_training_time = total_end_time - total_start_time

print(f"Total training time: {total_training_time:.2f} seconds")

return train_losses, val_losses, train_accuracies, val_accuracies, homogeneity_values

# Function to train the model with traditional backpropagation

def train_traditional(model, train_loader, val_loader, num_epochs, learning_rate):

criterion = nn.CrossEntropyLoss()

optimizer = optim.Adam(model.parameters(), lr=learning_rate)

train_losses = []

val_losses = []

train_accuracies = []

val_accuracies = []

total_start_time = time.time()

iteration_counter = 1 # Initialize iteration_counter here

for epoch in range(num_epochs):

epoch_start_time = time.time()

model.train()

epoch_train_loss = 0

epoch_train_correct = 0

epoch_train_total = 0

for batch_idx, (data, target) in enumerate(train_loader):

iteration_counter += 1

data, target = data.to(device), target.to(device)

optimizer.zero_grad()

output = model(data)

loss = criterion(output, target)

loss.backward()

optimizer.step()

# Calculate total_bp_update for each iteration

total_bp_update = 0

for param in model.parameters():

if param.grad is not None:

total_bp_update += torch.sum(torch.abs(param.grad)).item()

# print(f"Epoch [{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "

# f"Backpropagation Update: {total_bp_update:.4f}")

epoch_train_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_train_total += target.size(0)

epoch_train_correct += (predicted == target).sum().item()

epoch_train_loss /= len(train_loader)

epoch_train_accuracy = 100 * epoch_train_correct / epoch_train_total

train_losses.append(epoch_train_loss)

train_accuracies.append(epoch_train_accuracy)

model.eval()

epoch_val_loss = 0

epoch_val_correct = 0

epoch_val_total = 0

with torch.no_grad():

for data, target in val_loader:

data, target = data.to(device), target.to(device)

output = model(data)

loss = criterion(output, target)

epoch_val_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_val_total += target.size(0)

epoch_val_correct += (predicted == target).sum().item()

epoch_val_loss /= len(val_loader)

epoch_val_accuracy = 100 * epoch_val_correct / epoch_val_total

val_losses.append(epoch_val_loss)

val_accuracies.append(epoch_val_accuracy)

epoch_end_time = time.time()

epoch_time = epoch_end_time - epoch_start_time

print(f"Epoch [{epoch + 1}/{num_epochs}], "

f"Train Loss: {epoch_train_loss:.4f}, "

f"Train Accuracy: {epoch_train_accuracy:.2f}%, "

f"Val Loss: {epoch_val_loss:.4f}, "

f"Val Accuracy: {epoch_val_accuracy:.2f}%")

print(f"Epoch processing time: {epoch_time:.2f} seconds")

total_end_time = time.time()

total_training_time = total_end_time - total_start_time

print(f"Total training time: {total_training_time:.2f} seconds")

return train_losses, val_losses, train_accuracies, val_accuracies

# Function to calculate accuracy

def calculate_accuracy(model, data_loader):

model.eval()

correct = 0

total = 0

with torch.no_grad():

for images, labels in data_loader:

images, labels = images.to(device), labels.to(device)

outputs = model(images)

_, predicted = torch.max(outputs.data, 1)

total += labels.size(0)

correct += (predicted == labels).sum().item()

return 100 * correct / total

# !!!!!!!!!!!! Hyperparameters !!!!!!!!!!!!!!!!!!!!!

num_epochs = 8

learning_rate = 0.1

batch_size = 420

homogeneity_learning_rate = 0.1

# Load and split MNIST dataset

transform = transforms.ToTensor()

full_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)

test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

training_samples_number = int(0.7 * len(full_dataset))

total_iterations = math.ceil(training_samples_number / batch_size) * num_epochs

print("Hyperparameters and Calculated Values:")

print(f" num_epochs: {num_epochs}")

print(f" learning_rate: {learning_rate}")

print(f" homogeneity_learning_rate: {homogeneity_learning_rate}")

print(f" batch_size: {batch_size}")

print(f" training_samples_number: {training_samples_number}")

print(f" total_iterations: {total_iterations}")

print("\n")

train_size = int(0.7 * len(full_dataset))

val_size = int(0.15 * len(full_dataset))

test_size = len(full_dataset) - train_size - val_size

train_dataset, val_dataset, _ = random_split(full_dataset, [train_size, val_size, test_size])

train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)

test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

# Initialize models

model_homogeneity = Net().to(device)

model_traditional = Net().to(device)

average_model = Net().to(device)

average = torch.cat([p.data.flatten() for p in average_model.parameters()]).detach()

# Train the models

print("Training with Homogeneity-driven update:")

results_homogeneity = train_homogeneity_driven(model_homogeneity, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average)

print("\nTraining with traditional backpropagation:")

results_traditional = train_traditional(model_traditional, train_loader, val_loader, num_epochs, learning_rate)

# Evaluate and plot results

train_losses_h, val_losses_h, train_accuracies_h, val_accuracies_h, homogeneity_values_h = results_homogeneity

train_losses_t, val_losses_t, train_accuracies_t, val_accuracies_t = results_traditional

print(f"\nFinal Test Accuracy (Homogeneity-driven): {calculate_accuracy(model_homogeneity, test_loader):.2f}%")

print(f"Final Test Accuracy (Traditional): {calculate_accuracy(model_traditional, test_loader):.2f}%")

plt.figure(figsize=(10, 5))

plt.plot(train_losses_h, label='Homogeneity-Driven Train Loss')

plt.plot(val_losses_h, label='Homogeneity-Driven Validation Loss')

plt.plot(train_losses_t, label='Traditional Train Loss')

plt.plot(val_losses_t, label='Traditional Validation Loss')

plt.title('Training and Validation Loss')

plt.xlabel('Epoch')

plt.ylabel('Loss')

plt.legend()

plt.show()

plt.figure(figsize=(10, 5))

plt.plot(train_accuracies_h, label='Homogeneity-Driven Train Accuracy')

plt.plot(val_accuracies_h, label='Homogeneity-Driven Validation Accuracy')

plt.plot(train_accuracies_t, label='Traditional Train Accuracy')

plt.plot(val_accuracies_t, label='Traditional Validation Accuracy')

plt.title('Training and Validation Accuracy')

plt.xlabel('Epoch')

plt.ylabel('Accuracy')

plt.legend()

plt.show()

plt.figure(figsize=(10, 5))

plt.plot(homogeneity_values_h, label='Homogeneity')

plt.title('Homogeneity Values over Iterations')

plt.xlabel('Iteration')

plt.ylabel('Homogeneity')

plt.legend()

plt.show()

# END OF THE CODE

=========================================

Samples of run:

-------------------------------------------------------

Sample [A-A-MNIST] 1:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 8

learning_rate: 0.01

homogeneity_learning_rate: 0.0005

batch_size: 140

training_samples_number: 42000

total_iterations: 2400

Training with Homogeneity-driven update:

Total training time: 82.51 seconds

Training with traditional backpropagation:

Total training time: 58.64 seconds

Final Test Accuracy (Homogeneity-driven): 97.12%

Final Test Accuracy (Traditional): 97.05%

-------------------------------------------------------

Sample [A-A-MNIST] 2:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 8

learning_rate: 0.01

homogeneity_learning_rate: 0.001

batch_size: 280

training_samples_number: 42000

total_iterations: 1200

Training with Homogeneity-driven update:

Total training time: 67.44 seconds

Training with traditional backpropagation:

Total training time: 53.19 seconds

Final Test Accuracy (Homogeneity-driven): 97.18%

Final Test Accuracy (Traditional): 97.18%

-------------------------------------------------------

Sample [A-A-MNIST] 3:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 8

learning_rate: 0.01

homogeneity_learning_rate: 0.001

batch_size: 420

training_samples_number: 42000

total_iterations: 800

Training with Homogeneity-driven update:

Total training time: 61.46 seconds

Training with traditional backpropagation:

Total training time: 51.95 seconds

Final Test Accuracy (Homogeneity-driven): 97.53%

Final Test Accuracy (Traditional): 97.49%

-------------------------------------------------------

Sample [A-A-MNIST] 4:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 1

learning_rate: 0.01

homogeneity_learning_rate: 0.02

batch_size: 42000

training_samples_number: 42000

total_iterations: 1

Training with Homogeneity-driven update:

Total training time: 7.19 seconds

Training with traditional backpropagation:

Total training time: 6.57 seconds

Final Test Accuracy (Homogeneity-driven): 64.99%

Final Test Accuracy (Traditional): 58.69%

-------------------------------------------------------

2. HOMOGENEITY NN TRAINING (Option with manual setting of learning rate and lambda depends on current iteration ) – [A-B] – MNIST Dataset

=========================================

# BEGINNING OF THE CODE

import torch

import torch.nn as nn

import torch.optim as optim

from torch.utils.data import DataLoader, random_split

from torchvision import datasets, transforms

import numpy as np

import matplotlib.pyplot as plt

import time

import math

# Device configuration

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Define the neural network model

class Net(nn.Module):

def __init__(self):

super(Net, self).__init__()

self.fc1 = nn.Linear(28 * 28, 128)

self.relu = nn.ReLU()

self.fc2 = nn.Linear(128, 10)

def forward(self, x):

x = x.view(-1, 28 * 28)

x = self.fc1(x)

x = self.relu(x)

x = self.fc2(x)

return x

# Function to calculate Absolute Difference Similarity (ADS)

def calculate_similarity(status, average, epsilon=1e-8):

num = torch.sum(torch.abs(status - average))

den = epsilon + torch.sum(torch.abs(status) + torch.abs(average))

similarity = 1 - (num / den)

return similarity

# Function to calculate partial derivative of Homogeneity

def calculate_partial_derivative(status, average, homogeneity_lambda, epsilon=1e-8):

N = torch.sum(torch.abs(status - average))

D = epsilon + torch.sum(torch.abs(status) + torch.abs(average))

condition1 = torch.logical_or(torch.logical_and(status > average, average > 0),

torch.logical_and(status < average, average < 0))

condition2 = torch.logical_or(torch.logical_and(status > 0, average < 0),

torch.logical_and(status < 0, average > 0))

condition3 = torch.logical_and(status == 0, average > 0)

condition4 = torch.logical_and(status == 0, average < 0)

condition5 = status == average

partial_derivative = torch.zeros_like(status)

partial_derivative[condition1] = (1 - homogeneity_lambda) * (1 / D**2) * (D - N)

partial_derivative[condition2] = (1 - homogeneity_lambda) * (1 / D**2) * (D + N)

partial_derivative[condition3] = -(1 - homogeneity_lambda) / D

partial_derivative[condition4] = (1 - homogeneity_lambda) / D

partial_derivative[condition5] = 0

remaining_indices = torch.logical_not(torch.logical_or(torch.logical_or(torch.logical_or(condition1, condition2),

torch.logical_or(condition3, condition4)),

condition5))

partial_derivative[remaining_indices] = (1 - homogeneity_lambda) * (1 / D**2) * (

D * torch.sign(status[remaining_indices] - average[remaining_indices]) -

N * torch.sign(status[remaining_indices])

)

return partial_derivative

# Function to perform Homogeneity-driven weight update

def weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average, epsilon=1e-8):

partial_derivative = calculate_partial_derivative(status, average, homogeneity_lambda, epsilon)

delta = -homogeneity_learning_rate * partial_derivative

return delta

# Function to train the model with Homogeneity-driven update

def train_homogeneity_driven(model, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average):

criterion = nn.CrossEntropyLoss()

optimizer = optim.Adam(model.parameters(), lr=learning_rate)

optimizer_homogeneity = optim.Adam(model.parameters(), lr=homogeneity_learning_rate)

train_losses = []

val_losses = []

train_accuracies = []

val_accuracies = []

homogeneity_values = []

homogeneity = 1.0

total_start_time = time.time()

iteration_counter = 1

for epoch in range(num_epochs):

epoch_start_time = time.time()

model.train()

epoch_train_loss = 0

epoch_train_correct = 0

epoch_train_total = 0

for batch_idx, (data, target) in enumerate(train_loader):

iteration_counter += 1

data, target = data.to(device), target.to(device)

# --- Backpropagation Update ---

optimizer.zero_grad()

output = model(data)

loss = criterion(output, target)

loss.backward()

optimizer.step()

total_bp_update = 0

for param in model.parameters():

if param.grad is not None:

total_bp_update += torch.sum(torch.abs(param.grad)).item()

# --- Homogeneity-driven Update ---

status = torch.cat([p.data.flatten() for p in model.parameters()]).detach()

weights_before_update = status.clone()

# print("Model weights before homogeneity update (first 5 values of fc1.weight):")

# print(model.fc1.weight.data[:5])

similarity = calculate_similarity(status, average)

homogeneity_lambda = (iteration_counter - 1) / iteration_counter

homogeneity = (1 - homogeneity_lambda) * similarity + homogeneity_lambda * homogeneity

homogeneity_values.append(homogeneity.item())

delta = weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average)

current_index = 0

for param in model.parameters():

param_size = param.nelement()

update_value = delta[current_index: current_index + param_size].view_as(param.data)

param.grad = -update_value

current_index += param_size

optimizer_homogeneity.step()

optimizer_homogeneity.zero_grad()

# print("\nModel weights after homogeneity update (first 5 values of fc1.weight):")

# print(model.fc1.weight.data[:5])

weights_after_update = torch.cat([p.data.flatten() for p in model.parameters()]).detach()

distance = torch.norm(weights_before_update - weights_after_update)

# print(f"Distance between weights before and after homogeneity update: {distance.item()}\n")

average = (status + (epoch * len(train_loader) + batch_idx) * average) / (

epoch * len(train_loader) + batch_idx + 1)

optimizer.zero_grad()

# print(f"Epoch [{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "

# f"Backpropagation Update: {total_bp_update:.4f}, Homogeneity Update: {torch.sum(torch.abs(delta)).item():.4f}, "

# f"homogeneity_lambda: {homogeneity_lambda:.4f}, similarity: {similarity:.4f}, homogeneity: {homogeneity:.4f}")

epoch_train_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_train_total += target.size(0)

epoch_train_correct += (predicted == target).sum().item()

epoch_train_loss /= len(train_loader)

epoch_train_accuracy = 100 * epoch_train_correct / epoch_train_total

train_losses.append(epoch_train_loss)

train_accuracies.append(epoch_train_accuracy)

model.eval()

epoch_val_loss = 0

epoch_val_correct = 0

epoch_val_total = 0

with torch.no_grad():

for data, target in val_loader:

data, target = data.to(device), target.to(device)

output = model(data)

loss = criterion(output, target)

epoch_val_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_val_total += target.size(0)

epoch_val_correct += (predicted == target).sum().item()

epoch_val_loss /= len(val_loader)

epoch_val_accuracy = 100 * epoch_val_correct / epoch_val_total

val_losses.append(epoch_val_loss)

val_accuracies.append(epoch_val_accuracy)

epoch_end_time = time.time()

epoch_time = epoch_end_time - epoch_start_time

print(f"Epoch [{epoch + 1}/{num_epochs}], "

f"Train Loss: {epoch_train_loss:.4f}, "

f"Train Accuracy: {epoch_train_accuracy:.2f}%, "

f"Val Loss: {epoch_val_loss:.4f}, "

f"Val Accuracy: {epoch_val_accuracy:.2f}%")

print(f"Epoch processing time: {epoch_time:.2f} seconds")

total_end_time = time.time()

total_training_time = total_end_time - total_start_time

print(f"Total training time: {total_training_time:.2f} seconds")

return train_losses, val_losses, train_accuracies, val_accuracies, homogeneity_values

# Function to train the model with traditional backpropagation

def train_traditional(model, train_loader, val_loader, num_epochs, learning_rate):

criterion = nn.CrossEntropyLoss()

optimizer = optim.Adam(model.parameters(), lr=learning_rate)

train_losses = []

val_losses = []

train_accuracies = []

val_accuracies = []

total_start_time = time.time()

iteration_counter = 1 # Initialize iteration_counter here

for epoch in range(num_epochs):

epoch_start_time = time.time()

model.train()

epoch_train_loss = 0

epoch_train_correct = 0

epoch_train_total = 0

for batch_idx, (data, target) in enumerate(train_loader):

iteration_counter += 1

data, target = data.to(device), target.to(device)

optimizer.zero_grad()

output = model(data)

loss = criterion(output, target)

loss.backward()

optimizer.step()

# Calculate total_bp_update for each iteration

total_bp_update = 0

for param in model.parameters():

if param.grad is not None:

total_bp_update += torch.sum(torch.abs(param.grad)).item()

# print(f"Epoch [{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "

# f"Backpropagation Update: {total_bp_update:.4f}")

epoch_train_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_train_total += target.size(0)

epoch_train_correct += (predicted == target).sum().item()

epoch_train_loss /= len(train_loader)

epoch_train_accuracy = 100 * epoch_train_correct / epoch_train_total

train_losses.append(epoch_train_loss)

train_accuracies.append(epoch_train_accuracy)

model.eval()

epoch_val_loss = 0

epoch_val_correct = 0

epoch_val_total = 0

with torch.no_grad():

for data, target in val_loader:

data, target = data.to(device), target.to(device)

output = model(data)

loss = criterion(output, target)

epoch_val_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_val_total += target.size(0)

epoch_val_correct += (predicted == target).sum().item()

epoch_val_loss /= len(val_loader)

epoch_val_accuracy = 100 * epoch_val_correct / epoch_val_total

val_losses.append(epoch_val_loss)

val_accuracies.append(epoch_val_accuracy)

epoch_end_time = time.time()

epoch_time = epoch_end_time - epoch_start_time

print(f"Epoch [{epoch + 1}/{num_epochs}], "

f"Train Loss: {epoch_train_loss:.4f}, "

f"Train Accuracy: {epoch_train_accuracy:.2f}%, "

f"Val Loss: {epoch_val_loss:.4f}, "

f"Val Accuracy: {epoch_val_accuracy:.2f}%")

print(f"Epoch processing time: {epoch_time:.2f} seconds")

total_end_time = time.time()

total_training_time = total_end_time - total_start_time

print(f"Total training time: {total_training_time:.2f} seconds")

return train_losses, val_losses, train_accuracies, val_accuracies

# Function to calculate accuracy

def calculate_accuracy(model, data_loader):

model.eval()

correct = 0

total = 0

with torch.no_grad():

for images, labels in data_loader:

images, labels = images.to(device), labels.to(device)

outputs = model(images)

_, predicted = torch.max(outputs.data, 1)

total += labels.size(0)

correct += (predicted == labels).sum().item()

return 100 * correct / total

# !!!!!!!!!!!! Hyperparameters !!!!!!!!!!!!!!!!!!!!!

num_epochs = 8

learning_rate = 0.1

batch_size = 420

homogeneity_learning_rate = 0.1

# Load and split MNIST dataset

transform = transforms.ToTensor()

full_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)

test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

training_samples_number = int(0.7 * len(full_dataset))

total_iterations = math.ceil(training_samples_number / batch_size) * num_epochs

print("Hyperparameters and Calculated Values:")

print(f" num_epochs: {num_epochs}")

print(f" learning_rate: {learning_rate}")

print(f" homogeneity_learning_rate: {homogeneity_learning_rate}")

print(f" batch_size: {batch_size}")

print(f" training_samples_number: {training_samples_number}")

print(f" total_iterations: {total_iterations}")

print("\n")

train_size = int(0.7 * len(full_dataset))

val_size = int(0.15 * len(full_dataset))

test_size = len(full_dataset) - train_size - val_size

train_dataset, val_dataset, _ = random_split(full_dataset, [train_size, val_size, test_size])

train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)

test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

# Initialize models

model_homogeneity = Net().to(device)

model_traditional = Net().to(device)

average_model = Net().to(device)

average = torch.cat([p.data.flatten() for p in average_model.parameters()]).detach()

# Train the models

print("Training with Homogeneity-driven update:")

results_homogeneity = train_homogeneity_driven(model_homogeneity, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average)

print("\nTraining with traditional backpropagation:")

results_traditional = train_traditional(model_traditional, train_loader, val_loader, num_epochs, learning_rate)

# Evaluate and plot results

train_losses_h, val_losses_h, train_accuracies_h, val_accuracies_h, homogeneity_values_h = results_homogeneity

train_losses_t, val_losses_t, train_accuracies_t, val_accuracies_t = results_traditional

print(f"\nFinal Test Accuracy (Homogeneity-driven): {calculate_accuracy(model_homogeneity, test_loader):.2f}%")

print(f"Final Test Accuracy (Traditional): {calculate_accuracy(model_traditional, test_loader):.2f}%")

plt.figure(figsize=(10, 5))

plt.plot(train_losses_h, label='Homogeneity-Driven Train Loss')

plt.plot(val_losses_h, label='Homogeneity-Driven Validation Loss')

plt.plot(train_losses_t, label='Traditional Train Loss')

plt.plot(val_losses_t, label='Traditional Validation Loss')

plt.title('Training and Validation Loss')

plt.xlabel('Epoch')

plt.ylabel('Loss')

plt.legend()

plt.show()

plt.figure(figsize=(10, 5))

plt.plot(train_accuracies_h, label='Homogeneity-Driven Train Accuracy')

plt.plot(val_accuracies_h, label='Homogeneity-Driven Validation Accuracy')

plt.plot(train_accuracies_t, label='Traditional Train Accuracy')

plt.plot(val_accuracies_t, label='Traditional Validation Accuracy')

plt.title('Training and Validation Accuracy')

plt.xlabel('Epoch')

plt.ylabel('Accuracy')

plt.legend()

plt.show()

plt.figure(figsize=(10, 5))

plt.plot(homogeneity_values_h, label='Homogeneity')

plt.title('Homogeneity Values over Iterations')

plt.xlabel('Iteration')

plt.ylabel('Homogeneity')

plt.legend()

plt.show()

# END OF THE CODE

=========================================

Samples of run:

-------------------------------------------------------

Sample [A-B-MNIST] 1:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 8

learning_rate: 0.1

homogeneity_learning_rate: 0.1

batch_size: 420

training_samples_number: 42000

total_iterations: 800

Training with Homogeneity-driven update:

Total training time: 59.27 seconds

Training with traditional backpropagation:

Total training time: 51.43 seconds

Final Test Accuracy (Homogeneity-driven): 92.24%

Final Test Accuracy (Traditional): 90.14%

-------------------------------------------------------

Sample [A-B-MNIST] 2:

-------------------------------------------------------

Hyperparameters and Calculated Values:

  num_epochs: 18

  learning_rate: 0.001

  homogeneity_learning_rate: 0.001

  batch_size: 4200

  training_samples_number: 42000

  total_iterations: 180

Training with Homogeneity-driven update:

Total training time: 119.41 seconds

Training with traditional backpropagation:

Total training time: 115.07 seconds

Final Test Accuracy (Homogeneity-driven): 93.64%

Final Test Accuracy (Traditional): 93.27%

-------------------------------------------------------

Sample [A-B-MNIST] 3:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 16

learning_rate: 0.01

homogeneity_learning_rate: 0.008

batch_size: 21000

training_samples_number: 42000

total_iterations: 32

Training with Homogeneity-driven update:

Total training time: 104.28 seconds

Training with traditional backpropagation:

Total training time: 103.38 seconds

Final Test Accuracy (Homogeneity-driven): 94.03%

Final Test Accuracy (Traditional): 93.01%

-------------------------------------------------------

Sample [A-B-MNIST] 4:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 16

learning_rate: 0.01

homogeneity_learning_rate: 0.01

batch_size: 210

training_samples_number: 42000

total_iterations: 3200

Training with Homogeneity-driven update:

Total training time: 141.07 seconds

Training with traditional backpropagation:

Total training time: 108.91 seconds

Final Test Accuracy (Homogeneity-driven): 96.85%

Final Test Accuracy (Traditional): 96.63%

-------------------------------------------------------

Sample [A-B-MNIST] 5:

-------------------------------------------------------

Hyperparameters and Calculated Values:

  num_epochs: 16

  learning_rate: 0.01

  homogeneity_learning_rate: 0.01

  batch_size: 840

  training_samples_number: 42000

  total_iterations: 800

Training with Homogeneity-driven update:

Total training time: 109.00 seconds

Training with traditional backpropagation:

Total training time: 100.24 seconds

Final Test Accuracy (Homogeneity-driven): 97.72%

Final Test Accuracy (Traditional): 97.31%

-------------------------------------------------------

Sample [A-B-MNIST] 6:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 18

learning_rate: 0.001

homogeneity_learning_rate: 0.0008

batch_size: 2100

training_samples_number: 42000

total_iterations: 360

Training with Homogeneity-driven update:

Total training time: 119.29 seconds

Training with traditional backpropagation:

Epoch processing time: 5.94 seconds

Total training time: 115.52 seconds

Final Test Accuracy (Homogeneity-driven): 95.20%

Final Test Accuracy (Traditional): 94.99%

-------------------------------------------------------

Sample [A-B-MNIST] 7:

-------------------------------------------------------

Hyperparameters and Calculated Values:

  num_epochs: 16

  learning_rate: 0.01

  homogeneity_learning_rate: 0.005

  batch_size: 840

  training_samples_number: 42000

  total_iterations: 800

Training with Homogeneity-driven update:

Total training time: 109.48 seconds

Training with traditional backpropagation:

Total training time: 100.49 seconds

Final Test Accuracy (Homogeneity-driven): 97.69%

Final Test Accuracy (Traditional): 97.51%

-------------------------------------------------------

Sample [A-B-MNIST] 8:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 40

learning_rate: 0.01

homogeneity_learning_rate: 0.01

batch_size: 42000

training_samples_number: 42000

total_iterations: 40

Training with Homogeneity-driven update:

Total training time: 259.55 seconds

Training with traditional backpropagation:

Total training time: 257.16 seconds

Final Test Accuracy (Homogeneity-driven): 94.60%

Final Test Accuracy (Traditional): 94.24%

-------------------------------------------------------

Sample [A-B-MNIST] 9:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 40

learning_rate: 0.01

homogeneity_learning_rate: 0.001

batch_size: 42000

training_samples_number: 42000

total_iterations: 40

Training with Homogeneity-driven update:

Total training time: 258.01 seconds

Training with traditional backpropagation:

Total training time: 258.25 seconds

Final Test Accuracy (Homogeneity-driven): 94.61%

Final Test Accuracy (Traditional): 94.25%

-------------------------------------------------------

Sample [A-B-MNIST] 10:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 40

learning_rate: 0.01

homogeneity_learning_rate: 0.005

batch_size: 4200

training_samples_number: 42000

total_iterations: 400

Training with Homogeneity-driven update:

Total training time: 260.06 seconds

Training with traditional backpropagation:

Total training time: 253.40 seconds

Final Test Accuracy (Homogeneity-driven): 97.13%

Final Test Accuracy (Traditional): 97.02%

-------------------------------------------------------

Sample [A-B-MNIST] 11:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 18

learning_rate: 0.001

homogeneity_learning_rate: 0.0008

batch_size: 1050

training_samples_number: 42000

total_iterations: 720

Final Test Accuracy (Homogeneity-driven): 96.40%

Final Test Accuracy (Traditional): 96.24%

-------------------------------------------------------

Sample [A-B-MNIST] 12:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 10

learning_rate: 0.001

homogeneity_learning_rate: 0.001

batch_size: 672

training_samples_number: 42000

total_iterations: 630

Training with Homogeneity-driven update:

Total training time: 70.79 seconds

Training with traditional backpropagation:

Total training time: 63.73 seconds

Final Test Accuracy (Homogeneity-driven): 95.69%

Final Test Accuracy (Traditional): 95.63%

-------------------------------------------------------

Sample [A-B-MNIST] 13:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 40

learning_rate: 0.01

homogeneity_learning_rate: 0.005

batch_size: 420

training_samples_number: 42000

total_iterations: 4000

Training with Homogeneity-driven update:

Total training time: 304.14 seconds

Training with traditional backpropagation:

Total training time: 263.50 seconds

Final Test Accuracy (Homogeneity-driven): 97.42%

Final Test Accuracy (Traditional): 97.46%

-------------------------------------------------------

Sample [A-B-MNIST] 14:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 16

learning_rate: 0.001

homogeneity_learning_rate: 0.0008

batch_size: 672

training_samples_number: 42000

total_iterations: 1008

Training with Homogeneity-driven update:

Total training time: 116.50 seconds

Training with traditional backpropagation:

Total training time: 102.51 seconds

Final Test Accuracy (Homogeneity-driven): 96.65%

Final Test Accuracy (Traditional): 96.60%

-------------------------------------------------------

Sample [A-B-MNIST] 15:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 20

learning_rate: 0.01

homogeneity_learning_rate: 0.003

batch_size: 42

training_samples_number: 42000

total_iterations: 20000

Training with Homogeneity-driven update:

Total training time: 409.74 seconds

Training with traditional backpropagation:

Total training time: 213.06 seconds

Final Test Accuracy (Homogeneity-driven): 96.65%

Final Test Accuracy (Traditional): 96.07%

-------------------------------------------------------

Sample [A-B-MNIST] 16:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 20

learning_rate: 0.01

homogeneity_learning_rate: 0.01

batch_size: 168

training_samples_number: 42000

total_iterations: 5000

Training with Homogeneity-driven update:

Total training time: 199.35 seconds

Training with traditional backpropagation:

Total training time: 144.97 seconds

Final Test Accuracy (Homogeneity-driven): 97.07%

Final Test Accuracy (Traditional): 96.94%

-------------------------------------------------------

Sample [A-B-MNIST] 17:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 20

learning_rate: 0.01

homogeneity_learning_rate: 0.005

batch_size: 336

training_samples_number: 42000

total_iterations: 2500

Training with Homogeneity-driven update:

Total training time: 160.23 seconds

Training with traditional backpropagation:

Total training time: 131.83 seconds

Final Test Accuracy (Homogeneity-driven): 97.34%

Final Test Accuracy (Traditional): 97.16%

-------------------------------------------------------

Sample [A-B-MNIST] 18:

-------------------------------------------------------

Hyperparameters and Calculated Values:

  num_epochs: 24

  learning_rate: 0.001

  homogeneity_learning_rate: 0.0005

  batch_size: 4200

  training_samples_number: 42000

  total_iterations: 240

Training with Homogeneity-driven update:

Total training time: 157.12 seconds

Training with traditional backpropagation:

Total training time: 153.61 seconds

Final Test Accuracy (Homogeneity-driven): 94.25%

Final Test Accuracy (Traditional): 94.03%

-------------------------------------------------------

Sample [A-B-MNIST] 19:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 12

learning_rate: 0.1

homogeneity_learning_rate: 0.001

batch_size: 3360

training_samples_number: 42000

total_iterations: 156

Training with Homogeneity-driven update:

Epoch [1/12], Train Loss: 8.7319, Train Accuracy: 30.93%, Val Loss: 1.4389, Val Accuracy: 48.40%

Epoch processing time: 7.59 seconds

Epoch [2/12], Train Loss: 1.1013, Train Accuracy: 64.49%, Val Loss: 0.8079, Val Accuracy: 73.32%

Epoch processing time: 5.98 seconds

Epoch [3/12], Train Loss: 0.6478, Train Accuracy: 80.60%, Val Loss: 0.5574, Val Accuracy: 85.28%

Epoch processing time: 6.99 seconds

Epoch [4/12], Train Loss: 0.4568, Train Accuracy: 87.32%, Val Loss: 0.4280, Val Accuracy: 88.48%

Epoch processing time: 5.81 seconds

Epoch [5/12], Train Loss: 0.3534, Train Accuracy: 89.93%, Val Loss: 0.3521, Val Accuracy: 90.40%

Epoch processing time: 7.00 seconds

Epoch [6/12], Train Loss: 0.2978, Train Accuracy: 91.44%, Val Loss: 0.3133, Val Accuracy: 91.67%

Epoch processing time: 6.05 seconds

Epoch [7/12], Train Loss: 0.2655, Train Accuracy: 92.46%, Val Loss: 0.3019, Val Accuracy: 91.56%

Epoch processing time: 6.93 seconds

Epoch [8/12], Train Loss: 0.2472, Train Accuracy: 92.78%, Val Loss: 0.2807, Val Accuracy: 92.37%

Epoch processing time: 6.01 seconds

Epoch [9/12], Train Loss: 0.2282, Train Accuracy: 93.35%, Val Loss: 0.2738, Val Accuracy: 92.56%

Epoch processing time: 6.92 seconds

Epoch [10/12], Train Loss: 0.2132, Train Accuracy: 93.80%, Val Loss: 0.2575, Val Accuracy: 92.93%

Epoch processing time: 6.06 seconds

Epoch [11/12], Train Loss: 0.1981, Train Accuracy: 94.22%, Val Loss: 0.2553, Val Accuracy: 92.82%

Epoch processing time: 6.95 seconds

Epoch [12/12], Train Loss: 0.1878, Train Accuracy: 94.39%, Val Loss: 0.2489, Val Accuracy: 93.12%

Epoch processing time: 5.82 seconds

Total training time: 78.10 seconds

Training with traditional backpropagation:

Epoch [1/12], Train Loss: 6.9806, Train Accuracy: 37.27%, Val Loss: 1.5671, Val Accuracy: 49.42%

Epoch processing time: 6.84 seconds

Epoch [2/12], Train Loss: 1.1666, Train Accuracy: 60.90%, Val Loss: 0.9454, Val Accuracy: 72.51%

Epoch processing time: 5.91 seconds

Epoch [3/12], Train Loss: 0.7834, Train Accuracy: 75.20%, Val Loss: 0.6856, Val Accuracy: 78.37%

Epoch processing time: 6.80 seconds

Epoch [4/12], Train Loss: 0.5979, Train Accuracy: 81.24%, Val Loss: 0.5383, Val Accuracy: 83.37%

Epoch processing time: 5.91 seconds

Epoch [5/12], Train Loss: 0.4758, Train Accuracy: 85.61%, Val Loss: 0.4466, Val Accuracy: 86.86%

Epoch processing time: 6.86 seconds

Epoch [6/12], Train Loss: 0.3828, Train Accuracy: 88.98%, Val Loss: 0.3825, Val Accuracy: 89.56%

Epoch processing time: 5.87 seconds

Epoch [7/12], Train Loss: 0.3210, Train Accuracy: 91.02%, Val Loss: 0.3443, Val Accuracy: 90.71%

Epoch processing time: 6.86 seconds

Epoch [8/12], Train Loss: 0.2844, Train Accuracy: 92.08%, Val Loss: 0.3211, Val Accuracy: 91.11%

Epoch processing time: 5.70 seconds

Epoch [9/12], Train Loss: 0.2688, Train Accuracy: 92.48%, Val Loss: 0.3099, Val Accuracy: 91.96%

Epoch processing time: 6.86 seconds

Epoch [10/12], Train Loss: 0.2522, Train Accuracy: 93.00%, Val Loss: 0.2966, Val Accuracy: 92.21%

Epoch processing time: 5.94 seconds

Epoch [11/12], Train Loss: 0.2417, Train Accuracy: 93.25%, Val Loss: 0.2952, Val Accuracy: 91.84%

Epoch processing time: 7.90 seconds

Epoch [12/12], Train Loss: 0.2306, Train Accuracy: 93.39%, Val Loss: 0.2825, Val Accuracy: 92.67%

Epoch processing time: 5.94 seconds

Total training time: 77.39 seconds

Final Test Accuracy (Homogeneity-driven): 93.44%

Final Test Accuracy (Traditional): 92.67%

3. OPTION WITH DYNAMIC HOMOGENEITY LEARNING RATE and labda depends on – [B-A] – MNIST DATASET

=========================================

# BEGINNING OF THE CODE

import torch

import torch.nn as nn

import torch.optim as optim

from torch.utils.data import DataLoader, random_split

from torchvision import datasets, transforms

import numpy as np

import matplotlib.pyplot as plt

import time

import math

# Device configuration

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Define the neural network model

class Net(nn.Module):

def __init__(self):

super(Net, self).__init__()

self.fc1 = nn.Linear(28 * 28, 128)

self.relu = nn.ReLU()

self.fc2 = nn.Linear(128, 10)

def forward(self, x):

x = x.view(-1, 28 * 28)

x = self.fc1(x)

x = self.relu(x)

x = self.fc2(x)

return x

# Function to calculate Absolute Difference Similarity (ADS)

def calculate_similarity(status, average, epsilon=1e-8):

num = torch.sum(torch.abs(status - average))

den = epsilon + torch.sum(torch.abs(status) + torch.abs(average))

similarity = 1 - (num / den)

return similarity

# Function to calculate partial derivative of Homogeneity

def calculate_partial_derivative(status, average, homogeneity_lambda, epsilon=1e-8):

N = torch.sum(torch.abs(status - average))

D = epsilon + torch.sum(torch.abs(status) + torch.abs(average))

condition1 = torch.logical_or(torch.logical_and(status > average, average > 0),

torch.logical_and(status < average, average < 0))

condition2 = torch.logical_or(torch.logical_and(status > 0, average < 0),

torch.logical_and(status < 0, average > 0))

condition3 = torch.logical_and(status == 0, average > 0)

condition4 = torch.logical_and(status == 0, average < 0)

condition5 = status == average

partial_derivative = torch.zeros_like(status)

partial_derivative[condition1] = (1 - homogeneity_lambda) * (1 / D**2) * (D - N)

partial_derivative[condition2] = (1 - homogeneity_lambda) * (1 / D**2) * (D + N)

partial_derivative[condition3] = -(1 - homogeneity_lambda) / D

partial_derivative[condition4] = (1 - homogeneity_lambda) / D

partial_derivative[condition5] = 0

remaining_indices = torch.logical_not(torch.logical_or(torch.logical_or(torch.logical_or(condition1, condition2),

torch.logical_or(condition3, condition4)),

condition5))

partial_derivative[remaining_indices] = (1 - homogeneity_lambda) * (1 / D**2) * (

D * torch.sign(status[remaining_indices] - average[remaining_indices]) -

N * torch.sign(status[remaining_indices])

)

return partial_derivative

# Function to perform Homogeneity-driven weight update

def weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average, epsilon=1e-8):

partial_derivative = calculate_partial_derivative(status, average, homogeneity_lambda, epsilon)

delta = -homogeneity_learning_rate * partial_derivative

return delta

# Function to train the model with Homogeneity-driven update

def train_homogeneity_driven(model, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average):

criterion = nn.CrossEntropyLoss()

optimizer = optim.Adam(model.parameters(), lr=learning_rate)

optimizer_homogeneity = optim.Adam(model.parameters(), lr=homogeneity_learning_rate)

train_losses = []

val_losses = []

train_accuracies = []

val_accuracies = []

homogeneity_values = []

homogeneity = 1.0

total_start_time = time.time()

iteration_counter = 1

for epoch in range(num_epochs):

epoch_start_time = time.time()

model.train()

epoch_train_loss = 0

epoch_train_correct = 0

epoch_train_total = 0

for batch_idx, (data, target) in enumerate(train_loader):

iteration_counter += 1

data, target = data.to(device), target.to(device)

# --- Backpropagation Update ---

optimizer.zero_grad()

output = model(data)

loss = criterion(output, target)

loss.backward()

optimizer.step()

total_bp_update = 0

for param in model.parameters():

if param.grad is not None:

total_bp_update += torch.sum(torch.abs(param.grad)).item()

# --- Homogeneity-driven Update ---

status = torch.cat([p.data.flatten() for p in model.parameters()]).detach()

weights_before_update = status.clone()

# print("Model weights before homogeneity update (first 5 values of fc1.weight):")

# print(model.fc1.weight.data[:5])

similarity = calculate_similarity(status, average)

homogeneity_lambda = (iteration_counter - 1) / total_iterations

homogeneity = (1 - homogeneity_lambda) * similarity + homogeneity_lambda * homogeneity

homogeneity_values.append(homogeneity.item())

delta = weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average)

current_index = 0

for param in model.parameters():

param_size = param.nelement()

update_value = delta[current_index: current_index + param_size].view_as(param.data)

param.grad = -update_value

current_index += param_size

optimizer_homogeneity.step()

optimizer_homogeneity.zero_grad()

# print("\nModel weights after homogeneity update (first 5 values of fc1.weight):")

# print(model.fc1.weight.data[:5])

weights_after_update = torch.cat([p.data.flatten() for p in model.parameters()]).detach()

distance = torch.norm(weights_before_update - weights_after_update)

# print(f"Distance between weights before and after homogeneity update: {distance.item()}\n")

average = (status + (epoch * len(train_loader) + batch_idx) * average) / (

epoch * len(train_loader) + batch_idx + 1)

optimizer.zero_grad()

# print(f"Epoch [{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "

# f"Backpropagation Update: {total_bp_update:.4f}, Homogeneity Update: {torch.sum(torch.abs(delta)).item():.4f}, "

# f"homogeneity_lambda: {homogeneity_lambda:.4f}, similarity: {similarity:.4f}, homogeneity: {homogeneity:.4f}")

epoch_train_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_train_total += target.size(0)

epoch_train_correct += (predicted == target).sum().item()

epoch_train_loss /= len(train_loader)

epoch_train_accuracy = 100 * epoch_train_correct / epoch_train_total

train_losses.append(epoch_train_loss)

train_accuracies.append(epoch_train_accuracy)

model.eval()

epoch_val_loss = 0

epoch_val_correct = 0

epoch_val_total = 0

with torch.no_grad():

for data, target in val_loader:

data, target = data.to(device), target.to(device)

output = model(data)

loss = criterion(output, target)

epoch_val_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_val_total += target.size(0)

epoch_val_correct += (predicted == target).sum().item()

epoch_val_loss /= len(val_loader)

epoch_val_accuracy = 100 * epoch_val_correct / epoch_val_total

val_losses.append(epoch_val_loss)

val_accuracies.append(epoch_val_accuracy)

epoch_end_time = time.time()

epoch_time = epoch_end_time - epoch_start_time

print(f"Epoch [{epoch + 1}/{num_epochs}], "

f"Train Loss: {epoch_train_loss:.4f}, "

f"Train Accuracy: {epoch_train_accuracy:.2f}%, "

f"Val Loss: {epoch_val_loss:.4f}, "

f"Val Accuracy: {epoch_val_accuracy:.2f}%")

print(f"Epoch processing time: {epoch_time:.2f} seconds")

total_end_time = time.time()

total_training_time = total_end_time - total_start_time

print(f"Total training time: {total_training_time:.2f} seconds")

return train_losses, val_losses, train_accuracies, val_accuracies, homogeneity_values

# Function to train the model with traditional backpropagation

def train_traditional(model, train_loader, val_loader, num_epochs, learning_rate):

criterion = nn.CrossEntropyLoss()

optimizer = optim.Adam(model.parameters(), lr=learning_rate)

train_losses = []

val_losses = []

train_accuracies = []

val_accuracies = []

total_start_time = time.time()

iteration_counter = 1 # Initialize iteration_counter here

for epoch in range(num_epochs):

epoch_start_time = time.time()

model.train()

epoch_train_loss = 0

epoch_train_correct = 0

epoch_train_total = 0

for batch_idx, (data, target) in enumerate(train_loader):

iteration_counter += 1

data, target = data.to(device), target.to(device)

optimizer.zero_grad()

output = model(data)

loss = criterion(output, target)

loss.backward()

optimizer.step()

# Calculate total_bp_update for each iteration

total_bp_update = 0

for param in model.parameters():

if param.grad is not None:

total_bp_update += torch.sum(torch.abs(param.grad)).item()

# print(f"Epoch [{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "

# f"Backpropagation Update: {total_bp_update:.4f}")

epoch_train_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_train_total += target.size(0)

epoch_train_correct += (predicted == target).sum().item()

epoch_train_loss /= len(train_loader)

epoch_train_accuracy = 100 * epoch_train_correct / epoch_train_total

train_losses.append(epoch_train_loss)

train_accuracies.append(epoch_train_accuracy)

model.eval()

epoch_val_loss = 0

epoch_val_correct = 0

epoch_val_total = 0

with torch.no_grad():

for data, target in val_loader:

data, target = data.to(device), target.to(device)

output = model(data)

loss = criterion(output, target)

epoch_val_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_val_total += target.size(0)

epoch_val_correct += (predicted == target).sum().item()

epoch_val_loss /= len(val_loader)

epoch_val_accuracy = 100 * epoch_val_correct / epoch_val_total

val_losses.append(epoch_val_loss)

val_accuracies.append(epoch_val_accuracy)

epoch_end_time = time.time()

epoch_time = epoch_end_time - epoch_start_time

print(f"Epoch [{epoch + 1}/{num_epochs}], "

f"Train Loss: {epoch_train_loss:.4f}, "

f"Train Accuracy: {epoch_train_accuracy:.2f}%, "

f"Val Loss: {epoch_val_loss:.4f}, "

f"Val Accuracy: {epoch_val_accuracy:.2f}%")

print(f"Epoch processing time: {epoch_time:.2f} seconds")

total_end_time = time.time()

total_training_time = total_end_time - total_start_time

print(f"Total training time: {total_training_time:.2f} seconds")

return train_losses, val_losses, train_accuracies, val_accuracies

# Function to calculate accuracy

def calculate_accuracy(model, data_loader):

model.eval()

correct = 0

total = 0

with torch.no_grad():

for images, labels in data_loader:

images, labels = images.to(device), labels.to(device)

outputs = model(images)

_, predicted = torch.max(outputs.data, 1)

total += labels.size(0)

correct += (predicted == labels).sum().item()

return 100 * correct / total

# !!!!!!!!!!!! Hyperparameters !!!!!!!!!!!!!!!!!!!!!

num_epochs = 4

learning_rate = 0.01

batch_size = 10

# removed to test dynamic value homogeneity_learning_rate = 0.01

# Load and split MNIST dataset

transform = transforms.ToTensor()

full_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)

test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

training_samples_number = int(0.7 * len(full_dataset))

total_iterations = math.ceil(training_samples_number / batch_size) * num_epochs

homogeneity_learning_rate = (2* learning_rate) / (learning_rate * total_iterations + 1)

print("Hyperparameters and Calculated Values:")

print(f" num_epochs: {num_epochs}")

print(f" learning_rate: {learning_rate}")

print(f" homogeneity_learning_rate: {homogeneity_learning_rate}")

print(f" batch_size: {batch_size}")

print(f" training_samples_number: {training_samples_number}")

print(f" total_iterations: {total_iterations}")

print("\n")

train_size = int(0.7 * len(full_dataset))

val_size = int(0.15 * len(full_dataset))

test_size = len(full_dataset) - train_size - val_size

train_dataset, val_dataset, _ = random_split(full_dataset, [train_size, val_size, test_size])

train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)

test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

# Initialize models

model_homogeneity = Net().to(device)

model_traditional = Net().to(device)

average_model = Net().to(device)

average = torch.cat([p.data.flatten() for p in average_model.parameters()]).detach()

# Train the models

print("Training with Homogeneity-driven update:")

results_homogeneity = train_homogeneity_driven(model_homogeneity, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average)

print("\nTraining with traditional backpropagation:")

results_traditional = train_traditional(model_traditional, train_loader, val_loader, num_epochs, learning_rate)

# Evaluate and plot results

train_losses_h, val_losses_h, train_accuracies_h, val_accuracies_h, homogeneity_values_h = results_homogeneity

train_losses_t, val_losses_t, train_accuracies_t, val_accuracies_t = results_traditional

print(f"\nFinal Test Accuracy (Homogeneity-driven): {calculate_accuracy(model_homogeneity, test_loader):.2f}%")

print(f"Final Test Accuracy (Traditional): {calculate_accuracy(model_traditional, test_loader):.2f}%")

plt.figure(figsize=(10, 5))

plt.plot(train_losses_h, label='Homogeneity-Driven Train Loss')

plt.plot(val_losses_h, label='Homogeneity-Driven Validation Loss')

plt.plot(train_losses_t, label='Traditional Train Loss')

plt.plot(val_losses_t, label='Traditional Validation Loss')

plt.title('Training and Validation Loss')

plt.xlabel('Epoch')

plt.ylabel('Loss')

plt.legend()

plt.show()

plt.figure(figsize=(10, 5))

plt.plot(train_accuracies_h, label='Homogeneity-Driven Train Accuracy')

plt.plot(val_accuracies_h, label='Homogeneity-Driven Validation Accuracy')

plt.plot(train_accuracies_t, label='Traditional Train Accuracy')

plt.plot(val_accuracies_t, label='Traditional Validation Accuracy')

plt.title('Training and Validation Accuracy')

plt.xlabel('Epoch')

plt.ylabel('Accuracy')

plt.legend()

plt.show()

plt.figure(figsize=(10, 5))

plt.plot(homogeneity_values_h, label='Homogeneity')

plt.title('Homogeneity Values over Iterations')

plt.xlabel('Iteration')

plt.ylabel('Homogeneity')

plt.legend()

plt.show()

# END OF THE CODE

=========================================

Samples of run:

-------------------------------------------------------

Sample [B-A-MNIST] 1:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 4

learning_rate: 0.01

homogeneity_learning_rate: 0.00011834319526627219

batch_size: 10

training_samples_number: 42000

total_iterations: 16800

Training with Homogeneity-driven update:

Total training time: 249.94 seconds

Training with traditional backpropagation:

Epoch processing time: 22.37 seconds

Total training time: 90.24 seconds

Final Test Accuracy (Homogeneity-driven): 95.12%

Final Test Accuracy (Traditional): 94.38%

-------------------------------------------------------

Sample [B-A-MNIST] 2:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 8

learning_rate: 0.01

homogeneity_learning_rate: 0.00011834319526627219

batch_size: 20

training_samples_number: 42000

total_iterations: 16800

Training with Homogeneity-driven update:

Total training time: 277.78 seconds

Training with traditional backpropagation:

Total training time: 112.23 seconds

Final Test Accuracy (Homogeneity-driven): 95.91%

Final Test Accuracy (Traditional): 95.78%

-------------------------------------------------------

Sample [B-A-MNIST] 3:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 10

learning_rate: 0.01

homogeneity_learning_rate: 0.006666666666666667

batch_size: 2100

training_samples_number: 42000

total_iterations: 200

Training with Homogeneity-driven update:

Total training time: 65.42 seconds

Training with traditional backpropagation:

Total training time: 62.62 seconds

Final Test Accuracy (Homogeneity-driven): 96.92%

Final Test Accuracy (Traditional): 96.87%

4. HOMOGENEITY NN TRAINING (Option with lambda based on current iteration and dynamic learning rate computation) – [B-B]– MNIST DATASET

=========================================

# BEGINNING OF THE CODE

import torch

import torch.nn as nn

import torch.optim as optim

from torch.utils.data import DataLoader, random_split

from torchvision import datasets, transforms

import numpy as np

import matplotlib.pyplot as plt

import time

import math

# Device configuration

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Define the neural network model

class Net(nn.Module):

def __init__(self):

super(Net, self).__init__()

self.fc1 = nn.Linear(28 * 28, 128)

self.relu = nn.ReLU()

self.fc2 = nn.Linear(128, 10)

def forward(self, x):

x = x.view(-1, 28 * 28)

x = self.fc1(x)

x = self.relu(x)

x = self.fc2(x)

return x

# Function to calculate Absolute Difference Similarity (ADS)

def calculate_similarity(status, average, epsilon=1e-8):

num = torch.sum(torch.abs(status - average))

den = epsilon + torch.sum(torch.abs(status) + torch.abs(average))

similarity = 1 - (num / den)

return similarity

# Function to calculate partial derivative of Homogeneity

def calculate_partial_derivative(status, average, homogeneity_lambda, epsilon=1e-8):

N = torch.sum(torch.abs(status - average))

D = epsilon + torch.sum(torch.abs(status) + torch.abs(average))

condition1 = torch.logical_or(torch.logical_and(status > average, average > 0),

torch.logical_and(status < average, average < 0))

condition2 = torch.logical_or(torch.logical_and(status > 0, average < 0),

torch.logical_and(status < 0, average > 0))

condition3 = torch.logical_and(status == 0, average > 0)

condition4 = torch.logical_and(status == 0, average < 0)

condition5 = status == average

partial_derivative = torch.zeros_like(status)

partial_derivative[condition1] = (1 - homogeneity_lambda) * (1 / D**2) * (D - N)

partial_derivative[condition2] = (1 - homogeneity_lambda) * (1 / D**2) * (D + N)

partial_derivative[condition3] = -(1 - homogeneity_lambda) / D

partial_derivative[condition4] = (1 - homogeneity_lambda) / D

partial_derivative[condition5] = 0

remaining_indices = torch.logical_not(torch.logical_or(torch.logical_or(torch.logical_or(condition1, condition2),

torch.logical_or(condition3, condition4)),

condition5))

partial_derivative[remaining_indices] = (1 - homogeneity_lambda) * (1 / D**2) * (

D * torch.sign(status[remaining_indices] - average[remaining_indices]) -

N * torch.sign(status[remaining_indices])

)

return partial_derivative

# Function to perform Homogeneity-driven weight update

def weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average, epsilon=1e-8):

partial_derivative = calculate_partial_derivative(status, average, homogeneity_lambda, epsilon)

delta = -homogeneity_learning_rate * partial_derivative

return delta

# Function to train the model with Homogeneity-driven update

def train_homogeneity_driven(model, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average):

criterion = nn.CrossEntropyLoss()

optimizer = optim.Adam(model.parameters(), lr=learning_rate)

optimizer_homogeneity = optim.Adam(model.parameters(), lr=homogeneity_learning_rate)

train_losses = []

val_losses = []

train_accuracies = []

val_accuracies = []

homogeneity_values = []

homogeneity = 1.0

total_start_time = time.time()

iteration_counter = 1

for epoch in range(num_epochs):

epoch_start_time = time.time()

model.train()

epoch_train_loss = 0

epoch_train_correct = 0

epoch_train_total = 0

for batch_idx, (data, target) in enumerate(train_loader):

iteration_counter += 1

data, target = data.to(device), target.to(device)

# --- Backpropagation Update ---

optimizer.zero_grad()

output = model(data)

loss = criterion(output, target)

loss.backward()

optimizer.step()

total_bp_update = 0

for param in model.parameters():

if param.grad is not None:

total_bp_update += torch.sum(torch.abs(param.grad)).item()

# --- Homogeneity-driven Update ---

status = torch.cat([p.data.flatten() for p in model.parameters()]).detach()

weights_before_update = status.clone()

# print("Model weights before homogeneity update (first 5 values of fc1.weight):")

# print(model.fc1.weight.data[:5])

similarity = calculate_similarity(status, average)

homogeneity_lambda = (iteration_counter - 1) / iteration_counter

homogeneity = (1 - homogeneity_lambda) * similarity + homogeneity_lambda * homogeneity

homogeneity_values.append(homogeneity.item())

delta = weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average)

current_index = 0

for param in model.parameters():

param_size = param.nelement()

update_value = delta[current_index: current_index + param_size].view_as(param.data)

param.grad = -update_value

current_index += param_size

optimizer_homogeneity.step()

optimizer_homogeneity.zero_grad()

# print("\nModel weights after homogeneity update (first 5 values of fc1.weight):")

# print(model.fc1.weight.data[:5])

weights_after_update = torch.cat([p.data.flatten() for p in model.parameters()]).detach()

distance = torch.norm(weights_before_update - weights_after_update)

# print(f"Distance between weights before and after homogeneity update: {distance.item()}\n")

average = (status + (epoch * len(train_loader) + batch_idx) * average) / (

epoch * len(train_loader) + batch_idx + 1)

optimizer.zero_grad()

# print(f"Epoch [{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "

# f"Backpropagation Update: {total_bp_update:.4f}, Homogeneity Update: {torch.sum(torch.abs(delta)).item():.4f}, "

# f"homogeneity_lambda: {homogeneity_lambda:.4f}, similarity: {similarity:.4f}, homogeneity: {homogeneity:.4f}")

epoch_train_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_train_total += target.size(0)

epoch_train_correct += (predicted == target).sum().item()

epoch_train_loss /= len(train_loader)

epoch_train_accuracy = 100 * epoch_train_correct / epoch_train_total

train_losses.append(epoch_train_loss)

train_accuracies.append(epoch_train_accuracy)

model.eval()

epoch_val_loss = 0

epoch_val_correct = 0

epoch_val_total = 0

with torch.no_grad():

for data, target in val_loader:

data, target = data.to(device), target.to(device)

output = model(data)

loss = criterion(output, target)

epoch_val_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_val_total += target.size(0)

epoch_val_correct += (predicted == target).sum().item()

epoch_val_loss /= len(val_loader)

epoch_val_accuracy = 100 * epoch_val_correct / epoch_val_total

val_losses.append(epoch_val_loss)

val_accuracies.append(epoch_val_accuracy)

epoch_end_time = time.time()

epoch_time = epoch_end_time - epoch_start_time

print(f"Epoch [{epoch + 1}/{num_epochs}], "

f"Train Loss: {epoch_train_loss:.4f}, "

f"Train Accuracy: {epoch_train_accuracy:.2f}%, "

f"Val Loss: {epoch_val_loss:.4f}, "

f"Val Accuracy: {epoch_val_accuracy:.2f}%")

print(f"Epoch processing time: {epoch_time:.2f} seconds")

total_end_time = time.time()

total_training_time = total_end_time - total_start_time

print(f"Total training time: {total_training_time:.2f} seconds")

return train_losses, val_losses, train_accuracies, val_accuracies, homogeneity_values

# Function to train the model with traditional backpropagation

def train_traditional(model, train_loader, val_loader, num_epochs, learning_rate):

criterion = nn.CrossEntropyLoss()

optimizer = optim.Adam(model.parameters(), lr=learning_rate)

train_losses = []

val_losses = []

train_accuracies = []

val_accuracies = []

total_start_time = time.time()

iteration_counter = 1 # Initialize iteration_counter here

for epoch in range(num_epochs):

epoch_start_time = time.time()

model.train()

epoch_train_loss = 0

epoch_train_correct = 0

epoch_train_total = 0

for batch_idx, (data, target) in enumerate(train_loader):

iteration_counter += 1

data, target = data.to(device), target.to(device)

optimizer.zero_grad()

output = model(data)

loss = criterion(output, target)

loss.backward()

optimizer.step()

# Calculate total_bp_update for each iteration

total_bp_update = 0

for param in model.parameters():

if param.grad is not None:

total_bp_update += torch.sum(torch.abs(param.grad)).item()

# print(f"Epoch [{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "

# f"Backpropagation Update: {total_bp_update:.4f}")

epoch_train_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_train_total += target.size(0)

epoch_train_correct += (predicted == target).sum().item()

epoch_train_loss /= len(train_loader)

epoch_train_accuracy = 100 * epoch_train_correct / epoch_train_total

train_losses.append(epoch_train_loss)

train_accuracies.append(epoch_train_accuracy)

model.eval()

epoch_val_loss = 0

epoch_val_correct = 0

epoch_val_total = 0

with torch.no_grad():

for data, target in val_loader:

data, target = data.to(device), target.to(device)

output = model(data)

loss = criterion(output, target)

epoch_val_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_val_total += target.size(0)

epoch_val_correct += (predicted == target).sum().item()

epoch_val_loss /= len(val_loader)

epoch_val_accuracy = 100 * epoch_val_correct / epoch_val_total

val_losses.append(epoch_val_loss)

val_accuracies.append(epoch_val_accuracy)

epoch_end_time = time.time()

epoch_time = epoch_end_time - epoch_start_time

print(f"Epoch [{epoch + 1}/{num_epochs}], "

f"Train Loss: {epoch_train_loss:.4f}, "

f"Train Accuracy: {epoch_train_accuracy:.2f}%, "

f"Val Loss: {epoch_val_loss:.4f}, "

f"Val Accuracy: {epoch_val_accuracy:.2f}%")

print(f"Epoch processing time: {epoch_time:.2f} seconds")

total_end_time = time.time()

total_training_time = total_end_time - total_start_time

print(f"Total training time: {total_training_time:.2f} seconds")

return train_losses, val_losses, train_accuracies, val_accuracies

# Function to calculate accuracy

def calculate_accuracy(model, data_loader):

model.eval()

correct = 0

total = 0

with torch.no_grad():

for images, labels in data_loader:

images, labels = images.to(device), labels.to(device)

outputs = model(images)

_, predicted = torch.max(outputs.data, 1)

total += labels.size(0)

correct += (predicted == labels).sum().item()

return 100 * correct / total

# !!!!!!!!!!!! Hyperparameters !!!!!!!!!!!!!!!!!!!!!

num_epochs = 4

learning_rate = 0.01

batch_size = 10

# removed to test dynamic value homogeneity_learning_rate = 0.01

# Load and split MNIST dataset

transform = transforms.ToTensor()

full_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)

test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

training_samples_number = int(0.7 * len(full_dataset))

total_iterations = math.ceil(training_samples_number / batch_size) * num_epochs

homogeneity_learning_rate = (2* learning_rate) / (learning_rate * total_iterations + 1)

print("Hyperparameters and Calculated Values:")

print(f" num_epochs: {num_epochs}")

print(f" learning_rate: {learning_rate}")

print(f" homogeneity_learning_rate: {homogeneity_learning_rate}")

print(f" batch_size: {batch_size}")

print(f" training_samples_number: {training_samples_number}")

print(f" total_iterations: {total_iterations}")

print("\n")

train_size = int(0.7 * len(full_dataset))

val_size = int(0.15 * len(full_dataset))

test_size = len(full_dataset) - train_size - val_size

train_dataset, val_dataset, _ = random_split(full_dataset, [train_size, val_size, test_size])

train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)

test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

# Initialize models

model_homogeneity = Net().to(device)

model_traditional = Net().to(device)

average_model = Net().to(device)

average = torch.cat([p.data.flatten() for p in average_model.parameters()]).detach()

# Train the models

print("Training with Homogeneity-driven update:")

results_homogeneity = train_homogeneity_driven(model_homogeneity, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average)

print("\nTraining with traditional backpropagation:")

results_traditional = train_traditional(model_traditional, train_loader, val_loader, num_epochs, learning_rate)

# Evaluate and plot results

train_losses_h, val_losses_h, train_accuracies_h, val_accuracies_h, homogeneity_values_h = results_homogeneity

train_losses_t, val_losses_t, train_accuracies_t, val_accuracies_t = results_traditional

print(f"\nFinal Test Accuracy (Homogeneity-driven): {calculate_accuracy(model_homogeneity, test_loader):.2f}%")

print(f"Final Test Accuracy (Traditional): {calculate_accuracy(model_traditional, test_loader):.2f}%")

plt.figure(figsize=(10, 5))

plt.plot(train_losses_h, label='Homogeneity-Driven Train Loss')

plt.plot(val_losses_h, label='Homogeneity-Driven Validation Loss')

plt.plot(train_losses_t, label='Traditional Train Loss')

plt.plot(val_losses_t, label='Traditional Validation Loss')

plt.title('Training and Validation Loss')

plt.xlabel('Epoch')

plt.ylabel('Loss')

plt.legend()

plt.show()

plt.figure(figsize=(10, 5))

plt.plot(train_accuracies_h, label='Homogeneity-Driven Train Accuracy')

plt.plot(val_accuracies_h, label='Homogeneity-Driven Validation Accuracy')

plt.plot(train_accuracies_t, label='Traditional Train Accuracy')

plt.plot(val_accuracies_t, label='Traditional Validation Accuracy')

plt.title('Training and Validation Accuracy')

plt.xlabel('Epoch')

plt.ylabel('Accuracy')

plt.legend()

plt.show()

plt.figure(figsize=(10, 5))

plt.plot(homogeneity_values_h, label='Homogeneity')

plt.title('Homogeneity Values over Iterations')

plt.xlabel('Iteration')

plt.ylabel('Homogeneity')

plt.legend()

plt.show()

# END OF THE CODE

=========================================

Samples of run:

-------------------------------------------------------

Sample [B-B-MNIST] 1:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 8

learning_rate: 0.01

homogeneity_learning_rate: 0.0005780346820809249

batch_size: 100

training_samples_number: 42000

total_iterations: 3360

Training with Homogeneity-driven update:

Total training time: 133.06 seconds

Training with traditional backpropagation:

Total training time: 58.51 seconds

Final Test Accuracy (Homogeneity-driven): 96.67%

Final Test Accuracy (Traditional): 96.27%

-------------------------------------------------------

Sample [B-B-MNIST] 2:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 8

learning_rate: 0.001

homogeneity_learning_rate: 0.00022222222222222223

batch_size: 42

training_samples_number: 42000

total_iterations: 8000

Training with Homogeneity-driven update:

Total training time: 201.09 seconds

Training with traditional backpropagation:

Total training time: 71.01 seconds

Final Test Accuracy (Homogeneity-driven): 97.42%

Final Test Accuracy (Traditional): 97.33%

-------------------------------------------------------

Sample [B-B-MNIST] 3:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 8

learning_rate: 0.1

homogeneity_learning_rate: 0.00012492192379762648

batch_size: 21

training_samples_number: 42000

total_iterations: 16000

Training with Homogeneity-driven update:

Total training time: 316.75 seconds

Training with traditional backpropagation:

Total training time: 127.37 seconds

Final Test Accuracy (Homogeneity-driven): 49.31%

Final Test Accuracy (Traditional): 45.58%

SAME (basic) CODE IDEA FOR IRIS DATASET (labda depends on number of iterations and dynamic learning rate) [B-A]

=========================================

# BEGINNING OF THE CODE

import torch

import torch.nn as nn

import torch.optim as optim

from torch.utils.data import Dataset, DataLoader, random_split

from sklearn.datasets import load_iris

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler

import numpy as np

import matplotlib.pyplot as plt

import time

import math

# Device configuration

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Define the neural network model (adjusted for Iris dataset)

class Net(nn.Module):

def __init__(self, input_size, hidden_size, output_size):

super(Net, self).__init__()

self.fc1 = nn.Linear(input_size, hidden_size)

self.relu = nn.ReLU()

self.fc2 = nn.Linear(hidden_size, output_size)

def forward(self, x):

x = self.fc1(x)

x = self.relu(x)

x = self.fc2(x)

return x

# Function to calculate Absolute Difference Similarity (ADS)

def calculate_similarity(status, average, epsilon=1e-8):

num = torch.sum(torch.abs(status - average))

den = epsilon + torch.sum(torch.abs(status) + torch.abs(average))

similarity = 1 - (num / den)

return similarity

# Function to calculate partial derivative of Homogeneity

def calculate_partial_derivative(status, average, homogeneity_lambda, epsilon=1e-8):

N = torch.sum(torch.abs(status - average))

D = epsilon + torch.sum(torch.abs(status) + torch.abs(average))

condition1 = torch.logical_or(torch.logical_and(status > average, average > 0),

torch.logical_and(status < average, average < 0))

condition2 = torch.logical_or(torch.logical_and(status > 0, average < 0),

torch.logical_and(status < 0, average > 0))

condition3 = torch.logical_and(status == 0, average > 0)

condition4 = torch.logical_and(status == 0, average < 0)

condition5 = status == average

partial_derivative = torch.zeros_like(status)

partial_derivative[condition1] = (1 - homogeneity_lambda) * (1 / D**2) * (D - N)

partial_derivative[condition2] = (1 - homogeneity_lambda) * (1 / D**2) * (D + N)

partial_derivative[condition3] = -(1 - homogeneity_lambda) / D

partial_derivative[condition4] = (1 - homogeneity_lambda) / D

partial_derivative[condition5] = 0

remaining_indices = torch.logical_not(torch.logical_or(torch.logical_or(torch.logical_or(condition1, condition2),

torch.logical_or(condition3, condition4)),

condition5))

partial_derivative[remaining_indices] = (1 - homogeneity_lambda) * (1 / D**2) * (

D * torch.sign(status[remaining_indices] - average[remaining_indices]) -

N * torch.sign(status[remaining_indices])

)

return partial_derivative

# Function to perform Homogeneity-driven weight update

def weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average, epsilon=1e-8):

partial_derivative = calculate_partial_derivative(status, average, homogeneity_lambda, epsilon)

delta = -homogeneity_learning_rate * partial_derivative

return delta

# Function to train the model with Homogeneity-driven update

def train_homogeneity_driven(model, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average, total_iterations, batch_size):

# Added batch_size as an argument

criterion = nn.CrossEntropyLoss()

optimizer = optim.Adam(model.parameters(), lr=learning_rate)

optimizer_homogeneity = optim.Adam(model.parameters(), lr=homogeneity_learning_rate)

train_losses = []

val_losses = []

train_accuracies = []

val_accuracies = []

homogeneity_values = []

homogeneity = 1.0

total_start_time = time.time()

iteration_counter = 1

for epoch in range(num_epochs):

epoch_start_time = time.time()

model.train()

epoch_train_loss = 0

epoch_train_correct = 0

epoch_train_total = 0

for batch_idx, (data, target) in enumerate(train_loader):

iteration_counter += 1

data, target = data.to(device), target.to(device)

# --- Backpropagation Update ---

optimizer.zero_grad()

output = model(data)

loss = criterion(output, target)

loss.backward()

optimizer.step()

total_bp_update = 0

for param in model.parameters():

if param.grad is not None:

total_bp_update += torch.sum(torch.abs(param.grad)).item()

# --- Homogeneity-driven Update ---

status = torch.cat([p.data.flatten() for p in model.parameters()]).detach()

weights_before_update = status.clone()

# print("Model weights before homogeneity update (first 5 values of fc1.weight):")

# print(model.fc1.weight.data[:5])

similarity = calculate_similarity(status, average)

homogeneity_lambda = (iteration_counter - 1) / total_iterations

homogeneity = (1 - homogeneity_lambda) * similarity + homogeneity_lambda * homogeneity

homogeneity_values.append(homogeneity.item())

delta = weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average)

current_index = 0

for param in model.parameters():

param_size = param.nelement()

update_value = delta[current_index: current_index + param_size].view_as(param.data)

param.grad = -update_value

current_index += param_size

optimizer_homogeneity.step()

optimizer_homogeneity.zero_grad()

# print("\nModel weights after homogeneity update (first 5 values of fc1.weight):")

# print(model.fc1.weight.data[:5])

weights_after_update = torch.cat([p.data.flatten() for p in model.parameters()]).detach()

distance = torch.norm(weights_before_update - weights_after_update)

# print(f"Distance between weights before and after homogeneity update: {distance.item()}\n")

average = (status + (epoch * len(train_loader) + batch_idx) * average) / (

epoch * len(train_loader) + batch_idx + 1)

optimizer.zero_grad()

# print(f"Epoch [{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "

# f"Backpropagation Update: {total_bp_update:.4f}, Homogeneity Update: {torch.sum(torch.abs(delta)).item():.4f}, "

# f"homogeneity_lambda: {homogeneity_lambda:.4f}, similarity: {similarity:.4f}, homogeneity: {homogeneity:.4f}")

epoch_train_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_train_total += target.size(0)

epoch_train_correct += (predicted == target).sum().item()

epoch_train_loss /= len(train_loader)

epoch_train_accuracy = 100 * epoch_train_correct / epoch_train_total

train_losses.append(epoch_train_loss)

train_accuracies.append(epoch_train_accuracy)

model.eval()

epoch_val_loss = 0

epoch_val_correct = 0

epoch_val_total = 0

with torch.no_grad():

for data, target in val_loader:

data, target = data.to(device), target.to(device)

output = model(data)

loss = criterion(output, target)

epoch_val_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_val_total += target.size(0)

epoch_val_correct += (predicted == target).sum().item()

epoch_val_loss /= len(val_loader)

epoch_val_accuracy = 100 * epoch_val_correct / epoch_val_total

val_losses.append(epoch_val_loss)

val_accuracies.append(epoch_val_accuracy)

epoch_end_time = time.time()

epoch_time = epoch_end_time - epoch_start_time

print(f"Epoch [{epoch + 1}/{num_epochs}], "

f"Train Loss: {epoch_train_loss:.4f}, "

f"Train Accuracy: {epoch_train_accuracy:.2f}%, "

f"Val Loss: {epoch_val_loss:.4f}, "

f"Val Accuracy: {epoch_val_accuracy:.2f}%")

#print(f"Epoch processing time: {epoch_time:.2f} seconds")

total_end_time = time.time()

total_training_time = total_end_time - total_start_time

print(f"Total training time: {total_training_time:.2f} seconds")

return train_losses, val_losses, train_accuracies, val_accuracies, homogeneity_values

# Function to train the model with traditional backpropagation

def train_traditional(model, train_loader, val_loader, num_epochs, learning_rate, batch_size):

# Added batch_size as an argument

criterion = nn.CrossEntropyLoss()

optimizer = optim.Adam(model.parameters(), lr=learning_rate)

train_losses = []

val_losses = []

train_accuracies = []

val_accuracies = []

total_start_time = time.time()

iteration_counter = 1 # Initialize iteration_counter here

for epoch in range(num_epochs):

epoch_start_time = time.time()

model.train()

epoch_train_loss = 0

epoch_train_correct = 0

epoch_train_total = 0

for batch_idx, (data, target) in enumerate(train_loader):

iteration_counter += 1

data, target = data.to(device), target.to(device)

optimizer.zero_grad()

output = model(data)

loss = criterion(output, target)

loss.backward()

optimizer.step()

# Calculate total_bp_update for each iteration

total_bp_update = 0

for param in model.parameters():

if param.grad is not None:

total_bp_update += torch.sum(torch.abs(param.grad)).item()

# print(f"Epoch [{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "

# f"Backpropagation Update: {total_bp_update:.4f}")

epoch_train_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_train_total += target.size(0)

epoch_train_correct += (predicted == target).sum().item()

epoch_train_loss /= len(train_loader)

epoch_train_accuracy = 100 * epoch_train_correct / epoch_train_total

train_losses.append(epoch_train_loss)

train_accuracies.append(epoch_train_accuracy)

model.eval()

epoch_val_loss = 0

epoch_val_correct = 0

epoch_val_total = 0

with torch.no_grad():

for data, target in val_loader:

data, target = data.to(device), target.to(device)

output = model(data)

loss = criterion(output, target)

epoch_val_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_val_total += target.size(0)

epoch_val_correct += (predicted == target).sum().item()

epoch_val_loss /= len(val_loader)

epoch_val_accuracy = 100 * epoch_val_correct / epoch_val_total

val_losses.append(epoch_val_loss)

val_accuracies.append(epoch_val_accuracy)

epoch_end_time = time.time()

epoch_time = epoch_end_time - epoch_start_time

print(f"Epoch [{epoch + 1}/{num_epochs}], "

f"Train Loss: {epoch_train_loss:.4f}, "

f"Train Accuracy: {epoch_train_accuracy:.2f}%, "

f"Val Loss: {epoch_val_loss:.4f}, "

f"Val Accuracy: {epoch_val_accuracy:.2f}%")

#print(f"Epoch processing time: {epoch_time:.2f} seconds")

total_end_time = time.time()

total_training_time = total_end_time - total_start_time

print(f"Total training time: {total_training_time:.2f} seconds")

return train_losses, val_losses, train_accuracies, val_accuracies

# Function to calculate accuracy

def calculate_accuracy(model, data_loader):

model.eval()

correct = 0

total = 0

with torch.no_grad():

for images, labels in data_loader:

images, labels = images.to(device), labels.to(device)

outputs = model(images)

_, predicted = torch.max(outputs.data, 1)

total += labels.size(0)

correct += (predicted == labels).sum().item()

return 100 * correct / total

# -------------------- Hyperparameters --------------------

num_epochs = 10

learning_rate = 0.01

# removed to be replaced by dynamic option homogeneity_learning_rate = 0.01

input_size = 4 # Number of features in Iris dataset

hidden_size = 10

output_size = 3 # Number of classes in Iris dataset

batch_size = 32 # Added batch size definition

# ---------------------------------------------------------

# Load the Iris dataset

iris = load_iris()

X = iris.data

y = iris.target

# Split data into training, validation, and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.25, random_state=42) # 0.25 x 0.8 = 0.2

# Scale the data

scaler = StandardScaler()

X_train = scaler.fit_transform(X_train)

X_val = scaler.transform(X_val)

X_test = scaler.transform(X_test)

# Create custom dataset class for Iris data

class IrisDataset(Dataset):

def __init__(self, X, y):

self.X = torch.tensor(X, dtype=torch.float32)

self.y = torch.tensor(y, dtype=torch.long)

def __len__(self):

return len(self.X)

def __getitem__(self, idx):

return self.X[idx], self.y[idx]

# Create data loaders

train_dataset = IrisDataset(X_train, y_train)

val_dataset = IrisDataset(X_val, y_val)

test_dataset = IrisDataset(X_test, y_test)

train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True) # Updated to use batch_size variable

val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False) # Updated to use batch_size variable

test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False) # Updated to use batch_size variable

training_samples_number = len(train_dataset)

total_iterations = math.ceil(training_samples_number / batch_size) * num_epochs # Updated to use batch_size variable

homogeneity_learning_rate = (2* learning_rate) / (learning_rate * total_iterations + 1)

print("Hyperparameters and Calculated Values:")

print(f" num_epochs: {num_epochs}")

print(f" learning_rate: {learning_rate}")

print(f" homogeneity_learning_rate: {homogeneity_learning_rate}")

print(f" batch_size: {batch_size}")

print(f" training_samples_number: {training_samples_number}")

print(f" total_iterations: {total_iterations}")

print("\n")

# Initialize models

model_homogeneity = Net(input_size, hidden_size, output_size).to(device)

model_traditional = Net(input_size, hidden_size, output_size).to(device)

average_model = Net(input_size, hidden_size, output_size).to(device)

average = torch.cat([p.data.flatten() for p in average_model.parameters()]).detach()

# Train the models

print("Training with Homogeneity-driven update:")

results_homogeneity = train_homogeneity_driven(model_homogeneity, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average, total_iterations, batch_size)

print("\nTraining with traditional backpropagation:")

results_traditional = train_traditional(model_traditional, train_loader, val_loader, num_epochs, learning_rate, batch_size)

# Evaluate and plot results

train_losses_h, val_losses_h, train_accuracies_h, val_accuracies_h, homogeneity_values_h = results_homogeneity

train_losses_t, val_losses_t, train_accuracies_t, val_accuracies_t = results_traditional

print(f"\nFinal Test Accuracy (Homogeneity-driven): {calculate_accuracy(model_homogeneity, test_loader):.2f}%")

print(f"Final Test Accuracy (Traditional): {calculate_accuracy(model_traditional, test_loader):.2f}%")

plt.figure(figsize=(10, 5))

plt.plot(train_losses_h, label='Homogeneity-Driven Train Loss')

plt.plot(val_losses_h, label='Homogeneity-Driven Validation Loss')

plt.plot(train_losses_t, label='Traditional Train Loss')

plt.plot(val_losses_t, label='Traditional Validation Loss')

plt.title('Training and Validation Loss')

plt.xlabel('Epoch')

plt.ylabel('Loss')

plt.legend()

plt.show()

plt.figure(figsize=(10, 5))

plt.plot(train_accuracies_h, label='Homogeneity-Driven Train Accuracy')

plt.plot(val_accuracies_h, label='Homogeneity-Driven Validation Accuracy')

plt.plot(train_accuracies_t, label='Traditional Train Accuracy')

plt.plot(val_accuracies_t, label='Traditional Validation Accuracy')

plt.title('Training and Validation Accuracy')

plt.xlabel('Epoch')

plt.ylabel('Accuracy')

plt.legend()

plt.show()

plt.figure(figsize=(10, 5))

plt.plot(homogeneity_values_h, label='Homogeneity')

plt.title('Homogeneity Values over Iterations')

plt.xlabel('Iteration')

plt.ylabel('Homogeneity')

plt.legend()

plt.show()

# END OF THE CODE

=========================================

Samples of run:

-------------------------------------------------------

Sample [B-A-IRIS] 1:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 10

learning_rate: 0.01

homogeneity_learning_rate: 0.015384615384615384

batch_size: 32

training_samples_number: 90

total_iterations: 30

Training with Homogeneity-driven update:

Total training time: 0.16 seconds

Training with traditional backpropagation:

Final Test Accuracy (Homogeneity-driven): 86.67%

Final Test Accuracy (Traditional): 80.00%

-------------------------------------------------------

Sample [B-A-IRIS] 2:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 2

learning_rate: 0.01

homogeneity_learning_rate: 0.0196078431372549

batch_size: 90

training_samples_number: 90

total_iterations: 2

hidden size: 16

Training with Homogeneity-driven update:

Total training time: 0.02 seconds

Training with traditional backpropagation:

Total training time: 0.01 seconds

Final Test Accuracy (Homogeneity-driven): 63.33%

Final Test Accuracy (Traditional): 46.67%

-------------------------------------------------------

Sample [B-A-IRIS] 3:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 2

learning_rate: 0.01

homogeneity_learning_rate: 0.0196078431372549

batch_size: 90

training_samples_number: 90

total_iterations: 2

hidden_size: 4

Training with Homogeneity-driven update:

Total training time: 0.02 seconds

Training with traditional backpropagation:

Total training time: 0.01 seconds

Final Test Accuracy (Homogeneity-driven): 40.00%

Final Test Accuracy (Traditional): 16.67%

-------------------------------------------------------

Sample [B-A-IRIS] 4:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 2

hidden_size: 2

learning_rate: 0.01

homogeneity_learning_rate: 0.0196078431372549

batch_size: 90

training_samples_number: 90

total_iterations: 2

Training with Homogeneity-driven update:

Total training time: 0.02 seconds

Training with traditional backpropagation:

Total training time: 0.01 seconds

Final Test Accuracy (Homogeneity-driven): 43.33%

Final Test Accuracy (Traditional): 36.67%

-------------------------------------------------------

Sample [B-A-IRIS] 5:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 2

hidden_size: 24

learning_rate: 0.01

homogeneity_learning_rate: 0.0196078431372549

batch_size: 90

training_samples_number: 90

total_iterations: 2

Training with Homogeneity-driven update:

Total training time: 0.06 seconds

Training with traditional backpropagation:

Total training time: 0.03 seconds

Final Test Accuracy (Homogeneity-driven): 70.00%

Final Test Accuracy (Traditional): 43.33%

-------------------------------------------------------

Sample [B-A-IRIS] 6:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 10

hidden_size: 3

learning_rate: 0.01

homogeneity_learning_rate: 0.015384615384615384

batch_size: 30

training_samples_number: 90

total_iterations: 30

Training with Homogeneity-driven update:

Total training time: 0.20 seconds

Training with traditional backpropagation:

Total training time: 0.14 seconds

Final Test Accuracy (Homogeneity-driven): 90.00%

Final Test Accuracy (Traditional): 70.00%

SAME (basic) CODE IDEA FOR IRIS DATASET (labda depends on current iteration, manual setting of learning rate) [A-B]

=========================================

# BEGINNING OF THE CODE

import torch

import torch.nn as nn

import torch.optim as optim

from torch.utils.data import Dataset, DataLoader, random_split

from sklearn.datasets import load_iris

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler

import numpy as np

import matplotlib.pyplot as plt

import time

import math

# Device configuration

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Define the neural network model (adjusted for Iris dataset)

class Net(nn.Module):

def __init__(self, input_size, hidden_size, output_size):

super(Net, self).__init__()

self.fc1 = nn.Linear(input_size, hidden_size)

self.relu = nn.ReLU()

self.fc2 = nn.Linear(hidden_size, output_size)

def forward(self, x):

x = self.fc1(x)

x = self.relu(x)

x = self.fc2(x)

return x

# Function to calculate Absolute Difference Similarity (ADS)

def calculate_similarity(status, average, epsilon=1e-8):

num = torch.sum(torch.abs(status - average))

den = epsilon + torch.sum(torch.abs(status) + torch.abs(average))

similarity = 1 - (num / den)

return similarity

# Function to calculate partial derivative of Homogeneity

def calculate_partial_derivative(status, average, homogeneity_lambda, epsilon=1e-8):

N = torch.sum(torch.abs(status - average))

D = epsilon + torch.sum(torch.abs(status) + torch.abs(average))

condition1 = torch.logical_or(torch.logical_and(status > average, average > 0),

torch.logical_and(status < average, average < 0))

condition2 = torch.logical_or(torch.logical_and(status > 0, average < 0),

torch.logical_and(status < 0, average > 0))

condition3 = torch.logical_and(status == 0, average > 0)

condition4 = torch.logical_and(status == 0, average < 0)

condition5 = status == average

partial_derivative = torch.zeros_like(status)

partial_derivative[condition1] = (1 - homogeneity_lambda) * (1 / D**2) * (D - N)

partial_derivative[condition2] = (1 - homogeneity_lambda) * (1 / D**2) * (D + N)

partial_derivative[condition3] = -(1 - homogeneity_lambda) / D

partial_derivative[condition4] = (1 - homogeneity_lambda) / D

partial_derivative[condition5] = 0

remaining_indices = torch.logical_not(torch.logical_or(torch.logical_or(torch.logical_or(condition1, condition2),

torch.logical_or(condition3, condition4)),

condition5))

partial_derivative[remaining_indices] = (1 - homogeneity_lambda) * (1 / D**2) * (

D * torch.sign(status[remaining_indices] - average[remaining_indices]) -

N * torch.sign(status[remaining_indices])

)

return partial_derivative

# Function to perform Homogeneity-driven weight update

def weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average, epsilon=1e-8):

partial_derivative = calculate_partial_derivative(status, average, homogeneity_lambda, epsilon)

delta = -homogeneity_learning_rate * partial_derivative

return delta

# Function to train the model with Homogeneity-driven update

def train_homogeneity_driven(model, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average, total_iterations, batch_size):

# Added batch_size as an argument

criterion = nn.CrossEntropyLoss()

optimizer = optim.Adam(model.parameters(), lr=learning_rate)

optimizer_homogeneity = optim.Adam(model.parameters(), lr=homogeneity_learning_rate)

train_losses = []

val_losses = []

train_accuracies = []

val_accuracies = []

homogeneity_values = []

homogeneity = 1.0

total_start_time = time.time()

iteration_counter = 1

for epoch in range(num_epochs):

epoch_start_time = time.time()

model.train()

epoch_train_loss = 0

epoch_train_correct = 0

epoch_train_total = 0

for batch_idx, (data, target) in enumerate(train_loader):

iteration_counter += 1

data, target = data.to(device), target.to(device)

# --- Backpropagation Update ---

optimizer.zero_grad()

output = model(data)

loss = criterion(output, target)

loss.backward()

optimizer.step()

total_bp_update = 0

for param in model.parameters():

if param.grad is not None:

total_bp_update += torch.sum(torch.abs(param.grad)).item()

# --- Homogeneity-driven Update ---

status = torch.cat([p.data.flatten() for p in model.parameters()]).detach()

weights_before_update = status.clone()

# print("Model weights before homogeneity update (first 5 values of fc1.weight):")

# print(model.fc1.weight.data[:5])

similarity = calculate_similarity(status, average)

homogeneity_lambda = (iteration_counter - 1) / iteration_counter

homogeneity = (1 - homogeneity_lambda) * similarity + homogeneity_lambda * homogeneity

homogeneity_values.append(homogeneity.item())

delta = weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average)

current_index = 0

for param in model.parameters():

param_size = param.nelement()

update_value = delta[current_index: current_index + param_size].view_as(param.data)

param.grad = -update_value

current_index += param_size

optimizer_homogeneity.step()

optimizer_homogeneity.zero_grad()

# print("\nModel weights after homogeneity update (first 5 values of fc1.weight):")

# print(model.fc1.weight.data[:5])

weights_after_update = torch.cat([p.data.flatten() for p in model.parameters()]).detach()

distance = torch.norm(weights_before_update - weights_after_update)

# print(f"Distance between weights before and after homogeneity update: {distance.item()}\n")

average = (status + (epoch * len(train_loader) + batch_idx) * average) / (

epoch * len(train_loader) + batch_idx + 1)

optimizer.zero_grad()

# print(f"Epoch [{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "

# f"Backpropagation Update: {total_bp_update:.4f}, Homogeneity Update: {torch.sum(torch.abs(delta)).item():.4f}, "

# f"homogeneity_lambda: {homogeneity_lambda:.4f}, similarity: {similarity:.4f}, homogeneity: {homogeneity:.4f}")

epoch_train_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_train_total += target.size(0)

epoch_train_correct += (predicted == target).sum().item()

epoch_train_loss /= len(train_loader)

epoch_train_accuracy = 100 * epoch_train_correct / epoch_train_total

train_losses.append(epoch_train_loss)

train_accuracies.append(epoch_train_accuracy)

model.eval()

epoch_val_loss = 0

epoch_val_correct = 0

epoch_val_total = 0

with torch.no_grad():

for data, target in val_loader:

data, target = data.to(device), target.to(device)

output = model(data)

loss = criterion(output, target)

epoch_val_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_val_total += target.size(0)

epoch_val_correct += (predicted == target).sum().item()

epoch_val_loss /= len(val_loader)

epoch_val_accuracy = 100 * epoch_val_correct / epoch_val_total

val_losses.append(epoch_val_loss)

val_accuracies.append(epoch_val_accuracy)

epoch_end_time = time.time()

epoch_time = epoch_end_time - epoch_start_time

print(f"Epoch [{epoch + 1}/{num_epochs}], "

f"Train Loss: {epoch_train_loss:.4f}, "

f"Train Accuracy: {epoch_train_accuracy:.2f}%, "

f"Val Loss: {epoch_val_loss:.4f}, "

f"Val Accuracy: {epoch_val_accuracy:.2f}%")

#print(f"Epoch processing time: {epoch_time:.2f} seconds")

total_end_time = time.time()

total_training_time = total_end_time - total_start_time

print(f"Total training time: {total_training_time:.2f} seconds")

return train_losses, val_losses, train_accuracies, val_accuracies, homogeneity_values

# Function to train the model with traditional backpropagation

def train_traditional(model, train_loader, val_loader, num_epochs, learning_rate, batch_size):

# Added batch_size as an argument

criterion = nn.CrossEntropyLoss()

optimizer = optim.Adam(model.parameters(), lr=learning_rate)

train_losses = []

val_losses = []

train_accuracies = []

val_accuracies = []

total_start_time = time.time()

iteration_counter = 1 # Initialize iteration_counter here

for epoch in range(num_epochs):

epoch_start_time = time.time()

model.train()

epoch_train_loss = 0

epoch_train_correct = 0

epoch_train_total = 0

for batch_idx, (data, target) in enumerate(train_loader):

iteration_counter += 1

data, target = data.to(device), target.to(device)

optimizer.zero_grad()

output = model(data)

loss = criterion(output, target)

loss.backward()

optimizer.step()

# Calculate total_bp_update for each iteration

total_bp_update = 0

for param in model.parameters():

if param.grad is not None:

total_bp_update += torch.sum(torch.abs(param.grad)).item()

# print(f"Epoch [{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "

# f"Backpropagation Update: {total_bp_update:.4f}")

epoch_train_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_train_total += target.size(0)

epoch_train_correct += (predicted == target).sum().item()

epoch_train_loss /= len(train_loader)

epoch_train_accuracy = 100 * epoch_train_correct / epoch_train_total

train_losses.append(epoch_train_loss)

train_accuracies.append(epoch_train_accuracy)

model.eval()

epoch_val_loss = 0

epoch_val_correct = 0

epoch_val_total = 0

with torch.no_grad():

for data, target in val_loader:

data, target = data.to(device), target.to(device)

output = model(data)

loss = criterion(output, target)

epoch_val_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_val_total += target.size(0)

epoch_val_correct += (predicted == target).sum().item()

epoch_val_loss /= len(val_loader)

epoch_val_accuracy = 100 * epoch_val_correct / epoch_val_total

val_losses.append(epoch_val_loss)

val_accuracies.append(epoch_val_accuracy)

epoch_end_time = time.time()

epoch_time = epoch_end_time - epoch_start_time

print(f"Epoch [{epoch + 1}/{num_epochs}], "

f"Train Loss: {epoch_train_loss:.4f}, "

f"Train Accuracy: {epoch_train_accuracy:.2f}%, "

f"Val Loss: {epoch_val_loss:.4f}, "

f"Val Accuracy: {epoch_val_accuracy:.2f}%")

#print(f"Epoch processing time: {epoch_time:.2f} seconds")

total_end_time = time.time()

total_training_time = total_end_time - total_start_time

print(f"Total training time: {total_training_time:.2f} seconds")

return train_losses, val_losses, train_accuracies, val_accuracies

# Function to calculate accuracy

def calculate_accuracy(model, data_loader):

model.eval()

correct = 0

total = 0

with torch.no_grad():

for images, labels in data_loader:

images, labels = images.to(device), labels.to(device)

outputs = model(images)

_, predicted = torch.max(outputs.data, 1)

total += labels.size(0)

correct += (predicted == labels).sum().item()

return 100 * correct / total

# -------------------- Hyperparameters --------------------

num_epochs = 10

learning_rate = 0.01

homogeneity_learning_rate = 0.01

input_size = 4 # Number of features in Iris dataset

hidden_size = 10

output_size = 3 # Number of classes in Iris dataset

batch_size = 32 # Added batch size definition

# ---------------------------------------------------------

# Load the Iris dataset

iris = load_iris()

X = iris.data

y = iris.target

# Split data into training, validation, and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.25, random_state=42) # 0.25 x 0.8 = 0.2

# Scale the data

scaler = StandardScaler()

X_train = scaler.fit_transform(X_train)

X_val = scaler.transform(X_val)

X_test = scaler.transform(X_test)

# Create custom dataset class for Iris data

class IrisDataset(Dataset):

def __init__(self, X, y):

self.X = torch.tensor(X, dtype=torch.float32)

self.y = torch.tensor(y, dtype=torch.long)

def __len__(self):

return len(self.X)

def __getitem__(self, idx):

return self.X[idx], self.y[idx]

# Create data loaders

train_dataset = IrisDataset(X_train, y_train)

val_dataset = IrisDataset(X_val, y_val)

test_dataset = IrisDataset(X_test, y_test)

train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True) # Updated to use batch_size variable

val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False) # Updated to use batch_size variable

test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False) # Updated to use batch_size variable

training_samples_number = len(train_dataset)

total_iterations = math.ceil(training_samples_number / batch_size) * num_epochs # Updated to use batch_size variable

# homogeneity_learning_rate = (2* learning_rate) / (learning_rate * total_iterations + 1)

print("Hyperparameters and Calculated Values:")

print(f" num_epochs: {num_epochs}")

print(f" learning_rate: {learning_rate}")

print(f" homogeneity_learning_rate: {homogeneity_learning_rate}")

print(f" batch_size: {batch_size}")

print(f" training_samples_number: {training_samples_number}")

print(f" total_iterations: {total_iterations}")

print("\n")

# Initialize models

model_homogeneity = Net(input_size, hidden_size, output_size).to(device)

model_traditional = Net(input_size, hidden_size, output_size).to(device)

average_model = Net(input_size, hidden_size, output_size).to(device)

average = torch.cat([p.data.flatten() for p in average_model.parameters()]).detach()

# Train the models

print("Training with Homogeneity-driven update:")

results_homogeneity = train_homogeneity_driven(model_homogeneity, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average, total_iterations, batch_size)

print("\nTraining with traditional backpropagation:")

results_traditional = train_traditional(model_traditional, train_loader, val_loader, num_epochs, learning_rate, batch_size)

# Evaluate and plot results

train_losses_h, val_losses_h, train_accuracies_h, val_accuracies_h, homogeneity_values_h = results_homogeneity

train_losses_t, val_losses_t, train_accuracies_t, val_accuracies_t = results_traditional

print(f"\nFinal Test Accuracy (Homogeneity-driven): {calculate_accuracy(model_homogeneity, test_loader):.2f}%")

print(f"Final Test Accuracy (Traditional): {calculate_accuracy(model_traditional, test_loader):.2f}%")

plt.figure(figsize=(10, 5))

plt.plot(train_losses_h, label='Homogeneity-Driven Train Loss')

plt.plot(val_losses_h, label='Homogeneity-Driven Validation Loss')

plt.plot(train_losses_t, label='Traditional Train Loss')

plt.plot(val_losses_t, label='Traditional Validation Loss')

plt.title('Training and Validation Loss')

plt.xlabel('Epoch')

plt.ylabel('Loss')

plt.legend()

plt.show()

plt.figure(figsize=(10, 5))

plt.plot(train_accuracies_h, label='Homogeneity-Driven Train Accuracy')

plt.plot(val_accuracies_h, label='Homogeneity-Driven Validation Accuracy')

plt.plot(train_accuracies_t, label='Traditional Train Accuracy')

plt.plot(val_accuracies_t, label='Traditional Validation Accuracy')

plt.title('Training and Validation Accuracy')

plt.xlabel('Epoch')

plt.ylabel('Accuracy')

plt.legend()

plt.show()

plt.figure(figsize=(10, 5))

plt.plot(homogeneity_values_h, label='Homogeneity')

plt.title('Homogeneity Values over Iterations')

plt.xlabel('Iteration')

plt.ylabel('Homogeneity')

plt.legend()

plt.show()

# END OF THE CODE

=========================================

Samples of run:

-------------------------------------------------------

Sample [A-B-IRIS] 1:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 10

learning_rate: 0.01

homogeneity_learning_rate: 0.01

batch_size: 32

training_samples_number: 90

total_iterations: 30

Training with Homogeneity-driven update:

Total training time: 0.14 seconds

Training with traditional backpropagation:

Total training time: 0.09 seconds

Final Test Accuracy (Homogeneity-driven): 90.00%

Final Test Accuracy (Traditional): 83.33%

-------------------------------------------------------

Sample [A-B-IRIS] 2:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 10

learning_rate: 0.01

homogeneity_learning_rate: 0.01

batch_size: 64

training_samples_number: 90

total_iterations: 20

Training with Homogeneity-driven update:

Total training time: 0.12 seconds

Training with traditional backpropagation:

Total training time: 0.06 seconds

Final Test Accuracy (Homogeneity-driven): 90.00%

Final Test Accuracy (Traditional): 80.00%

EXPERIMENTS WITH HYBRID DYNAMIC LAMBDAs – MNIST DATASET

=========================================

# BEGINNING OF THE CODE

import torch

import torch.nn as nn

import torch.optim as optim

from torch.utils.data import DataLoader, random_split

from torchvision import datasets, transforms

import numpy as np

import matplotlib.pyplot as plt

import time

import math

# Device configuration

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Define the neural network model

class Net(nn.Module):

def __init__(self):

super(Net, self).__init__()

self.fc1 = nn.Linear(28 * 28, 128)

self.relu = nn.ReLU()

self.fc2 = nn.Linear(128, 10)

def forward(self, x):

x = x.view(-1, 28 * 28)

x = self.fc1(x)

x = self.relu(x)

x = self.fc2(x)

return x

# Function to calculate Absolute Difference Similarity (ADS)

def calculate_similarity(status, average, epsilon=1e-8):

num = torch.sum(torch.abs(status - average))

den = epsilon + torch.sum(torch.abs(status) + torch.abs(average))

similarity = 1 - (num / den)

return similarity

# Function to calculate partial derivative of Homogeneity

def calculate_partial_derivative(status, average, homogeneity_lambda, epsilon=1e-8):

N = torch.sum(torch.abs(status - average))

D = epsilon + torch.sum(torch.abs(status) + torch.abs(average))

condition1 = torch.logical_or(torch.logical_and(status > average, average > 0),

torch.logical_and(status < average, average < 0))

condition2 = torch.logical_or(torch.logical_and(status > 0, average < 0),

torch.logical_and(status < 0, average > 0))

condition3 = torch.logical_and(status == 0, average > 0)

condition4 = torch.logical_and(status == 0, average < 0)

condition5 = status == average

partial_derivative = torch.zeros_like(status)

partial_derivative[condition1] = (1 - homogeneity_lambda) * (1 / D**2) * (D - N)

partial_derivative[condition2] = (1 - homogeneity_lambda) * (1 / D**2) * (D + N)

partial_derivative[condition3] = -(1 - homogeneity_lambda) / D

partial_derivative[condition4] = (1 - homogeneity_lambda) / D

partial_derivative[condition5] = 0

remaining_indices = torch.logical_not(torch.logical_or(torch.logical_or(torch.logical_or(condition1, condition2),

torch.logical_or(condition3, condition4)),

condition5))

partial_derivative[remaining_indices] = (1 - homogeneity_lambda) * (1 / D**2) * (

D * torch.sign(status[remaining_indices] - average[remaining_indices]) -

N * torch.sign(status[remaining_indices])

)

return partial_derivative

# Function to perform Homogeneity-driven weight update

def weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average, epsilon=1e-8):

partial_derivative = calculate_partial_derivative(status, average, homogeneity_lambda, epsilon)

delta = -homogeneity_learning_rate * partial_derivative

return delta

# Function to train the model with Homogeneity-driven update

def train_homogeneity_driven(model, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average):

criterion = nn.CrossEntropyLoss()

optimizer = optim.Adam(model.parameters(), lr=learning_rate)

optimizer_homogeneity = optim.Adam(model.parameters(), lr=homogeneity_learning_rate)

train_losses = []

val_losses = []

train_accuracies = []

val_accuracies = []

homogeneity_values = []

homogeneity = 1.0

total_start_time = time.time()

iteration_counter = 1

for epoch in range(num_epochs):

epoch_start_time = time.time()

model.train()

epoch_train_loss = 0

epoch_train_correct = 0

epoch_train_total = 0

for batch_idx, (data, target) in enumerate(train_loader):

iteration_counter += 1

data, target = data.to(device), target.to(device)

# --- Backpropagation Update ---

optimizer.zero_grad()

output = model(data)

loss = criterion(output, target)

loss.backward()

optimizer.step()

total_bp_update = 0

for param in model.parameters():

if param.grad is not None:

total_bp_update += torch.sum(torch.abs(param.grad)).item()

# --- Homogeneity-driven Update ---

status = torch.cat([p.data.flatten() for p in model.parameters()]).detach()

weights_before_update = status.clone()

# print("Model weights before homogeneity update (first 5 values of fc1.weight):")

# print(model.fc1.weight.data[:5])

similarity = calculate_similarity(status, average)

beta = 0.999

alpha = (total_iterations - iteration_counter * (1-beta)- beta) / (total_iterations - 1)

homogeneity_lambda = alpha * ((iteration_counter - 1) / iteration_counter )

homogeneity = (1 - homogeneity_lambda) * similarity + homogeneity_lambda * homogeneity

homogeneity_values.append(homogeneity.item())

delta = weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average)

current_index = 0

for param in model.parameters():

param_size = param.nelement()

update_value = delta[current_index: current_index + param_size].view_as(param.data)

param.grad = -update_value

current_index += param_size

optimizer_homogeneity.step()

optimizer_homogeneity.zero_grad()

# print("\nModel weights after homogeneity update (first 5 values of fc1.weight):")

# print(model.fc1.weight.data[:5])

weights_after_update = torch.cat([p.data.flatten() for p in model.parameters()]).detach()

distance = torch.norm(weights_before_update - weights_after_update)

# print(f"Distance between weights before and after homogeneity update: {distance.item()}\n")

average = (status + (epoch * len(train_loader) + batch_idx) * average) / (

epoch * len(train_loader) + batch_idx + 1)

optimizer.zero_grad()

# print(f"Epoch [{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "

# f"Backpropagation Update: {total_bp_update:.4f}, Homogeneity Update: {torch.sum(torch.abs(delta)).item():.4f}, "

# f"homogeneity_lambda: {homogeneity_lambda:.4f}, similarity: {similarity:.4f}, homogeneity: {homogeneity:.4f}")

epoch_train_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_train_total += target.size(0)

epoch_train_correct += (predicted == target).sum().item()

epoch_train_loss /= len(train_loader)

epoch_train_accuracy = 100 * epoch_train_correct / epoch_train_total

train_losses.append(epoch_train_loss)

train_accuracies.append(epoch_train_accuracy)

model.eval()

epoch_val_loss = 0

epoch_val_correct = 0

epoch_val_total = 0

with torch.no_grad():

for data, target in val_loader:

data, target = data.to(device), target.to(device)

output = model(data)

loss = criterion(output, target)

epoch_val_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_val_total += target.size(0)

epoch_val_correct += (predicted == target).sum().item()

epoch_val_loss /= len(val_loader)

epoch_val_accuracy = 100 * epoch_val_correct / epoch_val_total

val_losses.append(epoch_val_loss)

val_accuracies.append(epoch_val_accuracy)

epoch_end_time = time.time()

epoch_time = epoch_end_time - epoch_start_time

print(f"Epoch [{epoch + 1}/{num_epochs}], "

f"Train Loss: {epoch_train_loss:.4f}, "

f"Train Accuracy: {epoch_train_accuracy:.2f}%, "

f"Val Loss: {epoch_val_loss:.4f}, "

f"Val Accuracy: {epoch_val_accuracy:.2f}%")

print(f"Epoch processing time: {epoch_time:.2f} seconds")

total_end_time = time.time()

total_training_time = total_end_time - total_start_time

print(f"Total training time: {total_training_time:.2f} seconds")

return train_losses, val_losses, train_accuracies, val_accuracies, homogeneity_values

# Function to train the model with traditional backpropagation

def train_traditional(model, train_loader, val_loader, num_epochs, learning_rate):

criterion = nn.CrossEntropyLoss()

optimizer = optim.Adam(model.parameters(), lr=learning_rate)

train_losses = []

val_losses = []

train_accuracies = []

val_accuracies = []

total_start_time = time.time()

iteration_counter = 1 # Initialize iteration_counter here

for epoch in range(num_epochs):

epoch_start_time = time.time()

model.train()

epoch_train_loss = 0

epoch_train_correct = 0

epoch_train_total = 0

for batch_idx, (data, target) in enumerate(train_loader):

iteration_counter += 1

data, target = data.to(device), target.to(device)

optimizer.zero_grad()

output = model(data)

loss = criterion(output, target)

loss.backward()

optimizer.step()

# Calculate total_bp_update for each iteration

total_bp_update = 0

for param in model.parameters():

if param.grad is not None:

total_bp_update += torch.sum(torch.abs(param.grad)).item()

# print(f"Epoch [{epoch + 1}/{num_epochs}], Iteration [{batch_idx + 1}/{len(train_loader)}], "

# f"Backpropagation Update: {total_bp_update:.4f}")

epoch_train_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_train_total += target.size(0)

epoch_train_correct += (predicted == target).sum().item()

epoch_train_loss /= len(train_loader)

epoch_train_accuracy = 100 * epoch_train_correct / epoch_train_total

train_losses.append(epoch_train_loss)

train_accuracies.append(epoch_train_accuracy)

model.eval()

epoch_val_loss = 0

epoch_val_correct = 0

epoch_val_total = 0

with torch.no_grad():

for data, target in val_loader:

data, target = data.to(device), target.to(device)

output = model(data)

loss = criterion(output, target)

epoch_val_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_val_total += target.size(0)

epoch_val_correct += (predicted == target).sum().item()

epoch_val_loss /= len(val_loader)

epoch_val_accuracy = 100 * epoch_val_correct / epoch_val_total

val_losses.append(epoch_val_loss)

val_accuracies.append(epoch_val_accuracy)

epoch_end_time = time.time()

epoch_time = epoch_end_time - epoch_start_time

print(f"Epoch [{epoch + 1}/{num_epochs}], "

f"Train Loss: {epoch_train_loss:.4f}, "

f"Train Accuracy: {epoch_train_accuracy:.2f}%, "

f"Val Loss: {epoch_val_loss:.4f}, "

f"Val Accuracy: {epoch_val_accuracy:.2f}%")

print(f"Epoch processing time: {epoch_time:.2f} seconds")

total_end_time = time.time()

total_training_time = total_end_time - total_start_time

print(f"Total training time: {total_training_time:.2f} seconds")

return train_losses, val_losses, train_accuracies, val_accuracies

# Function to calculate accuracy

def calculate_accuracy(model, data_loader):

model.eval()

correct = 0

total = 0

with torch.no_grad():

for images, labels in data_loader:

images, labels = images.to(device), labels.to(device)

outputs = model(images)

_, predicted = torch.max(outputs.data, 1)

total += labels.size(0)

correct += (predicted == labels).sum().item()

return 100 * correct / total

# !!!!!!!!!!!! Hyperparameters !!!!!!!!!!!!!!!!!!!!!

num_epochs = 10

learning_rate = 0.1

batch_size = 420

homogeneity_learning_rate = 0.1

# Load and split MNIST dataset

transform = transforms.ToTensor()

full_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)

test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

training_samples_number = int(0.7 * len(full_dataset))

total_iterations = math.ceil(training_samples_number / batch_size) * num_epochs

print("Hyperparameters and Calculated Values:")

print(f" num_epochs: {num_epochs}")

print(f" learning_rate: {learning_rate}")

print(f" homogeneity_learning_rate: {homogeneity_learning_rate}")

print(f" batch_size: {batch_size}")

print(f" training_samples_number: {training_samples_number}")

print(f" total_iterations: {total_iterations}")

print("\n")

train_size = int(0.7 * len(full_dataset))

val_size = int(0.15 * len(full_dataset))

test_size = len(full_dataset) - train_size - val_size

train_dataset, val_dataset, _ = random_split(full_dataset, [train_size, val_size, test_size])

train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)

test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

# Initialize models

model_homogeneity = Net().to(device)

model_traditional = Net().to(device)

average_model = Net().to(device)

average = torch.cat([p.data.flatten() for p in average_model.parameters()]).detach()

# Train the models

print("Training with Homogeneity-driven update:")

results_homogeneity = train_homogeneity_driven(model_homogeneity, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average)

print("\nTraining with traditional backpropagation:")

results_traditional = train_traditional(model_traditional, train_loader, val_loader, num_epochs, learning_rate)

# Evaluate and plot results

train_losses_h, val_losses_h, train_accuracies_h, val_accuracies_h, homogeneity_values_h = results_homogeneity

train_losses_t, val_losses_t, train_accuracies_t, val_accuracies_t = results_traditional

print(f"\nFinal Test Accuracy (Homogeneity-driven): {calculate_accuracy(model_homogeneity, test_loader):.2f}%")

print(f"Final Test Accuracy (Traditional): {calculate_accuracy(model_traditional, test_loader):.2f}%")

plt.figure(figsize=(10, 5))

plt.plot(train_losses_h, label='Homogeneity-Driven Train Loss')

plt.plot(val_losses_h, label='Homogeneity-Driven Validation Loss')

plt.plot(train_losses_t, label='Traditional Train Loss')

plt.plot(val_losses_t, label='Traditional Validation Loss')

plt.title('Training and Validation Loss')

plt.xlabel('Epoch')

plt.ylabel('Loss')

plt.legend()

plt.show()

plt.figure(figsize=(10, 5))

plt.plot(train_accuracies_h, label='Homogeneity-Driven Train Accuracy')

plt.plot(val_accuracies_h, label='Homogeneity-Driven Validation Accuracy')

plt.plot(train_accuracies_t, label='Traditional Train Accuracy')

plt.plot(val_accuracies_t, label='Traditional Validation Accuracy')

plt.title('Training and Validation Accuracy')

plt.xlabel('Epoch')

plt.ylabel('Accuracy')

plt.legend()

plt.show()

plt.figure(figsize=(10, 5))

plt.plot(homogeneity_values_h, label='Homogeneity')

plt.title('Homogeneity Values over Iterations')

plt.xlabel('Iteration')

plt.ylabel('Homogeneity')

plt.legend()

plt.show()

# END OF THE CODE

=========================================

Samples of run:

-------------------------------------------------------

Samples [HYBRID-MNIST] 1-10:

-------------------------------------------------------

Final Test Accuracy (Traditional): 92.00%

beta = 0.500 Final Test Accuracy (Homogeneity-driven): 61.28%

beta = 0.800 Final Test Accuracy (Homogeneity-driven): 78.19%

beta = 0.900 Final Test Accuracy (Homogeneity-driven): 86.76%

beta = 0.950 Final Test Accuracy (Homogeneity-driven): 90.48%

beta = 0.980 Final Test Accuracy (Homogeneity-driven): 91.71%

beta = 0.990 Final Test Accuracy (Homogeneity-driven): 91.78%

beta = 0.995 Final Test Accuracy (Homogeneity-driven): 92.06%

beta = 0.999 Final Test Accuracy (Homogeneity-driven): 92.30%

beta = 1.000 Final Test Accuracy (Homogeneity-driven): 91.66%

beta = 1.010 Final Test Accuracy (Homogeneity-driven): 91.02%

Part II (Full Implementation)

(Complete option: when homogeneity gradient depends both on current and historical change; all lambda options are in one code)

=========================================

# BEGINNING OF THE CODE

import torch

import torch.nn as nn

import torch.optim as optim

from torch.utils.data import Dataset, DataLoader, random_split

from sklearn.datasets import load_iris

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler

import numpy as np

import matplotlib.pyplot as plt

import time

import math

# Import transforms from torchvision

from torchvision import transforms, datasets

# Device configuration

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Define the neural network model

class Net(nn.Module):

def __init__(self, input_size, hidden_size, output_size):

super(Net, self).__init__()

self.fc1 = nn.Linear(input_size, hidden_size)

self.relu = nn.ReLU()

self.fc2 = nn.Linear(hidden_size, output_size)

def forward(self, x):

x = self.fc1(x)

x = self.relu(x)

x = self.fc2(x)

return x

# Function to calculate Absolute Difference Similarity (ADS)

def calculate_similarity(status, average, epsilon=1e-8):

num = torch.sum(torch.abs(status - average))

den = epsilon + torch.sum(torch.abs(status) + torch.abs(average))

similarity = 1 - (num / den)

return similarity

# Function to perform Homogeneity-driven weight update (Modified)

def weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average, gradients, epsilon=1e-8):

N = torch.sum(torch.abs(status - average))

D = epsilon + torch.sum(torch.abs(status) + torch.abs(average))

# Calculate partial derivative based on conditions from Table 1

partial_derivative = torch.zeros_like(status)

condition1 = torch.logical_or(torch.logical_and(status > average, average > 0),

torch.logical_and(status < average, average < 0))

condition2 = torch.logical_or(torch.logical_and(status > 0, average < 0),

torch.logical_and(status < 0, average > 0))

condition3 = torch.logical_and(status == 0, average < 0)

condition4 = torch.logical_and(status == 0, average > 0)

condition5 = status == average

partial_derivative[condition1] = (1 - homogeneity_lambda) / (D ** 2) * (D - N) + homogeneity_lambda * gradients[condition1]

partial_derivative[condition2] = (1 - homogeneity_lambda) / (D ** 2) * (D + N) + homogeneity_lambda * gradients[condition2]

partial_derivative[condition3] = (1 - homogeneity_lambda) / D + homogeneity_lambda * gradients[condition3]

partial_derivative[condition4] = -(1 - homogeneity_lambda) / D + homogeneity_lambda * gradients[condition4]

partial_derivative[condition5] = homogeneity_lambda * gradients[condition5]

# Update gradients for the next iteration

gradients = partial_derivative.clone()

delta = -homogeneity_learning_rate * partial_derivative

return delta, gradients

# Function to train the model with Homogeneity-driven update (Modified)

def train_homogeneity_driven(model, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average, total_iterations, batch_size, lambda_type, lambda_value):

criterion = nn.CrossEntropyLoss()

optimizer = optim.Adam(model.parameters(), lr=learning_rate)

optimizer_homogeneity = optim.Adam(model.parameters(), lr=homogeneity_learning_rate)

train_losses = []

val_losses = []

train_accuracies = []

val_accuracies = []

homogeneity_values = []

# Initialization (t=0)

homogeneity = 1.0

gradients = torch.cat([torch.zeros_like(p.data.flatten()) for p in model.parameters()])

total_start_time = time.time()

iteration_counter = 1

for epoch in range(num_epochs):

epoch_start_time = time.time()

model.train()

epoch_train_loss = 0

epoch_train_correct = 0

epoch_train_total = 0

for batch_idx, (data, target) in enumerate(train_loader):

iteration_counter += 1

data, target = data.to(device), target.to(device)

# 1. Compute Cross-Entropy Loss

optimizer.zero_grad()

# Flatten MNIST data if needed

data = data.view(data.size(0), -1) # Always flatten for MNIST

output = model(data)

loss = criterion(output, target)

# 2. Update Parameters via Backpropagation

loss.backward()

optimizer.step()

# Calculate total_bp_update for monitoring

total_bp_update = 0

for param in model.parameters():

if param.grad is not None:

total_bp_update += torch.sum(torch.abs(param.grad)).item()

# --- Homogeneity-driven Update ---

status = torch.cat([p.data.flatten() for p in model.parameters()]).detach()

# Calculate lambda based on lambda_type

if lambda_type == 'fixed':

homogeneity_lambda = lambda_value

elif lambda_type == 'linear':

homogeneity_lambda = (iteration_counter - 1) / total_iterations

elif lambda_type == 'dynamic':

homogeneity_lambda = (iteration_counter - 1) / iteration_counter if iteration_counter > 1 else 0

else:

raise ValueError("Invalid lambda_type. Choose from 'fixed', 'linear', or 'dynamic'.")

# 3. Compute Updated Homogeneity (Formula 1 and 3)

similarity = calculate_similarity(status, average)

homogeneity = (1 - homogeneity_lambda) * similarity + homogeneity_lambda * homogeneity

homogeneity_values.append(homogeneity.item())

# 4. Update NN Parameters Using Homogeneity Gradients (Formula 4 and 5)

delta, gradients = weights_update(status, homogeneity, homogeneity_learning_rate, homogeneity_lambda, average, gradients)

current_index = 0

for param in model.parameters():

param_size = param.nelement()

update_value = delta[current_index: current_index + param_size].view_as(param.data)

param.data.add_(-homogeneity_learning_rate * update_value) # Equation 5

current_index += param_size

# 5. Update Average Parameter Vector (Formula 2)

average = (status + (epoch * len(train_loader) + batch_idx) * average) / (epoch * len(train_loader) + batch_idx + 1)

# Reset optimizer state after homogeneity update

optimizer.zero_grad()

# --- End of Homogeneity-driven Update ---

epoch_train_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_train_total += target.size(0)

epoch_train_correct += (predicted == target).sum().item()

epoch_train_loss /= len(train_loader)

epoch_train_accuracy = 100 * epoch_train_correct / epoch_train_total

train_losses.append(epoch_train_loss)

train_accuracies.append(epoch_train_accuracy)

model.eval()

epoch_val_loss = 0

epoch_val_correct = 0

epoch_val_total = 0

with torch.no_grad():

for data, target in val_loader:

data, target = data.to(device), target.to(device)

# Flatten MNIST data if needed

data = data.view(data.size(0), -1) # Always flatten for MNIST

output = model(data)

loss = criterion(output, target)

epoch_val_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_val_total += target.size(0)

epoch_val_correct += (predicted == target).sum().item()

epoch_val_loss /= len(val_loader)

epoch_val_accuracy = 100 * epoch_val_correct / epoch_val_total

val_losses.append(epoch_val_loss)

val_accuracies.append(epoch_val_accuracy)

epoch_end_time = time.time()

epoch_time = epoch_end_time - epoch_start_time

print(f"Epoch [{epoch + 1}/{num_epochs}], "

f"Train Loss: {epoch_train_loss:.4f}, "

f"Train Accuracy: {epoch_train_accuracy:.2f}%, "

f"Val Loss: {epoch_val_loss:.4f}, "

f"Val Accuracy: {epoch_val_accuracy:.2f}%")

print(f"Epoch processing time: {epoch_time:.2f} seconds")

total_end_time = time.time()

total_training_time = total_end_time - total_start_time

print(f"Total training time: {total_training_time:.2f} seconds")

return train_losses, val_losses, train_accuracies, val_accuracies, homogeneity_values

# Function to train the model with traditional backpropagation

def train_traditional(model, train_loader, val_loader, num_epochs, learning_rate):

criterion = nn.CrossEntropyLoss()

optimizer = optim.Adam(model.parameters(), lr=learning_rate)

train_losses = []

val_losses = []

train_accuracies = []

val_accuracies = []

total_start_time = time.time()

iteration_counter = 1

for epoch in range(num_epochs):

epoch_start_time = time.time()

model.train()

epoch_train_loss = 0

epoch_train_correct = 0

epoch_train_total = 0

for batch_idx, (data, target) in enumerate(train_loader):

iteration_counter += 1

data, target = data.to(device), target.to(device)

optimizer.zero_grad()

# Flatten MNIST data if needed

data = data.view(data.size(0), -1) # Always flatten for MNIST

output = model(data)

loss = criterion(output, target)

loss.backward()

optimizer.step()

# Calculate total_bp_update for each iteration

total_bp_update = 0

for param in model.parameters():

if param.grad is not None:

total_bp_update += torch.sum(torch.abs(param.grad)).item()

epoch_train_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_train_total += target.size(0)

epoch_train_correct += (predicted == target).sum().item()

epoch_train_loss /= len(train_loader)

epoch_train_accuracy = 100 * epoch_train_correct / epoch_train_total

train_losses.append(epoch_train_loss)

train_accuracies.append(epoch_train_accuracy)

model.eval()

epoch_val_loss = 0

epoch_val_correct = 0

epoch_val_total = 0

with torch.no_grad():

for data, target in val_loader:

data, target = data.to(device), target.to(device)

# Flatten MNIST data if needed

data = data.view(data.size(0), -1) # Always flatten for MNIST

output = model(data)

loss = criterion(output, target)

epoch_val_loss += loss.item()

_, predicted = torch.max(output.data, 1)

epoch_val_total += target.size(0)

epoch_val_correct += (predicted == target).sum().item()

epoch_val_loss /= len(val_loader)

epoch_val_accuracy = 100 * epoch_val_correct / epoch_val_total

val_losses.append(epoch_val_loss)

val_accuracies.append(epoch_val_accuracy)

epoch_end_time = time.time()

epoch_time = epoch_end_time - epoch_start_time

print(f"Epoch [{epoch + 1}/{num_epochs}], "

f"Train Loss: {epoch_train_loss:.4f}, "

f"Train Accuracy: {epoch_train_accuracy:.2f}%, "

f"Val Loss: {epoch_val_loss:.4f}, "

f"Val Accuracy: {epoch_val_accuracy:.2f}%")

print(f"Epoch processing time: {epoch_time:.2f} seconds")

total_end_time = time.time()

total_training_time = total_end_time - total_start_time

print(f"Total training time: {total_training_time:.2f} seconds")

return train_losses, val_losses, train_accuracies, val_accuracies

# Function to calculate accuracy (Modified)

def calculate_accuracy(model, data_loader):

model.eval()

correct = 0

total = 0

with torch.no_grad():

for data, target in data_loader:

data, target = data.to(device), target.to(device)

# Flatten MNIST data if needed

data = data.view(data.size(0), -1) # Always flatten for MNIST

outputs = model(data)

_, predicted = torch.max(outputs.data, 1)

total += target.size(0)

correct += (predicted == target).sum().item()

return 100 * correct / total

# Hyperparameters

num_epochs = 16

learning_rate = 0.01

batch_size = 210

homogeneity_learning_rate = 0.005

input_size = 28 * 28 # For MNIST

hidden_size = 128

output_size = 10

# Homogeneity Hyperparameters

lambda_type = 'dynamic' # Options: 'fixed', 'linear', 'dynamic'

lambda_value = 0.9 # Default value for lambda if lambda_type is 'fixed'

# Load and split MNIST dataset

transform = transforms.ToTensor()

full_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)

test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

training_samples_number = int(0.7 * len(full_dataset))

total_iterations = math.ceil(training_samples_number / batch_size) * num_epochs

print("Hyperparameters and Calculated Values:")

print(f" num_epochs: {num_epochs}")

print(f" learning_rate: {learning_rate}")

print(f" homogeneity_learning_rate: {homogeneity_learning_rate}")

print(f" batch_size: {batch_size}")

print(f" training_samples_number: {training_samples_number}")

print(f" total_iterations: {total_iterations}")

print(f" lambda_type: {lambda_type}")

print(f" lambda_value: {lambda_value}") # Only relevant if lambda_type is 'fixed'

print("\n")

train_size = int(0.7 * len(full_dataset))

val_size = int(0.15 * len(full_dataset))

test_size = len(full_dataset) - train_size - val_size

train_dataset, val_dataset, _ = random_split(full_dataset, [train_size, val_size, test_size])

train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)

test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

# Initialize models

model_homogeneity = Net(input_size, hidden_size, output_size).to(device)

model_traditional = Net(input_size, hidden_size, output_size).to(device)

average_model = Net(input_size, hidden_size, output_size).to(device)

average = torch.cat([p.data.flatten() for p in average_model.parameters()]).detach()

# Train the models

print("Training with Homogeneity-driven update:")

results_homogeneity = train_homogeneity_driven(model_homogeneity, train_loader, val_loader, num_epochs, learning_rate, homogeneity_learning_rate, average, total_iterations, batch_size, lambda_type, lambda_value)

print("\nTraining with traditional backpropagation:")

results_traditional = train_traditional(model_traditional, train_loader, val_loader, num_epochs, learning_rate)

# Evaluate and plot results

train_losses_h, val_losses_h, train_accuracies_h, val_accuracies_h, homogeneity_values_h = results_homogeneity

train_losses_t, val_losses_t, train_accuracies_t, val_accuracies_t = results_traditional

print(f"\nFinal Test Accuracy (Homogeneity-driven): {calculate_accuracy(model_homogeneity, test_loader):.2f}%")

print(f"Final Test Accuracy (Traditional): {calculate_accuracy(model_traditional, test_loader):.2f}%")

plt.figure(figsize=(10, 5))

plt.plot(train_losses_h, label='Homogeneity-Driven Train Loss')

plt.plot(val_losses_h, label='Homogeneity-Driven Validation Loss')

plt.plot(train_losses_t, label='Traditional Train Loss')

plt.plot(val_losses_t, label='Traditional Validation Loss')

plt.title('Training and Validation Loss')

plt.xlabel('Epoch')

plt.ylabel('Loss')

plt.legend()

plt.show()

plt.figure(figsize=(10, 5))

plt.plot(train_accuracies_h, label='Homogeneity-Driven Train Accuracy')

plt.plot(val_accuracies_h, label='Homogeneity-Driven Validation Accuracy')

plt.plot(train_accuracies_t, label='Traditional Train Accuracy')

plt.plot(val_accuracies_t, label='Traditional Validation Accuracy')

plt.title('Training and Validation Accuracy')

plt.xlabel('Epoch')

plt.ylabel('Accuracy')

plt.legend()

plt.show()

plt.figure(figsize=(10, 5))

plt.plot(homogeneity_values_h, label='Homogeneity')

plt.title('Homogeneity Values over Iterations')

plt.xlabel('Iteration')

plt.ylabel('Homogeneity')

plt.legend()

plt.show()

# END OF THE CODE

=========================================

Samples of run:

-------------------------------------------------------

Samples [COMPLETE-MNIST] 1:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 10

learning_rate: 0.01

homogeneity_learning_rate: 0.001

batch_size: 210

training_samples_number: 42000

total_iterations: 2000

lambda_type: dynamic

Training with Homogeneity-driven update:

Total training time: 90.20 seconds

Training with traditional backpropagation:

Total training time: 69.26 seconds

Final Test Accuracy (Homogeneity-driven): 96.87%

Final Test Accuracy (Traditional): 96.77%

-------------------------------------------------------

Samples [COMPLETE-MNIST] 2:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 16

learning_rate: 0.01

homogeneity_learning_rate: 0.005

batch_size: 210

training_samples_number: 42000

total_iterations: 3200

lambda_type: dynamic

Training with Homogeneity-driven update:

Total training time: 142.09 seconds

Training with traditional backpropagation:

Total training time: 112.01 seconds

Final Test Accuracy (Homogeneity-driven): 97.07%

Final Test Accuracy (Traditional): 96.66%

-------------------------------------------------------

Samples [COMPLETE-MNIST] 3:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 16

learning_rate: 0.01

homogeneity_learning_rate: 0.005

batch_size: 105

training_samples_number: 42000

total_iterations: 6400

lambda_type: dynamic

Training with Homogeneity-driven update:

Total training time: 184.55 seconds

Training with traditional backpropagation:

Total training time: 123.41 seconds

Final Test Accuracy (Homogeneity-driven): 97.18%

Final Test Accuracy (Traditional): 96.60%

-------------------------------------------------------

Samples [COMPLETE-MNIST] 4:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 8

learning_rate: 0.01

homogeneity_learning_rate: 0.01

batch_size: 105

training_samples_number: 42000

total_iterations: 3200

lambda_type: linear

Training with Homogeneity-driven update:

Total training time: 91.41 seconds

Training with traditional backpropagation:

Total training time: 61.28 seconds

Final Test Accuracy (Homogeneity-driven): 96.60%

Final Test Accuracy (Traditional): 96.38%

-------------------------------------------------------

Samples [COMPLETE-MNIST] 5:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 8

learning_rate: 0.01

homogeneity_learning_rate: 0.01

batch_size: 1050

training_samples_number: 42000

total_iterations: 320

lambda_type: linear

Training with Homogeneity-driven update:

Total training time: 53.81 seconds

Training with traditional backpropagation:

Total training time: 50.97 seconds

Final Test Accuracy (Homogeneity-driven): 97.27%

Final Test Accuracy (Traditional): 97.00%

-------------------------------------------------------

Samples [COMPLETE-MNIST] 6:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 8

learning_rate: 0.01

homogeneity_learning_rate: 0.01

batch_size: 2100

training_samples_number: 42000

total_iterations: 160

lambda_type: linear

Training with Homogeneity-driven update:

Total training time: 53.00 seconds

Training with traditional backpropagation:

Total training time: 51.18 seconds

Final Test Accuracy (Homogeneity-driven): 96.76%

Final Test Accuracy (Traditional): 96.32%

-------------------------------------------------------

Samples [COMPLETE-MNIST] 7:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 8

learning_rate: 0.01

homogeneity_learning_rate: 0.01

batch_size: 2100

training_samples_number: 42000

total_iterations: 160

lambda_type: fixed lambda_value: 0.9

Training with Homogeneity-driven update:

Total training time: 59.20 seconds

Training with traditional backpropagation:

Total training time: 55.72 seconds

Final Test Accuracy (Homogeneity-driven): 96.67%

Final Test Accuracy (Traditional): 96.52%

-------------------------------------------------------

Samples [COMPLETE-MNIST] 8:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 8

learning_rate: 0.01

homogeneity_learning_rate: 0.01

batch_size: 2100

training_samples_number: 42000

total_iterations: 160

lambda_type: fixed lambda_value: 0.5

Training with Homogeneity-driven update:

Total training time: 52.22 seconds

Training with traditional backpropagation:

Total training time: 51.21 seconds

Final Test Accuracy (Homogeneity-driven): 96.80%

Final Test Accuracy (Traditional): 96.73%

-------------------------------------------------------

Samples [COMPLETE-MNIST] 9:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 8

learning_rate: 0.01

homogeneity_learning_rate: 0.01

batch_size: 2100

training_samples_number: 42000

total_iterations: 160

lambda_type: fixed lambda_value: 0.1

Training with Homogeneity-driven update:

Total training time: 53.95 seconds

Training with traditional backpropagation:

Total training time: 51.14 seconds

Final Test Accuracy (Homogeneity-driven): 96.81%

Final Test Accuracy (Traditional): 96.72%

-------------------------------------------------------

Samples [COMPLETE-MNIST] 10:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 8

learning_rate: 0.01

homogeneity_learning_rate: 0.001

batch_size: 105

training_samples_number: 42000

total_iterations: 3200

lambda_type: fixed lambda_value: 0.9

Training with Homogeneity-driven update:

Total training time: 94.03 seconds

Training with traditional backpropagation:

Epoch processing time: 7.23 seconds

Total training time: 61.98 seconds

Final Test Accuracy (Homogeneity-driven): 96.57%

Final Test Accuracy (Traditional): 96.57%

-------------------------------------------------------

Samples [COMPLETE-MNIST] 11:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 16

learning_rate: 0.01

homogeneity_learning_rate: 0.005

batch_size: 21000

training_samples_number: 42000

total_iterations: 32

lambda_type: fixed lambda_value: 0.9

Training with Homogeneity-driven update:

Total training time: 105.42 seconds

Training with traditional backpropagation:

Total training time: 106.84 seconds

Final Test Accuracy (Homogeneity-driven): 93.55%

Final Test Accuracy (Traditional): 93.02%

-------------------------------------------------------

Samples [COMPLETE-MNIST] 12:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 16

learning_rate: 0.01

homogeneity_learning_rate: 0.005

batch_size: 5

training_samples_number: 42000

total_iterations: 134400

lambda_type: fixed lambda_value: 0.9

Training with Homogeneity-driven update:

Total training time: 2068.69 seconds

Training with traditional backpropagation:

Total training time: 785.92 seconds

Final Test Accuracy (Homogeneity-driven): 94.54%

Final Test Accuracy (Traditional): 94.47%

-------------------------------------------------------

Samples [COMPLETE-MNIST] 13:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 4

learning_rate: 0.01

homogeneity_learning_rate: 0.005

batch_size: 2

training_samples_number: 42000

total_iterations: 84000

lambda_type: fixed lambda_value: 0.9

Training with Homogeneity-driven update:

Total training time: 1199.32 seconds

Training with traditional backpropagation:

Total training time: 377.90 seconds

Final Test Accuracy (Homogeneity-driven): 92.46%

Final Test Accuracy (Traditional): 90.46%

-------------------------------------------------------

Samples [COMPLETE-MNIST] 14:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 20

learning_rate: 0.01

homogeneity_learning_rate: 0.005

batch_size: 42000

training_samples_number: 42000

total_iterations: 20

lambda_type: fixed lambda_value: 0.9

Training with Homogeneity-driven update:

Total training time: 131.89 seconds

Training with traditional backpropagation:

Total training time: 132.32 seconds

Final Test Accuracy (Homogeneity-driven): 91.40%

Final Test Accuracy (Traditional): 91.30%

-------------------------------------------------------

Samples [COMPLETE-MNIST] 15:

-------------------------------------------------------

Hyperparameters and Calculated Values:

num_epochs: 4

learning_rate: 0.01

homogeneity_learning_rate: 0.01

batch_size: 1

training_samples_number: 42000

total_iterations: 168000

lambda_type: linear

Training with Homogeneity-driven update:

Total training time: 2556.08 seconds

Training with traditional backpropagation:

Total training time: 833.99 seconds

Final Test Accuracy (Homogeneity-driven): 89.85%

Final Test Accuracy (Traditional): 87.74%