computer-nec-license
  • NEC-Computer
  • 1. Concept of Basic Electrical and Electronics Engineering
    • 1.1 Basic Concepts
    • 1.2 Network Theorems
    • 1.3 Alternating Current Fundamentals
    • 1.4 Semiconductor Device
    • 1.5 Signal Generator
    • 1.6 Amplifiers
    • MCQs
      • MCQs On Basic Electrical
        • set-1
        • set-2
      • MCQs On Basic Electronics
        • set-1
        • set-2
  • 2. Digital Logic and Microprocessor
    • 2.1 Digital Logic
    • 2.2 Combinational & Arithmetic Circuit
    • 2.3 Sequential Logic Circuits
    • 2.4 Microprocessor
    • 2.5 Microprocessor System
    • 2.6 Interrupt Operations
    • MCQs
      • MCQs On Digital Logic
        • set-1
        • set-2
        • set-3
        • set-4
        • set-5
        • set-6
        • set-7
        • set-8
        • set-9
        • set-10
        • set-11
        • set-12
      • MCQs On Microprocessor
        • set-1
        • set-2
        • set-3
        • set-4
        • set-5
        • set-6
        • set-7
        • set-8
        • set-9
  • 3. Programming Language and Its Applications
    • 3.1 Introduction to C Programming
    • 3.2 Pointers, Structures, and Data Files
    • 3.3 C++ Language Constructs with Objects and Classes
    • 3.4 Features of Object-Oriented Programming
    • 3.5 Pure Virtual Functions and File Handling
    • 3.6 Generic Programming and Exception Handling
    • MCQs
      • set-1
      • set-2
      • set-3
      • set-4
      • set-5
  • 4. Computer Organization and Embedded System
    • 4.1 Control and CPU
    • 4.2 Computer Arithmetic and Memory System
    • 4.3 I/O Organization and Multiprocessor
    • 4.4 Embedded System Design
    • 4.5 Real-Time Operating and Control Systems
    • 4.6 Hardware Description Language (VHDL) and IC Technology
    • MCQs
      • set-1
      • set-2
      • set-3
      • set-4
      • set-5
      • set-6
      • set-7
      • set-8
      • set-9
      • set-10
      • set-11
  • 5. Concept of Computer Network and Network Security System
    • 5.1 Introduction to Computer Networks
    • 5.2 Data Link Layer
    • 5.3 Network Layer
    • 5.4 Transport Layer
    • 5.5 Application Layer
    • 5.6 Network Security
    • MCQs
      • Basic Networking
        • set-1
        • set-2
      • Advanced Networking
        • set-1
        • set-2
        • set-3
        • set-4
        • set-5
        • set-6
  • 6. Theory of Computation and Computer Graphics
    • 6.1 Introduction to Finite Automata
    • 6.2 Introduction to Context-Free Languages (CFL)
    • 6.3 Turing Machines (TM)
    • 6.4 Introduction to Computer Graphics
    • 6.5 Two-Dimensional Transformation
    • 6.6 Three-Dimensional Transformation
    • MCQs
      • MCQs on Theory of Computation
        • set-1
        • set-2
        • set-3
      • MCQs On Computer Graphics
        • set-1
        • set-2
        • set-3
        • set-4
        • set-5
        • set-6
  • 7. Data Structures and Algorithm, Database System and Operating System
    • 7.1 Introduction to Data Structures, Lists, Linked Lists, and Trees
    • 7.2 Sorting, Searching, Hashing and Graphs
    • 7.3 Introduction to Data Models, Normalization, and SQL
    • 7.4 Transaction Processing, Concurrency Control, and Crash Recovery
    • 7.5 Introduction to Operating System and Process Management
    • 7.6 Memory Management, File Systems, and System Administration
    • MCQs
      • MCQs ON DSA
        • set-1
        • set-2
        • set-3
        • set-4
        • set-5
        • set-6
      • MCQs On DBMS
        • set-1
        • set-2
      • MCQs On Operating System
        • set-1
        • set-2
        • set-3
        • set-4
        • set-5
        • set-6
        • set-7
        • set-8
        • set-9
        • set-10
        • set-11
        • set-12
  • 8. Software Engineering and Object-Oriented Analysis & Design
    • 8.1 Software Process and Requirements
    • 8.2 Software Design
    • 8.3 Software Testing, Cost Estimation, Quality Management, and Configuration Management
    • 8.4 Object-Oriented Fundamentals and Analysis
    • 8.5 Object-Oriented Design
    • 8.6 Object-Oriented Design Implementation
    • MCQs
      • set-1
      • set-2
      • set-3
      • set-4
      • set-5
      • set-6
      • set-7
      • set-8
      • set-9
  • 9. Artificial Intelligence and Neural Networks
    • 9.1 Introduction to AI and Intelligent Agents
    • 9.2 Problem Solving and Searching Techniques
    • 9.3 Knowledge Representation
    • 9.4 Expert System and Natural Language Processing
    • 9.5 Machine Learning
    • 9.6 Neural Networks
    • MCQs
      • set-1
      • set-2
      • set-3
      • set-4
      • set-5
      • set-6
      • set-7
      • set-8
      • set-9
  • 10. Project Planning, Design and Implementation
    • 10.1 Engineering Drawings and Its Concepts
    • 10.2 Engineering Economics
    • 10.3 Project Planning and Scheduling
    • 10.4 Project Management
    • 10.5 Engineering Professional Practice
    • 10.6 Engineering Regulatory Body
    • MCQs
      • MCQs On Engineering Drawing
        • set-1
        • set-2
      • MCQs On Engineering Economics
      • MCQs On Project Planning & Scheduling
      • MCQs On Project Mangement
      • MCQs On Engineering Professional Practice
      • MCQs On Engineering Regulatory Body
  • Questions Sets
    • Set 1 (Chaitra, 2080)
      • Short Questions (60*1=60 Marks)
      • Long Questions (20*2=40 Marks)
    • Set 2 (Aasadh, 2081)
      • Short Questions (60*1=60 Marks)
      • Long Questions (20*2=40 Marks)
    • Set 3 (Asojh, 2080)
      • Short Questions (60*1=60 Marks)
      • Long Questions (20*2=40 Marks)
    • Model Set - Computer Engineering By NEC
      • Short Questions (60*1=60 Marks)
      • Long Questions (20*2=40 Marks)
    • Model Set - Software Engineering By NEC
      • Short Questions (60*1=60 Marks)
      • Long Questions (20*2=40 Marks)
  • Tips & Tricks
Powered by GitBook
On this page
  • 1. Biological Neural Networks vs. Artificial Neural Networks (ANN)
  • 2. McCulloch-Pitts Neuron
  • 3. Mathematical Model of ANN
  • 4. Activation Functions
  • 5. Architectures of Neural Networks
  • 6. The Perceptron
  • 7. The Learning Rate
  • 8. Gradient Descent
  • 9. The Delta Rule
  • 10. Hebbian Learning
  • 11. Adaline Network
  • 12. Multilayer Perceptron Neural Networks (MLP)
  • 13. Backpropagation Algorithm
  • 14. Hopfield Neural Network
  • Conclusion
  1. 9. Artificial Intelligence and Neural Networks

9.6 Neural Networks

Previous9.5 Machine LearningNextMCQs

Last updated 3 months ago

Neural networks (NN) are computational models inspired by the way biological neural systems, such as the brain, process information. In this section, we will explore the fundamental concepts behind artificial neural networks (ANNs), their types, and key algorithms used in their learning processes.


1. Biological Neural Networks vs. Artificial Neural Networks (ANN)

  1. Biological Neural Networks

Biological neural networks refer to the complex networks of neurons in the human brain and nervous system that process information. A biological neuron consists of three main components:

  • Dendrites: Receive signals from other neurons.

  • Axon: Transmits signals to other neurons.

  • Synapse: The connection between neurons that enables signal transmission.

Neurons in the brain are highly interconnected and process information through electrical and chemical signals.

  1. Artificial Neural Networks (ANNs)

Artificial neural networks are computational models that simulate the behavior of biological neural networks. They consist of artificial neurons (nodes) connected by edges (synapses) that transmit signals in the form of numerical values. ANNs are primarily used for pattern recognition, classification, and regression tasks.


2. McCulloch-Pitts Neuron

The McCulloch-Pitts neuron, introduced in 1943, is a mathematical model of a simple artificial neuron. It is primarily designed to simulate the working of biological neurons in performing basic logical operations. This model laid the foundation for modern artificial neural networks.

Mathematically, the McCulloch-Pitts neuron can be represented as:

  • Output=f(∑i=1nxiwi−θ)\text{Output} = f \left( \sum_{i=1}^{n} x_i w_i - \theta \right)Output=f(∑i=1n​xi​wi​−θ)

Where:

  • xi=is the input,x_i= \text{is the input,}xi​=is the input,

  • wi=is the weight,w_i = \text{is the weight,}wi​=is the weight,

  • θ=is the threshold,\theta = \text{is the threshold,}θ=is the threshold,

  • f=is the activation function (typically a step function).f = \text{is the activation function (typically a step function).}f=is the activation function (typically a step function).


Components of the McCulloch-Pitts Neuron

Inputs

  • The neuron accepts a set of binary inputs: x₁, x₂, ..., xₙ, where each input is either 0 or 1.

  • These inputs represent the external signals received by the neuron.

Weights

  • Each input has an associated weight: w₁, w₂, ..., wₙ.

  • Weights determine the importance or influence of each input on the output. A higher weight signifies a stronger influence.

Weighted Sum

  • The neuron calculates the weighted sum of the inputs:

    • S=∑i=1nxi⋅wiS = \sum_{i=1}^{n} x_i \cdot w_iS=∑i=1n​xi​⋅wi​

This sum determines the combined effect of all inputs based on their weights.

Threshold (θ)

  • A predefined threshold value, θ, is set.

  • The weighted sum, S, is compared against θ to decide the neuron's output.

Activation Function

A step function is used as the activation function:

y={1if S≥θ0if S<θy = \begin{cases} 1 & \text{if } S \geq \theta \\ 0 & \text{if } S < \theta \end{cases}y={10​if S≥θif S<θ​

The output, ( y ), is binary (0 or 1).

Mathematical Representation

y={1if ∑i=1nxi⋅wi≥θ0otherwisey = \begin{cases} 1 & \text{if } \sum_{i=1}^{n} x_i \cdot w_i \geq \theta \\ 0 & \text{otherwise} \end{cases}y={10​if ∑i=1n​xi​⋅wi​≥θotherwise​

Where:

  • xᵢ: Input signals (binary values, 0 or 1).

  • wᵢ: Weights of the inputs.

  • θ: Threshold value.

  • y: Output of the neuron (binary, 0 or 1).

Applications

AND Gate

  • Outputs 1 only when all inputs are 1.

  • Example:

    • Inputs: x₁ = 1, x₂ = 1

    • Weights: w₁ = 1, w₂ = 1

    • Threshold: θ = 2

    • Calculation:

      • S=x1⋅w1+x2⋅w2=1⋅1+1⋅1=2≥2  ⟹  y=1S = x_1 \cdot w_1 + x_2 \cdot w_2 = 1 \cdot 1 + 1 \cdot 1 = 2 \geq 2 \implies y = 1S=x1​⋅w1​+x2​⋅w2​=1⋅1+1⋅1=2≥2⟹y=1

OR Gate

  • Outputs 1 if at least one input is 1.

  • Example:

    • Inputs: x₁ = 1, x₂ = 0

    • Weights: w₁ = 1, w₂ = 1

    • Threshold: θ = 1

    • Calculations:

      • S=x1⋅w1+x2⋅w2=1⋅1+0⋅1=1≥1  ⟹  y=1S = x_1 \cdot w_1 + x_2 \cdot w_2 = 1 \cdot 1 + 0 \cdot 1 = 1 \geq 1 \implies y = 1S=x1​⋅w1​+x2​⋅w2​=1⋅1+0⋅1=1≥1⟹y=1

NOT Gate

  • Outputs the opposite of the input.

  • Example:

    • Input: x₁ = 1

    • Weight: w₁ = -1

    • Threshold: θ = -0.5

    • Calculation:

      • S=x1⋅w1=1⋅(−1)=−1<−0.5  ⟹  y=0S = x_1 \cdot w_1 = 1 \cdot (-1) = -1 < -0.5 \implies y = 0S=x1​⋅w1​=1⋅(−1)=−1<−0.5⟹y=0

Limitations

  1. Linear Separability: McCulloch-Pitts neurons cannot solve problems that are not linearly separable (e.g., the XOR problem).

  2. Static Model: The neuron lacks learning capabilities; weights and thresholds are fixed and not adjusted through training.

Significance

The McCulloch-Pitts neuron was an early milestone in AI and computational neuroscience. It demonstrated how a network of simple neurons could model complex logical operations, paving the way for modern neural networks and deep learning.


3. Mathematical Model of ANN

The basic unit of an artificial neural network is the artificial neuron, which is modeled mathematically by the following steps:

  1. Inputs: The network receives a set of inputs( x1,x2,…,xnx_1, x_2, \dots, x_nx1​,x2​,…,xn​)

  2. Weights: Each input is multiplied by a corresponding weight( w1,w2,...,wnw₁, w₂, ..., wₙw1​,w2​,...,wn​)

  3. Summation: The weighted inputs are summed.

  4. Bias: An additional term called bias is added to the summation to adjust the output independently of the inputs.

  5. Activation Function: The sum is passed through an activation function to produce the output.

Mathematically:

Output=f(∑i=1nxiwi+b)\text{Output} = f\left( \sum_{i=1}^{n} x_i w_i + b \right)Output=f(∑i=1n​xi​wi​+b)

Where:

  • xi=are the inputs,x_i = \text{are the inputs,} xi​=are the inputs,

  • wi=are the weights,w_i = \text{are the weights,} wi​=are the weights,

  • b=is the bias,b = \text{is the bias,} b=is the bias,

  • f=is the activation function.f = \text{is the activation function.}f=is the activation function.


4. Activation Functions

Activation functions are mathematical functions used to determine the output of a neuron. They introduce non-linearity into the neural network, allowing it to learn complex patterns. Common activation functions include:

  • Step Function: Outputs 1 if the input is greater than a threshold, otherwise 0.

  • Sigmoid Function: Outputs values between 0 and 1, often used in binary classification.

    • σ(x)=11+e−x\sigma(x) = \frac{1}{1 + e^{-x}}σ(x)=1+e−x1​

  • Hyperbolic Tangent (tanh): Outputs values between -1 and 1.

    • tanh⁡(x)=ex−e−xex+e−x\tanh(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}}tanh(x)=ex+e−xex−e−x​

  • ReLU (Rectified Linear Unit): Outputs the input directly if it is positive, otherwise zero. It is widely used in modern neural networks.

    • ReLU(x)=max⁡(0,x)\text{ReLU}(x) = \max(0, x)ReLU(x)=max(0,x)

  • Leaky ReLU: A variant of ReLU that allows small negative values when the input is less than zero.

    • Leaky ReLU(x)=max⁡(0.01x,x)\text{Leaky ReLU}(x) = \max(0.01x, x)Leaky ReLU(x)=max(0.01x,x)


5. Architectures of Neural Networks

The architecture of a neural network refers to the arrangement of neurons in layers and the connections between them. Key types of architectures include:

  • Feedforward Neural Networks (FNN): Neurons are arranged in layers, and information moves only in one direction — from input to output.

  • Recurrent Neural Networks (RNN): In RNNs, information can flow in both directions, and they maintain memory of previous computations, making them useful for sequential data like time series or language modeling.

  • Convolutional Neural Networks (CNN): Specialized for processing grid-like data such as images, CNNs include layers that apply convolutions (filters) to extract features from data.


6. The Perceptron

The Perceptron is a single-layer neural network model used for binary classification. It consists of a set of inputs, weights, a bias term, and an activation function. The perceptron algorithm updates weights based on the error between the predicted and actual output, typically using a simple learning rule.

Perceptron Learning Rule:

  • wi=wi+Δwiw_i = w_i + \Delta w_iwi​=wi​+Δwi​

Where:

  • Δwi=η(y−y^)xi\Delta w_i = \eta (y - \hat{y}) x_iΔwi​=η(y−y^​)xi​

Where:

  • η=is the learning rate,\eta = \text{is the learning rate,} η=is the learning rate,

  • y=is the actual output,y = \text{is the actual output,} y=is the actual output,

  • y^=is the predicted output,\hat{y} = \text{is the predicted output,} y^​=is the predicted output,

  • xi=is the input value.x_i = \text{is the input value.}xi​=is the input value.

Two essential steps for adjusting weights in a perceptron

Step 1: Calculate the Prediction

Formula: z=w1⋅x1+w2⋅x2+bz = w_1 \cdot x_1 + w_2 \cdot x_2 + bz=w1​⋅x1​+w2​⋅x2​+b

If z≥0z \geq 0z≥0, predict 1; otherwise, predict 0.

Step 2: Adjust Weights if Prediction is Wrong

If ytrue≠ypredy_{\text{true}} \neq y_{\text{pred}}ytrue​=ypred​, update weights and bias:

Weight Update: wi←wi+η⋅(ytrue−ypred)⋅xiw_i \leftarrow w_i + \eta \cdot (y_{\text{true}} - y_{\text{pred}}) \cdot x_iwi​←wi​+η⋅(ytrue​−ypred​)⋅xi​

Bias Update: b←b+η⋅(ytrue−ypred)b \leftarrow b + \eta \cdot (y_{\text{true}} - y_{\text{pred}})b←b+η⋅(ytrue​−ypred​)

Limitations

  • Linear separability: The perceptron can only solve problems where the classes are linearly separable. For example, it can easily solve problems like AND, OR, but it cannot solve problems like XOR.

  • Single-layer: The perceptron is a single-layer neural network, meaning it has no hidden layers. It can only model linear decision boundaries.

Advantages

  • Binary classification: The perceptron is primarily used for binary classification tasks where the goal is to classify inputs into one of two classes (e.g., 0 or 1).

  • Simple problems: It is often used in introductory machine learning tasks or when working with simple, linearly separable data.


7. The Learning Rate

The learning rate is a hyperparameter that controls the magnitude of the weight updates during training. A small learning rate makes the network learn slowly but more precisely, while a large learning rate speeds up learning but risks overshooting optimal solutions.


8. Gradient Descent

Gradient descent is an optimization algorithm used to minimize the loss function in neural networks. It works by iteratively adjusting the weights in the direction that reduces the error.

  • Batch Gradient Descent: Computes the gradient using the entire dataset.

  • Stochastic Gradient Descent (SGD): Computes the gradient using a single data point.

  • Mini-batch Gradient Descent: A compromise between batch and stochastic gradient descent, using a subset of the data.


9. The Delta Rule

The Delta Rule is used in the perceptron and other neural networks to adjust weights during training. It updates the weights to reduce the difference (error) between the predicted and actual outputs. The update rule is:

wi=wi+η(y−y^)xiw_i = w_i + \eta (y - \hat{y}) x_iwi​=wi​+η(y−y^​)xi​

Where:

ηis the learning rate,yis the target output,y^is the predicted output,xiis the input value.\eta \quad \text{is the learning rate,} \\ y \quad \text{is the target output,} \\ \hat{y} \quad \text{is the predicted output,} \\ x_i \quad \text{is the input value.}ηis the learning rate,yis the target output,y^​is the predicted output,xi​is the input value.

10. Hebbian Learning

Hebbian learning is a learning principle based on the idea that if two neurons are activated together, their connection (synapse) is strengthened. This is often summarized by the phrase "cells that fire together, wire together."

The update rule for Hebbian learning is:

Δw=ηxiy\Delta w = \eta x_i yΔw=ηxi​y

Where:

ηis the learning rate,xiis the input,yis the output.\eta \quad \text{is the learning rate,} \\ x_i \quad \text{is the input,} \\ y \quad \text{is the output.}ηis the learning rate,xi​is the input,yis the output.

11. Adaline Network

The ADALINE (Adaptive Linear Neuron) network is a type of neural network similar to the perceptron but with continuous activation functions. It uses the least mean squares (LMS) rule for training, which minimizes the squared error between the actual and predicted outputs.


12. Multilayer Perceptron Neural Networks (MLP)

The Multilayer Perceptron (MLP) is a type of feedforward neural network that consists of multiple layers of neurons. It is a powerful model used for learning complex patterns, especially non-linear ones. Here's a breakdown of its key features:

Key Features of MLP:

  1. Layers:

    • Input Layer: This is where the data is fed into the network.

    • Hidden Layers: These are the intermediate layers between the input and output. An MLP can have one or more hidden layers. The neurons in these layers process the input data and learn complex patterns.

    • Output Layer: This layer produces the final output, which can be a classification, prediction, etc.

  2. Neurons: Each layer contains multiple neurons (also called nodes). Each neuron in one layer is connected to every neuron in the next layer, creating a fully connected network.

  3. Activation Functions: MLP uses non-linear activation functions (such as ReLU, Sigmoid, or Tanh) in hidden layers to enable learning of non-linear relationships in the data.

  4. Training using Backpropagation:

    • The training process of an MLP involves the use of an algorithm called backpropagation. Backpropagation helps the network adjust its weights by calculating the error (the difference between the predicted output and the actual target) and propagating it backward through the network to update the weights.

    • Gradient Descent is commonly used to minimize the error and find the optimal weights.

Why MLP is Powerful:

  • Non-Linear Learning: With multiple hidden layers and non-linear activation functions, MLPs are capable of learning complex, non-linear patterns that simpler models (like perceptrons) cannot.

  • Versatile: MLPs can be used for various tasks such as classification, regression, and time series prediction.

Example of MLP in Action:

  • Input: A set of features (e.g., images, text, or numerical data).

  • Hidden Layers: The data is processed through multiple layers where the model learns to extract relevant features and patterns.

  • Output: The final output could be a classification label (e.g., "cat" or "dog") or a numerical prediction.

MLP is a type of neural network that uses multiple layers to learn non-linear patterns in data and is trained using backpropagation.


13. Backpropagation Algorithm

The backpropagation algorithm is a key method for training neural networks, particularly multi-layer neural networks like Multilayer Perceptrons (MLP). It helps the network learn by adjusting its weights based on the error in the predictions. Here's how it works in two main steps:

1. Forward Pass

  • Input Data: The input data is fed into the network through the input layer.

  • Pass Through Layers: The data passes through the hidden layers where each neuron processes the data using weights and an activation function.

  • Output Calculation: The final output is produced in the output layer. This output is a prediction made by the neural network based on the input data.

In this step, the network computes the predicted output but doesn't adjust the weights yet. The output is then compared to the true target (label) to calculate the error (loss).

2. Backward Pass (Backpropagation)

  • Calculate Error: First, calculate the error (or loss), which is the difference between the predicted output and the actual target. A common loss function is mean squared error (MSE) for regression or cross-entropy for classification.

  • Apply Chain Rule: To update the weights, backpropagation uses the chain rule of calculus to compute the gradient of the loss function with respect to each weight. This tells us how much each weight contributed to the error. The gradient is calculated layer by layer, starting from the output layer and moving backward through the hidden layers.

  • Gradient Calculation: For each weight wiw_iwi​, calculate the derivative of the loss function with respect to wiw_iwi​, using the chain rule to propagate errors backward through the layers.

  • Update Weights: Using gradient descent, the weights are updated to minimize the error. The update rule for each weight is:

    wi←wi−η⋅∂Loss∂wiw_i \leftarrow w_i - \eta \cdot \frac{\partial \text{Loss}}{\partial w_i}wi​←wi​−η⋅∂wi​∂Loss​

    where:

  • η\etaη is the learning rate, which determines the step size for each weight update.

  • ∂Loss∂wi\frac{\partial \text{Loss}}{\partial w_i}∂wi​∂Loss​ is the gradient of the loss function with respect to the weight wiw_iwi​, which tells how much to change the weight to reduce the error.


14. Hopfield Neural Network

The Hopfield network is a type of recurrent neural network used for associative memory. It is capable of storing and recalling binary patterns. The network converges to a stable state, which corresponds to the closest stored pattern.

Key Differences between Multilayer Perceptron Neural Network & Hopfield Neural Network:

  • Architecture: Hopfield networks are recurrent (with feedback loops) and MLPs are feedforward (no feedback).

  • Training: Hopfield networks use an energy minimization approach, while MLPs use backpropagation.

  • Purpose: Hopfield is used for pattern storage and retrieval, whereas MLP is used for more general tasks like classification and regression.


Conclusion

Neural networks are a powerful class of models inspired by biological systems, and they play a fundamental role in AI. From simple perceptrons to deep multi-layer networks, these models use various algorithms like backpropagation and Hebbian learning to adjust weights and improve their performance. Understanding the core concepts like activation functions, gradient descent, and the delta rule is essential for building and training effective neural networks.

Biological Neural Netwok
Artificial Neural Networks