Introduction to nerual network —— 把“神经网络” 拉下“神坛”
Introduction to nerual network
基本上玩过数模或者模式识别的研究生就会知道神经网络这个新兴的技术,一般人都会觉得“很神”,不知道具体原理。当然多数人也就只会在matlab里面调用API而已,真正神经网络是什么,怎么实现的...就呵呵了.
仅有基础数据类型元素,没有过多的抽象的C语言实现的时候,才是真正把逻辑思维转化为可用,同时也是对自身逻辑思维的检测。别人已经帮你实现了的功能,提供API给你调用的时候,那不是你学会了,你只是会调用而已。
不能自欺欺人. 实干兴邦,空谈误国。
-----------------------------------------------------------------------------------------------------------------------------------------------
题外话:
Matlab的神经网络工具箱使得神经网络得到大力的推广,得到更多的人关注与研究使用. 然而,很多人也就是简单调用matlab的API而已,别人高手已经帮你写好了所有相关的处理函数,自己调用一下API就是了,有时候甚至都不用写代码。直接打开matlab工具箱就能用.
-------------------------------------------------------------------------------------------------------------------------------------------------
建议先跳到最后瞥一眼实现代码,然后再从这里看 :)
一 . 什么是神经网络?
Humans and other animals process information with neural networks. These are formed from trillions of neurons (nerve cells) exchanging brief electrical pulses called action potentials. Computer algorithms that
mimic these biological structures are formally called artificial neural networks to distinguish them from the squishy things inside of animals. However, most scientists and engineers are not this formal and use the term neural network to
include both biological and nonbiological systems.
This neural network is formed in three layers, called the input layer,hidden layer, and
output layer. Each layer consists of one or more nodes, represented in this diagram by the small circles. The lines between the nodes indicate the flow of information from one node to the next. In this particular type of neural
network, the information flows only from the input to the output (that is, from left-to-right). Other types of neural networks have more intricate connections, such as feedback paths.
神经网络其实就是 “非线性方程组基于随机概率的暴力近似求解”,这是我的理解。至于为什么,单扯理论是讲不明白的,来个实际的例子会好很多.
二. 神经网络的简单实现——检测数字
这里又要感谢开明的washington university。我的程序根据他们的demo改变而来,原来他们的程序测试输入单一,我对程序进行了简单的改进,希望能更好的阐述nerual network :)
这里demo要求实现的功能就是允许用户输入这样的10*5的点阵(大写字母O以及空格组成),图中例子为数字0 和数字1数2的点整表示形式,计算机原本是不会识别这样的点阵的,就像计算机原本无法识别人脸。这些点阵的排布几乎不能用数学方程式表示出来,是典型的非线性模型.
首先我们对我们的对象——神经网络,进行数据抽象.
需要说明的是,这里仅仅是介绍神经网络的实现,使用的是最最简单的神经网络,没有中间层,
只有输入(InputLayyer)输出(OutputLayyer)两层。 INT REAL 分别是int 和double类型的数据(不得不说对于C的抽象封装来说typedef简直棒呆~)
typedef struct { /* A LAYER OF A NET: */ INT Units; /* - number of units in this layer */ REAL* Activation; /* - activation of ith unit */ INT* Output; /* - output of ith unit */ REAL* Error; /* - error term of ith unit */ REAL** Weight; /* - connection weights to ith unit */ } LAYER; typedef struct { /* A NET: */ LAYER* InputLayer; /* - input layer */ LAYER* OutputLayer; /* - output layer */ REAL Eta; /* - learning rate */ REAL Error; /* - total net error */ REAL Epsilon; /* - net error to terminate training */ } NET;
这个实现的层和网络的各个子成员后面还会又介绍.
这里是基于随机概率产生某个系数的函数实现,比方说RandomEqualINT(),这个函数就是随机的产生一个位于[ Low,High ]之间的整数,而RandEqualREAL则是产生一个实数(double类型嘛~)
void InitializeRandoms() { srand(4711); } INT RandomEqualINT(INT Low, INT High) { return rand() % (High-Low+1) + Low; } REAL RandomEqualREAL(REAL Low, REAL High) { return ((REAL) rand() / RAND_MAX) * (High-Low) + Low; }
Pattern数组定义了NUM_DATA ( 10 )个“二维点阵”,用来表示象形的阿拉伯数字1~9
宏定义X Y分别是这里点阵的列数和行数,N是一个点阵内所有的点的数目
网络是X×Y(35)个元素输入,M(10)个元素输出.
InitializeApplication 实现了把二维向量转化为一维向量的功能
(就是把二维的点阵转化到一维去,比方说
0 1 0
0 1 0
0 1 0
这个点阵转换成 0 1 0 0 1 0 0 1 0
)
如果Pattern 点阵里面是储存的字符‘O’ 就把全局变量Input数组对应的位设置成HI (1) 否则设置成LO(-1)
这里有NUM_DATA个点阵,于是转化成了十个一维向量,组合起来就是全局变量Input
void InitializeApplication(NET* Net) { INT n,i,j; Net->Eta = 0.001; Net->Epsilon = 0.0001; for (n=0; n<NUM_DATA; n++) { for (i=0; i<Y; i++) { for (j=0; j<X; j++) { Input[n][i*X+j] = (Pattern[n][i][j] == 'O') ? HI : LO; } } } f = fopen("ADALINE.txt", "w"); }
RandomWeights用于初始化输出层的权重. 取-0.5到0.5之间是随机数,这里就是随机初始化而已. Don‘t panic :)
void RandomWeights(NET* Net) { INT i,j; for (i=1; i<=Net->OutputLayer->Units; i++) { for (j=0; j<=Net->InputLayer->Units; j++) { Net->OutputLayer->Weight[i][j] = RandomEqualREAL(-0.5, 0.5); } } }
介绍了部分函数,接下来介绍main函数的主要的部分,借此进而介绍其他没有介绍的函数. 遇到问题,解决问题.
可以看到之里有个BOOL类型的变量Stop,每次循环开始都会被赋值为TRUE(1).默认是要while停止循环的
while(NOT Stop)。NOT 是 !的宏定义. main函数内部的局部变量Error用于记录全网络的误差
一旦网络误差Net.Error小于网络求解精度要求Net.Epsilon,那么就停止训练,否则一直训练.
第一个for循环调用了SimulateNet(),注意这里 BOOL变量Traning 和Protocoling被置为FALSE(0)
void SimulateNet(NET* Net, INT* Input, INT* Target, BOOL Training, BOOL Protocoling) { INT Output[M]; SetInput(Net, Input, Protocoling); PropagateNet(Net); GetOutput(Net, Output, Protocoling); ComputeOutputError(Net, Target); if (Training) AdjustWeights(Net); }
调用了三个函数
SetInput(), 还是要记得,这里我们在第一个for循环的时候,Protocoling是FALSE.
这里把当前输入向量Input_current_vector,作为当前网络的输入层的输出。
void SetInput(NET* Net, INT* Input_current_vector, BOOL Protocoling) { INT i; for (i=1; i<=Net->InputLayer->Units; i++) { Net->InputLayer->Output[i] = Input_current_vector[i-1]; } if (Protocoling) { WriteInput(Net, Input_current_vector); } }
Net->OutputLayer->Weight[i][j] 这是随机初始化得来的权重(后面会根据偏差进行不断的修正,直达达到误差精度要求).
Propagate(), 对把每个输入进行不同权重的组合相加,然后得到一个值Sum,大于0,取HI,小于0,取LO赋值给输出层的输出.
void PropagateNet(NET* Net) { INT i,j; REAL Sum; for (i=1; i<=Net->OutputLayer->Units; i++) { Sum = 0; for (j=0; j<=Net->InputLayer->Units; j++) { Sum += Net->OutputLayer->Weight[i][j] * Net->InputLayer->Output[j]; } Net->OutputLayer->Activation[i] = Sum; if (Sum >= 0) Net->OutputLayer->Output[i] = HI; else Net->OutputLayer->Output[i] = LO; } }
把当前网络输出Net->OutputLayer->Output输出到输出向量Output_current_vector当中.
GetOutput().
void GetOutput(NET* Net, INT* Output_current_vector, BOOL Protocoling) { INT i; for (i=1; i<=Net->OutputLayer->Units; i++) { Output_current_vector[i-1] = Net->OutputLayer->Output[i]; } if (Protocoling) { WriteOutput(Net, Output_current_vector); } }
ComputeOutputError()把输出层的输出和Target(训练目标,这里是矩阵Output)做差值,作为Err和输出层的Error[i]
网络的Error被更新为 原有误差 + 0.5×sqr(Err)。
void ComputeOutputError(NET* Net, INT* Target) { INT i; REAL Err; Net->Error = 0; for (i=1; i<=Net->OutputLayer->Units; i++) { Err = Target[i-1] - Net->OutputLayer->Activation[i]; Net->OutputLayer->Error[i] = Err; Net->Error += 0.5 * sqr(Err); } }
接着main函数内部第二层for循环对网络进行训练,注意这里SImulateNet函数的第四个参数Traning变成了TURE
for (m=0; m<10*NUM_DATA; m++) { n = RandomEqualINT(0, NUM_DATA-1); SimulateNet(&Net, Input[n], Output[n], TRUE, FALSE); }
于是会调用函数AdjustWeights(),这才是反馈的过程 调整输出层的权重,具体方法是当前权重加上输出偏差Error[i]*输入层的输出Output[i]×精度Net->Eta.
可能会又疑问,为什么这么干?
void AdjustWeights(NET* Net) { INT i,j; INT Out; REAL Err; for (i=1; i<=Net->OutputLayer->Units; i++) { for (j=0; j<=Net->InputLayer->Units; j++) { Out = Net->InputLayer->Output[j]; Err = Net->OutputLayer->Error[i]; Net->OutputLayer->Weight[i][j] += Net->Eta * Err * Out; } } }
回想输出层的Error[i]如果大于0标志着,当前输出小于期望输出,如果小于0,反之。
就说明要加大该输入点j对应的输出点i的权重,以缩小实际输出和期望的差值(误差从大于0到趋近于0).
同理,如果当前输出层的输出比期望输出要大,那么输出层的Error[i]就是个负数,以减小对应节点间的权重,缩小误差.(误差从小于0趋向于0)
AHa~框架就是这样,剩下的部分都是测试,基本上把主要重要的代码都讲清楚了.接下来我们可以开心的玩测试了.
我故意对输入造成一定的误差(点阵数字都是有标准定义的Parttern数组),得到的检测结果,其实感觉效果还蛮好的,嘿嘿,具有一定的容错性。
毕竟都没有中间层,神经网络的鲁棒性还没有发挥出来,神经网络的中间层越多,鲁棒性就越强,同时计算量会增大,时间消耗加大.
下面是我经过修改过的代码:
/********************************************************* Code writer : EOF Code date : 2014.10.24 Code file : adaline_netwrok.c e-mail: : [email protected] Code Description: This program is a demo for beginner to understand nerual network. If there is something wrong with my code, please touch me by e-mail. **********************************************************/ #include <stdlib.h> #include <stdio.h> typedef int BOOL; typedef char CHAR; typedef int INT; typedef double REAL; #define FALSE 0 #define TRUE 1 #define NOT ! #define AND && #define OR || #define MIN(x,y) ((x)<(y) ? (x) : (y)) #define MAX(x,y) ((x)>(y) ? (x) : (y)) #define LO -1 #define HI +1 #define BIAS 1 #define sqr(x) ((x)*(x)) typedef struct { /* A LAYER OF A NET: */ INT Units; /* - number of units in this layer */ REAL* Activation; /* - activation of ith unit */ INT* Output; /* - output of ith unit */ REAL* Error; /* - error term of ith unit */ REAL** Weight; /* - connection weights to ith unit */ } LAYER; typedef struct { /* A NET: */ LAYER* InputLayer; /* - input layer */ LAYER* OutputLayer; /* - output layer */ REAL Eta; /* - learning rate */ REAL Error; /* - total net error */ REAL Epsilon; /* - net error to terminate training */ } NET; /****************************************************************************** R A N D O M S D R A W N F R O M D I S T R I B U T I O N S ******************************************************************************/ void InitializeRandoms() { srand(4711); } INT RandomEqualINT(INT Low, INT High) { return rand() % (High-Low+1) + Low; } REAL RandomEqualREAL(REAL Low, REAL High) { return ((REAL) rand() / RAND_MAX) * (High-Low) + Low; } /****************************************************************************** A P P L I C A T I O N - S P E C I F I C C O D E ******************************************************************************/ #define NUM_DATA 10 #define X 5 #define Y 7 #define N (X * Y) #define M 10 CHAR Pattern[NUM_DATA][Y][X] = { { " OOO ", "O O", "O O", "O O", "O O", "O O", " OOO " }, { " O ", " OO ", "O O ", " O ", " O ", " O ", " O " }, { " OOO ", "O O", " O", " O ", " O ", " O ", "OOOOO" }, { " OOO ", "O O", " O", " OOO ", " O", "O O", " OOO " }, { " O ", " OO ", " O O ", "O O ", "OOOOO", " O ", " O " }, { "OOOOO", "O ", "O ", "OOOO ", " O", "O O", " OOO " }, { " OOO ", "O O", "O ", "OOOO ", "O O", "O O", " OOO " }, { "OOOOO", " O", " O", " O ", " O ", " O ", "O " }, { " OOO ", "O O", "O O", " OOO ", "O O", "O O", " OOO " }, { " OOO ", "O O", "O O", " OOOO", " O", "O O", " OOO " } }; CHAR Pattern_for_testing[NUM_DATA][Y][X] = { { " OOO ", " ", " O", " O", " O", " O", " O " }, { " O ", " O ", " O ", " O ", " O ", " O ", " O "}, { " OOO ", "O O", " O", " O ", " O ", " O ", "OOOOO" }, { " OOO ", "O O", "O O", "OOOOO", " O", "O O", " OOO " }, { " O ", " OO ", " O O ", "O O ", "OOOOO", " ", " " }, { "OOOOO", "O ", "O ", "OOOOO", "O O", "O O", "OOOOO" }, { " OOO ", "O O", "O ", "OOOO ", "O O", "O O", " OOO " }, { "OOOOO", " O", " O", " O ", " O ", " O ", "O " }, { " OOO ", "O O", "O O", " ", "O O", "O O", " OOO " }, { " OOO ", "O O", "O O", "O O", "O O", "O O", " OOO " } }; INT Input [NUM_DATA][N]; INT Input_for_testing [NUM_DATA][N];//EOF added INT Output[NUM_DATA][M] = { {HI, LO, LO, LO, LO, LO, LO, LO, LO, LO}, {LO, HI, LO, LO, LO, LO, LO, LO, LO, LO}, {LO, LO, HI, LO, LO, LO, LO, LO, LO, LO}, {LO, LO, LO, HI, LO, LO, LO, LO, LO, LO}, {LO, LO, LO, LO, HI, LO, LO, LO, LO, LO}, {LO, LO, LO, LO, LO, HI, LO, LO, LO, LO}, {LO, LO, LO, LO, LO, LO, HI, LO, LO, LO}, {LO, LO, LO, LO, LO, LO, LO, HI, LO, LO}, {LO, LO, LO, LO, LO, LO, LO, LO, HI, LO}, {LO, LO, LO, LO, LO, LO, LO, LO, LO, HI} }; FILE* f; void InitializeApplication(NET* Net) { INT n,i,j; Net->Eta = 0.001; Net->Epsilon = 0.0001; for (n=0; n<NUM_DATA; n++) { for (i=0; i<Y; i++) { for (j=0; j<X; j++) { Input[n][i*X+j] = (Pattern[n][i][j] == 'O') ? HI : LO; /* ** EOF added */ Input_for_testing[n][i*X+j] = (Pattern_for_testing[n][i][j] == 'O') ? HI : LO; } } } f = fopen("ADALINE.txt", "w"); } void WriteInput(NET* Net, INT* Input_current_vector) { INT i; for (i=0; i<N; i++) { if (i%X == 0) { fprintf(f, "\n"); } fprintf(f, "%c", (Input_current_vector[i] == HI) ? 'O' : ' '); } fprintf(f, " -> "); } void WriteOutput(NET* Net, INT* Output_current_vector) { INT i; INT Count, Index; Count = 0; for (i=0; i<M; i++) { if (Output_current_vector[i] == HI) { Count++; Index = i; } } if (Count == 1) fprintf(f, "%i\n", Index); else fprintf(f, "%s\n", "invalid"); } void FinalizeApplication(NET* Net) { fclose(f); } /****************************************************************************** I N I T I A L I Z A T I O N ******************************************************************************/ void GenerateNetwork(NET* Net) { INT i; Net->InputLayer = (LAYER*) malloc(sizeof(LAYER)); Net->OutputLayer = (LAYER*) malloc(sizeof(LAYER)); Net->InputLayer->Units = N; Net->InputLayer->Output = (INT*) calloc(N+1, sizeof(INT)); Net->InputLayer->Output[0] = BIAS; Net->OutputLayer->Units = M; Net->OutputLayer->Activation = (REAL*) calloc(M+1, sizeof(REAL)); Net->OutputLayer->Output = (INT*) calloc(M+1, sizeof(INT)); Net->OutputLayer->Error = (REAL*) calloc(M+1, sizeof(REAL)); Net->OutputLayer->Weight = (REAL**) calloc(M+1, sizeof(REAL*)); for (i=1; i<=M; i++) { Net->OutputLayer->Weight[i] = (REAL*) calloc(N+1, sizeof(REAL)); } Net->Eta = 0.1; Net->Epsilon = 0.01; } void RandomWeights(NET* Net) { INT i,j; for (i=1; i<=Net->OutputLayer->Units; i++) { for (j=0; j<=Net->InputLayer->Units; j++) { Net->OutputLayer->Weight[i][j] = RandomEqualREAL(-0.5, 0.5); } } } void SetInput(NET* Net, INT* Input_current_vector, BOOL Protocoling) { INT i; for (i=1; i<=Net->InputLayer->Units; i++) { Net->InputLayer->Output[i] = Input_current_vector[i-1]; } if (Protocoling) { WriteInput(Net, Input_current_vector); } } void GetOutput(NET* Net, INT* Output_current_vector, BOOL Protocoling) { INT i; for (i=1; i<=Net->OutputLayer->Units; i++) { Output_current_vector[i-1] = Net->OutputLayer->Output[i]; } if (Protocoling) { WriteOutput(Net, Output_current_vector); } } /****************************************************************************** P R O P A G A T I N G S I G N A L S ******************************************************************************/ void PropagateNet(NET* Net) { INT i,j; REAL Sum; for (i=1; i<=Net->OutputLayer->Units; i++) { Sum = 0; for (j=0; j<=Net->InputLayer->Units; j++) { Sum += Net->OutputLayer->Weight[i][j] * Net->InputLayer->Output[j]; } Net->OutputLayer->Activation[i] = Sum; if (Sum >= 0) Net->OutputLayer->Output[i] = HI; else Net->OutputLayer->Output[i] = LO; } } /****************************************************************************** A D J U S T I N G W E I G H T S ******************************************************************************/ void ComputeOutputError(NET* Net, INT* Target) { INT i; REAL Err; Net->Error = 0; for (i=1; i<=Net->OutputLayer->Units; i++) { Err = Target[i-1] - Net->OutputLayer->Activation[i]; Net->OutputLayer->Error[i] = Err; Net->Error += 0.5 * sqr(Err); } } void AdjustWeights(NET* Net) { INT i,j; INT Out; REAL Err; for (i=1; i<=Net->OutputLayer->Units; i++) { for (j=0; j<=Net->InputLayer->Units; j++) { Out = Net->InputLayer->Output[j]; Err = Net->OutputLayer->Error[i]; Net->OutputLayer->Weight[i][j] += Net->Eta * Err * Out; } } } /****************************************************************************** S I M U L A T I N G T H E N E T ******************************************************************************/ void SimulateNet(NET* Net, INT* Input, INT* Target, BOOL Training, BOOL Protocoling) { INT Output[M]; SetInput(Net, Input, Protocoling); PropagateNet(Net); GetOutput(Net, Output, Protocoling); ComputeOutputError(Net, Target); if (Training) AdjustWeights(Net); } /****************************************************************************** M A I N ******************************************************************************/ void main() { NET Net; REAL Error; BOOL Stop; INT n,m; InitializeRandoms(); GenerateNetwork(&Net); RandomWeights(&Net); InitializeApplication(&Net); do { Error = 0; Stop = TRUE; for (n=0; n<NUM_DATA; n++) { SimulateNet(&Net, Input[n], Output[n], FALSE, FALSE); Error = MAX(Error, Net.Error); Stop = Stop AND (Net.Error < Net.Epsilon); } Error = MAX(Error, Net.Epsilon); printf("Training %0.0f%% completed ...\n", (Net.Epsilon / Error) * 100); if (NOT Stop) { for (m=0; m<10*NUM_DATA; m++) { n = RandomEqualINT(0, NUM_DATA-1); SimulateNet(&Net, Input[n], Output[n], TRUE, FALSE); } } } while (NOT Stop); for (n=0; n<NUM_DATA; n++) { SimulateNet(&Net, Input_for_testing[n], Output[n], FALSE, TRUE); } FinalizeApplication(&Net); }
郑重声明:本站内容如果来自互联网及其他传播媒体,其版权均属原媒体及文章作者所有。转载目的在于传递更多信息及用于网络分享,并不代表本站赞同其观点和对其真实性负责,也不构成任何其他建议。