上節我們講了第一部分,如何用生成簡易的車牌,這節課中我們會用PaddlePaddle來識別生成的車牌。
數據讀取
在上一節生成車牌時,我們可以分別生成訓練數據和測試數據,方法如下(完整代碼在這里):
1 # 將生成的車牌圖片寫入文件夾,對應的label寫入label.txt
2 def genBatch(self, batchSize,pos,charRange, outputPath,size):
3 if (not os.path.exists(outputPath)):
4 os.mkdir(outputPath)
5 outfile = open('label.txt','w')
6 for i in xrange(batchSize):
7 plateStr,plate = G.genPlateString(-1,-1)
8 print plateStr,plate
9 img = G.generate(plateStr);
10 img = cv2.resize(img,size);
11 cv2.imwrite(outputPath + "/" + str(i).zfill(2) + ".jpg", img);
12 outfile.write(str(plate)+"\n")
生成好數據后,我們寫一個reader來讀取數據 ( reador.py )
1 def reader_creator(data,label):
2 def reader():
3 for i in xrange(len(data)):
4 yield data[i,:],int(label[i])
5 return reader
灌入模型時,我們需要調用paddle.batch函數,將數據shuffle后批量灌入模型中:
1 # 讀取訓練數據
2 train_reader = paddle.batch(paddle.reader.shuffle(
3 reador.reader_creator(X_train,Y_train),buf_size=200),
4 batch_size=16)
5
6 # 讀取驗證數據
7 val_reader = paddle.batch(paddle.reader.shuffle(
8 reador.reader_creator(X_val,Y_val),buf_size=200),
9 batch_size=16)
10 trainer.train(reader=train_reader,num_passes=20,event_handler=event_handler)
構建網絡模型
因為我們訓練的是端到端的車牌識別,所以一開始構建了兩個卷積-池化層訓練,訓練完后同步訓練 7 個全連接層,分別對應車牌的 7 位字符,最后將其拼接起來,與原始的label計算Softmax值,預測訓練結果。
1 def get_network_cnn(self):
2 # 加載data和label
3 x = paddle.layer.data(name='x', type=paddle.data_type.dense_vector(self.data))
4 y = paddle.layer.data(name='y', type=paddle.data_type.integer_value(self.label))
5 # 構建卷積-池化層-1
6 conv_pool_1 = paddle.networks.simple_img_conv_pool(
7 input=x,
8 filter_size=12,
9 num_filters=50,
10 num_channel=1,
11 pool_size=2,
12 pool_stride=2,
13 act=paddle.activation.Relu())
14 drop_1 = paddle.layer.dropout(input=conv_pool_1, dropout_rate=0.5)
15 # 構建卷積-池化層-2
16 conv_pool_2 = paddle.networks.simple_img_conv_pool(
17 input=drop_1,
18 filter_size=5,
19 num_filters=50,
20 num_channel=20,
21 pool_size=2,
22 pool_stride=2,
23 act=paddle.activation.Relu())
24 drop_2 = paddle.layer.dropout(input=conv_pool_2, dropout_rate=0.5)
25
26 # 全連接層
27 fc = paddle.layer.fc(input = drop_2, size = 120)
28 fc1_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
29 fc1 = paddle.layer.fc(input = fc1_drop,size = 65,act = paddle.activation.Linear())
30
31 fc2_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
32 fc2 = paddle.layer.fc(input = fc2_drop,size = 65,act = paddle.activation.Linear())
33
34 fc3_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
35 fc3 = paddle.layer.fc(input = fc3_drop,size = 65,act = paddle.activation.Linear())
36
37 fc4_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
38 fc4 = paddle.layer.fc(input = fc4_drop,size = 65,act = paddle.activation.Linear())
39
40 fc5_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
41 fc5 = paddle.layer.fc(input = fc5_drop,size = 65,act = paddle.activation.Linear())
42
43 fc6_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
44 fc6 = paddle.layer.fc(input = fc6_drop,size = 65,act = paddle.activation.Linear())
45
46 fc7_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
47 fc7 = paddle.layer.fc(input = fc7_drop,size = 65,act = paddle.activation.Linear())
48
49 # 將訓練好的 7 個字符的全連接層拼接起來
50 fc_concat = paddle.layer.concact(input = [fc21, fc22, fc23, fc24,fc25,fc26,fc27], axis = 0)
51 predict = paddle.layer.classification_cost(input = fc_concat,label = y,act=paddle.activation.Softmax())
52 return predict
訓練模型
構建好網絡模型后,就是比較常見的步驟了,譬如初始化,定義優化方法, 定義訓練參數,定義訓練器等等,再把第一步里我們寫好的數據讀取的方式放進去,就可以正常跑模型了。
1 class NeuralNetwork(object):
2 def __init__(self,X_train,Y_train,X_val,Y_val):
3 paddle.init(use_gpu = with_gpu,trainer_count=1)
4
5 self.X_train = X_train
6 self.Y_train = Y_train
7 self.X_val = X_val
8 self.Y_val = Y_val
9
10
11 def get_network_cnn(self):
12
13 x = paddle.layer.data(name='x', type=paddle.data_type.dense_vector(self.data))
14 y = paddle.layer.data(name='y', type=paddle.data_type.integer_value(self.label))
15 conv_pool_1 = paddle.networks.simple_img_conv_pool(
16 input=x,
17 filter_size=12,
18 num_filters=50,
19 num_channel=1,
20 pool_size=2,
21 pool_stride=2,
22 act=paddle.activation.Relu())
23 drop_1 = paddle.layer.dropout(input=conv_pool_1, dropout_rate=0.5)
24 conv_pool_2 = paddle.networks.simple_img_conv_pool(
25 input=drop_1,
26 filter_size=5,
27 num_filters=50,
28 num_channel=20,
29 pool_size=2,
30 pool_stride=2,
31 act=paddle.activation.Relu())
32 drop_2 = paddle.layer.dropout(input=conv_pool_2, dropout_rate=0.5)
33
34 fc = paddle.layer.fc(input = drop_2, size = 120)
35 fc1_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
36 fc1 = paddle.layer.fc(input = fc1_drop,size = 65,act = paddle.activation.Linear())
37
38 fc2_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
39 fc2 = paddle.layer.fc(input = fc2_drop,size = 65,act = paddle.activation.Linear())
40
41 fc3_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
42 fc3 = paddle.layer.fc(input = fc3_drop,size = 65,act = paddle.activation.Linear())
43
44 fc4_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
45 fc4 = paddle.layer.fc(input = fc4_drop,size = 65,act = paddle.activation.Linear())
46
47 fc5_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
48 fc5 = paddle.layer.fc(input = fc5_drop,size = 65,act = paddle.activation.Linear())
49
50 fc6_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
51 fc6 = paddle.layer.fc(input = fc6_drop,size = 65,act = paddle.activation.Linear())
52
53 fc7_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
54 fc7 = paddle.layer.fc(input = fc7_drop,size = 65,act = paddle.activation.Linear())
55
56 fc_concat = paddle.layer.concact(input = [fc21, fc22, fc23, fc24,fc25,fc26,fc27], axis = 0)
57 predict = paddle.layer.classification_cost(input = fc_concat,label = y,act=paddle.activation.Softmax())
58 return predict
59
60 # 定義訓練器
61 def get_trainer(self):
62
63 cost = self.get_network()
64
65 #獲取參數
66 parameters = paddle.parameters.create(cost)
67
68
69 optimizer = paddle.optimizer.Momentum(
70 momentum=0.9,
71 regularization=paddle.optimizer.L2Regularization(rate=0.0002 * 128),
72 learning_rate=0.001,
73 learning_rate_schedule = "pass_manual")
74
75
76 # 創建訓練器
77 trainer = paddle.trainer.SGD(
78 cost=cost, parameters=parameters, update_equation=optimizer)
79 return trainer
80
81
82 # 開始訓練
83 def start_trainer(self,X_train,Y_train,X_val,Y_val):
84 trainer = self.get_trainer()
85
86 result_lists = []
87 def event_handler(event):
88 if isinstance(event, paddle.event.EndIteration):
89 if event.batch_id % 10 == 0:
90 print "\nPass %d, Batch %d, Cost %f, %s" % (
91 event.pass_id, event.batch_id, event.cost, event.metrics)
92 if isinstance(event, paddle.event.EndPass):
93 # 保存訓練好的參數
94 with open('params_pass_%d.tar' % event.pass_id, 'w') as f:
95 parameters.to_tar(f)
96 # feeding = ['x','y']
97 result = trainer.test(
98 reader=val_reader)
99 # feeding=feeding)
100 print "\nTest with Pass %d, %s" % (event.pass_id, result.metrics)
101
102 result_lists.append((event.pass_id, result.cost,
103 result.metrics['classification_error_evaluator']))
104
105 # 開始訓練
106 train_reader = paddle.batch(paddle.reader.shuffle(
107 reador.reader_creator(X_train,Y_train),buf_size=200),
108 batch_size=16)
109
110 val_reader = paddle.batch(paddle.reader.shuffle(
111 reador.reader_creator(X_val,Y_val),buf_size=200),
112 batch_size=16)
113 # val_reader = paddle.reader(reador.reader_creator(X_val,Y_val),batch_size=16)
114
115 trainer.train(reader=train_reader,num_passes=20,event_handler=event_handler)
輸出結果
上一步訓練完以后,保存訓練完的模型,然后寫一個test.py進行預測,需要注意的是,在預測時,構建的網絡結構得和訓練的網絡結構相同。
#批量預測測試圖片準確率
python test.py /Users/shelter/test
##輸出結果示例
output:
預測車牌號碼為:津 K 4 2 R M Y
輸入圖片數量:100
輸入圖片行準確率:0.72
輸入圖片列準確率:0.86
如果是一次性只預測一張的話,在終端里會顯示原始的圖片與預測的值,如果是批量預測的話,會打印出預測的總準確率,包括行與列的準確率。
總結
車牌識別的方法有很多,商業化落地的方法也很成熟,傳統的方法需要對圖片灰度化,字符進行切分等,需要很多數據預處理的過程,端到端的方法可以直接將原始的圖片灌進去進行訓練,最后出來預測的車牌字符的結果,這個方法在構建了兩層卷積-池化網絡結構后,并行訓練了 7 個全連接層來進行車牌的字符識別,可以實現端到端的識別。但是在實際訓練過程中,仍然有一些問題,譬如前幾個訓練的全連接層的準確率要比最后一兩個的準確率高,大家可以分別打印出每一個全連接層的訓練結果準確率對比一下,可能是由于訓練還沒有收斂導致的,也可能有其他原因,如果在做的過程中發現有什么問題,或者有更好的方法,歡迎留言~
參考文獻:
1.我的github:https://github.com/huxiaoman7/mxnet-cnn-plate-recognition
作者:Charlotte77
免責聲明:本文為廠商推廣稿件,企業發布本文的目的在于推廣其產品或服務,站長之家發布此文僅為傳遞信息,不代表站長之家贊同其觀點,不對對內容真實性負責,僅供用戶參考之用,不構成任何投資、使用等行為的建議。請讀者使用之前核實真實性,以及可能存在的風險,任何后果均由讀者自行承擔。