京東購物商城網(wǎng)絡(luò)排名優(yōu)化軟件
基于WIN10的64位系統(tǒng)演示
一、寫在前面
截至上期,我們一直都在做二分類的任務(wù),無論是之前的機器學(xué)習(xí)任務(wù),還是最近更新的圖像分類任務(wù)。然而,在實際工作中,我們大概率需要進行多分類任務(wù)。例如肺部胸片可不僅僅能診斷肺結(jié)核,還有COVID-19、細(xì)菌性(病毒性)肺炎等等,這就涉及到圖像識別的多分類任務(wù)。
本期以健康組、肺結(jié)核組、COVID-19組、細(xì)菌性(病毒性)肺炎組為數(shù)據(jù)集,構(gòu)建Mobilenet多分類模型,原因還是因為它建模速度快。
同樣,基于GPT-4輔助編程,改寫過程見后面。
二、誤判病例分析實戰(zhàn)
使用胸片的數(shù)據(jù)集:肺結(jié)核病人和健康人的胸片的識別。其中,健康人900張,肺結(jié)核病人700張,COVID-19病人549張、細(xì)菌性(病毒性)肺炎組900張,分別存入單獨的文件夾中。
(a)直接分享代碼
######################################導(dǎo)入包###################################
from tensorflow import keras
import tensorflow as tf
from tensorflow.python.keras.layers import Dense, Flatten, Conv2D, MaxPool2D, Dropout, Activation, Reshape, Softmax, GlobalAveragePooling2D, BatchNormalization
from tensorflow.python.keras.layers.convolutional import Convolution2D, MaxPooling2D
from tensorflow.python.keras import Sequential
from tensorflow.python.keras import Model
from tensorflow.python.keras.optimizers import adam_v2
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.python.keras.preprocessing.image import ImageDataGenerator, image_dataset_from_directory
from tensorflow.python.keras.layers.preprocessing.image_preprocessing import RandomFlip, RandomRotation, RandomContrast, RandomZoom, RandomTranslation
import os,PIL,pathlib
import warnings
#設(shè)置GPU
gpus = tf.config.list_physical_devices("GPU")if gpus:gpu0 = gpus[0] #如果有多個GPU,僅使用第0個GPUtf.config.experimental.set_memory_growth(gpu0, True) #設(shè)置GPU顯存用量按需使用tf.config.set_visible_devices([gpu0],"GPU")warnings.filterwarnings("ignore") #忽略警告信息
plt.rcParams['font.sans-serif'] = ['SimHei'] # 用來正常顯示中文標(biāo)簽
plt.rcParams['axes.unicode_minus'] = False # 用來正常顯示負(fù)號################################導(dǎo)入數(shù)據(jù)集#####################################
#1.導(dǎo)入數(shù)據(jù)
#1.導(dǎo)入數(shù)據(jù)
data_dir = "./MTB-1" # 修改了路徑
data_dir = pathlib.Path(data_dir)
image_count = len(list(data_dir.glob('*/*')))
print("圖片總數(shù)為:",image_count)batch_size = 32
img_height = 100
img_width = 100train_ds = image_dataset_from_directory(data_dir,validation_split=0.2,subset="training",seed=12,image_size=(img_height, img_width),batch_size=batch_size)val_ds = image_dataset_from_directory(data_dir,validation_split=0.2,subset="validation",seed=12,image_size=(img_height, img_width),batch_size=batch_size)class_names = train_ds.class_names
print(class_names)
print(train_ds)#2.檢查數(shù)據(jù)
for image_batch, labels_batch in train_ds:print(image_batch.shape)print(labels_batch.shape)break#3.配置數(shù)據(jù)
AUTOTUNE = tf.data.AUTOTUNEdef train_preprocessing(image,label):return (image/255.0,label)train_ds = (train_ds.cache().shuffle(800).map(train_preprocessing) # 這里可以設(shè)置預(yù)處理函數(shù)
# .batch(batch_size) # 在image_dataset_from_directory處已經(jīng)設(shè)置了batch_size.prefetch(buffer_size=AUTOTUNE)
)val_ds = (val_ds.cache().map(train_preprocessing) # 這里可以設(shè)置預(yù)處理函數(shù)
# .batch(batch_size) # 在image_dataset_from_directory處已經(jīng)設(shè)置了batch_size.prefetch(buffer_size=AUTOTUNE)
)#4. 數(shù)據(jù)可視化
plt.figure(figsize=(10, 8)) # 圖形的寬為10高為5
plt.suptitle("數(shù)據(jù)展示")class_names = ["COVID-19", "Normal", "Pneumonia", "Tuberculosis"] # 修改類別標(biāo)簽for images, labels in train_ds.take(1):for i in range(15):plt.subplot(4, 5, i + 1)plt.xticks([])plt.yticks([])plt.grid(False)# 顯示圖片plt.imshow(images[i])# 顯示標(biāo)簽plt.xlabel(class_names[labels[i]-1])plt.show()######################################數(shù)據(jù)增強函數(shù)################################data_augmentation = Sequential([RandomFlip("horizontal_and_vertical"),RandomRotation(0.2),RandomContrast(1.0),RandomZoom(0.5,0.2),RandomTranslation(0.3,0.5),
])def prepare(ds):ds = ds.map(lambda x, y: (data_augmentation(x, training=True), y), num_parallel_calls=AUTOTUNE)return ds
train_ds = prepare(train_ds)###############################導(dǎo)入mobilenet_v2################################
#獲取預(yù)訓(xùn)練模型對輸入的預(yù)處理方法
from tensorflow.python.keras.applications import mobilenet_v2
from tensorflow.python.keras import Input, regularizers
IMG_SIZE = (img_height, img_width, 3)base_model = mobilenet_v2.MobileNetV2(input_shape=IMG_SIZE, include_top=False, #是否包含頂層的全連接層weights='imagenet')inputs = Input(shape=IMG_SIZE)
#模型
x = base_model(inputs, training=False) #參數(shù)不變化
#全局池化
x = GlobalAveragePooling2D()(x)
#BatchNormalization
x = BatchNormalization()(x)
#Dropout
x = Dropout(0.8)(x)
#Dense
x = Dense(128, kernel_regularizer=regularizers.l2(0.1))(x) # 全連接層減少到128,添加 L2 正則化
#BatchNormalization
x = BatchNormalization()(x)
#激活函數(shù)
x = Activation('relu')(x)
#輸出層
outputs = Dense(4, kernel_regularizer=regularizers.l2(0.1))(x) # 輸出層神經(jīng)元數(shù)量修改為4
#BatchNormalization
outputs = BatchNormalization()(outputs)
#激活函數(shù)
outputs = Activation('softmax')(outputs) # 激活函數(shù)修改為'softmax'
#整體封裝
model = Model(inputs, outputs)
#打印模型結(jié)構(gòu)
print(model.summary())
#############################編譯模型#########################################
#定義優(yōu)化器
from tensorflow.python.keras.optimizers import adam_v2, rmsprop_v2
#from tensorflow.python.keras.optimizer_v2.gradient_descent import SGD
optimizer = adam_v2.Adam()
#optimizer = SGD(learning_rate=0.001)
#optimizer = rmsprop_v2.RMSprop()#常用的優(yōu)化器
#all_classes = {
# 'adadelta': adadelta_v2.Adadelta,
# 'adagrad': adagrad_v2.Adagrad,
# 'adam': adam_v2.Adam,
# 'adamax': adamax_v2.Adamax,
# 'experimentaladadelta': adadelta_experimental.Adadelta,
# 'experimentaladagrad': adagrad_experimental.Adagrad,
# 'experimentaladam': adam_experimental.Adam,
# 'experimentalsgd': sgd_experimental.SGD,
# 'nadam': nadam_v2.Nadam,
# 'rmsprop': rmsprop_v2.RMSprop,#編譯模型
model.compile(optimizer=optimizer,loss='sparse_categorical_crossentropy', # 多分類問題metrics=['accuracy'])#訓(xùn)練模型
from tensorflow.python.keras.callbacks import ModelCheckpoint, Callback, EarlyStopping, ReduceLROnPlateau, LearningRateSchedulerNO_EPOCHS = 50
PATIENCE = 10
VERBOSE = 1# 設(shè)置動態(tài)學(xué)習(xí)率
annealer = LearningRateScheduler(lambda x: 1e-5 * 0.99 ** (x+NO_EPOCHS))# 設(shè)置早停
earlystopper = EarlyStopping(monitor='loss', patience=PATIENCE, verbose=VERBOSE)#
checkpointer = ModelCheckpoint('mtb_4_jet_best_model_mobilenetv3samll.h5',monitor='val_accuracy',verbose=VERBOSE,save_best_only=True,save_weights_only=True)train_model = model.fit(train_ds,epochs=NO_EPOCHS,verbose=1,validation_data=val_ds,callbacks=[earlystopper, checkpointer, annealer])#保存模型
model.save('mtb_4_jet_best_model_mobilenet.h5')
print("The trained model has been saved.")from tensorflow.python.keras.models import load_model
train_model=load_model('mtb_4_jet_best_model_mobilenet.h5')
###########################Accuracy和Loss可視化#################################
import matplotlib.pyplot as pltloss = train_model.history['loss']
acc = train_model.history['accuracy']
val_loss = train_model.history['val_loss']
val_acc = train_model.history['val_accuracy']
epoch = range(1, len(loss)+1)fig, ax = plt.subplots(1, 2, figsize=(10,4))
ax[0].plot(epoch, loss, label='Train loss')
ax[0].plot(epoch, val_loss, label='Validation loss')
ax[0].set_xlabel('Epochs')
ax[0].set_ylabel('Loss')
ax[0].legend()
ax[1].plot(epoch, acc, label='Train acc')
ax[1].plot(epoch, val_acc, label='Validation acc')
ax[1].set_xlabel('Epochs')
ax[1].set_ylabel('Accuracy')
ax[1].legend()
#plt.show()
plt.savefig("loss-acc.pdf", dpi=300,format="pdf")####################################混淆矩陣可視化#############################
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.python.keras.models import load_model
from matplotlib.pyplot import imshow
from sklearn.metrics import classification_report, confusion_matrix
import seaborn as sns
import pandas as pd
import math
from sklearn.metrics import precision_recall_fscore_support, accuracy_score# 定義一個繪制混淆矩陣圖的函數(shù)
def plot_cm(labels, predictions, class_names):# 生成混淆矩陣conf_numpy = confusion_matrix(labels, predictions)# 將矩陣轉(zhuǎn)化為 DataFrameconf_df = pd.DataFrame(conf_numpy, index=class_names ,columns=class_names) plt.figure(figsize=(8,7))sns.heatmap(conf_df, annot=True, fmt="d", cmap="BuPu")plt.title('Confusion matrix',fontsize=15)plt.ylabel('Actual value',fontsize=14)plt.xlabel('Predictive value',fontsize=14)val_pre = []
val_label = []
for images, labels in val_ds:for image, label in zip(images, labels):img_array = tf.expand_dims(image, 0)prediction = model.predict(img_array)val_pre.append(np.argmax(prediction, axis=-1))val_label.append(label.numpy()) # 需要將標(biāo)簽轉(zhuǎn)換為 numpy 數(shù)組class_names = ['COVID-19', 'Normal', 'Pneumonia', 'Tuberculosis'] # 修改為你的類別名稱
plot_cm(val_label, val_pre, class_names)
plt.savefig("val-cm.pdf", dpi=300,format="pdf")precision_val, recall_val, f1_val, _ = precision_recall_fscore_support(val_label, val_pre, average='micro')
acc_val = accuracy_score(val_label, val_pre)
error_rate_val = 1 - acc_valprint("驗證集的靈敏度(召回率)為:",recall_val, "驗證集的特異度為:",precision_val, # 在多分類問題中,特異度定義不明確,這里我們使用精確度來代替"驗證集的準(zhǔn)確率為:",acc_val, "驗證集的錯誤率為:",error_rate_val,"驗證集的F1為:",f1_val)train_pre = []
train_label = []
for images, labels in train_ds:for image, label in zip(images, labels):img_array = tf.expand_dims(image, 0)prediction = model.predict(img_array)train_pre.append(np.argmax(prediction, axis=-1))train_label.append(label.numpy())plot_cm(train_label, train_pre, class_names)
plt.savefig("train-cm.pdf", dpi=300,format="pdf")precision_train, recall_train, f1_train, _ = precision_recall_fscore_support(train_label, train_pre, average='micro')
acc_train = accuracy_score(train_label, train_pre)
error_rate_train = 1 - acc_trainprint("訓(xùn)練集的靈敏度(召回率)為:",recall_train, "訓(xùn)練集的特異度為:",precision_train, # 在多分類問題中,特異度定義不明確,這里我們使用精確度來代替"訓(xùn)練集的準(zhǔn)確率為:",acc_train, "訓(xùn)練集的錯誤率為:",error_rate_train,"訓(xùn)練集的F1為:",f1_train)################################模型性能參數(shù)計算################################
from sklearn import metricsdef test_accuracy_report(model):print(metrics.classification_report(val_label, val_pre, target_names=class_names)) score = model.evaluate(val_ds, verbose=0)print('Loss function: %s, accuracy:' % score[0], score[1])test_accuracy_report(model)def train_accuracy_report(model):print(metrics.classification_report(train_label, train_pre, target_names=class_names)) score = model.evaluate(train_ds, verbose=0)print('Loss function: %s, accuracy:' % score[0], score[1])train_accuracy_report(model)################################AUC曲線繪制####################################
from sklearn import metrics
from sklearn.preprocessing import LabelBinarizer
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.python.keras.models import load_model
from matplotlib.pyplot import imshow
from sklearn.metrics import classification_report, confusion_matrix
import seaborn as sns
import pandas as pd
import mathdef plot_roc(name, labels, predictions, **kwargs):fp, tp, _ = metrics.roc_curve(labels, predictions)plt.plot(fp, tp, label=name, linewidth=2, **kwargs)plt.xlabel('False positives rate')plt.ylabel('True positives rate')ax = plt.gca()ax.set_aspect('equal')# 需要將標(biāo)簽進行one-hot編碼
lb = LabelBinarizer()
lb.fit([0, 1, 2, 3]) # 訓(xùn)練標(biāo)簽編碼器,這里設(shè)定有四個類別
n_classes = 4 # 類別數(shù)量val_pre_auc = []
val_label_auc = []for images, labels in val_ds:for image, label in zip(images, labels): img_array = tf.expand_dims(image, 0) prediction_auc = model.predict(img_array)val_pre_auc.append(prediction_auc[0])val_label_auc.append(lb.transform([label])[0]) # 這里需要使用標(biāo)簽編碼器進行編碼val_pre_auc = np.array(val_pre_auc)
val_label_auc = np.array(val_label_auc)auc_score_val = [metrics.roc_auc_score(val_label_auc[:, i], val_pre_auc[:, i]) for i in range(n_classes)]train_pre_auc = []
train_label_auc = []for images, labels in train_ds:for image, label in zip(images, labels):img_array_train = tf.expand_dims(image, 0) prediction_auc = model.predict(img_array_train)train_pre_auc.append(prediction_auc[0])train_label_auc.append(lb.transform([label])[0])train_pre_auc = np.array(train_pre_auc)
train_label_auc = np.array(train_label_auc)auc_score_train = [metrics.roc_auc_score(train_label_auc[:, i], train_pre_auc[:, i]) for i in range(n_classes)]for i in range(n_classes):plot_roc('validation AUC for class {0}: {1:.4f}'.format(i, auc_score_val[i]), val_label_auc[:, i] , val_pre_auc[:, i], color="red", linestyle='--')plot_roc('training AUC for class {0}: {1:.4f}'.format(i, auc_score_train[i]), train_label_auc[:, i], train_pre_auc[:, i], color="blue", linestyle='--')plt.legend(loc='lower right')
plt.savefig("roc.pdf", dpi=300,format="pdf")for i in range(n_classes):print("Class {0} 訓(xùn)練集的AUC值為:".format(i), auc_score_train[i], "驗證集的AUC值為:", auc_score_val[i])################################AUC曲線繪制-分開展示####################################
from sklearn import metrics
from sklearn.preprocessing import LabelBinarizer
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.python.keras.models import load_model
from matplotlib.pyplot import imshow
from sklearn.metrics import classification_report, confusion_matrix
import seaborn as sns
import pandas as pd
import mathdef plot_roc(ax, name, labels, predictions, **kwargs):fp, tp, _ = metrics.roc_curve(labels, predictions)ax.plot(fp, tp, label=name, linewidth=2, **kwargs)ax.plot([0, 1], [0, 1], color='orange', linestyle='--')ax.set_xlabel('False positives rate')ax.set_ylabel('True positives rate')ax.set_aspect('equal')# 需要將標(biāo)簽進行one-hot編碼
lb = LabelBinarizer()
lb.fit([0, 1, 2, 3]) # 訓(xùn)練標(biāo)簽編碼器,這里設(shè)定有四個類別
n_classes = 4 # 類別數(shù)量val_pre_auc = []
val_label_auc = []for images, labels in val_ds:for image, label in zip(images, labels): img_array = tf.expand_dims(image, 0) prediction_auc = model.predict(img_array)val_pre_auc.append(prediction_auc[0])val_label_auc.append(lb.transform([label])[0]) # 這里需要使用標(biāo)簽編碼器進行編碼val_pre_auc = np.array(val_pre_auc)
val_label_auc = np.array(val_label_auc)auc_score_val = [metrics.roc_auc_score(val_label_auc[:, i], val_pre_auc[:, i]) for i in range(n_classes)]train_pre_auc = []
train_label_auc = []for images, labels in train_ds:for image, label in zip(images, labels):img_array_train = tf.expand_dims(image, 0) prediction_auc = model.predict(img_array_train)train_pre_auc.append(prediction_auc[0])train_label_auc.append(lb.transform([label])[0])train_pre_auc = np.array(train_pre_auc)
train_label_auc = np.array(train_label_auc)auc_score_train = [metrics.roc_auc_score(train_label_auc[:, i], train_pre_auc[:, i]) for i in range(n_classes)]fig, axs = plt.subplots(n_classes, figsize=(5, 20))for i in range(n_classes):plot_roc(axs[i], 'validation AUC for class {0}: {1:.4f}'.format(i, auc_score_val[i]), val_label_auc[:, i] , val_pre_auc[:, i], color="red", linestyle='--')plot_roc(axs[i], 'training AUC for class {0}: {1:.4f}'.format(i, auc_score_train[i]), train_label_auc[:, i], train_pre_auc[:, i], color="blue", linestyle='--')axs[i].legend(loc='lower right')plt.tight_layout()
plt.savefig("roc.pdf", dpi=300,format="pdf")for i in range(n_classes):print("Class {0} 訓(xùn)練集的AUC值為:".format(i), auc_score_train[i], "驗證集的AUC值為:", auc_score_val[i])
(b)調(diào)教GPT-4的過程
(b1)咒語:請根據(jù){代碼1},改寫和續(xù)寫《代碼2》。代碼1:{也就是之前用tensorflow寫的誤判病例分析部分};代碼2:《也就是修改之前的Mobilenet模型建模代碼》
然后根據(jù)具體情況調(diào)整即可,當(dāng)然是在GPT的幫助下。
三、輸出結(jié)果
(1)學(xué)習(xí)曲線
(2)混淆矩陣
(3)性能參數(shù)
(4)ROC曲線
(4.1)和在一起的:
?(4.2)分開的:
?
四、數(shù)據(jù)
鏈接:https://pan.baidu.com/s/1rqu15KAUxjNBaWYfEmPwgQ?pwd=xfyn
提取碼:xfyn