python——即使使用相同的数据,训练损失也远低于测试损失
发布时间:2022-05-13 08:47:52 237
相关标签: # node.js
我使用相同的数据进行培训和测试(这不是最佳实践),理论上损失应该完全相同。然而,在训练时,我的损失通常在1e-07左右,而在测试期间,它实际上是0.05。
理想情况下,训练和测试之间的训练损失应该非常相似,但它们却非常不同。
以下是我的训练循环:
losses = []
try:
for epoch in range(args.epochs):
for (index, img) in trainDataloader:
for i, image in enumerate(img):
# circlenet.zero_grad()
optimizer.zero_grad()
output = circlenet(image)
# print(index[i])
loss = nn.functional.mse_loss(output, torch.tensor(
index[i]
, device=device, dtype=torch.float))
loss = loss.to(device)
loss.backward()
optimizer.step()
if epoch % 50 == 0:
if args.cuda:
GPUs = GPUtil.getGPUs()
print(GPUs[0].temperature, "C")
if epoch % args.saveevery == 0:
circlenet.cpu()
torch.save({"model": circlenet.state_dict(), "optimizer": optimizer.state_dict()}, f"{args.save_dir}/weights.pth")
circlenet.to(device)
losses.append(loss.item())
print(f"Epoch: {epoch + 1: <6} Loss: {loss.item()}")
except KeyboardInterrupt:
torch.save({"model": circlenet.state_dict(), "optimizer": optimizer.state_dict()}, f"{args.save_dir}/weights.pth")
import matplotlib.pyplot as plt
plt.plot(losses)
plt.show()
以下是我的测试方法:
rand = random.randint(0, len(os.listdir("data/imgs/")) - 1)
import cv2
# use the network
circlenet.eval()
img = (PIL.Image.open(f"data/imgs/img{rand}.jpg"))
img = transforms.functional.pil_to_tensor(img).to(device)
img = img.type(torch.FloatTensor)
img = img.to(device)
with torch.no_grad():
out = circlenet(img)
out = out.cpu().numpy()
out = out.tolist()
imgcv = cv2.imread(f"data/imgs/img{rand}.jpg")
print("Output: ", out)
print(rand)
# remove first and last characters
ans = data[rand - 1]
print("Answer: ", ans)
loss = nn.functional.mse_loss(torch.tensor(out, dtype=torch.float, device=device), torch.tensor(ans, dtype=torch.float, device=device))
print("Loss: ", loss.item())
cv2.circle(imgcv, (round(ans[1] * 256), round(ans[2] * 144)), 2, (255, 255, 0), 2) # answer
color = (0, 255, 0) if round(out[0]) == 1 else (0, 0, 255)
cv2.circle(imgcv, (round(out[1] * 256), round(out[2] * 144)), 4, color, 2)
imgcv = cv2.resize(imgcv, (480, 270))
cv2.imshow("output", imgcv)
cv2.waitKey(0)
培训的一些产出:
Epoch: 94 Loss: 7.115558560144564e-07
Epoch: 95 Loss: 5.9022491768701e-05
Epoch: 96 Loss: 2.5865596398944035e-05
Epoch: 97 Loss: 9.173281227958796e-07
Epoch: 98 Loss: 8.050536962400656e-06
Epoch: 99 Loss: 8.39896165416576e-06
Epoch: 100 Loss: 7.107677788553701e-07
测试输出:
You are running on device: NVIDIA GeForce RTX 3050 Ti Laptop GPU
Current statistics:
| ID | GPU | MEM |
------------------
| 0 | 40% | 12% |
55.0 C
Output: [0.9986587166786194, 0.6712906360626221, 0.6456944346427917]
870
Answer: [1.0, 0.3328125, 0.8268518518518518]
Loss: 0.04912909120321274
特别声明:以上内容(图片及文字)均为互联网收集或者用户上传发布,本站仅提供信息存储服务!如有侵权或有涉及法律问题请联系我们。
举报