Used a lazy package from Bilibili, succeeded in my first experiment, doing the simplest A/B binary classification. The results look good, with both training and testing sets having pretty high accuracy, 95%.
However, when testing on new data, the recognition for A is almost all correct, but for B it’s almost all wrong (all recognized as A).
After discussing with classmates, it might be due to too much background interference. I plan to switch to object detection. Coincidentally, I had seen a yolov5 tutorial before, so I’ll put it to use (●’◡’●).
Encountered a crash with labelimg, but finally solved it by creating a new environment with conda and installing via pip.
Some mystical tips for labelimg:
First select the image folder
Then select the label folder
Start labeling
Finally change to yolo format and save
Remember to turn on auto-save
Got tired after labeling for a while, lying down Orz, will continue tomorrow
After discussing with teachers and classmates, I believe the main reason is that direct classification was done; factors like color and background are obvious interferences. Object detection would be more suitable for this task.
Currently, half of the data has been labeled. Once all labeling is complete, I’ll proceed to refine the second batch of results
Today is another day of data labeling.
Yesterday I labeled on Windows, but my Mac is lighter and thinner, so I brought it out for labeling. Unexpectedly, the labels copied over couldn’t be used directly, so I had to relabel everything again this morning
Later, labelimg started acting up again, crashing intermittently, with a very poor experience. The mystical little tricks didn’t work anymore.
I started looking for alternative tools and found label studio, but this tool is very laggy when deployed locally, taking several seconds to save and several seconds to load images, so I kept searching.
I discovered that Baidu’s PaddlePaddle has an automatic labeling tool called eiseg, and then started trying out Paddle and related packages.
Looking back, the data labeled this morning has already completed 100 epochs of training, and the validation results seem pretty good.
But the actual testing results were disappointing, the same as last time.
I blamed it wrongly; at that time, I didn’t know what was running on the computer, it was too hot, so it was extremely laggy. Now it has cooled down and it works quite well.