Do I need to run an object recognition prior to classification to isolate individuals?

I’m helping my daughter setup a camera that will identify birds that show up to her bird feeders using this (https://tfhub.dev/google/aiy/vision/classifier/birds_V1/1) model. I’ve got it working (haven’t started running it with the camera yet, but I already figured out how to get the frames off of that…) BUT, it seems to have a LOT of trouble making an identification when there are multiple birds in the shot. For example, a picture of an empty birdfeeder came back as chickadee at ~20%, but a picture with 8 sparrows came back as sparrow but only at ~15%… similarly, a very clear shot of a cardinal with 5 other birds came back as cardinal but only at 21%… I do understand that the model isn’t going to be perfectly accurate and 20% means the model isn’t very confident (I’m not concerned about that…), but I need to set some bottom thresh-hold and I’m concerned this means any time there are multiple birds at the feeders the system will basically stop working which would end up being most of the time…

So do i need to run an object detection model on the images and clip out individual images of “birds” and then run this identification model? and if so, would anyone have a suggestion on an easyish way to do this? I’m far from a competent coder so advice/suggestions are welcome!

submitted by /u/StrongAbbreviations5
[visit reddit] [comments]

Leave a Reply Cancel reply