Hacker News new | past | comments | ask | show | jobs | submit login
iNaturalist strikes out on its own (baynature.org)
395 points by kscottz on Sept 14, 2023 | hide | past | favorite | 78 comments



How the hell does the Seek by iNaturalist app work so well and also be small/performant enough to the job completely offline on a phone? You should really try it out for IDing animals and plants if you haven't, it's like a real life pokedex. Have they released any information (e.g. a whitepaper?) about how the model works or how it was trained? The ability to classify things incrementally and phylogenetically makes it helpful to narrow down your own search even when it doesn't know the exact species. I've been surprised by it even IDing the insects that made specific galls on random leaves or plants.


I reverse engineered their stuff a bit. I downloaded their Android APK and found a tensorflow lite model inside. I found that it accepts 299x299px RGB input and spits out probabilities/scores for about 25,000 species. The phylogenetic ranking is performed separately (outside of the model) based on thresholds (if it isn't confident enough about any species, it seems to only provide genus, family, etc.) They just have a CSV file that defines the taxonomic ranks of each species.

I use it to automatically tag pictures that I take. I took up bird photography a few years ago and it's become a very serious hobby. I just run my Python script (which wraps their TF model) and it extracts JPG thumbnails from my RAW photos, automatically crops them based on EXIF data (regarding the focus point and the focus distance) and then feeds it into the model. This cropping was critical - I can't just throw the model a downsampled 45 megapixel image straight from the camera, usually the subject is too small in the frame. I store the results in a sqlite database. So now I can quickly pull up all photos of a given species, and even sort them by other EXIF values like focus distance. I pipe the results of arbitrary sqlite queries into my own custom RAW photo viewer and I can quickly browse the photos. (e.g. "Show me all Green Heron photos sorted by focus distance.") The species identification results aren't perfect, but they are very good. And I store the score in sql too, so I can know how confident the model was.

One cool thing was that it revealed that I had photographed a Blackpoll Warbler in 2020 when I was a new and budding birder. I didn't think I had ever seen one. But I saw it listed in the program results, and was able to confirm by revisiting the photo.

I don't know if they've changed anything recently. Judging by some of their code on GitHub, it looked like they were also working on considering location when determining species, but the model I found doesn't seem to do that.

I can't tell you anything about how the model was actually trained, but this information may still be useful in understanding how the app operates.

Of course, I haven't published any of this code because the model isn't my own work.


I don't use Seek, but the iNaturalist website filters computer vision matches using a "Seen Nearby" feature:

> The “Seen Nearby” label on the computer vision suggestions indicates that there is a Research Grade observation, or an observation that would be research grade if it wasn't captive, of that taxon that is:

> - within nine 1-degree grid cells in around the observation's coordinates and

> - observed around that time of year (in a three calendar month range, in any year).

https://www.inaturalist.org/pages/help#computer-vision

For how the model was trained, it's fairly well documented on the blog, including different platforms used as well as changes in training techniques. Previously the model was updated twice per year, as it required several months to train. For the past year they've been operating on a transfer learning method, so the model is trained on the images then updated, roughly once each month, to reflect changes in taxa. The v2.0 model was trained on 60,000 taxa and 30 million photos. There are far more taxa on iNaturalist, however there is a threshold of ~100 observations before a new species is included in the model.

https://www.inaturalist.org/blog/83370-a-new-computer-vision...

https://www.inaturalist.org/blog/75633-a-new-computer-vision...


>It looked like they were also working on considering location when determining species, but the model I use doesn't do that.

I do this in fish for very different work and there's a good chance the model for your species does not exist yet. For fish we have 6,000 distribution models based on sightings (aquamaps.org) but there are at least 20,000 species. These models have levels of certainty from 'expert had a look and fixed it slightly manually' to 'automatically made based on just three sightings' to 'no model as we don't have great sightings data'. So it may be that the model uses location, just not for the species you have?


Well, there's no way to feed lat/lng or similar into this particular tensorflow model.


that is actually surprising. surely they use location at some point in the ID process. its possible they have a secondary location based model to do sorting/ranking after the initial detection?

Merlin's bird detection system is almost non-functional without location.


yeah that's true! You can't really do that, these models are just polygons, all we do is doublecheck the other methods' predictions' overlap with these polygons as a second step.


Sounds like a real-life Pokémon Snap. You should add a digital professor who gives you points based on how good your recent photos are. (Size of subject in photo, focus, in-frame, and if the animal is doing something interesting.)


That doesn't work well even when it's the only game mechanic and everything else is designed around making it work.

https://www.awkwardzombie.com/comic/angle-tangle

It's not likely to work well on actual photos of actual wildlife.


That just adds to the fun.


That sounds like an awesome setup! Would you be willing to share your script with another bird photography enthusiast?


Comments like these are why I lurk on HN. Genius solution.

As a birder I have thousands of bird photos and would pay for this service.


This post fits the username perfectly.


You wrote your own custom RAW photo viewer? Like, including parsing? That's incredibly cool, do you share it anywhere?

Also why not just darktable / digikam?


I would pay for this


Thanks for sharing - I was curious too but didn’t delve in myself.


if you're willing it's totally fine to share your work with the model itself removed



I'd never heard of this app, but your description made me want to install it. When I googled it I was surprised at the app ratings:

Apple: 4.8

Google Play: 3.4

The most common issue mentioned by negative Play store reviews is the camera not focusing on the right thing, and needing to try many different angles before something is recognized correctly. This is probably nothing to do with the underlying model, which I guess is the same on both platforms.


Camera zoom is definitely annoying, there's no way to control how zoomed in it is.

And yes, it often takes as much as a minute to identify a species, because you have to keep adjusting zoom and angle and trying to catch every important feature.

That said, once you are used to it, it becomes less noticeable and just feels like part of the game.


I'm curious why this seems (from the reviews) to be an issue only on Android?


I tried this app for a while, and there's definitely some rough edges. My partner's phone was much quicker in recognizing plants and flowers, so after a while I gave up and we just used her phone instead.

And then there's the issue of many plants being slightly mis-identified, many really common flowers being incorrectly identified. I don't really know how people get remotely close to wild animals to identify them, I had no luck with animal life with this and after a while I started mistrusting its results and cross referencing with google lens anyway.


our own app Birda has this issue too, most of the 1 star reviews are 'not the app I was looking for' or 'not a game'


I wish I had the same experience as you. The vast majority of time I point at tree leaves in South East Asia it tells me it’s Dicots and it stops there. Only rarely I get the actual full classification; the last time it happened was for a common almond tree.


It's very bad at trees for some reason. Also mushrooms, but I thought that might be intentional so they don't get blamed for someone eating something poisonous that was misidentified.

PlantNet often works better for trees.


Trees are generally difficult to classify well with computer vision. It's hard for the models to establish context because at a scale where you can see the whole tree you tend to include lots of background. If you include a bark photo, it's often ambiguous if there's growth/stuff/weathering on top. Flowers tend to be good inputs.

The training imagery is also really inconsistent in inaturalist and again for plants it's hard to establish context. These are mostly smartphone pictures that non experts have captured in the app. While someone might have verified that a plant is a particular species, because there isn't any foreground/background segmentation, there are often confounding objects in the frame. On top of that you only get a 300px input to classify from. With animals I'd say it's much more common for photographers to isolate the subject. There's also massive class imbalance in inat, a large number of the observations are things like mallards (ie common species in city parks).

I guess the best solution would be to heavily incorporate geographic priors and expected species in the location (which I think is partly done already).


Flowers are crucial for human IDs as well. A lot of tropical tree leaves are very similar, so without context they're virtually impossible to visually distinguish.


Yeah this is a good point. I've done some work with local experts to do tree ID from drone images over rainforest and there were several species where they would need to see the underside of the leaves to get better than the family.


My experience has been great with mushrooms, just to add another datapoint. I mean, it's often about as good as you can get by eye without breaking out the lab equipment.


It seems to do well for trees for me in California.


For trees try to photograph the flowers, the seeds, the bark, the leaves (both sides), the trunk growth habit (especially bottom portion), and the upper branches growth habit. Often when asking it to suggest a species switching between these will create progress.


Probably because many people around the world participated in classifying what was posted?

I am guessing. Please tell me if that is correct. How do they prevent false labels ?


Any observations can be submitted, but the observation has to be verified by a different observer. Most identifiers are folks with more experience identifying things locally, and the data quality is high. There's very little incentive to game the system and if something is misidentified other iNatters can also add identifications correcting mistakes which happens regularly - often various scientists/specialists tend to sweep observations in their taxa of note and correct issues. There's criteria for a "high quality" observation, including being verified. Only those observations that are high quality are used for training.


There are hundreds of thousands of "false' labels. Pictures can be classified many times.


I always wondered how do you determine truth in such sites?


You ask actual experts for identification


Most (or all?) of iNaturalist's code is open source (typically MIT), see https://github.com/inaturalist

A lot of iNaturalist data is open data, see https://github.com/inaturalist/inaturalist-open-data

(It's up to each user to decide how they want to license their observations, photos, and sounds. The options are all-rights-reserved [no license], CC0, CC-BY, and various other CC license.)


You can also get all research-grade observation data from a DarwinCore archive[1] that's updated monthly and exported to GBIF[2]. They also have a pretty good API[3], and if you happen to want to use that in a python application or script, I maintain a client library[4] for it.

[1]: https://www.inaturalist.org/pages/developers

[2]: https://www.gbif.org/dataset/50c9509d-22c7-4a22-a47d-8c48425...

[3]: https://api.inaturalist.org/v1/docs/

[4]: https://github.com/pyinat/pyinaturalist


In case anyone else is wondering why it's not on Fdroid: https://github.com/inaturalist/iNaturalistAndroid/issues/654

(tldr: uses Google Maps and analytics)


I wonder if the IzzyOnDroid repository [1] (for F-Droid) will accept them. Their inclusion policy is "slightly less strict than F-Droid’s" in their words, but reading it [2] doesn't immediately clarify to me if this one would be allowed or not.

[1] https://apt.izzysoft.de/fdroid/index/info [2] https://gitlab.com/IzzyOnDroid/repo/#what-are-the-requiremen...


You can download it from their github apparently.

I just limp along with aurora, though.


Do you know if this includes the model this commenter was talking about? https://news.ycombinator.com/item?id=37516652

Feels like that'd be useful for all sorts of stuff.


I'm a big fan of their "Seek" app and have used it a bunch. It has a similar "game" loop like Pokemon Go, but without any mtx, plus you learn a lot about your local ecosystems.

If you have kids I highly recommend pulling out the app during a nature walk.


In case others have the same question I did...

Microtransactions, sometimes abbreviated as mtx, are a business model where users can purchase virtual goods with micropayments within a game. Microtransactions are often used in free-to-play games to provide a revenue source for the developers.


I do the same thing with the iNatutalist app. I joke with my wife that I'm filling out my Pokedex. Currently at 164 species.


I'm so glad this happened and they weren't acquired by some shitty tech company.


Perhaps the most successful "natural" social-network out there, well deserved. Now if we can just get citizen scientists cheap scopes and better camera lenses so we can actually make those photos of critters < 2mm useful (at scale) we'd see research potential rocket.


With the adoption of multi-lens phone cameras allowing better zooming the general quality of insect observation photos has improved (some, there's still a lot of terrible photos). I actually started submitting a lot more insect observations in iNaturalist after I got an iPhone 12 and other folks on the forums have mentioned upgraded phone cameras have improved their observations and increased the range of what species they can submit. I'd love getting more zoom, there's micro-moths, lot of small aquatic inverts and things I'd be happy to be able to observe. Individuals do have some incentive to find new species over time to increase species counts, I'm up to 590 identified insect species and am trying to top my bird species count at 632. Improved cameras over time help improve model training and add species that are harder to get photos of as tech improves.



Oh, tiger moth - nice! I have been working on my Lepidoptera for a while always love seeing those and sphinx moths are so cool. I'm up to six species of sphinx moths, the tessa sphinx is my favorite:

https://www.inaturalist.org/observations/176100012

Also that cottonwood dagger - whoah! I checked out your lifelist, you've found a lot of really interesting insect species, I'm in Austin and the differences are pretty dramatic and quite interesting. You can see my insect life list here: https://www.inaturalist.org/lifelists/steven_bach?view=tree&...


That moth is really neat! The cottonwood dagger was what led me to inaturalist as I had no idea what it was. Been posting stuff since and learned a lot.


What camera do you use? This one is so nicely zoomed https://www.inaturalist.org/observations/132443637

Almost looks like an image from an electron microscope the way the grass stands up.


Just an iphone, grass isn't real though.


If anyone has suggestions for a smartphone scope/lens, I'd be happy to hear them.


The Moment lenses and cases are probably the most well-received lenses for mobile devices. They have macro lenses as well: https://www.shopmoment.com/mobile/phone-lenses


Great news. Seek is the #1 app I recommend parents for their children if they’re going to be using a smart phone. A tool that helps us engage with and appreciate our natural environment with nice gamification. Helped me identify some of my favorite sages and other aromatic plants.


I'm a huge fan of the Seek app. It does a much better job than the attempts Apple makes at IDing plants and animals in photos I've taken.

I'm sure it's an issue of the image not having enough data, but Apple seems to always try and pin down a species or two whereas Seek seems to only tell me what it knows.


Has anyone used PlantNet's app[0] and compared accuracy to iNaturalist? Been using the former and it seems to identify successfully, though I can't say I've used to identify more "exotic" fauna. Might give iNaturalist a try.

[0]: https://identify.plantnet.org/


I’ve been using PlantNet over iNaturalist. One reason is that PlantNet gives confidence scores for identification. In general PN seems very good for European plants.


I haven't used PlantNet though I've been working on learning plant identification/taxonomy and have been using iNaturalist. Their models are limited by inputs, if you've got a wild angiosperm that's flowering it's pretty good. With human cultivars, even with flowers it's not always great, though no shade there, that's what I'd expect for visually trained models that focus on plants in the wild. If you aim it at a leaf there's not enough visual data to categorize, if you aim it at a full plant it's so-so. I mostly stick to flowering plants and have an easy time.


I try PlantNet when Seek can't identify something. It often feels more successful because it always gives a list of possibilities, but when the top answer is like 40% probable I don't actually know if it's doing any better than Seek.

Downsides are that it requires service, and obviously it doesn't work for animals. Seek works perfectly in the middle of the wilderness with zero service, it always feels incredible.


I'd never heard of iNaturalist until recently. Then it turned out one of my young adult family members had started developing quite a reputation on there - has found a few undescribed/ambiguous/out of range species of animal in quite a short period of time, and I think that it might have a strong influence on their coming career.


This is so beautiful! Congrats to everyone for pulling off a great project and now it has the ability to actually continue to do great things with a better future.

Very positive to see


Anecdotal, but I've gotten better results from iNaturalist than from the built in camera search on my phone, especially for insects.


For insects iNaturalist has the best trained models I know of, though of course not all insects are visually identifiable from a photo, especially beetles. It's actually incredibly good with birds as well. I post observations to iNaturalist daily and use the AI all the time for identification, even when I know the species the models usually figure it out as well and it's saves typing.

I have used iNaturalist since 2014, the quality wasn't great initially but with the continued observations adding high quality annotations to the data their training improved pretty quickly, by 2017-18 it was already really good.


> especially beetles

Dang. I was just hoping maybe I could identify the beetles that come out and crawl around my floor around dusk.

I've been trying Seek on them, and when it identifies a species it says "Strawberry Seed Beetle", but that seems unlikely unless strawberry seed beetles are common household pests.


J.B.S. Haldane joked that if some god or divine being had actually created all living organisms on Earth, then that creator must have an "inordinate fondness for beetles." They're incredibly diverse, this article has a nice table that sums things up: https://en.wikipedia.org/wiki/Global_biodiversity

There's ~64.000 mammal species total, insects clock in at ~1,000,000. Of insects Coleoptera, the beetles, are estimated at 360-400k species. There's about 2,000,000 species in Animalia total and a bit less than a quarter are beetles.

If you use iNaturalist it's more helpful on that count than Seek since some beetles are more identifiable than others, and there are entomologists who might be able to help confirm some ids, I've had luck with a number of ground beetles and some other species.


I also came to the conclusion that I should try iNaturalist. It doesn't appear to be especially lively, but I did find several "nearby" observations of what I can only assume is the same beetle, and one had been labeled (by the uploader) harpalus sinicus. (Context: middle of Shanghai.)

The harpalus identification seems pretty safe. Identifying the species from a photo seems like a hopeless task, so further identification would presumably have to rely on background knowledge like "there's only one common harpalus species in your area".

The strawberry seed beetle, as far as I can tell, is in fact nearly visually identical, so in some sense Seek made a respectable guess.


Seek is iNaturalist. If you want, after Seek identifies it you can hit a button to post to the iNaturalist website, and then human users will review and add their own identification.


I love the app. Each week I discover a new to me species of insect in my garden. When the AI fails to work (relatively rarely) the community usually quickly manually identify it.

I’ve had less success with trees but it’s still pretty good.

The focus on images is a bit iffy. Usually I take the photo using the phones native camera app to overcome.


What’s the model for them decide when to say a new species has been found, rather than just a variant of a known species?

Or more generally, whats the generation of a species? It can’t be just whether they can produce offspring together because some different species has been shown to produce viable offspring.


Like a lot of biology, it’s complicated and often you need multiple forms of evidence to support a species classification if the viable offspring (biological species concept) doesn’t suit.

So there are lots of different models for different types of organisms. And some reproduce asexually so reproductive compatibility would never be enough.

Two simple options are evolutionary lineage (effectively genetic distance) or physical characteristics (morphology): https://bio.libretexts.org/Courses/University_of_California_...


If they could drop their dependencies on Google Services for Seek I would be ecstatic. The identification works perfect but I would love to be able to view/post locations.


Are there any apks available outside of the Play store?


Seems like there are apks on the Github repo: https://github.com/inaturalist/iNaturalistAndroid/releases


Hopefully they'll get the app into F-Droid.


Old thread, but looks like due to non-FOSS dependencies, it's not likely to happen: https://github.com/inaturalist/iNaturalistAndroid/issues/104...


Good on ‘em!

Great service, and great app.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:
  翻译: